MTJ’s everywhere

I attended a IEEE Magnetic’s Society Distinguished Lecture series on MTJ (magnetic tunnel junction) which technology has been under development since 1960. It turns out since about 2004, the TMR (tunneling magneto-resistance) read head based on MTJs has been the dominant technology used in HDDs for reading magnetic bits recorded on media. TMR is a magneto-resistance effect that occurs in a MTJ (magnetic tunneling junction) device.

And MTJ devices are also used in today’s MRAM devices, one of the contenders to replace flash (e.g. see our The end of NAND… post). Given that TLC-QLC-PLC (NAND) flash and NAND 3D layering (now @230+ layers, checkout the GreyBeards FMS2022 wrapup podcast) have taken off, that replacement seems way down the line, if ever. But MRAM has other uses.

Disk heads

An MTJ is a device with two ferromagnets (iron, magnetite, or similar metals which can be magnetized and magnetizable material) separated by a small insulator which exhibit a quantum tunneling. The tunnelers, in an MTJ are electrons and the tunneling occurs in the presence of a magnetic field (like bits in disk media) across the insulating material.

Depending on the magnetic field electron tunneling is more successful (low resistance) or less successful (high resistance) and this allows the reading of the magnetic fields (bits) on the (magnetic) recording material.

In the past all disk devices used GMR (giant magnetoresistence heads) which uses a different technique to sense magnetic fields on recording media. Suffice it to say here that GMR head’s had a limitation when miniaturized which TMR heads do not. And this (along with shrinking write heads) has allowed bit density, track density and disk capacity to skyrocket over the past few years with nothing apparently stopping it.

So MTJ and the TMR head is one of the critical technologies keeping disks the dominant form of data storage in the market today.

MRAM

On the other hand MRAM devices have been around a long time but they are still mainly used in niche applications.

There are two types of MRAM devices based on the MTJ one is a Toggle MRAM which uses current pulses to write MTJ bits and the STT-MRAM which uses spin-transfer torque (STT) to write a bit. It turns out that electrons have a spin or angular momentum. And most current in use today have electrons with mixed up spins. That is ~50% of the electrons in a current spin one way and 50% the other.

But if one sends current through a thicker “fixed” magnetic layer device, one can create spin polarized currents. If one then sends this spin polarized current through another smaller “free” magnetic layer, its spin can be transferred to the device and by doing so can change its resistance or write a bit of data. Presumably reading can be done with less current using the same MTJ. So once again a magneto-resistance effect that is being measured which can be used to detect bit values and in this case write bit values.

There’s a new, faster form of MRAM called SOT-MRAM (spin-orbit torque) which uses a different write mechanism but operates similarly for reads.

It turns out the Toggle MRAM devices don’t work well when you shrink bit dimensions which is not a problem with STT-MRAM devices.

One can see on this slide that the space taken up by one bit in Toggle MRAM can be used to store 2 bytes (16 bits) worth of data.

The only problem with STT-MRAM is fabricating the pit pillars cannot be done with chemistry (lithography using photo resists which is how most semiconductors are made today) alone but rather requires a combination of lithography and Ion Beam Etching (IBE) or milling.

IBE is an optical-mechanical process that uses Ions (charged particles) directed at the device to blast away metals and other material surrounding a MRAM bit pillar. The debris from IBE creates even more problems and has to be carefully removed. The problem is that IBE doesn’t scale as easily as lithography processes.

However, the nice thing about MRAM is that it can both be used as a potential replacement for NAND flash and DRAM.

In a NAND flash applications, MRAM has an endurance over 1M write cycles and great bit retention of over 10yrs.

In addition unlike DRAM and NAND both of which hold data in capacitors, MRAM holds data in a magnetic effect. As such it’s less susceptible to radiation damage, present in space and reactors. As such, MRAM has found a niche market in Satellites and other extreme environments.

As for MRAM scaling, Everspin announced in December of 2018 a 28nm node fab process, 1Gb MRAM storage chip, still their highest capacity MRAM chip on their website. TSMC and others have roadmaps to take MRAM technology to 14nm/12nm node fab processes which should increase capacity by 2X or more.

In contrast, Micron just announced a 232 layer 3D TLC NAND chip that has a raw capacity of 1Tb using what they call a 1α node process. So today Everspin’s latest chip is 1000 times less dense than best in class 3D TLC NAND to be shipping soon.

I did ask at the IEEE Magnetics session if MRAM could be scaled vertically and the answer was yes, but they didn’t provide any more details.

The other thing that NAND has done was change from a single bit per cell (SLC) to three bits per cell (TLC) which has helped increase density. I suppose there’s nothing stopping MRAM from offering more than one bit per pillar which would help as well. But I didn’t even ask that one.

~~~~

MRAM may one day overcome it’s IBE scaling and single bit/pillar constraints to become a more viable competitor to 3D NAND flash and DRAM, but it seems a long way off.

Photo Credit(s):

CTERA, Cloud NAS on steroids

We attended SFD22 last week and one of the presenters was CTERA, (for more information please see SFD22 videos of their session) discussing their enterprise class, cloud NAS solution.

We’ve heard a lot about cloud NAS systems lately (see our/listen to our GreyBeards on Storage podcast with LucidLink from last month). Cloud NAS systems provide a NAS (SMB, NFS, and S3 object storage) front-end system that uses the cloud or onprem object storage to hold customer data which is accessed through the use of (virtual or hardware) caching appliances.

These differ from file synch and share in that Cloud NAS systems

  • Don’t copy lots or all customer data to user devices, the only data that resides locally is metadata and the user’s or site’s working set (of files).
  • Do cache working set data locally to provide faster access
  • Do provide NFS, SMB and S3 access along with user drive, mobile app, API and web based access to customer data.
  • Do provide multiple options to host user data in multiple clouds or on prem
  • Do allow for some levels of collaboration on the same files

Although admittedly, the boundary lines between synch and share and Cloud NAS are starting to blur.

CTERA is a software defined solution. But, they also offer a whole gaggle of hardware options for edge filers, ranging from smart phone sized, 1TB flash cache for home office user to a multi-RU media edge server with 128TB of hybrid disk-SSD solution for 8K video editing.

They have HC100 edge filers, X-Series HCI edge servers, branch in a box, edge and Media edge filers. These later systems have specialized support for MacOS and Adobe suite systems. For their HCI edge systems they support Nutanix, Simplicity, HyperFlex and VxRail systems.

CTERA edge filers/servers can be clustered together to provide higher performance and HA. This way customers can scale-out their filers to supply whatever levels of IO performance they need. And CTERA allows customers to segregate (file workloads/directories) to be serviced by specific edge filer devices to minimize noisy neighbor performance problems.

CTERA supports a number of ways to access cloud NAS data:

  • Through (virtual or real) edge filers which present NFS, SMB or S3 access protocols
  • Through the use of CTERA Drive on MacOS or Windows desktop/laptop devices
  • Through a mobile device app for IOS or Android
  • Through their web portal
  • Through their API

CTERA uses a, HA, dual redundant, Portal service which is a cloud (or on prem) service that provides CTERA metadata database, edge filer/server management and other services, such as web access, cloud drive end points, mobile apps, API, etc.

CTERA uses S3 or Azure compatible object storage for its backend, source of truth repository to hold customer file data. CTERA currently supports 36 on-prem and in cloud object storage services. Customers can have their data in multiple object storage repositories. Customer files are mapped one to one to objects.

CTERA offers global dedupe, virus scanning, policy based scheduled snapshots and end to end encryption of customer data. Encryption keys can be held in the Portals or in a KMIP service that’s connected to the Portals.

CTERA has impressive data security support. As mentioned above end-to-end data encryption but they also support dark sites, zero-trust authentication and are DISA (Defense Information Systems Agency) certified.

Customer data can also be pinned to edge filers, Moreover, specific customer (director/sub-directorydirectories) data can be hosted on specific buckets so that data can:

  • Stay within specified geographies,
  • Support multi-cloud services to eliminate vendor lock-in

CTERA file locking is what I would call hybrid. They offer strict consistency for file locking within sites but eventual consistency for file locking across sites. There are performance tradeoffs for strict consistency, so by using a hybrid approach, they offer most of what the world needs from file locking without incurring the performance overhead of strict consistency across sites. For another way to do support hybrid file locking consistency check out LucidLink’s approach (see the GreyBeards podcast with LucidLink above).

At the end of their session Aron Brand got up and took us into a deep dive on select portions of their system software. One thing I noticed is that the portal is NOT in the data path. Once the edge filers want to access a file, the Portal provides the credential verification and points the filer(s) to the appropriate object and the filers take off from there.

CTERA’s customer list is very impressive. It seems that many (50 of WW F500) large enterprises are customers of theirs. Some of the more prominent include GE, McDonalds, US Navy, and the US Air Force.

Oh and besides supporting potentially 1000s of sites, 100K users in the same name space, and they also have intrinsic support for multi-tenancy and offer cloud data migration services. For example, one can use Portal services to migrate cloud data from one cloud object storage provider to another.

They also mentioned they are working on supplying K8S container access to CTERA’s global file system data.

There’s a lot to like in CTERA. We hadn’t heard of them before but they seem focused on enterprise’s with lots of sites, boatloads of users and massive amounts of data. It seems like our kind of storage system.

Comments?

Scality’s Open Source S3 Driver

img_6931
The view from Scality’s conference room

We were at Scality last week for Cloud Field Day 1 (CFD1) and one of the items they discussed was their open source S3 driver. (Videos available here).

Scality was on the 25th floor of a downtown San Francisco office tower. And the view outside the conference room was great. Giorgio Regni, CTO, Scality, said on the two days a year it wasn’t foggy out, you could even see Golden Gate Bridge from their conference room.

Scality

img_6912As you may recall, Scality is an object storage solution that came out of the telecom, consumer networking industry to provide Google/Facebook like storage services to other customers.

Scality RING is a software defined object storage that supports a full complement of interface legacy and advanced protocols including, NFS, CIGS/SMB, Linux FUSE, RESTful native, SWIFT, CDMI and Amazon Web Services (AWS) S3. Scality also supports replication and erasure coding based on object size.

RING 6.0 brings AWS IAM style authentication to Scality object storage. Scality pricing is based on usable storage and you bring your own hardware.

Giorgio also gave a session on the RING’s durability (reliability) which showed they support 13-9’s data availability. He flashed up the math on this but it was too fast for me to take down:)

Scality has been on the market since 2010 and has been having a lot of success lately, having grown 150% in revenue this past year. In the media and entertainment space, Scality has won a lot of business with their S3 support. But their other interface protocols are also very popular.

Why S3?

It looks as if AWS S3 is becoming the defacto standard for object storage. AWS S3 is the largest current repository of objects. As such, other vendors and solution providers now offer support for S3 services whenever they need an object/bulk storage tier behind their appliances/applications/solutions.

This has driven every object storage vendor to also offer S3 “compatible” services to entice these users to move to their object storage solution. In essence, the object storage industry, like it or not, is standardizing on S3 because everyone is using it.

But how can you tell if a vendor’s S3 solution is any good. You could always try it out to see if it worked properly with your S3 application, but that involves a lot of heavy lifting.

However, there is another way. Take an S3 Driver and run your application against that. Assuming your vendor supports all the functionality used in the S3 Driver, it should all work with the real object storage solution.

Open source S3 driver

img_6916Scality open sourced their S3 driver just to make this process easier. Now, one could just download their S3server driver (available from Scality’s GitHub) and start it up.

Scality’s S3 driver runs ontop of a Docker Engine so to run it on your desktop you would need to install Docker Toolbox for older Mac or Windows systems or run Docker for Mac or Docker for Windows for newer systems. (We also talked with Docker at CFD1).

img_6933Firing up the S3server on my Mac

I used Docker for Mac but I assume the terminal CLI is the same for both.Downloading and installing Docker for Mac was pretty straightforward.  Starting it up took just a double click on the Docker application, which generates a toolbar Docker icon. You do need to enter your login password to run Docker for Mac but once that was done, you have Docker running on your Mac.

Open up a terminal window and you have the full Docker CLI at your disposal. You can download the latest S3 Server from Scality’s Docker hub by executing  a pull command (docker pull scality/s3server), to fire it up, you need to define a new container (docker run -d –name s3server -p 8000:8000 scality/s3server) and then start it (docker start s3server).

It’s that simple to have a S3server running on your Mac. The toolbox approach for older Mac’s and PC’S is a bit more complicated but seems simple enough.

The data is stored in the container and persists until you stop/delete the container. However, there’s an option to store the data elsewhere as well.

I tried to use CyberDuck to load some objects into my Mac’s S3server but couldn’t get it to connect properly. I wrote up a ticket to the S3server community. It seemed to be talking to the right port, but maybe I needed to do an S3cmd to initialize the bucket first – I think.

[Update 2016Sep19: Turns out the S3 server getting started doc said you should download an S3 profile for Cyberduck. I didn’t do that originally because I had already been using S3 with Cyberduck. But did that just now and it now works just like it’s supposed to. My mistake]

~~~~

Anyways, it all seemed pretty straight forward to run S3server on my Mac. If I was an application developer, it would make a lot of sense to try S3 this way before I did anything on the real AWS S3. And some day, when I grew tired of paying AWS, I could always migrate to Scality RING S3 object storage – or at least that’s the idea.

Comments?

BlockStack, a Bitcoin secured global name space for distributed storage

At USENIX ATC conference a couple of weeks ago there was a presentation by a number of researchers on their BlockStack global name space and storage system based on the blockchain based Bitcoin network. Their paper was titled “Blockstack: A global naming and storage system secured by blockchain” (see pg. 181-194, in USENIX ATC’16 proceedings).

Bitcoin blockchain simplified

Blockchain’s like Bitcoin have a number of interesting properties including completely distributed understanding of current state, based on hashing and an always appended to log of transactions.

Blockchain nodes all participate in validating the current block of transactions and some nodes (deemed “miners” in Bitcoin) supply new blocks of transactions for validation.

All blockchain transactions are sent to each node and blockchain software in the node timestamps the transaction and accumulates them in an ordered append log (the “block“) which is then hashed, and each new block contains a hash of the previous block (the “chain” in blockchain) in the blockchain.

The miner’s block is then compared against the non-miners node’s block (hashes are compared) and if equal then, everyone reaches consensus (agrees) that the transaction block is valid. Then the next miner supplies a new block of transactions, and the process repeats. (See wikipedia’s article for more info).

All blockchain transactions are owned by a cryptographic address. Each cryptographic address has a public and private key associated with it.
Continue reading “BlockStack, a Bitcoin secured global name space for distributed storage”

Testing filesystems for CPU core scalability

IMG_6536I attended HotStorage’16 and Usenix ATC’16 conferences this past week and there was a paper presented at ATC titled “Understanding Manicure Scalability of File Systems” (see p. 71 in PDF) by Changwoo Min and others at Georgia Institute of Technology. This team of researchers set out to understand the bottlenecks in a typical file systems as they scaled from 1 to 80 (or more) CPU cores on the same server.

FxMark, a new scalability benchmark

They created a new benchmark to probe CPU core scalability they called FxMark (source code available at FxMark), consisting of 19 “micro benchmarks” stressing specific scalability scenarios and three application level benchmarks, representing popular file system activities.

The application benchmarks in FxMark included: standard mail server (Exim), a NoSQL DB (RocksDB) and a standard user file server (DBENCH).

In the micro benchmarks, they stressed 7 different components of files systems: 1) path name resolution; 2) page cache for buffered IO; 3) node management; 4) disk block management; 5) file offset to disk block mapping; 6) directory management; and 7) consistency guarantee mechanism.
Continue reading “Testing filesystems for CPU core scalability”