Tape still alive, well and growing at Spectra Logic

T-Finity library at SpectraLogic's test facility (c) 2011 Silverton Consulting, All Rights Reserved
T-Finity library at SpectraLogic's test facility (c) 2011 Silverton Consulting, All Rights Reserved

Today I met with Spectra Logic execs and some of their Media and Entertainment (M&E) customers, and toured their manufacturing, test labs and briefing center.  The tour was a blast and the customers Kyle Knack from National Geographic (Nat Geo) Global Media, Toni Perez from Medcom (Panama based entertainment company) and Lee Coleman from Entertainment Tonight (ET) all talked about their use of the T-950 Spectra Logic tape libraries in the media ingest, editing and production processes.

Mr. Collins from ET spoke almost reverently about their T-950 and how it has enabled ET to access over 30 years of video interviews, movie segments and other media they can now use to put together clips on just about any entertainment subject imaginable.

He  talked specifically about the obit they did for Michael Jackson and how they were able to grab footage from an interview they did years ago and splice it together with more recent media to show a more complete story.  He also showed a piece on some early Eddie Murphy film footage and interviews they had done at the time which they used in a recent segment about his new movie.

All this was made possible by moving to digital file formats and placing digital media in their T-950 tape libraries.

Spectra Logic T-950 (I think) with TeraPack loaded in robot (c) 2011 Silverton Consulting, All Rights Reserved
Spectra Logic T-950 (I think) with TeraPack loaded in robot (c) 2011 Silverton Consulting, All Rights Reserved

Mr. Knack from Nat Geo Media said every bit of media they get anymore, automatically goes into the library archive and becomes the “original copy” of the media used in case other copies are corrupted or lost.  Nat Geo started out only putting important media in the library but found it just cost so much less to just store it in the tape archive that they decided it made more sense to just move all media to the tape library.

Typically they keep two copies in their tape library and important media is also copied to tape and shipped offsite (3 copies for this data).  They have a 4-frame T-950 with around 4000 slots and 14 drives (combination of LTO-4 and -5).  They use FC and FCoE storage for their primary storage and depend on 1000s of SATA drives for primary storage access.

He said they only use SSDs for some metadata support for their web site. He found that SATA drives can handle their big block sequential and provide consistent throughput and especially important to M&E companies consistent latency.

3D printer at Spectra Logic (for mechanical parts fabrication) (c) 2011 Silverton Consulting, All Rights Reserved
3D printer at Spectra Logic (for mechanical parts fabrication) (c) 2011 Silverton Consulting, All Rights Reserved

Mr. Perez from MedCom had much the same story. They were in the process of moving off of proprietary video tape format (Sony Betacam) to LTO media and digital files. The process is still ongoing although they are more than halfway there for current production.

They still have a lot of old media in Betacam format which will take them years to convert to digital files but they are at least starting this activity.  He said a recent move from one site to another revealed that much of the Betacam tapes were no longer readable.  Digital files on LTO tape should solve that problem for them when they finally get there.

Matt Starr Spectra Logic CTO talked about the history of tape libraries at Spectra Logic which was founded in 1998 and has been laser focused on tape data protection and tape libraries.

I find it pleasantly surprising that a company today can just supply tape libraries with software and make a ongoing concern of it. Spectra Logic must be doing something right, revenue grew 30% YoY last year and they are outgrowing their current (88K sq ft) office, lab, and manufacturing building they just moved into earlier this year and have just signed to occupy another building providing 55K sq ft of more space.

T-Series robot returning TeraPack to shelf (c) 2011 Silverton Consulting, All Rights Reserved
T-Series robot returning TeraPack to shelf (c) 2011 Silverton Consulting, All Rights Reserved

Molly Rector Spectra Logic CMO talked about the shift in the market from peta-scale (10**15 bytes) storage repositories to exa-scale (10**18 bytes) ones.  Ms. Rector believed that today’s cloud storage environments can take advantage of these large tape based, archives to provide much more economical storage for their users without suffering any performance penalty.

At lunch with Matt Starr, Fred Moore (Horison Information Strategies)Mark Peters (Enterprise Strategy Group) and I were talking about HPSS (High Performance Storage System) developed in conjunction with IBM and 5 US national labs that supports vast amounts of data residing across primary disk and tape libraries.

Matt said that there are about a dozen large HPSS sites (HPSS website shows at least 30 sites using it) that store a significant portion of the worlds 1ZB (10**21 bytes) of digital data created this past year (see my 3.3 exabytes of data a day!? post).  Later that day talking with Nathan Thompson Spectra Logic CEO, he said these large HPSS sites probably store ~10% of the worlds data, or 100EB.  I find that difficult to comprehend that much data at only ~12 sites but the national labs do have lots of data on hand.

Nowadays you can get a Spectra Logic T-Finity tape complex with 122K slot, using LTO-4/-5 or IBM TS1140 (enterprise class) tape drives.  This large a T-Finity has 4 rows of tape libraries which uses the ‘Skyway’ to transport a terapack of tape cartridges between one library row to the another.   All Spectra Logic libraries are built around a tape cartridge package they call the TeraPack which contains 10 LTO cartridges or (I think) 9-TS1140 tape cartridges (they are bigger than LTO tapes).  The TeraPack is used to import or export tapes from the library and all the tape slots in the library.

The software used to control all this is called BlueScale and is used in their T50e, a small, 50 slot library all the way up to the 122K T-Finity tape complex.  There are some changes for configuration, robotics and other personalization for each library type but the UI looks exactly the same across any of their libraries. Moreover, BlueScale offers the same enterprise level of functionality (e.g., drive and media life management) services for all Spectra Logic tape libraries.

Day 1 for SpectraPRDay closed with the lab tour and dinner.  Day 2 will start discussing futures and will be under NDA so there won’t be much to talk about right away. But from what I can see, Spectra Logic seems to be breaking down the barriers inhibiting tape use and providing tape library systems, that people almost revere.

I haven’t seen that sort of reaction about a tape library since the STK 4400 first came out last century.

—-

Comments?

Disk drive density multiplying by 6X

Sodium Chloride by amandabhslater (cc) (From Flickr)
Sodium Chloride by amandabhslater (cc) (From Flickr)

In a news story out of Singapore Institute of Materials Research and Engineering (IMRE), Dr. Joel Yang has demonstrated 6X the current density on disk platter media, or up to 3.3 Terabits /square inch (Tb/sqin). And it all happens due to salt (sodium chloride) crystals.

I have previously discussed some of the problems encountered by the disk industry going to the next technology transition trying to continue current density trends.  At the time, the then best solution was to use bit-patterned media (BPM) and shingled writes discussed in my Sequential Only Disk!? and Disk trends, revisited posts.  However, this may have been premature.

Just add salt

It turns out that by adding salt to the lithographic process used to disperse magnetic particles onto disk platters for BPM, the particles are more regularly spaced. In contrast, todays process used in current disk media manufacturing, causes the particles to be randomly spaced.

More regular magnetic particle spacing on media provides two immediate benefits for disk density:

  • More particles can be packed in the same area. With increased magnetic particles located in a square inch of media, more data can be recorded.
  • Bigger particles can be used for recording data. With larger grains, data can be recorded using a single structure rather than using multiple, smaller particles, increasing density yet again.

Combining these two attributes increases disk platter capacities by a factor of 6 without having to alter read-write head technology.  The IMRE team demonstrated 1.9Tb/sqin recording capacity but fabricated media with particles at levels that could provide 3.3Tb/sqin.  Currently, the disk industry is demonstrating 0.5Tb/sqin.

Other changes needed

I suppose other changes will also be needed to accommodate the increased capacity, not the least of which is speeding up the read-write channels to support 6X more bits being accessed per revolution.  Probably other items need to be changed as well,  but these all come with increased disk density.

Before this technique came along the next density levels was turning out to be a significant issue. But now that salt is in use, we can all rest easy knowing that disk capacity trends can continue to increase with todays recording head technology.

Using the recent 4TB 7200RPM hard drives (see my Disk capacity growing out-of-sight post), but moving to salt and BPM, the industry could potentially create a 24TB 7200RPM drive or for the high performance 600GB 15KRPM drives, 3.6TB high performance disks!  Gosh, not to long ago 24TB of storage was a good size storage system for SMB shops, with this technology, it’s just a single disk drive.

—-

Comments?

IBM’s 120PB storage system

Susitna Glacier, Alaska by NASA Goddard Photo and Video (cc) (from Flickr)
Susitna Glacier, Alaska by NASA Goddard Photo and Video (cc) (from Flickr)

Talk about big data, Technology Review reported this week that IBM is building a 120PB storage system for some unnamed customer.  Details are sketchy and I cannot seem to find any announcement of this on IBM.com.

Hardware

It appears that the system uses 200K disk drives to support the 120PB of storage.  The disk drives are packed in a new wider rack and are water cooled.  According to the news report the new wider drive trays hold more drives than current drive trays available on the market.

For instance, HP has a hot pluggable, 100 SFF (small form factor 2.5″) disk enclosure that sits in 3U of standard rack space.  200K SFF disks would take up about 154 full racks, not counting the interconnect switching that would be required.  Unclear whether water cooling would increase the density much but I suppose a wider tray with special cooling might get you more drives per floor tile.

There was no mention of interconnect, but today’s drives use either SAS or SATA.  SAS interconnects for 200K drives would require many separate SAS busses. With an SAS expander addressing 255 drives or other expanders, one would need at least 4 SAS busses but this would have ~64K drives per bus and would not perform well.  Something more like 64-128 drives per bus would have much better performer and each drive would need dual pathing, and if we use 100 drives per SAS string, that’s 2000 SAS drive strings or at least 4000 SAS busses (dual port access to the drives).

The report mentioned GPFS as the underlying software which supports three cluster types today:

  • Shared storage cluster – where GPFS front end nodes access shared storage across the backend. This is generally SAN storage system(s).  But the requirements for high density, it doesn’t seem likely that the 120PB storage system uses SAN storage in the backend.
  • Networked based cluster – here the GPFS front end nodes talk over a LAN to a cluster of NSD (network storage director?) servers which can have access to all or some of the storage. My guess is this is what will be used in the 120PB storage system
  • Shared Network based clusters – this looks just like a bunch of NSD servers but provides access across multiple NSD clusters.

Given the above, with ~100 drives per NSD server means another 1U extra per 100 drives or (given HP drive density) 4U per 100 drives for 1000 drives and 10 IO servers per 40U rack, (not counting switching).  At this density it takes ~200 racks for 120PB of raw storage and NSD nodes or 2000 NSD nodes.

Unclear how many GPFS front end nodes would be needed on top of this but even if it were 1 GPFS frontend node for every 5 NSD nodes, we are talking another 400 GPFS frontend nodes and at 1U per server, another 10 racks or so (not counting switching).

If my calculations are correct we are talking over 210 racks with switching thrown in to support the storage.  According to IBM’s discussion on the Storage challenges for petascale systems, it probably provides ~6TB/sec of data transfer which should be easy with 200K disks but may require even more SAS busses (maybe ~10K vs. the 2K discussed above).

Software

IBM GPFS is used behind the scenes in IBM’s commercial SONAS storage system but has been around as a cluster file system designed for HPC environments for over 15 years or more now.

Given this many disk drives something needs to be done about protecting against drive failure.  IBM has been talking about declustered RAID algorithms for their next generation HPC storage system which spreads the parity across more disks and as such, speeds up rebuild time at the cost of reducing effective capacity. There was no mention of effective capacity in the report but this would be a reasonable tradeoff.  A 200K drive storage system should have a drive failure every 10 hours, on average (assuming a 2 million hour MTBF).  Let’s hope they get drive rebuild time down much below that.

The system is expected to hold around a trillion files.  Not sure but even at 1024 bytes of metadata per file, this number of files would chew up ~1PB of metadata storage space.

GPFS provides ILM (information life cycle management, or data placement based on information attributes) using automated policies and supports external storage pools outside the GPFS cluster storage.  ILM within the GPFS cluster supports file placement across different tiers of storage.

All the discussion up to now revolved around homogeneous backend storage but it’s quite possible that multiple storage tiers could also be used.  For example, a high density but slower storage tier could be combined with a low density but faster storage tier to provide a more cost effective storage system.  Although, it’s unclear whether the application (real world modeling) could readily utilize this sort of storage architecture nor whether they would care about system cost.

Nonetheless, presumably an external storage pool would be a useful adjunct to any 120PB storage system for HPC applications.

Can it be done?

Let’s see, 400 GPFS nodes, 2000 NSD nodes, and 200K drives. Seems like the hardware would be readily doable (not sure why they needed watercooling but hopefully they obtained better drive density that way).

Luckily GPFS supports Infiniband which can support 10,000 nodes within a single subnet.  Thus an Infiniband interconnect between the GPFS and NSD nodes could easily support a 2400 node cluster.

The only real question is can a GPFS software system handle 2000 NSD nodes and 400 GPFS nodes with trillions of files over 120PB of raw storage.

As a comparison here are some recent examples of scale out NAS systems:

It would seem that a 20X multiplier times a current Isilon cluster or even a 10X multiple of a currently supported SONAS system would take some software effort to work together, but seems entirely within reason.

On the other hand, Yahoo supports a 4000-node Hadoop cluster and seems to work just fine.  So from a feasability perspective, a 2500 node GPFS-NSD node system seems just a walk in the park for Hadoop.

Of course, IBM Almaden is working on project to support Hadoop over GPFS which might not be optimum for real world modeling but would nonetheless support the node count being talked about here.

——

I wish there was some real technical information on the project out on the web but I could not find any. Much of this is informed conjecture based on current GPFS system and storage hardware capabilities. But hopefully, I haven’t traveled to far astray.

Comments?

 

When will disks become extinct?

A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)
A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)

Yesterday, it was announced that Hitachi General Storage Technologies (HGST) is being sold to Western Digital for $4.3B and after that there was much discussion in the tweeterverse about the end of enterprise disk as we know it.  Also, last week I was at a dinner at an analyst meeting with Hitachi, where the conversation turned to when disks will no longer be available. This discussion was between Mr. Takashi Oeda of Hitachi RSD, Mr. John Webster of Evaluator group and myself.

Why SSDs will replace disks

John was of the opinion that disks would stop being economically viable in about 5 years time and will no longer be shipping in volume, mainly due to energy costs.  Oeda-san said that Hitachi had predicted that NAND pricing on a $/GB basis would cross over (become less expensive than) 15Krpm disk pricing sometime around 2013.  Later he said that NAND pricing had not come down as fast as projected and that it was going to take longer than anticipated.  Note that Oeda-san mentioned density price cross over for only 15Krpm disk not 7200rpm disk.  In all honesty, he said SATA disk would take longer, but he did not predict when

I think both arguments are flawed:

  • Energy costs for disk drives drop on a Watts/GB basis every time disk density increases. So the energy it takes to run a 600GB drive today will likely be able to run a 1.2TB drive tomorrow.  I don’t think energy costs are going to be the main factor to drives disks out of the enterprise.
  • Density costs for NAND storage are certainly declining but cost/GB is not the only factor in technology adoption. Disk storage has cost more than tape capacity since the ’50s, yet they continue to coexist in the enterprise. I contend that disks will remain viable for at least the next 15-20 years over SSDs, primarily because disks have unique functional advantages which are vital to enterprise storage.

Most analysts would say I am wrong, but I disagree. I believe disks will continue to play an important role in the storage hierarchy of future enterprise data centers.

NAND/SSD flaws from an enterprise storage perspective

All costs aside, NAND based SSDs have serious disadvantages when it comes to:

  • Data retention – the problem with NAND data cells is that they can only be written so many times before they fail.  And as NAND cells become smaller, this rate seems to be going the wrong way, i.e,  today’s NAND technology can support 100K writes before failure but tomorrow’s NAND technology may only support 15K writes before failure.  This is not a beneficial trend if one is going to depend on NAND technology for the storage of tomorrow.
  • Sequential access – although NAND SSDs perform much better than disk when it comes to random reads and less so, random writes, the performance advantage of sequential access is not that dramatic.  NAND sequential access can be sped up by deploying multiple parallel channels but it starts looking like internal forms of wide striping across multiple disk drives.
  • Unbalanced performance – with NAND technology, reads operate quicker than writes. Sometimes 10X faster.  Such unbalanced performance can make dealing with this technology more difficult and less advantageous than disk drives of today with much more balanced performance.

None of these problems will halt SSD use in the enterprise. They can all be dealt with through more complexity in the SSD or in the storage controller managing the SSDs, e.g., wear leveling to try to prolong data retention, multi-data channels for sequential access, etc. But all this additional complexity increases SSD cost, and time to market.

SSD vendors would respond with yes it’s more complex, but such complexity is a one time charge, mostly a one time delay, and once done, incremental costs are minimal. And when you come down to it, today’s disk drives are not that simple either with defect skipping, fault handling, etc.

So why won’t disk drives go away soon.  I think other major concern in NAND/SSD ascendancy is the fact that the bulk NAND market is moving away from SLC (single level cell or bit/cell) NAND to MLC (multi-level cell) NAND due to it’s cost advantage.  When SLC NAND is no longer the main technology being manufactured, it’s price will not drop as fast and it’s availability will become more limited.

Some vendors also counter this trend by incorporating MLC technology into enterprise SSDs. However, all the problems discussed earlier become an order of magnitude more severe with MLC NAND. For example, rather than 100K write operations to failure with SLC NAND today, it’s more like 10K write operations to failure on current MLC NAND.  The fact that you get 2 to 3 times more storage per cell with MLC doesn’t help that much when one gets 10X less writes per cell. And the next generation of MLC is 10X worse, maybe getting on the order of 1000 writes/cell prior to failure.  Similar issues occur for write performance, MLC writes are much slower than SLC writes.

So yes, raw NAND may become cheaper than 15Krpm Disks on a $/GB basis someday but the complexity to deal with such technology is also going up at an alarming rate.

Why disks will persist

Now something similar can be said for disk density, what with the transition to thermally assisted recording heads/media and the rise of bit-patterned media.  All of which are making disk drives more complex with each generation that comes out.  So what allows disks to persist long after $/GB is cheaper for NAND than disk:

  • Current infrastructure supports disk technology well in enterprise storage. Disks have been around so long, that storage controllers and server applications have all been designed around them.  This legacy provides an advantage that will be difficult and time consuming to overcome. All this will delay NAND/SSD adoption in the enterprise for some time, at least until this infrastructural bias towards disk is neutralized.
  • Disk technology is not standing still.  It’s essentially a race to see who will win the next generations storage.  There is enough of an eco-system around disk that will keep pushing media, heads and mechanisms ever forward into higher densities, better throughput, and more economical storage.

However, any infrastructural advantage can be overcome in time.  What will make this go away even quicker is the existance of a significant advantage over current disk technology in one or more dimensions. Cheaper and faster storage can make this a reality.

Moreover, as for the ecosystem discussion, arguably the NAND ecosystem is even larger than disk.  I don’t have the figures but if one includes SSD drive producers as well as NAND semiconductor manufacturers the amount of capital investment in R&D is at least the size of disk technology if not orders of magnitude larger.

Disks will go extinct someday

So will disks become extinct, yes someday undoubtedly, but when is harder to nail down. Earlier in my career there was talk of super-paramagnetic effect that would limit how much data could be stored on a disk. Advances in heads and media moved that limit out of the way. However, there will come a time where it becomes impossible (or more likely too expensive) to increase magnetic recording density.

I was at a meeting a few years back where a magnetic head researcher predicted that such an end point to disk density increase would come in 25 years time for disk and 30 years for tape.  When this occurs disk density increase will stand still and then it’s a certainty that some other technology will take over.  Because as we all know data storage requirements will never stop increasing.

I think the other major unknown is other, non-NAND semiconductor storage technologies still under research.  They have the potential for  unlimited data retention, balanced performance and sequential performance orders of magnitude faster than disk and can become a much more functional equivalent of disk storage.  Such technologies are not commercially available today in sufficient densities and cost to even threaten NAND let alone disk devices.

—-

So when do disks go extinct.  I would say in 15 to 20 years time we may see the last disks in enterprise storage.  That would give disks an almost an 80 year dominance over storage technology.

But in any event I don’t see disks going away anytime soon in enterprise storage.

Comments?

Hitachi-HDS Strategy Sessions in Japan

Went to Tokyo at HDS’s expense this week and had a great time talking with Hitachi and HDS about their current and future product portfolio.

Gaggle of Analysts (c) 2011 Silverton Consulting, All Rights Reserved
Gaggle of Analysts (c) 2011 Silverton Consulting, All Rights Reserved

Arrived Monday evening and went out for an informal dinner.  In the photo foreground one can see  John Webster (Evaluator Group), Andrew Reichman (Forrester Research), Richard Jew (HDS), and the back of Sean Moser (HDS).   In the background, Claus Egge, (Claus Egge), Laura DuBois and Nick Sundby (both from IDC).  We had a great Japanese dinner, close to the hotel.

Monday started off with a ride on Japan’s Bullet Train, the Shinkansen.  Some debate as to whether it went 300 or 500 Km/hour but it seemed fast enough and got us from our hotel near Tokyo’s Shinagawa station to Odawara in under an hour.  The train was quiet, comfortable and quick.

Shinkansen coming to a stop (c) 2011 Silverton Consulting, All Rights Reserved
Shinkansen coming to a stop (c) 2011 Silverton Consulting, All Rights Reserved

In the picture at the station one can see Christophe Bertrand (HDS) and Tony Lock (Freeform Dynamics) as well as Laura and Nick (IDC) in the background.  Another picture of the Shinkansen from the return trip.

Shinkansen (c) 2011 Silverton Consulting, All Rights Reserved
Shinkansen (c) 2011 Silverton Consulting, All Rights Reserved

Spent the day at Odawara Hitachi’s Customer Demo center listening to Hitachi RSD (hardware division) and HDS talk about the upcoming storage product plans.

Hitachi, the parent company of HDS, is into a lot of technology areas.  We talked briefly about some of these which included hard disk drives, server technology, and telecommunications equipment.  In addition to these info tech arenas, they provide MRIs, construction equipment, energy generation and other equipment.  The bullet trains we were using to get back and forth to the hotel were also manufactured by Hitachi.  It turns out information technology and hard disk drive technology represents just under 30% of Hitachi’s global revenue.

Platter data density history (c) 2011 Silverton Consulting, All Rights Reserved
Platter data density history (c) 2011 Silverton Consulting, All Rights Reserved

They had an interesting historical study of storage technology there. Took some photos.  They also had on display just about every current Hitachi-HDS storage product currently sold as well as some of their prior storage systems.

They held discussions on upcoming products and other capabilities most of which was under NDA.  I and the other analysts had a chance to critique some of their plans with more feedback to come.

The second day was also spent in Odowara but at another facility, with the software division (responsible for HiCommand and other Hitachi software) discussing the next generation of HiCommand and other software functionality.  After lunch we got a chance to tour the factory premises.

Hitachi manufacturers all the PCBs (printed circuit boards) for all HDS storage products in Japan. I believe they said they were manufacturing 7500 PCBs a day across the PCB lines they operate here.  We walked through one PCB line as well as the final assembly and test area for VSP and AMS products. I was pretty impressed with what I saw.  We weren’t able to take many pictures here but I was allowed a few.

Hitachi's PCB line (c) 2011 Silverton Consulting, All Rights Reserved
Hitachi's PCB line (c) 2011 Silverton Consulting, All Rights Reserved

I have seen a lot of assembly areas and test areas and this seemed to be far and away more sophisticated than most.  I have also seen my fair share of ESS (environmental stress screening chambers) but these looked more like waiting rooms than ESS chambers. Everyone looked busy but not to harried. I suppose had it been end of year rather than middle of quarter it might have looked different.

On the metro in Tokyo (c) 2011 Silverton Consulting, All Rights Reserved
On the metro in Tokyo (c) 2011 Silverton Consulting, All Rights Reserved

The last day was spent at Hitachi Central Research Lab (HCRL) in Tokyo. We had to take two Tokyo metro trains to get there. I thought we would have the metro train pushers to “compress” us into the train cars but apparently it just wasn’t that busy, so we were spared that “experience”.

Tokyo Metro Train Ride (c) 2011 Silverton Consulting, All Rights Reserved
Tokyo Metro Train Ride (c) 2011 Silverton Consulting, All Rights Reserved

Nonetheless, the ride on the metro lines was fun, loud and crowded.  In the pictures on the platform one can See Michael Hay (Hitachi), Jason Knadler (HDS), Richard, Andrew, Tony and a few other analysts.  In the car, one can readily see Andrew, Nick and Sean. Also the back of Micky Sandorfi’s (HDS) head.  Had lot’s of “close bonding” on that trip.

Hitachi’s Central Research Lab (HCRL) was founded during WWII and has been doing fundamental and business sponsored research there ever since.  It seems like just about everyone we met there had a PhD after their name.  We were shown some of Hitachi’s advanced research in optical interconnects, next generation MRI, explosive’s detection, and other technologies.

One of the highlights of the day’s events though was the tour of the grounds. HCRL is in a campus-like setting with forested areas, hot springs and ponds to no doubt help the scientists invent the next big thing.  We were allowed to take pictures

Plum blossom on HCRL Grounds (c) 2011 Silverton Consulting, All Rights Reserved
Plum blossom on HCRL Grounds (c) 2011 Silverton Consulting, All Rights Reserved

here and I have included a few of them.  In the plum blossom picture, Mr. IRIE Naohika who led the tour can be easily seen as well as the back of Rajnish Arora (IDC). I think one can see the back of Claus’s and Nick’s heads here as well as Ms. NAKAMURA Yuko.  In the other picture one can see Mr. MAEDA Yuki (HDS) who led most of the trip in Japan and was showing us the way to the lab, as well as bits of Sean, Micky, Rajnish and Tony.

At the entrance to HCRL grounds (c) 2011 Silverton Consulting, All Rights Reserved
At the entrance to HCRL grounds (c) 2011 Silverton Consulting, All Rights Reserved

—-

All in all I had a great trip. We learned a lot about Hitachi-HDS technology and upcoming products. Got to see Tokyo and had a wonderful time. Overall I thought the meetings were productive for both analysts and Hitach-HDS.

The only negative I would have to say was the mad dash through the Shinagawa station with Sean and Yuki to get me to the “Narita Express” train on time, but other than that it was lot’s of fun.

Top 10 storage technologies over the last decade

Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)
Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)

Some of these technologies were in development prior to 2000, some were available in other domains but not in storage, and some were in a few subsystems but had yet to become popular as they are today.  In no particular order here are my top 10 storage technologies for the decade:

  1. NAND based SSDs – DRAM and other technology solid state drives (SSDs) were available last century but over the last decade NAND Flash based devices have dominated SSD technology and have altered the storage industry forever more.  Today, it’s nigh impossible to find enterprise class storage that doesn’t support NAND SSDs.
  2. GMR head– Giant Magneto Resistance disk heads have become common place over the last decade and have allowed disk drive manufacturers to double data density every 18-24 months.  Now GMR heads are starting to transition over to tape storage and will enable that technology to increase data density dramatically
  3. Data DeduplicationDeduplication technologies emerged over the last decade as a complement to higher density disk drives as a means to more efficiently backup data.  Deduplication technology can be found in many different forms today, ranging from file and block storage systems, backup storage systems, to backup software only solutions.
  4. Thin provisioning – No one would argue that thin provisioning emerged last century but it took the last decade to really find its place in the storage pantheon.  One almost cannot find a data center class storage device that does not support thin provisioning today.
  5. Scale-out storage – Last century if you wanted to get higher IOPS from a storage subsystem you could add cache or disk drives but at some point you hit a subsystem performance wall.  With scale-out storage, one can now add more processing elements to a storage system cluster without having to replace the controller to obtain more IO processing power.  The link reference talks about the use of commodity hardware to provide added performance but scale-out storage can also be done with non-commodity hardware (see Hitachi’s VSP vs. VMAX).
  6. Storage virtualizationserver virtualization has taken off as the dominant data center paradigm over the last decade but a counterpart to this in storage has also become more viable as well.  Storage virtualization was originally used to migrate data from old subsystems to new storage but today can be used to manage and migrate data over PBs of physical storage dynamically optimizing data placement for cost and/or performance.
  7. LTO tape When IBM dominated IT in the mid to late last century, the tape format dejour always matched IBM’s tape technology.  As the decade dawned, IBM was no longer the dominant player and tape technology was starting to diverge into a babble of differing formats.  As a result, IBM, Quantum, and HP put their technology together and created a standard tape format, called LTO, which has become the new dominant tape format for the data center.
  8. Cloud storage Unclear just when over the last decade cloud storage emerged but it seemed to be a supplement to cloud computing that also appeared this past decade.  Storage service providers had existed earlier but due to bandwidth limitations and storage costs didn’t survive the dotcom bubble. But over this past decade both bandwidth and storage costs have come down considerably and cloud storage has now become a viable technological solution to many data center issues.
  9. iSCSI SCSI has taken on many forms over the last couple of decades but iSCSI has the altered the dominant block storage paradigm from a single, pure FC based SAN to a plurality of technologies.  Nowadays, SMB shops can have block storage without the cost and complexity of FC SANs over the LAN networking technology they already use.
  10. FCoEOne could argue that this technology is still maturing today but once again SCSI has taken opened up another way to access storage. FCoE has the potential to offer all the robustness and performance of FC SANs over data center Ethernet hardware simplifying and unifying data center networking onto one technology.

No doubt others would differ on their top 10 storage technologies over the last decade but I strived to find technologies that significantly changed data storage that existed in 2000 vs. today.  These 10 seemed to me to fit the bill better than most.

Comments?

The future of data storage is MRAM

Core Memory by teclasorg
Core Memory by teclasorg

We have been discussing NAND technology for quite awhile now but this month I ran across an article in IEEE Spectrum titled “a SPIN to REMEMBER – Spintronic memories to revolutionize data storage“. The article discussed a form of magneto-resistive random access memory or MRAM that uses quantum mechanical spin effects or spintronics to record data. We have talked about MRAM technology before and progress has been made since then.

Many in the industry will recall that current GMR (Giant Magneto-resistance) heads and TMR (Tunnel magneto-resistance) next generation disk read heads already make use of spintronics to detect magnetized bit values in disk media. GMR heads detect bit values on media by changing its electrical resistance.

Spintronics however can also be used to record data as well as read it. These capabilities are being exploited in MRAM technology which uses a ferro-magnetic material to record data in magnetic spin alignment – spin UP, means 0; spin down, means 1 (or vice versa).

The technologists claim that when MRAM reaches its full potential it could conceivably replace DRAM, SRAM, NAND, and hard disk drives or all current electrical and magnetic data storage. Some of MRAM’s advantages include unlimited write passes, fast reads and writes and data non-volatilility.

MRAM reminds me of old fashioned magnetic core memory (in photo above) which used magnetic polarity to record non-volatile data bits. Core was a memory mainstay in the early years of computing before the advent of semi-conductor devices like DRAM.

Back to future – MRAM

However, the problems with MRAM today are that it is low-density, takes lots of power and is very expensive. But technologists are working on all these problems with the view that the future of data storage will be MRAM. In fact, researchers at the North Carolina State University (NCSU) Electrical Engineering department have been having some success with reducing power requirements and increasing density.

As for data density NCSU researchers now believe they can record data in cells approximating 20 nm across, better than current bit patterned media which is the next generation disk recording media. However reading data out of such a small cell will prove to be difficult and may require a separate read head on top of each cell. The fact that all of this is created with normal silicon fabrication methods make doing so at least feasible but the added chip costs may be hard to justify.

Regarding high power, their most recent design records data by electronically controlling the magnetism of a cell. They are using dilute magnetic semiconductor material doped with gallium maganese which can hold spin value alignment (see the article for more information). They are also using a semiconductor p-n junction on top of the MRAM cell. Apparently at the p-n junction they can control the magnetization of the MRAM cells by applying -5 volts or removing this. Today the magnetization is temporary but they are also working on solutions for this as well.

NCSU researchers would be the first to admit that none of this is ready for prime time and they have yet to demonstrate in the lab a MRAM memory device with 20nm cells, but the feeling is it’s all just a matter of time and lot’s of research.

Fortunately, NCSU has lots of help. It seems Freescale, Honeywell, IBM, Toshiba and Micron are also looking into MRAM technology and its applications.

—–

Let’s see, using electron spin alignment in a magnetic medium to record data bits, needs a read head to read out the spin values – couldn’t something like this be used in some sort of next generation disk drive that uses the ferromagnetic material as a recording medium. Hey, aren’t disks already using a ferromagnetic material for recording media? Could MRAM be fabricated/layed down as a form of magnetic disk media?? Maybe there’s life in disks yet….

What do you think?

Save the planet – buy fatter disks and flash

Hard drive capacity overt time (from commons.wikimedia.org) (cc)
PC hard drive capacity over time (from commons.wikimedia.org) (cc)

Well maybe that overstates the case but there is no denying that both fatter (higher capacity) drives and flash memory (used as cache or in SSDs) saves energy in today’s data center.  The interesting thing is that the trend to higher capacity drives has been going on for decades now (see chart) but only within the last few years has been given any credit for energy reduction.  In contrast, flash in SSDs and cache is a relative newcomer but saves energy nonetheless.

I almost can’t recall when disk drives weren’t doubling in capacity every 18 to 24 months.  The above chart only shows PC drives capacities over time but enterprise drives have followed a similar curve.  The coming hard drive capacity wall may slow things down in the future but just last week IBM announced they were moving from a 300GB to a 600GB 15Krpm enterprise class disk drive in their DS8700 subsystem.  While doubling capacity may not quite halve energy use, it’s still significant.   Such energy reductions are even more dramatic with slower, higher density disks. These SATA disks are moving from 1TB to 2TB later this year and should cut energy use considerably.

Similarly, NAND flash density used in SSDs is increasing capacity at almost a faster rate than disk storage.  ASIC feature size continues to shrink and as such, more and more flash storage is packed onto the same die size.  Improvements like these are doubling the capacity of SSDs and flash memory.  While SSD power reduction due to density improvements may not be as significant as disk, we hope to see a flattening out of power use per NAND cell over time.  This flattening out of power use is now happening with processing chips and we see little reason why similar techniques couldn’t apply to NAND.

But the story with flash/SSDs is a bit more complicated:

  • SSDs don’t consume as much energy as a standard disk drive at the same capacity, so a 146GB enterprise class SSD should consume much less energy than a 146GB enterprise class disk drive.
  • SSDs don’t exhibit the significant energy spike that hard disk drives encounter when driven at higher IOPs and was discussed in SSDs vs. Drives energy use.
  • SSDs can often replace many more disk spindles than pure capacity equivalence would dictate.  Some data centers use more disks than necessary to spread workload performance over more spindles wasting storage, power and cooling.  Moving this data to SSDs or adding flash cache to a subsystem, spindle counts can be reduced dramatically and as such, slash energy use for storage.

All this says that using SSDs or flash in place of disk drives reduces data center power requirements.  So if you’re interested in saving energy and thus, helping to save the planet, buy fat(ter) disks and flash for your data storage needs.

Brought to you on behalf of Planet Earth in honor of Earth Day.