Primary data’s path to better data storage presented at SFD8

IMG_5606rz A couple of weeks ago we met with Primary Data, Lance Smith, CEO, David Flynn, CTO and Kaycee Lai, SVP Product & Sales who were presenting at Storage Field Day 8 (SFD8, videos of their sessions available here). Primary Data has just emerged out of stealth late last year and has ~$60M in funding. Also they have Steve Wozniak (of Apple fame) as Chief Scientist, but he wasn’t at the SFD8 session 🙁

Primary Data seems out to change the world. At first I thought this was just another form of storage virtualization but they are laser focused on data virtualization or what they call data mobility. It differs from pure storage virtualization by being outside the data path.  (I have written about data virtualization before as well as the data hypervisor a long time ago). Nowadays they seem to be using the tag line of data in motion.

Why move data?

David has a theory behind the proliferation of startup storage companies. The spectrum behind capacity and performance has gotten immense, over time, which has provided an opening for a number of companies to address these widening needs.

David believes that caching at the storage system or in the servers is an attempt to address this issue by “loaning” the data from the storage silo to the cache. This is trying to supply a lower cost $/IOP for the data. Similar considerations are apparent at the other side where customer’s use archive or backup services to take advantage of much cheaper $/GB storage.

However, given the difficulty of moving data around in present day storage environments, customer data has become essentially immobile. Primary Data is trying to bring about a data mobility revolution and allow data to move over this spectrum of performance and capacity of storage with ease. Doing so easily, will provide significant benefits as customers can more fully take advantage of the various levels of performance and capacity in their data center storage environments.

Primary Data architecture

IMG_5607Primary Data is providing data mobility by using their meta-data service called the DataSphere appliance and their client software running on host servers called the Data Portal. Their offering can be best explained in three layers:

  • Data virtualization layer – provides continuity of identity and continuity of access across multiple physical storage systems. That is the same data (identity continuity) can be accessed wherever it resides (access continuity) by server applications. Such access and identity must transcend access protocols and interfaces. The Data Portal client software intercepts the server data activity and does control plane activity using the DataSphere appliance and performs IO directly using the physical storage.
  • Objective based data management – supplies a data affinity service. That is data can have a temporary location relationship with physical storage depending on the current performance (R:W, IOPS, bandwidth, latency) and protection (durability, availability, disaster recoverability, security, copy-ability, version-ability) characteristics of the data. These data objectives are matched to the capabilities or service catalog of the storage infrastructure and data objectives can change over time
  • Analytics in the loop – detects the performance and other characteristics of the storage and data in real-time. That is by monitoring the storage IO activity Primary Data can determine the actual performance attribute of the storage. Similarly, by monitoring the applications IO characteristics over time the system can determine the performance objectives of its data. The system also takes advantage of SMI-S to define some of the other characteristics of the storage systems.

How does Primary Data work?

Primary Data has taken advantage of parallel NFS extensions (pNFS) in NFSv4 to externalize and separate the storage control plane from the IO data plane. This works well for native Linux where the main developer of the Linux file system stack is on their payroll.IMG_5608rz

In Windows they put a filter driver in front of SMB to split off the control from data IO plane. Something similar is done for VMware ESX servers to supply the control-data plane split but in this case there is a software defined Data Portal that goes along with the DataSphere Service client that can do it all within the same ESX server. Another alternative exists and that is to use the Data Portal appliance as a storage virtualization service but then the IO data path goes through the portal.

According to their datasheet they currently support data virtualization services for NetApp cDOT and 7-mode, EMC Isilon OneFS7.2, and Nexenta 4.x&5.0 but plan on more.

They are not quite GA yet, but are close.

Comments?

 

 

 

When 64 nodes are not enough

Why would VMware with years of ESX development behind them want to develop a whole new virtualization system for Docker and other container frameworks. Especially since they already have a compatible Docker support in their current product line.

The main reason I can think of is that a 64 node cluster may be limiting to some container services and the likelihood of VMware ESX/vSphere to supporting 1000s of nodes in a single cluster seems pretty unlikely. So given that more and more cloud services are being deployed across 1000s of nodes using container frameworks, VMware had to do something or say goodbye to a potentially lucrative use case for virtualization.

Yes over time VMware may indeed extend vSphere clusters to 128 or even 256 nodes but by then the world will have moved beyond VMware services for these services and where will VMware be then – left behind.

Photon to the rescue

With the new Photon system VMware has an answer to anyone that needs 1000 to 10,000 server cluster environments. Now these customers can easily deploy their services on a VMware Photon Platform which is was developed off of ESX but doesn’t have any cluster limitations of ESX.

Thus, the need for Photon was now. Customers can easily deploy container frameworks that span 1000s of nodes. Of course it won’t be as easy to manage as a 64 node vSphere cluster but it will be easy automated and easier to deploy and easier to scale when necessary, especially beyond 64 nodes.

The claim is that the new Photon will be able to support multiple container frameworks without modification.

So what’s stopping you from taking on the Amazons, Googles, and Apples of the worlds data centers?

  • Maybe storage, but then there’s ScaleIO, and the other software defined storage solutions that are there to support local DAS clusters spanning almost incredible sizes of clusters.
  • Maybe networking, I am not sure just where NSX is in the scheme of things but maybe it’s capable of handling 1000s of nodes and maybe not but networking could be a clear limitation to what how many nodes can be deployed in this sort of environment.

Where does this leave vSphere? Probably continuation of the current trajectory, making easier and more efficient to run VMware clusters and over time extending any current limitations. So for the moment two development streams based off of ESX and each being enhanced for it’s own market.

How much of ESX survived is an open question but it’s likely that Photon will never see the VMware familiar services and operations that is readily available to vSphere clusters.

Comments?

Photo Credit(s): A first look into Dockerfile system

EMCWorld2015 day 1 news

We are at EMCWorld2015 in Vegas this week. Day 1 was great with new XtremIO 4.0, “The Beast”, new enhanced Data Protection, and a new VCE VxRACK converged infrastructure solution announcements. Somewhere in all the hoopla I saw an all flash VNXe appliance and VMAX3 with a cloud storage tier but these seemed to be just teasers.

XtremIO 4.0

The new hardware provides 40TB per X-brick and with compression/dedupe and the new 8-Xbrick cluster provides 320TB raw or 1.9PB effective capacity. As XtremIO supports 150K mixed IOPS/XBrick, an 8-Xbrick cluster could do 1.2M IOPS or with 250K read IOPS/Xbrick that’s 2.0M IOPS.

XtremIO 4.0 now also includes RecoverPoint integration. (I assume this means they have integrated the write splitter directly into XtremIO that way you don’t need the host version or the switch version of the write splitter.)

The other thing XtremIO 4.0 introduces is non-disruptive upgrades. This means that they can expand or contract the cluster without taking down IO activity.

There was also some mention of better application consistent snapshots, which I suspect means Microsoft VSS integration.

XtremIO 4.0 is a free software upgrade, so the ability to scale up to 8-Xbricks and non-disruptive cluster changes, and RecoverPoint integration can all be added to current XtremIO systems.

Data Protection

EMC introduced a new top end DataDomain hardware appliance the DataDomain 9500, which has 1.5X the performance (58.7TB/hr) and 4X the capacity (1.7PB) of their nearest competitor solution.

They also added a new software feature (from Maginetics) called CloudBoost™.  CloudBoost allows Networker and Avamar to backup to cloud storage. EMC also added Microsoft Ofc365 cloud backup to Spannings previous Google Apps and SalesForce cloud backups.

VMAX3 Protect Point was also enhanced to provide native backup for Oracle, Microsoft SQL Server, and IBM DB2 application environments. ProtectPoint offers a direct path between VMAX3 and  DataDomain appliances and can speed up backup performance by 20X.

EMC also announced Project Falcon which is a virtual appliance version of DataDomain software

VCE VxRACK

This is a rack sized, stack of VSPEX Blue appliances (a VMware EVO:RAIL solution) with new software to bring the VCE useability and data center scale services to a hyper-converged solution. Each appliance is a 2U rack mounted compute intensive or storage intensive unit. The Blue appliances are configureed in a rack for VxRACK and with version 1 you can use VMware or KVM as a chose your own hypervisor. Version 2 will come out later this year and will be based on a complete VMware stack known as EVO: RACK.

Storage services are supplied by EMC ScaleIO. You can purchase a 1/4 rack, 1/2  rack or full rack which includes top of rack networking. You can also scale out by adding more full racks to the system. EMC said that it can technically support 1000s of racks VSPEX Blue appliances for up to ~38PB of storage.

The significant thing is that the VCE VxRACK supplies the VCE customer experience, in a hyper converged solution. However, the focus for VxRACK is tier 2 applications that don’t have a need for the extremely high availability, low response times and high performance of tier 1 applications that run on their VBLOCK solutions (with VNX, VMAX or XtremIO storage).

VMAX3

They had a 5th grader provision an VMAX3 gold storage (LUN) and convert it to a diamond storage (LUN) in 20.48 seconds. It seemed pretty simple to me but the kid blazed through the screens a bit fast for me to see what was going on. It wasn’t nearly as complex as it used to be.

VMAX3 also introduces CloudArray™, which uses FastX storage tiering to cloud storage (using onboard TwinStrata software). This could be used as a tier 3 or 4 storage. EMC also mentioned that you can have an XtremIO (maybe an Xbrick) behind a VMAX3 storage system. VMAX3’s software rewrite has separated data services from backend storage and one can see EMC rolling out different backend storage (like cloud storage or XtremIO) in future offerings.

Other Notes

There was a lot of discussion about the “Information Generation” a new customer for IT services. This is tied to the 3rd platform transformation that’s happening in the industry today. To address this new world IT needs to have 5 attributes:

  1. Predictively spot new opportunities for services/products
  2. Deliver a personalized experience
  3. Innovate in an agile way
  4. Develop trusted programs/apps Demonstrate transparency & trust
  5. Operate in real time

David Goulden talked a lot about what this all means and I encourage you to take a look at the video stream to learn more.

Speaking of video last year was the first year there were more online viewers of EMCWorld than actual participants. So this year EMC upped their game with more entertainment value. The opening dance sequence was pretty impressive.

A lot of talk today was on 3rd platform and the transition from 2nd platform. EMC says their new products are Platform 2.5 which are enablers for 3rd platform. I asked the question what the 3rd platform storage environment looks like and they said scale-out (read ScaleIO) converged storage environment with flash for meta-data/indexing.

As the 3rd platform transforms IT there will be some customers that will want to own the infrastructure, some that will want to use service providers and some that will use public cloud services. EMC’s hope is to capture those customers that want to own it or use service providers.

Tomorrow the focus will be on the Federation with Pivotal and VMware being up for keynotes and other sessions. Stay tuned.

 

 

What’s next for Nexenta

We talked with Nexenta at Storage Field Day 6 where they discussed their current and future software defined storage solutions. I highly encourage you to see the SFD6 videos of their sessions if you want to learn more about them.

Nexenta was an earlier adopter of software defined storage and have recently signed with Solinea to support Nexenta under OpenStack CINDER block storage. Nexenta is based on ZFS and supports inline deduplication and advanced performance functionality.

NexentaStor™

NexentaStor™ is there base storage software and comes as a download in both an Enterprise edition and Community edition. NexentaStor can run on most industry standard, x86 server platforms.

  • The Community edition supports up to 18TB and uses DAS and/or SAS connected storage to supply NFS and SMB file services.
  • The Enterprise edition extends capacity into the PB and supports FC and iSCSI block storage services as well as file services. The Enterprise edition supports plugins for HA solutions and storage replication.

Nexenta mentioned that they had over 6500 customers for NexentaStor of which 1500 are cloud service providers. But they have a whole lot more to offer than just NexentaStor including NexentaConnect™ and coming soon, NexentaEdge™ and NexentaFusion™.

NexentaConnect™

NexentaConnect software works with VMware or Citrix solutions to provide advanced storage services, such as file services, IO acceleration, and storage automation/analytics. There are three products in the NexentaConnect family:

  • NexentaConnect for VMware Virtual SAN – by combining NexentaConnect together with VMware Virtual SAN software and DAS or SAS storage one can offer NFS and SMB/CIFS file services.  Prior to NexentaConnect, VMware Virtual SAN storage only provided VMware dedicated SAN storage, but now that same infrastructure can be used for any NFS or SMB/CIFS file system storage.
  • NexentaConnect for VMware Horizon – by combining NexentaConnect with VMware Horizon and DAS plus local SSD storage, one can provide accelerated virtual desktop IO with state of the art write logging, inline deduplication, and GUI based storage automation/analytics.
  • NexentaConnect for Citrix XenDesktop (in Beta now) by combining NexentaConnect with Citrix XenDesktop software and DAS plus local SSD storage, one can accelerate XenDesktop IO and ease the management of XenDesktop storage.

Nexenta has teamed up with Dell to offer Dell-Nexenta (and VMware) storage solution using NexentaConnect and VMware Virtual SAN software on Dell hardware.

NexentaEdge™

They spent a lot of time on NexentaEdge and what they plan to offer is a software defined object storage solution. Most object storage systems on the market either started as software only or currently support a software only version. But Nexenta is the first to come at it from a file services heritage that I know of.

NexentaEdge will offer iSCSI services as well as standard object storage services such as Amazon S3 and OpenStack SWIFT. Their solution splits up objects into chunks and replicates/distributes the object chunks across their software defined (object) storage cluster.

Cluster communications uses UDP (not TCP) and so has less overhead. NexentaEdge cluster communications uses their own Replicast protocol to send messages and data out across the cluster. .

They designed NexentaEdge to be able to support Shingle Magnetic Recording (SMR) disks which are very dense storage but occasionally have to go “away” while they perform  garbage collection/re-organization. I did two posts about SMR disks a while back (see Shingled magnetic recording disks and Sequential-only disk for more information on SMR).

I have to admit I had a BIG problem with support for iSCSI over eventually consistent storage. I don’t see how this can be used to support ACID database requests but I suppose Nexenta would argue that anyone using object storage for ACID database IO needs to have their head examined.

NexentaFusion™

Although this was not discussed as much, NexentaFusion is another future offering supplying software defined storage analytics and orchestration automation. They intent is to use NexentaFusion with NexentaStor, NexentaConnect and/or NexentaEdge. As you scale up your Nexenta storage cluster, automation/orchestration and storage analytics starts to become a more pressing need. According to Nexenta’s website NexentaFusion 1.0 will support multi-tennant storage monitoring and real time storage analytics while NexentaFusion 2.0 will supportstorage provisioning and orchestration.

~~~~

Nexenta provided Converse all-star shoes to all the participants as well as pens and notebooks. I had to admit I liked the look of the new tennis shoes but my wife and kids thought I was crazy.

Different views on Nexenta from the other SFD6 bloggers can be found below:

SFD6 – Day 2 – Nexenta from PenguinPunk (Dan Firth, @PenguinPunk)

Nexenta – Back in da house by Nigel Poulton (@NigelPoulton)

Sorry Nexenta, but I don’t get it … and questions arise by Juku (Enrico Signoretti, @ESignoretti)

Day 2 at SFD6: Nexenta by Absolutely Windows (John Obeto, @JohnObeto)

Data virtualization surfaces

There’s a new storage startup out of stealth, called Primary Data and it’s implementing data (note, not storage) virtualization.

They already have $60M in funding with some pretty highpowered talent from Fusion IO, namely David Flynn, Rick White and Steve Wozniak (the ‘Woz’)  (also of Apple fame).

There have been a number of attempts at creating a virtualization layers for data namely ViPR (See my post ViPR virtues, vexations but no storage virtualization) but Primary Data is taking a different tack to the problem.

Data virtualization explained

Data hypervisor, software defined storage, data plane, control plane
(c) 2012 Silverton Consulting, Inc. All rights reserved

Essentially they want to separate the data plane from the control plane (See my Data Hypervisor post and comments for another view on this).

  • The data plane consists of those storage system activities that actually perform IO or read and writes.
  • The control plane is those storage system activities that do everything else that has to be done by a storage system, including provisioning, monitoring, and managing the storage.

Separating the data plane from the control plane offers a number of advantages. EMC ViPR does this but it’s data plane is either standard storage systems like VMAX, VNX, Isilon etc, or software defined storage solutions. Primary Data wants to do it all.

Their meta data or control plane engine is called a Data Director which holds information about the data objects that are stored in the Primary Data system, runs a data policy management engine and handles data migration.

Primary Data relies on purpose-built, Data Hypervisor (client) software that talks to Data Directors to understand where data objects reside and how to go about accessing them. But once the metadata information is transferred to the client SW, then IO activity can go directly between the host and the storage system in a protocol independent fashion.

[The graphic above is from my prior post and I assumed the data hypervisor (DH) would be co-located with the data but Primary Data has rightly implemented this as a separate layer in host software.]

Data Hypervisor protocol independence?

As I understand it this means that customers could use file storage, object storage or block storage to support any application requirement. This also means that file data (objects) could be migrated to block storage and still be accessed as file data. But the converse is also true, i.e., block data (objects) could be migrated to file storage and still be accessed as block data. You need to add object, DAS, PCIe flash and cloud storage to the mix to see where they are headed.

All data in Primary Data’s system are object encapsulated and all data objects are catalogued within a single, global namespace that spans file, block, object and cloud storage repositories

Data objects can reside on Primary storage systems, external non-Primary data aware file or block storage systems, DAS, PCIe Flash, and even cloud storage.

How does Data Virtualization compare to Storage Virtualization?

There are a number of differences:

  1. Most storage virtualization solutions are in the middle of the data path and because of this have to be fairly significant, highly fault-tolerant solutions.
  2. Most storage virtualization solutions don’t have a separate and distinct meta-data engine.
  3. Most storage virtualization systems don’t require any special (data hypervisor) software running on hosts or clients.
  4. Most storage virtualization systems don’t support protocol independent access to data storage.
  5. Most storage virtualization systems don’t support DAS or server based, PCIe flash for permanent storage. (Yes this is not supported in the first release but the intent is to support this soon.)
  6. Most storage virtualization systems support internal storage that resides directly inside the storage virtualization system hardware.
  7. Most storage virtualization systems support an internal DRAM cache layer which is used to speed up IO to internal and external storage and is in addition to any caching done at the external storage system level.
  8. Most storage virtualization systems only support external block storage.

There are a few similarities as well:

  1. They both manage data migration in a non-disruptive fashion.
  2. They both support automated policy management over data placement, data protection, data performance, and other QoS attributes.
  3. They both support multiple vendors of external storage.
  4. They both can support different host access protocols.

Data Virtualization Policy Management

A policy engine runs in the Data Directors and provides SLAs for data objects. This would include performance attributes, protection attributes, security requirements and cost requirements.  Presumably, policy specifications for data protection would include RAID level, erasure coding level and geographic dispersion.

In Primary Data, backup becomes nothing more than object snapshots with different protection characteristics, like offsite full copy. Moreover, data object migration can be handled completely outboard and without causing data access disruption and on an automated policy basis.

Primary Data first release

Primary Data will be initially deployed as an integrated data virtualization solution which includes an all flash NAS storage system and a standard NAS system. Over time, Primary Data will add non-Primary Data external storage and internal storage (DAS, SSD, PCIe Flash).

The Data Policy Engine and Data Migrator functionality will be separately charged for software solutions. Data Directors are sold in pairs (active-passive) and can be non-disruptively upgraded. Storage (directors?) are also sold separately.

Data Hypervisor (client) software is available for most styles of Linux, Openstack and coming for ESX. Windows SMB support is not split yet (control plane/data plane) but Primary data does support SMB. I believe the Data Hypervisor software will also be released in an upcoming version of the Linux kernel.

They are currently in testing. No official date for GA but they did say they would announce pricing in 2015.

~~~~

Comments?

Disclosure: We have done work for Primary Data over the past year.

Photo Credits:

  1. Screen shot of beta test system supplied by Primary Data
  2. Graphic created by SCI for prior Data Hypervisor post

Cloud storage growth is hurting NAS & SAN storage vendors

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

My friend Alex Teu (@alexteu), from Oxygen Cloud wrote a post today about how Cloud Storage is Eating the World Alive. Alex reports that all major NAS and SAN storage vendors lost revenue this year over the previous year ranging from a ~3% loss to over a 20% loss (Q1-2014 compared to Q1-2013, from IDC).

Although an interesting development, it’s hard to say that this is the end of enterprise storage as we know it.  I believe there are a number of factors that are impacting  enterprise storage revenues and Cloud storage adoption may be only one of them.

Other trends impacting NAS & SAN storage adoption

One thing that has emerged over the last decade or so is the advance of Flash storage. Some of this is used in storage controllers to speed up IO access and some is used in servers to speed up IO access. But any speedup of IO could potentially reduce the need for high-performing disk drives and could allow customers to use higher capacity/slower disk drives instead. This could definitely reduce the cost of storage systems. A little bit of flash goes  long way to speed up IO access.

The other thing is that disk capacity is trending upward, at exponential rates. Yesterday,s 2TB disk drive is todays 4TB disk drive and we are already seeing 6TB from Seagate, HGST and others. And this is also driving down the cost of NAS and SAN storage.

Nowadays you can configure 1PB of storage with just over 170 drives. Somewhere in there you might want a couple 100TB of Flash to speed up IO access to these slow disks but Flash is also coming down in ($/GB) price (see SanDISK’s recent consumer grade TLC drive at $0.44/GB). Also the move to MLC flash has increased the capacity of flash devices, leading to less SSDs/flash cache cards to store/speed up more data.

Finally, the other trend which seems to have emerged recently is the movement away from enterprise class storage to server storage. One can see this in VMware’s VSAN, HyperConverged systems such as Nutanix and Scale Computing, as well as a general trend in Windows Server applications (SQL Server, Exchange Server, etc.) to make better use of DAS storage. So some customers are moving their data to shared DAS storage today, whereas before this was more difficult to accomplish effectively and because of that they previously purchased networked storage.

What about cloud storage?

Yes, as Alex has noted, the price of cloud storage has declined precipitously over the last year or so. Alex’s cloud storage pricing graph is shows how the entry of Microsoft and Google has seemingly forced Amazon to match their price reductions. But the other thing of note is that they have all come down to about the same basic price of $0.024/GB/Month.

It’s interesting that Amazon delayed their first S3 serious price reductions by about 4 months after Azure and Google Cloud Storage dropped there’s and then within another month after that, they all were at price parity.

What’s cloud storage real growth?

I reported last August that Microsoft Azure and Amazon S3 were respectively storing 8 trillion and over 2 trillion objects (see my Is object storage outpacing structured and unstructured data growth). This year (April 2014) Microsoft mentioned at TechEd that Azure was storing 20 Trillion object and servicing 2 million request per second.

I could find no update to Amazon S3 numbers from last year but the 10x  2.5x growth in Azure’s object count in ~8 months and the roughly doubling of request/second (In my post I didn’t mention last year they were processing 900K requests/second) say something interesting is going on in cloud storage.

I suppose Google’s cloud storage service is too new to report serious results and maybe Amazon wants to keep their growth a secret. But considering Amazon’s recent matching of Azure’s and Google’s pricing, it probably means that their growth wasn’t what they expected.

The other interesting item from the Microsoft discussions on Azure, was that they were already hosting 1M SQL databases in Azure and that 57% of Fortune 500 customers are currently using Azure.

In the “olden days”, before cloud storage, all these SQL databases and Fortune 500 data sets would have more than likely resided on NAS or SAN storage of some kind. And possibly due to the traditional storage’s higher cost and greater complexity, some of this data would never have been spun up in the first place if they had to use traditional storage, but with cloud storage so cheap, rapidly configurable and easy to use all this new data was placed in the cloud.

So I must conclude from Microsofts growth numbers and their implication for the rest of the cloud storage industry that maybe Alex was right, more data is moving to the cloud and this is impacting traditional storage revenues.  With IDC’s (2013) data growth at ~43% per year, it would seem that Microsoft’s cloud storage is growing more rapidly than the worldwide data growth, ~14X faster!

On the other hand, if cloud storage was consuming most of the world’s data growth, it would seem to precipitate the collapse of traditional storage revenues, not just a ~3-20% decline. So maybe the most new cloud storage applications would never have been implemented before if they had to use traditional storage, which means that only some of this new data would ever have been stored on traditional storage in the first place, leading to a relatively smaller decline in revenue.

One question remains: is this a short term impact or more of a long running trend that will play out over the next decade or so? From my perspective, new applications spinning up on non-traditional storage is a long running threat to traditional NAS and SAN storage which will ultimately see traditional storage relegated to a niche. How big this niche will ultimately be and how well it can be defended needs to be the subject for another post?

~~~~

Comments?

MCS, UltraDIMMs and memory IO, the new path ahead – part 2

IMG_2337In part 1 (see previous post here), we discussed the underlying technology for SanDisk‘s UltraDIMMs based on Diablo Technologies MCS hardware and software. IBM will be shipping UltraDIMMs in their high end servers later this year as their new eXFlash.

In this segment we will discuss what SanDisk has put on top of the Diablo Technology’s MCS to supply SSD storage.

SanDisk UltraDIMM SSD storage

In the UltraDIMM package, SanDisk supports 200 or 400GB of 19nm MLC NAND SSD storage that is accessed via SATA [corrected after this went out, Ed.] internally, but the main interface is the 1600MHz, DDR3 to the UltraDIMMs.  As each UltraDIMM card plugs into any DDR3 memory slot you can potentially support multiples of these cards in a single server. I believe the maximum number is 7 UltraDIMMs, not sure if IBM supports this many [corrected after this went out, Ed.] dependent on the number of memory slots in your server. IBM on their x3850 and x3950 can support up to 32 UltraDIMMs per server.

SanDisk uses their Guardian Technology to enhance NAND endurance beyond what’s possible with native NAND controllers. One of the things that Guardian Technology does is to vary the voltage used to program the NAND bits over the life of the bit cells/pages. So early on when the cell is fresh, they can use less voltage and as it ages they increase the voltage to insure that the bits are properly programmed. With other NAND controllers, using the same voltage across the whole NAND lifetime it will unduly stress the NAND bits early on and later as they age, it will be unable to program properly and will need to be flagged as bad.  The NAND chips/bits are characterized so that SanDisk Guardian Technology can use an optimum voltage curve over the chips lifetime.

The UltraDIMMs also have powerloss protection. This means that any write to an UltraDIMM memory that’s been acknowledged to the server is guaranteed to have sufficient power to make it all the way to the SSD storage.

Another thing that MCS memory interface brings to the picture is Error Correction Circuitry (ECC). Data written to UltraDIMMs has ECC protection throughout the data path up from the server DRAM memory, through the DIMM socket, all the way to the SSD flash.

As discussed extensively in Part 1 of this post, access times for UltraDIMM storage is on the order 7µsec, which is ~7X faster than best of class PCIe Flash storage and a single UltraDIMM card is capable of sustaining 20GB/second of data throughput. I know of enterprise class storage systems that can’t do half that in throughput.

On the other hand, one problem with UltraDIMM storage is that they are not hot swappable. This is primarily a memory interface problem and not an UltraDIMM issue but nonetheless, you can’t swap an UltraDIMM module until the server is powered down. And who would want to do such a thing when the server is powered anyway?

SanDisk long history in NAND

SanDisk1 SanDisk2 SanDisk3As you can see from the three photos at right SanDisk seems to have been involved in flash/NAND technology innovation since the early 1990’s.  At the time NOR and NAND were competing for almost the same market.

But sometime in the mid to late 1990’s NAND found a niche in consumer cameras and never looked back. Not sure where NOR marketis today but it’s a drop in the bucket compared to the NAND market

UltraDIMMs is just the latest platform to support NAND storage access.  It happens to be one with blazingly fast access times and high IO parallelism, but in the end it just represents another way to obtain the benefits of NAND for IT customers.

Also, SanDisk’s commercial NAND (Memory Card) business seems to be very healthy. What with higher resolution photos/video/audio coming online over the next decade or so it doesn’t seem to be going away anytime soon.

SanDisk is in a new joint venture (JV) with Toshiba to produce 3D NAND flash. But in the mean time they are still using 2D flash for their current SSD storage. Toshiba and SanDisk in their current JV together manufacture about 1/2 the NAND bits in the world today.

The rest of SanDisk NAND business also seem to be doing well. And the aforementioned JV with Toshiba on 3D NAND looks positioned to take all of this NAND to the next level of density as well which should make all of us happy.

SanDisk acquiring FusionIO

SanDisk was in the news lately as they have recently filed to acquire FusionIO, a prominent and early PCIe flash supplier that in recent years has broadened their portfolio to include enterprise storage with their acquisition of NexGen storage (renamed IO Control).

When FusionIO IPO’d the stock sold at ~$19/share and SanDisk is purchasing the company in an all cash deal for $11.25/share almost a 40% reduction in share price in 3 years (June’11 IPO) – ouch.  At IPO the company was valued at ~$2B, (some pundits said this was ~$1.5B, so there’s some debate on the original valuation). SanDisk is buying the company for ~$1.1B in cash. Any way you look at it, they paid significantly less than what the company was worth at IPO. Granted, it was valued at 41X earnings then and its recent stock price at $11.59 represents a 3.3P/E (ttm).

Not exactly certain what happened. Analysts seem to indicate that Apple and Facebook, FusionIO’s biggest customers were buying less FusionIO product. I also happen to think that the PCIe flash space has gotten pretty crowded over the last 3 years with entrants from Micron Technologies, Intel, LSI, Verident/Western Digital, and others.

In addition, for PCIe flash to broaden its market there’s a serious need to surround it with sophisticated caching software to enable a more general purpose IO solution (see Pernix Data, Proximal Data, and others). These general purpose, caching solutions have finally reached high levels of sophistication and just now are becoming more widely available.

~~~~

Originally, part 3 of this series was going to be on IBM’s release of the UltraDIMM technology  as their new eXFlash. However, I am somewhat surprised not to see other vendors taking up the MCS/UltraDIMM technology but IBM may have a limited exclusivity to it.

The only other thing thats this interesting happening in solid state storage is HP’s Memristor Machine which is still a ways off.

Nonetheless, a new much faster memory card based SSD is hitting the market and if history is any indication, it won’t be long until the data storage world will sit up and take notice.

Comments?

MCS, UltraDIMMs and memory IO, the new path ahead – part 1

IMG_2338I was  at Storage Field Day 5 (SFD5) last month and got a chance to talk with SanDisk and Diablo Technologies. It turns out that SanDisk’s UltraDIMM product is based on Diablo Technologies MCS hardware.  So the two of them provided a pretty deep dive into the technology and where they want to go with it. Before we go any deeper the UltraDIMMs will be released to the field by IBM under the eXFlash name.

Diablo Technologies

The team at Diablo have been focusing on the x86 standard memory channel for a while now and lately have been trying out different sorts of technologies to connect as CPU memory. The first Memory Channel Storage (MCS) product converts Memory Channel IO to SATA IO. This allows any SATA device to be attached as memory and enjoy lightening fast, memory access times. Access times are clocked at 7µsec. Most PCIe Flash cards have an access latency at 50µsec or more, so this is 7X faster that PCIe Flash.  They also claim the MCS is capable of 20GB/sec. I know enterprise class storage systems that can’t do that. Also, the MCS utilizes 2 memory channels.

Diablo delivers a chip (that converts MemIO to SATA IO) and software that provides a block IO access to the MCS device. Customers of MCS supply their own SATA flash storage device and presumably package it all together in a DIMM compatible card.

But the main problem is that the whole MCS chip and SATA IO flash device has to fit in the form factor of a DIMM. And cannot draw any more power than a memory device can draw, ~10-15W with its corresponding thermal load.

But this seems plenty for a small flash drive.  The MCS is configured as a 4GB DDR3 DIMM.  There is a requirement to patch the BIOS so that it doesn’t run diagnostic memory tests on the MCS device and their software needs to be loaded to access the device as a block device. I believe they currently support Linux O/S with more O/Ss on the way.

Diablo has looked at other applications for their technology including providing an Memory IO accessed Ethernet NIC was mentioned. But it seems flash storage would be a great first application of their technology.  Not clear to me but SAS would also be something that could be done.

Whatever happens after NAND with the next generation semiconductor storage (see my The end of NAND is near post, it seems to me that accessing it as Memory IO would make an awful lot of sense. makes a lot of sense.  Using MCS as the access channel would seem to be a logical next step.

Part 1 of this story is on Diablo Technologies, Part 2 will be on SanDisk and I am not sure but maybe there will be a Part 3 on IBM eXFlash. So stay tuned.

Comments?