Latest SPECsfs2008 results, over 1 million NFS ops/sec – chart-of-the-month

Column chart showing the top 10 NFS througput operations per second for SPECsfs2008
(SCISFS111221-001) (c) 2011 Silverton Consulting, All Rights Reserved

[We are still catching up on our charts for the past quarter but this one brings us up to date through last month]

There’s just something about a million SPECsfs2008(r) NFS throughput operations per second that kind of excites me (weird, I know).  Yes it takes over 44-nodes of Avere FXT 3500 with over 6TB of DRAM cache, 140-nodes of EMC Isilon S200 with almost 7TB of DRAM cache and 25TB of SSDs or at least 16-nodes of NetApp FAS6240 in Data ONTAP 8.1 cluster mode with 8TB of FlashCache to get to that level.

Nevertheless, a million NFS throughput operations is something worth celebrating.  It’s not often one achieves a 2X improvement in performance over a previous record.  Something significant has changed here.

The age of scale-out

We have reached a point where scaling systems out can provide linear performance improvements, at least up to a point.  For example, the EMC Isilon and NetApp FAS6240 had a close to linear speed up in performance as they added nodes indicating (to me at least) there may be more there if they just throw more storage nodes at the problem.  Although maybe they saw some drop off and didn’t wish to show the world or potentially the costs became prohibitive and they had to stop someplace.   On the other hand, Avere only benchmarked their 44-node system with their current hardware (FXT 3500), they must have figured winning the crown was enough.

However, I would like to point out that throwing just any hardware at these systems doesn’t necessary increase performance.  Previously (see my CIFS vs NFS corrected post), we had shown the linear regression for NFS throughput against spindle count and although the regression coefficient was good (~R**2 of 0.82), it wasn’t perfect. And of course we eliminated any SSDs from that prior analysis. (Probably should consider eliminating any system with more than a TB of DRAM as well – but this was before the 44-node Avere result was out).

Speaking of disk drives, the FAS6240 system nodes had 72-450GB 15Krpm disks, the Isilon nodes had 24-300GB 10Krpm disks and each Avere node had 15-600GB 7.2Krpm SAS disks.  However the Avere system also had a 4-Solaris ZFS file storage systems behind it each of which had another 22-3TB (7.2Krpm, I think) disks.  Given all that, the 16-node NetApp system, 140-node Isilon and the 44-node Avere systems had a total of 1152, 3360 and 748 disk drives respectively.   Of course, this doesn’t count the system disks for the Isilon and Avere systems nor any of the SSDs or FlashCache in the various configurations.

I would say with this round of SPECsfs2008 benchmarks scale-out NAS systems have come out.  It’s too bad that both NetApp and Avere didn’t release comparable CIFS benchmark results which would have helped in my perennial discussion on CIFS vs. NFS.

But there’s always next time.

~~~~

The full SPECsfs2008 performance report went out to our newsletter subscriber’s last December.  A copy of the full report will be up on the dispatches page of our site sometime later this month (if all goes well). However, you can see our full SPECsfs2008 performance analysis now and subscribe to our free monthly newsletter to receive future reports directly by just sending us an email or using the signup form above right.

For a more extensive discussion of file and NAS storage performance covering top 30 SPECsfs2008 results and NAS storage system features and functionality, please consider purchasing our NAS Buying Guide available from SCI’s website.

As always, we welcome any suggestions on how to improve our analysis of SPECsfs2008 results or any of our other storage system performance discussions.

Comments?

Will Hybrid drives conquer enterprise storage?

Toyota Hybrid Synergy Drive Decal: RAC Future Car Challenge by Dominic's pics (cc) (from Flickr)
Toyota Hybrid Synergy Drive Decal: RAC Future Car Challenge by Dominic's pics (cc) (from Flickr)

I saw where Seagate announced the next generation of their Momentus XT Hybrid (SSD & Disk) drive this week.  We haven’t discussed Hybrid drives much on this blog but it has become a viable product family.

I am not planning on describing the new drive specs here as there was an excellent review by Greg Schulz at StorageIOblog.

However, the question some in the storage industry have had is can Hybrid drives supplant data center storage.  I believe the answer to that is no and I will tell you why.

Hybrid drive secrets

The secret to Seagate’s Hybrid drive lies in its FAST technology.  It provides a sort of automated disk caching that moves frequently accessed OS or boot data to NAND/SSD providing quicker access times.

Storage subsystem caching logic has been around in storage subsystems for decade’s now, ever since the IBM 3880 Mod 11&13 storage control systems came out last century.  However, these algorithms have gotten much more sophisticated over time and today can make a significant difference in storage system performance.  This can be easily witnessed by the wide variance in storage system performance on a per disk drive basis (e.g., see my post on Latest SPC-2 results – chart of the month).

Enterprise storage use of Hybrid drives?

The problem with using Hybrid drives in enterprise storage is that caching algorithms are based on some predictability of access/reference patterns.  When you have a Hybrid drive directly connected to a server or a PC it can view a significant portion of server IO (at least to the boot/OS volume) but more importantly, that boot/OS data is statically allocated, i.e., doesn’t move around all that much.   This means that one PC session looks pretty much like the next PC session and as such, the hybrid drive can learn an awful lot about the next IO session just by remembering the last one.

However, enterprise storage IO changes significantly from one storage session (day?) to another.  Not only are the end-user generated database transactions moving around the data, but the data itself is much more dynamically allocated, i.e., moves around a lot.

Backend data movement is especially true for automated storage tiering used in subsystems that contain both SSDs and disk drives. But it’s also true in systems that map data placement using log structured file systems.  NetApp Write Anywhere File Layout (WAFL) being a prominent user of this approach but other storage systems do this as well.

In addition, any fixed, permanent mapping of a user data block to a physical disk location is becoming less useful over time as advanced storage features make dynamic or virtualized mapping a necessity.  Just consider snapshots based on copy-on-write technology, all it takes is a write to have a snapshot block be moved to a different location.

Nonetheless, the main problem is that all the smarts about what is happening to data on backend storage primarily lies at the controller level not at the drive level.  This not only applies to data mapping but also end-user/application data access, as cache hits are never even seen by a drive.  As such, Hybrid drives alone don’t make much sense in enterprise storage.

Maybe, if they were intricately tied to the subsystem

I guess one way this could all work better is if the Hybrid drive caching logic were somehow controlled by the storage subsystem.  In this way, the controller could provide hints as to which disk blocks to move into NAND.  Perhaps this is a way to distribute storage tiering activity to the backend devices, without the subsystem having to do any of the heavy lifting, i.e., the hybrid drives would do all the data movement under the guidance of the controller.

I don’t think this likely because it would take industry standardization to define any new “hint” commands and they would be specific to Hybrid drives.  Barring standards, it’s an interface between one storage vendor and one drive vendor.  Probably ok if you made both storage subsystem and hybrid drives but there aren’t any vendor’s left that does both drives and the storage controllers.

~~~~

So, given the state of enterprise storage today and its continuing proclivity to move data around accross its backend storage,  I believe Hybrid drives won’t be used in enterprise storage anytime soon.

Comments?

 

OCZ’s new Octane SATA SSD pushes latency limits below 100μsec

(c) 2011 OCZ (from their website)OCZ just announced that their new Octane 1TB SSD can perform reads and writes under a 100 μsec. (specifically “Read: 0.06ms; Write: 0.09ms”).  Such fast access times boggle the imagination and even with SATA 3 seems almost unobtainable.

Speed matters, especially with SSDs

Why would any device try to reach a 90μsec write access time and a 60μsec read access time? With the advent of high-speed stock trading where even distance matters, a lot, latency is becoming a hot topic once again.

Although from my perspective it never really went away (see my Storage throughput vs. IO response time and why it matters post).  So access times measured in 10’s of μsec. is just fine by me.

How SSD access time translates into storage system latency or response time is another matter.  But one can see some seriously fast storage system latencies (or LRT) in TMS’s latest RAMSAN SPC-1 benchmark results, under ~90μsec measured at the host level! (See my May dispatch on latest SPC performance).  On the other hand, how they measure 90μsec host level latencies without a logic analyzer attached is beyond me.

How are they doing this?

How can a OCZ’s SATA SSD deliver such fast access times? NAND is too slow to provide this access time for writes so there must be some magic.  For instance, NAND writes (programing) can take on the order of a couple of 100μsecs and that doesn’t include the erase time of more like 1/2msec.  So the only way to support a 90μsec write or storage system access time with NAND chips is by buffering write data into an “ondevice” DRAM cache.

NAND reads are quite a bit faster on the order of 25μsec for the first byte and 25nsec for each byte after that.  As such, SSD read data could conceivably be coming directly from NAND.  However you have to set aside some device latency/access time to perform IO command processing, chip addressing, channel setup, etc.  Thus, it wouldn’t surprise me to see them using the DRAM cache for read data as well.

—–

I never thought I would see sub-1msec storage system response times but that was broken a couple of years ago with IBM’s Turbo 8300.   With the advent of DRAM caching for NAND SSDs and the new,  purpose built all-SSD storage systems, it seems we are already in the age of sub-100μsec response times.

I fear to get much below this we may need something like the next generation SATA or SAS to come out and even faster processing/memory speeds. But from where I sit sub-10μsec response times don’t seem that far away.  By then, distance will matter even more.

Comments?

Pure Storage surfaces

1 controller X 1 storage shelf (c) 2011 Pure Storage (from their website)
1 controller X 1 storage shelf (c) 2011 Pure Storage (from their website)

We were talking with Pure Storage last week, another SSD startup which just emerged out of stealth mode today.  Somewhat like SolidFire which we discussed a month or so ago, Pure Storage uses only SSDs to provide primary storage.  In this case, they are supporting a FC front end, with an all SSDs backend, and implementing internal data deduplication and compression, to try to address the needs of enterprise tier 1 storage.

Pure Storage is in final beta testing with their product and plan to GA sometime around the end of the year.

Pure Storage hardware

Their system is built around MLC SSDs which are available from many vendors but with a strategic investment from Samsung, currently use that vendor’s storage.  As we know, MLC has write endurance limitations but Pure Storage was built from the ground up knowing they were going to use this technology and have built their IP to counteract these issues.

The system is available in one or two controller configurations, with an Infiniband interconnect between the controllers, 6Gbps SAS backend, 48GB of DRAM per controller for caching purposes, and NV-RAM for power outages.  Each controller has 12-cores supplied by 2-Intel Xeon processor chips.

With the first release they are limiting the controllers to one or two (HA option) but their storage system is capable of clustering together many more, maybe even up to 8-controllers using the Infiniband back end.

Each storage shelf provides 5.5TB of raw storage using 2.5″ 256GB MLC SSDs.  It looks like each controller can handle up to 2-storage shelfs with the HA (dual controller option) supporting 4 drive shelfs for up to 22TB of raw storage.

Pure Storage Performance

Although these numbers are not independently verified, the company says a single controller (with 1-storage shelf) they can do 200K sustained 4K random read IOPS, 2GB/sec bandwidth, 140K sustained write IOPS, or 500MB/s of write bandwidth.  A dual controller system (with 2-storage shelfs) can achieve 300K random read IOPS, 3GB/sec bandwidth, 180K write IOPS or 1GB/sec of write bandwidth.  They also claim that they can do all this IO with an under 1 msec. latency.

One of the things they pride themselves on is consistent performance.  They have built their storage such that they can deliver this consistent performance even under load conditions.

Given the amount of SSDs in their system this isn’t screaming performance but is certainly up there with many enterprise class systems sporting over 1000 disks.  The random write performance is not bad considering this is MLC.  On the other hand the sequential write bandwidth is probably their weakest spec and reflects their use of MLC flash.

Purity software

One key to Pure Storage (and SolidFire for that matter) is their use of inline data compression and deduplication. By using these techniques and basing their system storage on MLC, Pure Storage believes they can close the price gap between disk and SSD storage systems.

The problems with data reduction technologies is that not all environments can benefit from them and they both require lots of CPU power to perform well.  Pure Storage believes they have the horsepower (with 12 cores per controller) to support these services and are focusing their sales activities on those (VMware, Oracle, and SQL server) environments which have historically proven to be good candidates for data reduction.

In addition, they perform a lot of optimizations in their backend data layout to prolong the life of MLC storage. Specifically, they use a write chunk size that matches the underlying MLC SSDs page width so as not to waste endurance with partial data writes.  Also they migrate old data to new locations occasionally to maintain “data freshness” which can be a problem with MLC storage if the data is not touched often enough.  Probably other stuff as well, but essentially they are tuning their backend use to optimize endurance and performance of their SSD storage.

Furthermore, they have created a new RAID 3D scheme which provides an adaptive parity scheme based on the number of available drives that protects against any dual SSD failure.  They provide triple parity, dual parity for drive failures and another parity for unrecoverable bit errors within a data payload.  In most cases, a failed drive will not induce an immediate rebuild but rather a reconfiguration of data and parity to accommodate the failing drive and rebuild it onto new drives over time.

At the moment, they don’t have snapshots or data replication but they said these capabilities are on their roadmap for future delivery.

—-

In the mean time, all SSD storage systems seem to be coming out of the wood work. We mentioned SolidFire, but WhipTail is another one and I am sure there are plenty more in stealth waiting for the right moment to emerge.

I was at a conference about two months ago where I predicted that all SSD systems would be coming out with little of the engineering development of storage systems of yore. Based on the performance available from a single SSD, one wouldn’t need 100s of SSDs to generate 100K IOPS or more.  Pure Storage is doing this level of IO with only 22 MLC SSDs and a high-end, but essentially off-the-shelf controller.

Just imagine what one could do if you threw some custom hardware at it…

Comments?

SATA Express combines PCIe and SATA

SATA Express plug configuration (c) SATA-IO (from SATA-IO.org website)SATA-IO recently announced a new specification for an PCIe and SATA-IO specification (better described in the presentation) that will provide a SATA device interface directly connected to a server’s PCIe bus.

The new working specification offers either 8Gbps or 16Gbps depending on the number of PCIe lanes being used and provides a new PCIe/SATA-IO plug configuration.

While this may be a boon to normal SATA-IO disk drives I see the real advantage lies with an easier interface for PCIe based NAND storage cards or Hybrid disk drives.

New generation of PCIe SSDs based on SATA Express

For example, previously if you wanted to produce a PCIe NAND storage card, you either had to surround this with IO drivers to provide storage/cache interfaces (such as FusionIO) or provide enough smarts on the card to emulate an IO controller along with the backend storage device (see my post on OCZ’s new bootable PCIe Z-drive).  With the new SATA Express interface, one no longer needs to provide any additional smarts with the PCIe card as long as you can support SATA Express.

It would seem that SATA Express would be the best of all worlds.

  • If you wanted a directly accessed SATA SSD you could plug it in to your SATA-IO controller
  • If you wanted networked SATA SSDs you could plug it into your storage array.
  • If you wanted even better performance than either of those two alternatives you could plug the SATA SSD directly into the PCIe bus with the PCIe/SATA-IO interface.

Of course supporting SATA Express will take additional smarts on the part of any SATA-IO device but with all new SATA devices supporting the new interface, additional cost differentials should shrink substantially.

SATA-IO 3.2

The PCIe/SATA-IO plug design is just a concept now but SATA expects to have the specification finalized by year end with product availability near the end of 2012.  The SATA-IO organization have designated the SATA Express standard to be part of SATA 3.2.

One other new capability is being introduced with SATA 3.2, specifically a µSATA specification designed to provide storage for embedded system applications.

The prior generation SATA 3.1, coming out in products soon, includes the mSATA interface specification for mobile device storage and the USM SATA interface specification for consumer electronics storage.   And as most should recall, SATA 3.0 provided 6Gbps data transfer rates for SATA storage devices.

—-

Can “SAS Express” be far behind?

Comments?

OCZ’s latest Z-Drive R4 series PCIe SSD

OCZ_Z-Drive_RSeries (from http://www.ocztechnology.com/ocz-z-drive-r4-r-series-pci-express-ssd.html)
OCZ_Z-Drive_RSeries (from http://www.ocztechnology.com/ocz-z-drive-r4-r-series-pci-express-ssd.html)

OCZ just released a new version of their enterprise class Z-drive SSD storage with pretty impressive performance numbers (up to 500K IOPS [probably read] with 2.8GB/sec read data transfer).

Bootability

These new drives are bootable SCSI devices and connect directly to a server’s PCIe bus. They come in half height and full height card form factors and support 800GB to 3.2TB (full height) or 300GB to 1.2TB (half height) raw storage capacities.

OCZ also offers their Velo PCIe SSD series which are not bootable and as such, require an IO driver for each operating system. However, the Z-drive has more intelligence which provides a SCSI device and as such, can be used anywhere.

Naturally this comes at the price of additional hardware and overhead.   All of which could impact performance but given their specified IO rates, it doesn’t seem to be a problem.

Unclear how many other PCIe SSDs exist today that offer bootability but it certainly puts these drives in a different class than previous generation PCIe SSD such as available from FusionIO and other vendors that require IO drivers.

MLC NAND

One concern with new Z-drives might be their use of MLC NAND technology.  Although OCZ’s press release said the new drives would be available in either SLC or MLC configurations, current Z-drive spec sheets only indicate MLC availability.

As  discussed previously (see eMLC & eSLC and STEC’s MLC posts), MLC supports less write endurance (program-erase and write cycles) than SLC NAND cells.  Normally the difference is on the order of 10X less before NAND cell erase/write failure.

I also noticed there was no write endurance specification on their spec sheet for the new Z-drives.  Possibly,  at these capacities it may not matter but, in our view, a write endurance specification should be supplied for any SSD drive, and especially for enterprise class ones.

Z-drive series

OCZ offers two versions of their Z-drive the R and C series, both of which offer the same capacities and high performance but as far as I could tell the R series appears to be have more enterprise class availability and functionality. Specifically, this drive has power fail protection for the writes (capacitance power backup) as well as better SMART support (with “enterprise attributes”). These both seem to be missing from their C Series drives.

We hope the enterprise attribute SMART provides write endurance monitoring and reporting.  But there is no apparent definition of these attributes that were easily findable.

Also the R series power backup, called DataWrite Assurance Technology would be a necessary component for any enterprise disk device.  This essentially saves data written to the device but not to the NAND just yet from disappearing during a power outage/failure.

Given the above, we would certainly opt for the R series drive in any enterprise configuration.

Storage system using Z-drives

Just consider what one can do with a gaggle of Z-drives in a standard storage system.  For example, with 5 Z-drives in a server, it could potentially support 2.5M IOPs/sec and 14GB/sec of data transfer with some resulting loss of performance due to front-end emulation.  Moreover, at 3.2TB per drive, even in a RAID5 4+1 configuration the storage system would support 12.8TB of user capacity. One could conceivably do away with any DRAM cache in such a system and still provide excellent performance.

What the cost for such a system would be is another question. But with MLC NAND it shouldn’t be too obscene.

On the other hand serviceability might be a concern as it would be difficult to swap out a failed drive (bad SSD/PCIe card) while continuing IO operations. This could be done with some special hardware but it’s typically not present in standard, off the shelf servers.

—-

All in all a very interesting announcement from OCZ.  The likelihood that a single server will need this sort of IO performance from a lone drive is not that high (except maybe for massive website front ends) but putting a bunch of these in a storage box is another matter.  Such a configuration would make one screaming storage system with minimal hardware changes and only a modest amount of software development.

Comments?

SolidFire supplies scale-out SSD storage for cloud service providers

SolidFire SF3010 node (c) 2011 SolidFire (from their website)
SolidFire SF3010 node (c) 2011 SolidFire (from their website)

I was talking with a local start up called SolidFire the other day with an interesting twist on SSD storage.  They were targeting cloud service providers with a scale-out, cluster based SSD iSCSI storage system.  Apparently a portion of their team had come from Lefthand (now owned by HP) another local storage company and the rest came from Rackspace, a national cloud service provider.

The hardware

Their storage system is a scale-out cluster of storage nodes that can range from 3 to a theoretical maximum of 100 nodes (validated node count ?). Each node comes equipped with 2-2.4GHz, 6-core Intel processors and 10-300GB SSDs for a total of 3TB raw storage per node.  Also they have 8GB of non-volatile DRAM for write buffering and 72GB read cache resident on each node.

The system also uses 2-10GbE links for host to storage IO and inter-cluster communications and support iSCSI LUNs.  There are another 2-1GigE links used for management communications.

SolidFire states that they can sustain 50K IO/sec per node. (This looks conservative from my viewpoint but didn’t state any specific R:W ratio or block size for this performance number.)

The software

They are targeting cloud service providers and as such the management interface was designed from the start as a RESTful API but they also have a web GUI built out of their API.  Cloud service providers will automate whatever they can and having a RESTful API seems like the right choice.

QoS and data reliability

The cluster supports 100K iSCSI LUNs and each LUN can have a QoS SLA associated with it.  According to SolidFire one can specify a minimum/maximum/burst level for IOPS and a maximum or burst level for throughput at a LUN granularity.

With LUN based QoS, one can divide cluster performance into many levels of support for multiple customers of a cloud provider.  Given these unique QoS capabilities it should be relatively easy for cloud providers to support multiple customers on the same storage providing very fine grained multi-tennancy capabilities.

This could potentially lead to system over commitment, but presumably they have some way to ascertain over commitment is near and not allowing this to occur.

Data reliability is supplied through replication across nodes which they call Helix(tm) data protection.  In this way if an SSD or node fails, it’s relatively easy to reconstruct the lost data onto another node’s SSD storage.  Which is probably why the minimum number of nodes per cluster is set at 3.

Storage efficiency

Aside from the QoS capabilities, the other interesting twist from a customer perspective is that they are trying to price an all-SSD storage solution at the $/GB of normal enterprise disk storage. They believe their node with 3TB raw SSD storage supports 12TB of “effective” data storage.

They are able to do this by offering storage efficiency features of enterprise storage using an all SSD configuration. Specifically they provide,

  • Thin provisioned storage – which allows physical storage to be over subscribed and used to support multiple LUNs when space hasn’t completely been written over.
  • Data compression – which searches for underlying redundancy in a chunk of data and compresses it out of the storage.
  • Data deduplication – which searches multiple blocks and multiple LUNs to see what data is duplicated and eliminates duplicative data across blocks and LUNs.
  • Space efficient snapshot and cloning – which allows users to take point-in-time copies which consume little space useful for backups and test-dev requirements.

Tape data compression gets anywhere from 2:1 to 3:1 reduction in storage space for typical data loads. Whether SolidFire’s system can reach these numbers is another question.  However, tape uses hardware compression and the traditional problem with software data compression is that it takes lots of processing power and/or time to perform it well.  As such, SolidFire has configured their node hardware to dedicate a CPU core to each physical disk drive (2-6 core processors for 10 SSDs in a node).

Deduplication savings are somewhat trickier to predict but ultimately depends on the data being stored in a system and the algorithm used to deduplicate it.  For user home directories, typical deduplication levels of 25-40% are readily attainable.  SolidFire stated that their deduplication algorithm is their own patented design and uses a small fixed block approach.

The savings from thin provisioning ultimately depends on how much physical data is actually consumed on a storage LUN but in typical environments can save 10-30% of physical storage by pooling non-written or free storage across all the LUNs configured on a storage system.

Space savings from point-in-time copies like snapshots and clones depends on data change rates and how long it’s been since a copy was made.  But, with space efficient copies and a short period of existence, (used for backups or temporary copies in test-development environments) such copies should take little physical storage.

Whether all of this can create a 4:1 multiplier for raw to effective data storage is another question but they also have a eScanner tool which can estimate savings one can achieve in their data center. Apparently the eScanner can be used by anyone to scan real customer LUNs and it will compute how much SolidFire storage will be required to support the scanned volumes.

—–

There are a few items left on their current road map to be delivered later, namely remote replication or mirroring. But for now this looks to be a pretty complete package of iSCSI storage functionality.

SolidFire is currently signing up customers for Early Access but plan to go GA sometime around the end of the year. No pricing was disclosed at this time.

I was at SNIA’s BoD meeting the other week and stated my belief that SSDs will ultimately lead to the commoditization of storage.  By that I meant that it would be relatively easy to configure enough SSD hardware to create a 100K IO/sec  or 1GB/sec system without having to manage 1000 disk drives.  Lo and behold, SolidFire comes out the next week.  Of course, I said this would happen over the next decade – so I am off by a 9.99 years…

Comments?

EMCWorld news Day1 1st half

EMC World keynote stage, storage, vblocks, and cloud...
EMC World keynote stage, storage, vblocks, and cloud...

EMC announced today a couple of new twists on the flash/SSD storage end of the product spectrum.  Specifically,

  • They now support all flash/no-disk storage systems. Apparently they have been getting requests to eliminate disk storage altogether. Probably government IT but maybe some high-end enterprise customers with low-power, high performance requirements.
  • They are going to roll out enterprise MLC flash.  It’s unclear when it will  be released but it’s coming soon, different price curve, different longevity (maybe), but brings down the cost of flash by ~2X.
  • EMC is going to start selling server side Flash.  Using storage FAST like caching algorithms to knit the storage to the server side Flash.  Unclear what server Flash they will be using but it sounds a lot like a Fusion-IO type of product.  How well the server cache and the storage cache talks is another matter.  Chuck Hollis said EMC decided to redraw the boundary between storage and server and now there is a dotted line that spans the SAN/NAS boundary and carves out a piece of the server which is sort of on server caching.

Interesting to say the least.  How well it’s tied to the rest of the FAST suite is critical. What happens when one or the other loses power, as Flash is non-volatile no data would be lost but the currency of the data for shared storage may be another question.  Also having multiple servers in the environment may require cache coherence across the servers and storage participating in this data network!?

Some teaser announcements from Joe’s keynote:

  • VPLEX asynchronous, active active supporting two datacenter access to the same data over 1700Km away Pittsburgh to Dallas.
  • New Isilon record scalability and capacity the NL appliance. Can now support a 15PB file system, with trillions of files in it.  One gene sequencer says a typical assay generates 500M objects/files…
  • Embracing Hadoop open source products so that EMC will support Hadoop distro in an appliance or software only solution

Pat G also showed EMC Greenplum appliance searching a 8B row database to find out how many products have been shipped to a specific zip code…