Fall SNWUSA wrap-up

Attended SNWUSA this week in San Jose,  It’s hard to see the show gradually change when you attend each one but it does seem that the end-user content and attendance is increasing proportionally.  This should bode well for future SNWs. Although, there was always a good number of end users at the show but the bulk of the attendees in the past were from storage vendors.

Another large storage vendor dropped their sponsorship.  HDS no longer sponsors the show and the last large vendor still standing at the show is HP.  Some of this is cyclical, perhaps the large vendors will come back for the spring show, next year in Orlando, Fl.  But EMC, NetApp and IBM seemed to have pretty much dropped sponsorship for the last couple of shows at least.

SSD startup of the show

Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)
Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)

The best, new SSD startup had to be Skyera. A 48TB raw flash dual controller system supporting iSCSI block protocol and using real commercial grade MLC.  The team at Skyera seem to be all ex-SandForce executives and technical people.

Skyera’s team have designed a 1U box called the Skyhawk, with  a phalanx of NAND chips, there own controller(s) and other logic as well. They support software compression and deduplication as well as a special designed RAID logic that claims to reduce extraneous write’s to something just over 1 for  RAID 6, dual drive failure equivalent protection.

Skyera’s underlying belief is that just as consumer HDAs took over from the big monster 14″ and 11″ disk drives in the 90’s sooner or later commercial NAND will take over from eMLC and SLC.  And if one elects to stay with the eMLC and SLC technology you are destined to be one to two technology nodes behind. That is, commercial MLC (in USB sticks, SD cards etc) is currently manufactured with 19nm technology.  The EMLC and SLC NAND technology is back at 24 or 25nm technology.  But 80-90% of the NAND market is being driven by commercial MLC NAND.  Skyera came out this past August.

Coming in second place was Arkologic an all flash NAS box using SSD drives from multiple vendors. In their case a fully populated rack holds about 192TB (raw?) with an active-passive controller configuration.  The main concern I have with this product is that all their metadata is held in UPS backed DRAM (??) and they have up to 128GB of DRAM in the controller.

Arkologic’s main differentiation is supporting QOS on a file system basis and having some connection with a NIC vendor that can provide end to end QOS.  The other thing they have is a new RAID-AS which is special designed for Flash.

I just hope their USP is pretty hefty and they don’t sell it someplace where power is very flaky, because when that UPS gives out, kiss your data goodbye as your metadata is held nowhere else – at least that’s what they told me.

Cloud storage startup of the show

There was more cloud stuff going on at the show. Talked to at least three or four cloud gateway providers.  But the cloud startup of the show had to be Egnyte.  They supply storage services that span cloud storage and on premises  storage with an in band or out-of-band solution and provide file synchronization services for file sharing across multiple locations.  They have some hooks into NetApp and other major storage vendor products that allows them to be out-of-band for these environments but would need to be inband for other storage systems.  Seems an interesting solution that if succesful may help accelerate the adoption of cloud storage in the enterprise, as it makes transparent whether storage you access is local or in the cloud. How they deal with the response time differences is another question.

Different idea startup of the show

The new technology showplace had a bunch of vendors some I had never heard of before but one that caught my eye was Actifio. They were at VMworld but I never got time to stop by.  They seem to be taking another shot at storage virtualization. Only in this case rather than focusing on non-disruptive file migration they are taking on the task of doing a better job of point in time copies for iSCSI and FC attached storage.

I assume they are in the middle of the data path in order to do this and they seem to be using copy-on-write technology for point-in-time snapshots.  Not sure where this fits, but I suspect SME and maybe up to mid-range.

Most enterprise vendors have solved these problems a long time ago but at the low end, it’s a little more variable.  I wish them luck but although most customers use snapshots if their storage has it, those that don’t, seem unable to understand what they are missing.  And then there’s the matter of being in the data path?!

~~~~

If there was a hybrid startup at the show I must have missed them. Did talk with Nimble Storage and they seem to be firing on all cylinders.  Maybe someday we can do a deep dive on their technology.  Tintri was there as well in the new technology showcase and we talked with them earlier this year at Storage Tech Field Day.

The big news at the show was Microsoft purchasing StorSimple a cloud storage gateway/cache.  Apparently StorSimple did a majority of their business with Microsoft’s Azure cloud storage and it seemed to make sense to everyone.

The SNIA suite was hopping as usual and the venue seemed to work well.  Although I would say the exhibit floor and lab area was a bit to big. But everything else seemed to work out fine.

On Wednesday, the CIO from Dish talked about what it took to completely transform their IT environment from a management and leadership perspective.  Seemed like an awful big risk but they were able to pull it off.

All in all, SNW is still a great show to learn about storage technology at least from an end-user perspective.  I just wish some more large vendors would return once again, but alas that seems to be a dream for now.

vSphere 5.1 storage enhancements and future vision

We discussed last year’s vSphere 5 storage changes in a previous post.  And at last week’s VMworld2012 in San Francisco, VMware announced a few new enhancements for vSphere 5.1 but showed more on their vision for the future of storage in VMware environments.

vSphere 5.1 storage enhancements were not as significant as last year’s enhancements.  Specifically, vSphere 5.1 storage oriented changes include:

  • VDP – vSphere Data Protector is a new agentless, deduplicating backup solution from VMware (and EMC) which is now bundled into vSphere and comes free for all users at the Essentials+ level and above. VDP is based on EMC’s Avamar Virtual Edition and provides a new integrated data protection management tab in vCenter Operations Manager GUI.  VDP replaces VDR.
  • vMotion changes – vMotion now supports non-shared storage and specifically, VSA storage environments.  To do this vMotion will now perform a standard storage vMotion to the targeted host before the VM vMotion takes place to move the data to the new location.
  • vSphere replication auto-failback with SRM – SRM 5.1 now supports vSphere replication service automated failback. SRM 5 supported storage array based replication automated failback but had no support for the then announced new VMware, host based replication service. This has been rectified with SRM 5.1.
  • SRM packaging changes – SRM standard now comes at no additional charge with the vCloud Suite Standard license option.  And a new entry level SRM (for 6 CPUs, 3 hosts) comes with Essentials+ to match and provide DR services for VSA environments.

VMware storage vision

VMware took the opportunity to discuss their vision for future offerings in the storage arena.  Specifically,

  • vSphere volumes (vVols) –  vVols will become the new defacto standard unit of granularity and abstraction for storage systems, providing a new allocation unit behind VMDKs and eliminating VMFS.  vVols are intended to define a new interface between vSphere and networked storage systems so that VMDKs can now be replicated, snapshot, cloned, etc.  alone without impacting other VMDKs on the storage system.  vVols are intended to replace LUNs and/or files used as previous holding containers for VMDKs.  vVols -should eliminate the mess of having to define 1000s of LUNs required to support VDI or cloud data centers implementations
  • Virtual flash – VMware’s first internal support for server side flash.  VMware will now be able to partition and allocate the flash on PCIe cards to VMs executing in the ESX server just like physical memory and vCPUs are today.  Also VMware will be able to copy flash cache contents when vMotion-ing VMs to other physical servers.  The intent is to fully support PCIe flash cards for vMotion by warm starting the flash in the target server and bring fast access storage closer to VMs.
  • VSAN – also called distributed storage, takes VSA like services and scales it out to support many more hosts/CPUs and networked storage.   The ultimate goal here seems to be to provide a shared, mid-tier, distributed storage system based on VMware DAS, which will better support vSphere execution and high availability.  VSAN will provide compute and storage within the same host.  It’s intended that VSAN be easier to configure, deploy and manage than current VM shared storage solutions.

Where are they going with all this?

I believe VMware is signaling an intent to get more involved in the storage arena.  Last years move with VSA now seems like just the beginning.

If examined together with their other thrusts for the virtual data center, it all starts to make sense. When these three future storage capabilities are in place, VMware should be better able to configure and support virtual cloud data centers (VCD) built out of commodity servers, commodity storage and commodity networking gear.  With all this in place VCDs should be better able to compete with AWS and other cloud service providers.

The end of enterprise storage, …

I was talking with one IT analyst, Dr. Kevin McIsaac with IBRS in Australia who feels when these three capabilities start rolling out, it signals the beginning of the end of enterprise storage as we know it.  He compares  this to what happened to specialized Unix servers (from HP, Sun, IBM, etc.) prominent at the end of the last century and early this century with the introduction of VMware and commodity high-performance, Intel servers/microprocessor chips.  Although these proprietary Unix servers still exist they are no longer growing market share.

In Kevin’s view, VMware is just following that playbook again, only this time it’s enterprise storage in their sights.  Of course, the other side of this is the enterprise networking that starts to be commoditized by all the virtual networking capabilities VMware is rolling out in VxLAN and Nicira integration as well. (Perhaps subject for another post).

… Not quite yet.

I understand his point and can’t help but agree with parts of it at least at the low end and potentially mid-tier storage.  IMHO however, enterprise storage vendors have a viable defense to all this but it involves providing even more functionality, performance and capabilities than they available today in their systems.

I see it every time I look at my performance charts, anytime you start getting over 300 disk drives, storage sophistication matters more to performance, than just throwing more hardware in the mix.  For an example of this effect checkout my last post on SPC-2 performance correlations.

And of course, VMware might be straining their very profitable relationship with storage vendors today such as Dell, HP, IBM, NetApp, EMC, etc. all of which today highlight and push their virtualization solution throughout their partner community.   If they decide to stop recommending VMware and start focusing on other virtualization offerings this might also stall VMware’s vision.

~~~~

In the end I can’t help but feel that in VMware’s view their challenge, in the long run will come from AWS, Google and other cloud service providers. Whatever they can do to better prepare to compete with this gaggle of cloud purveyors, the better they succeed for their enterprise customer. And ultimately that means more business for VMware.  If enterprise networking and storage vendors have to adapt to that vision, then so be it.

Comments?

The end of NAND is near, maybe…

In honor of today’s Flash Summit conference, I give my semi-annual amateur view of competing NAND technologies.

I was talking with a major storage vendor today and they said they were sampling sub-20nm NAND chips with P/E cycles of 300 with a data retention period under a week at room temperatures. With those specifications these chips almost can’t get out of the factory with any life left in them.

On the other hand the only sub-20nm (19nm) NAND information I could find online were inside the new Toshiba THNSNF SSDs with toggle MLC NAND that guaranteed data retention of 3 months at 40°C.   I could not find any published P/E cycle specifications for the NAND in their drive but presumably this is at most equivalent to their prior generation 24 nm NAND or at worse somewhere below that generations P/E cycles. (Of course, I couldn’t find P/E cycle specifications for that drive either but similar technology in other drives seems to offer native 3000 P/E cycles.)

Intel-Micron, SanDisk and others have all recently announced 20nm MLC NAND chips with a P/E cycles around 3K to 5K.

Nevertheless, as NAND chips go beyond their rated P/E cycle quantities, NAND bit errors increase. With a more powerful ECC algorithm in SSDs and NAND controllers, one can still correct the data coming off the NAND chips.  However at some point beyond 24 bit ECC this probably becomes unsustainable. (See interesting post by NexGen on ECC capabilities as NAND die size shrinks).

Not sure how to bridge the gap between 3-5K P/E cycles and the 300 P/E cycles being seen by storage vendors above but this may be a function of prototype vs. production technology and possibly it had other characteristics they were interested in.

But given the declining endurance of NAND below 20nm, some industry players are investigating other solid state storage technologies to replace NAND, e.g.,  MRAM, FeRAM, PCM and ReRAM all of which are current contenders, at least from a research perspective.

MRAM is currently available in small capacities from Everspin and elsewhere but hasn’t really come up with similar densities on the order of today’s NAND technologies.

ReRAM is starting to emerge in low power applications as a substitute for SRAM/DRAM, but it’s still early yet.

I haven’t heard much about FeRAM other than last year researchers at Purdue having invented a new non-destructive read FeRAM they call FeTRAM.   Standard FeRAMs are already in commercial use, albeit in limited applications from Ramtron and others but density is still a hurdle and write performance is a problem.

Recently the PCM approach has heated up as PCM technology is now commercially available being released by Micro.  Yes the technology has a long way to go to catch up with NAND densities (available at 45nm technology) but it’s yet another start down a technology pathway to build volume and research ways to reduce cost, increase density and generally improve the technology.  In the mean time I hear it’s an order of magnitude faster than NAND.

Racetrack memory, a form of MRAM using wires to store multiple bits, isn’t standing still either.  Last December, IBM announced they have demonstrated  Racetrack memory chips in their labs.  With this milestone IBM has shown how a complete Racetrack memory chip could be fabricated on a CMOS technology lines.

However, in the same press release from IBM on recent research results, they announced a new technique to construct CMOS compatible graphene devices on a chip.  As we have previously reported, another approach to replacing standard NAND technology  uses graphene transistors to replace the storage layer of NAND flash.  Graphene NAND holds the promise of increasing density with much better endurance, retention and reliability than today’s NAND.

So as of today, NAND is still the king of solid state storage technologies but there are a number of princelings and other emerging pretenders, all vying for its throne of tomorrow.

Comments?

Image: 20 nanometer NAND Flash chip by IntelFreePress

Storage utilization is broke

Storage virtualization is a great concept but it has some serious limitations. One which comes to mind is the way we measure storage utilization.

Just look at what made VMware so succesful.  As far as I can see it was mainly due to the fact that x86 servers were vastly under utilized sometimes only achieving single digit utilization with normal applications.  This meant you could potentially run 10 or more of these processes on a single server.  Then dual core, quad core, and eight core processor chips starting showing up which just made non-virtualized systems seem even more of a waste.

We need a better way to measure storage utilization in order to show customers one of the ways where storage virtualization can help.

Although the storage industry talks a lot about storage utilization, they really mean capacity utilization.  There is no sense, no measurement, no idea of what performance utilization is of a storage system.

There is one storage startup that is looking at performance utilization, but from the standpoint of partitioning out system performance to different applications, i.e.,  Nexgen, a hybrid SSD-disk storage system, not as a better way to measure system utilization.

Historical problems with storage performance utilization

I think one problem may be that its much harder to measure storage performance utilization.  With a server processor it’s relatively easy to measure idle time. One just needs some place in the O/S to start clocking idle time whenever the server has nothing else to do.

But it’s not so easy in storage systems. Yes there are still plenty of idle loops but  they can be used to wait until a device delivers or accepts some data.  In that case the storage system is not “technically” idle.  On other hand, when a storage system is actually waiting for work, this is “true” idle time.

Potential performance utilization metrics

From a storage performance utilization perspective, I see at least four different metrics:

  • Idle IO time – this is probably closest to what a standard server utilization looks like.  It could be accumulated during intervals when no IO is active on the system. It’s complement, Busy IO time would be accumulated every time IO activity is present in the storage (from the storage server[s] perspective).  The sad fact is that plenty of storage systems measure something akin to Idle IO time, but seldom report on it in any sophisticated manner.
  • Idle IOP time – this could be some theoretical IOPS rate the system could achieve in its present configuration and anytime it was below that level it would accumulate Idle IOP time. It doesn’t have to be 100% of its rated IOPS performance, it could be targeted at 75%, 50% or even 25% of its configuration dependent theoretical maximum. But whenever IOP’s dropped below this rate then the system would start counting idle IOP time.  It’s complement, Busy IOP time, would be counted anytime the system exceeded that targeted IOPs rate.
  • Idle Throughput time – this could be some theoretical data transfer rate the system, in its current configuration, was capable of sustaining and anytime it was less than this rate it would accumulate Idle Throughput time.  Again this doesn’t have to be at the maximum throughput for the storage system but it needs to be some representative sustainable level. It’s counterpart, Busy Throughput time would be accumulated anytime the system reached the targeted throughput level.

Either of the last two measures could be a continuous value rather than some absolute quantity. For example if the targeted IOPS rate was 100K and the system used 50K IOPS for some time interval, then the Idle IOPS time would be the time interval times 50% (50K IOPS achieved/100K targeted IOPS).

To calculate storage performance utilization one would take the Idle IO, IOPS and/or Throughput time over a wall clock interval (say 15 minutes) and average this across multiple time periods.  Storage systems could chart these values to show end-users the periodicity of their activity.

This way on a 15 minute basis we could understand the busy-ness of a storage system.  And if we found that the storage was running at 5% IOPS or Throughput utilization most of the time, then implementing storage virtualization on that storage system would make a lot of sense.

Problems with proposed metrics

One problem with the foregoing is that IOPS and Throughput rates vary tremendously depending on storage system configuration as well as the type of workload that the system is encountering, e.g., 256KB blocks vs 512 byte blocks can have a significant bearing on IOPS rates and throughput rates attainable by any storage system.

There are solutions to these issues but they all require more work in development, testing and performance modeling.   This may argue for the more simpler Idle IO time metric but I prefer the other measures as they provide more accurate and continuous data.

~~~~

I believe metrics such as the above would be a great start to supplying the information that IT staff need to understand how storage virtualization would be beneficial to an organization. 

There are other problems with the current storage virtualization capabilities present today but these must be subjects for future posts.

Image: Biblioteca José Vasconcelos / Vasconcelos Library by * CliNKer *

SPECsfs2008 NFS SSD/NAND performance, take two – chart-of-the-month

SCISFS120623-010(002) (c) 2012 Silverton Consulting, Inc. All Rights Reserved

For some time now I have been experimenting with different approaches to normalize IO activity (in the chart above its NFS throughput operations per second) for systems that use SSDs or Flash Cache.  My previous attempt  (see prior SPECsfs2008 chart of the month post) normalized base on GB of NAND capacity used in a submission.

I found the previous chart to be somewhat lacking so this quarter I decided to use SSD device and/or Flash Cache card count instead.  This approach is shown in the above chart. Funny thing, although the rankings were exactly the same between the two charts one can see significant changes in the magnitudes achieved, especially in the relative values, between the top 2 rankings.

For example, in the prior chart Avere FXT 3500 result still came in at number one but whereas here they achieved ~390K NFS ops/sec/SSD on the prior chart they obtained ~2000 NFS ops/sec/NAND-GB. But more interesting was the number two result. Here the NetApp FAS6240 with 1TB Flash Cache Card achieved ~190K NFS ops/sec/FC-card but on the prior chart they only hit ~185 NFS ops/sec/NAND-GB.

That means on this version of the normalization the Avere is about 2X more effective than the NetApp FAS6240 with 1TB FlashCache card but in the prior chart they were 10X more effective in ops/sec/NAND-GB. I feel this is getting closer to the truth but not quite there yet.

We still have the problem that all the SPECsfs2008 submissions that use SSDs or FlashCache also have disk drives as well as (sometimes significant) DRAM cache in them.  So doing a pure SSD normalization may never suffice for these systems.

On the other hand, I have taken a shot at normalizing SPECsfs2008 performance for SSDs-NAND, disk devices and DRAM caching as one dimension in a ChampionsChart™ I use for a NAS Buying Guide, for sale on my website.  If your interested in seeing it, drop me a line, or better yet purchase the guide.

~~~~

The complete SPECsfs2008 performance report went out in SCI’s June newsletter.  But a copy of the report will be posted on our dispatches page sometime next month (if all goes well).  However, you can get the SPECsfs2008 performance analysis now and subscribe to future free newsletters by just using the signup form above right.

For a more extensive discussion of current NAS or file system storage performance covering SPECsfs2008 (Top 20) results and our new ChampionsChart™ for NFS and CIFS storage systems, please see SCI’s NAS Buying Guide available from our website.

As always, we welcome any suggestions or comments on how to improve our analysis of SPECsfs2008 results or any of our other storage performance analyses.


NetApp Analyst Summit Customer Panel – how to survive a category 5 tornado

20120621-085224.jpg
NetApp had three of their customer innovation winners come up on stage for a panel discussion with Dave Hitz moderating the discussion. All three had interesting deployments of NetApp storage systems:

  • Andrew Henderson from ING DIRECT talked about their need to deploy copies of the banks IT environment for test, development, optimization and security testing. This process took 12 weeks to accomplish the first time they tried and only created a single copy. They wanted to speed this up and be able to deploy 10 or more copies if necessary. Andrew looked at Microsoft Hyper-V, System Center and NetApp FlexClones and transformed this process to now generate a copy of the entire banks IT services in under 10 minutes. And since the new capabilities have been in place they have created over 400 copies of the bank (he called these bank-in-a-box) for various purposes.
  • Teresa Wahlert from Iowa Workforce Development Agency was up next and talked about their VDI implementation. Iowa cut their budget which forced them to shut down a number of physical offices. But with VDI, VMware and NetApp storage Workforce were able to disperse their services to over 3000 locations now in prisons, libraries, and other venues where they had no presence before. They put out a general call for all the tired, dying PCs in Iowa government and used these to host VDI services. Now Workforce services are up 7X24 locations, pretty amazing for government work. Apparently they had tried VDI before and their previous storage couldn’t handle it. They moved to NetApp with FlashCache and it worked just fine. That’s when they rolled it VDI services to their customers and businesses. With NetApp they were able to implement VDI, reduce storage costs (via deduplication and other storage efficiency features) and increase department services.
  • Jeff Bell at Mercy Healthcare talked about the difficulties of rolling out electronic health records (EHR) and their challenges of integrating ~30 hospitals and ~400 medical clinics. They started with EHR fairly early 2006-2007 well before the latest governmental push. He mentioned Joplin MO and last years category 5 tornado which about wiped out their hospital there. He said within 2 hours after the disaster, Mercy Healthcare was printing out the EHR for the 183 patients present in the hospital at the time that had to be moved to other care facilities. The promise of EHR is that the information travels with the patient, can be recovered in the event of a disaster and is immediately available.  It seems that at least at Mercy Healthcare, EHR is living up to its promise. In addition, they just built a new data center as they were running out of space, power and cooling at the old one. They installed new NetApp storage there and for the first few months had to run heaters to keep the data center live-able because the new power/cooling load was so far below what they were experienced previously. Looking back on what they had accomplished Jeff was not so sure they would build a new data center again. With new cloud offerings coming out and the reduced power/cooling and increased density of NetApp storage they could almost get by without another data center at all.

That’s about it from the customer session.

NetApp execs spent the rest of the day on innovation, mostly at NetApp but also in the IT industry in general.

There was lots of discussion on the new release of Data ONTAP 8.1.1 with its latest cluster mode features.  NetApp positioned it as fulfilling out the transition to  data/storage as an infrastructure that IT has been pushing for the last decade or so.  Following in the grand tradition of what IBM did for computing infrastructure with the 360 and what Cisco and others did for networking infrastructure in the mid 80’s.

Comments?

Dell Storage Forum 2012 day 1

At Dell Storage Forum today in Boston they announced:

  • A new M4110 EqualLogic blade storage system which fits in Dell’s M1000e blade chasis. Each M4110 blade is equivalent to a dual controller P4000 EqulLogic storage system. Up to 4 M4110 controller blades can be configured within the same M1000e chasis. The drive storage is also configured as a blade, with 14-1TB drives on a card. You can have up to 4 of these (56 drives/56TB) of storage connected to the EqualLogic storage blades. The M4110 storage blades can be peered with external P6000 EqualLogic storage systems, if you need more expansion. Also announced today is 1/4 blade form factor for Dell blade servers. So that within a single M1000e chasis one can have enough storage (24 TB) and compute (24 compute nodes, 384 cores) to support up to 384 VMs in a single blade chasis.
  • vStart 1000 for converged infrastructure with Compellent storage, Force10 networking and PowerEdge M620 servers that can support up to 1000VMs in a single rack, in an All pre-tested, pre-integrated solution. The vStart 1000 joins the already vStart 50, vStart 100, and vStart 200 all based on EqualLogic storage.

More to come …

20120611-165617.jpg

20120611-165637.jpg

20120611-180056.jpg

20120611-180105.jpg

IBMEdge2012 Day 1 part A – new Ultradrawer, integrated real time compression and more

Brian got up and talked about SmarterStorage that is coming out of STG. A couple of items of specific interest included:

  • IBM’s new ultra drawer – apparently server side all flash array, unclear if this is a shared storage device but would think so.
  • EasyTier cross domain support – In conjunction with the Ultra drawer rollout Brian mentioned that EasyTier would support multi-domain (outside just shared storage tiering?!)
  • Virtual Storage Center – a new administration capability targeted to private and public cloud services which supports a storage service catalog and more self-service provisioning.
  • Real time data compression – for SVC and Storwize V7000 a new (integrated) storage data compression capability based on LZ compression for real time primary active data compression on storage. This will be integrated with other IBM storage over time.

 

That’s about it from the keynote other than the electronic strings music was awesome…

20120604-101955.jpg

20120604-102035.jpg