EMCworld 2013 Day 3

IMG_1431Rich Napolitano, President Unified Storage Division got up and showed some technology demonstrations of what they had working in their labs.  Rich had some of his long time engineers up on the stage to show what was running in their labs.

  • First up was a dual controller, dual processors per controller 8 core processing chips (32cores in all) running against an all SSD backend. The configuration was up for a short time but it seemed like 96 SSDs, so an all-flash VNX array.  They used Iometer, random-8KB IO to drive almost 975K IOPS at sub-msec. response time. They hit 1M IOPS with just slightly above 1 msec. response time. You could see the processor utilization of the 32 cores going up as the workload reached higher levels.  Couldn’t see precisely but all the cores were running at ~70-80% busy at the 1Miops level and it seemed like the system performance was entering the knee-of-the-curve
  • Next up was the new VNX data app store demonstration. Similar to iPhone and Android App stores. EMC has identified a select set of apps that can be run directly on VNX hardware. The current demonstration had two versions of anti-virus, Recover Point Virtual Appliance (vRPA), (v?)VPLEX, CloudAccess and MySQL server.  The engineers showed how AV software could be installed and be running on the VNX as well as how vRPA could be installed and provide onboard replication services.
  • Then, they demonstrated a VNX virtual appliance (vVNX?) which was able to run on white box server which I think was running ESX.  In this case, vVNX was running with onboard DAS storage but had all the advanced functionality of VNX
  • Finally, they showed a vVNX running in a cloud services environment. Not sure if this was VMware vCloud or some other compute cloud but Rich stated that they will support many clouds.  With vVNX running in the cloud accessing storage behind the compute engine it’s unclear what the performance would be and how one would access the storage (file or iSCSI no doubt) but it did open up new possibilities as to where one could run VNX services.

It’s readily apparent that the next iteration of VNX software seems focused on taking advantage of multi-core processing (called MCx) to boost storage system performance, providing a virtualized environment within the VNX engine to run specialized data services and supplying a new vVNX functionality which can be deployed just about anywhere you would want.

That’s all for the public sessions, spent much of the rest of the day in NDA sessions.

I had a good time at EMCworld 2013, seeing old friends again and meeting new ones and thank EMC for inviting me.  For information on previous days at EMCworld 2013 please see my Day 1 and Day 2 posts.

Posted in Block Storage, Disk storage, Distributed computing, File Storage, Server virtualization, SSD storage, Storage architecture, Storage Features, Storage performance, Strategic Inflection Points | Tagged , , , , , , | Leave a comment

EMCworld 2013 Day 2

IMG_1382The first session of the day was with  Joe Tucci EMC Chairman and CEO.  He talked about the trends transforming IT today. These include Mobile, Cloud, Big Data and Social Networking. He then discussed  IDC’s 1st, 2nd and 3rd computing platform framework where the first was mainframe, the second was client-server and the third is mobile. Each of these platforms had winers and losers.  EMC wants definitely to be one of the winners in the coming age of mobile and they are charting multiple paths to get there.

Mainly they will use Pivotal, VMware, RSA and their software defined storage (SDS) product to go after the 3rd platform applications.  Pivotal becomes the main enabler to help companies gain value out of the mobile-social networking-cloud computing data deluge.  SDS helps provide the different pathways for companies to access all that data. VMware provides the software defined data center (SDDC) where SDS, server virtualization and software defined networking (SDN) live, breathe and interoperate to provide services to applications running in the data center.

Joe started talking about the federation of EMC companies. These include EMC, VMware, RSA and now Pivotal. He sees these four brands as almost standalone entities whose identities will remain distinct and seperate for a long time to come.

Joe mentioned the internet of things or the sensor cloud as opening up new opportunities for data gathering and analysis that dwarfs what’s coming from mobile today. He quoted IDC estimates that says by 2020 there will be 200B devices connected to the internet, today there’s just 2 to 3B devices connected.

Pivotal’s debut

Paul Maritz, Pivotal CEO got up and took us through the Pivotal story. Essentially they have three components a data fabric, an application development fabric and a cloud fabric. He believes the mobile and internet of things will open up new opportunities for organizations to gain value from their data wherever it may lie, that goes well beyond what’s available today. These activities center around consumer grade technologies  which 1) store and reason over very large amounts of data; 2) use rapid application development; and 3) operate at scale in an entirely automated fashion.

He mentioned that humans are a serious risk to continuous availability. Automation is the answer to the human problem for the “always on”, consumer grade technologies needed in the future.

Parts of Pivotal come from VMware, Greenplum and EMC with some available today in specific components. However by YE they will come out with Pivotal One which will be the first framework with data, app development and cloud fabrics coupled together.

Paul called Pivotal Labs as the special forces of his service organization helping leading tech companies pull together the awesome apps needed for the technology of tomorrow, consisting of Extreme programming, Agile development and very technically astute individuals.  Also, CETAS was mentioned as an analytics-as-a-service group providing such analytics capabilities to gaming companies doing log analysis but believes there’s a much broader market coming.

IMG_1393Paul also showed some impressive numbers on their new Pivotal HD/HAWQ offering which showed it handled many more queries than Hive and Cloudera/Impala. In essence, parts of Pivotal are available today but later this year the whole cloud-app dev-big data framework will be released for the first time.

IMG_1401Next up was a media-analyst event where David Goulden, EMC President and COO gave a talk on where EMC has come from and where they are headed from a business perspective.

Then he and Joe did a Q&A with the combined media and analyst community.  The questions were mostly on the financial aspects of the company rather than their technology, but there will be a more focused Q&A session tomorrow with the analyst community.

IMG_1403 Joe was asked about Vblock status. He said last quarter they announced it had reached a $1B revenue run rate which he said was the fastest in the industry.  Joe mentioned EMC is all about choice, such as Vblock different product offerings, VSpex product offerings and now with ViPR providing more choice in storage.

Sometime today Joe had mentioned that they don’t really do custom hardware anymore.  He said of the 13,000 engineers they currently have ~500 are hardware engineers. He also mentioned that they have only one internally designed ASIC in current shipping product.

Then Paul got up and did a Q&A on Pivotal.  He believes there’s definitely an opportunity in providing services surrounding big data and specifically mentioned CETAS as offering analytics-as-a-service as well as Pivotal Labs professional services organization.  Paul hopes that Pivotal will be $1B revenue company in 5yrs.  They already have $300M so it’s well on its way to get there.

IMG_1406Next, there was a very interesting media and analyst session that was visually stimulating from Jer Thorp, co-founder of The Office for Creative Research. And about the best way to describe him is he is a data visualization scientist.

IMG_1409He took some NASA Kepler research paper with very dry data and brought it to life. Also he did a number of analyzes of public Twitter data and showed twitter user travel patterns, twitter good morning analysis, twitter NYT article Retweetings, etc.  He also showed a video depicting people on airplanes around the world. He said it is a little known fact but over a million people are in the air at any given moment of the day.

Jer talked about the need for data ethics and an informed data ownership discussion with people about the breadcrumbs they leave around in the mobile connected world of today. If you get a chance, you should definitely watch his session.IMG_1410

Next Juergen Urbanski, CTO T-Systems got up and talked about the importance of Hadoop to what they are trying to do. He mentioned that in 5 years, 80% of all new data will land on Hadoop first.  He showed how Hadoop is entirely different than what went before and will take T-Systems in vastly new directions.

Next up at EMCworld main hall was Pat Gelsinger, VMware CEO’s keynote on VMware.  The story was all about Software Defined Data Center (SDDC) and the components needed to make this happen.   He said data was the fourth factor of production behind land, capital and labor.

Pat said that networking was becoming a barrier to the realization of SDDC and that they had been working on it for some time prior to the Nicera acquisition. But now they are hard at work merging the organic VMware development with Nicera to create VMware NSX a new software defined networking layer that will be deployed as part of the SDDC.

Pat also talked a little bit about how ViPR and other software defined storage solutions will provide the ease of use they are looking for to be able to deploy VMs in seconds.

Pat demo-ed a solution specifically designed for Hadoop clusters and was able to configure a hadoop cluster with about 4 clicks and have it start deploying. It was going to take 4-6 minutes to get it fully provisioned so they had a couple of clusters already configured and they ran a pseudo Hadoop benchmark on it using visual recognition and showed how Vcenter could be used to monitor the cluster in real time operations.

Pat mentioned that there are over 500,000 physical servers running Hadoop. Needless to say VMware sees this as a prime opportunity for new and enhanced server virtualization capabilities.

That’s about it for the major keynotes and media sessions from today.

Tomorrow looks to be another fun day.

Posted in Cloud services, desktop virtualization, Information economy, Internet traffic, Mobile computing, R&D measures, Server virtualization, Software Defined Network, Visionary leadershp | Tagged , , , , , , , | Leave a comment

EMCworld 2013 day 1

Lines for coffee at the Cafe were pretty long this morning and I missed my opportunity to have breakfast to do some work. But eventually made my way to the press room and got some food and coffee.

Spent the morning in Analyst sessions mostly under NDA but it seems safe to say that EMC sees plenty of opportunity ahead.

The first session Q&A with BRS executives and customers was enlightening but the main message from the customers was that data protection is hard, legacy systems often can’t adjust quick enough and sometimes a completely new architecture is warranted. The executives were upbeat about current BRS business and where they were headed in the future.

20130506-142735.jpgRest of the morning was with Jeremy Burton EVP Product, Operations and Marketing and John Roese, the new SVP and CTO of EMC (6 months on the job). Jeremy talked about an IDC insight that there’s a new world emerging so-called 3rd platform applications based on mobile and consumer grade technology  with literally billions of users, millions of apps built on mobile-cloud-bigdata-social infrastructure which complements the 2nd platform built on lan/wan, client server frameworks.

For an example of this environment Jeremy mentioned that AT&T provisions 12PB of storage a month.

What’s needed for this new platform is a new type of storage built for the 3rd platform but taking advantage of current enterprise storage characteristics.  This is ViPR (more on that later)

John comes by way of Huawei, Nortel and myriad others and offers a broad insight to the way forward for EMC. It looks like a bright future ahead if they can do half of what John has outlined.

John talked about the intersections between the carrier market (or services), enterprise IT and consumer market.  There is convergence between these regions and at each of these intersections new technology is going to answer many of the problems which exist. For instance in the carrier space:

  • The amount of information they gather is frightening they know everything about you. Pivotal will be the key here because its good at 1) ability to correlate information across different information sources. Most carriers have a whole bunch of disparate information stores; and 2) It’s not just focused on Big Data as a non-realtime problem but also provides realtime analytics as well.
  • Capital costs are going down but $/bits are going way down.  VMware & Software defined data center is the right way to drive down costs.  Today servers are ~50% virtualized but networking is not virtualized at all.
  • Customers are dissatisfied with service providers (carriers).  Again Pivotal is key here. One carrier customer was focused on customer churn and tried to figure out how to minimize this. They used  Gemfire’ high speed infrastructure that could watchc all transactions on cell tower infrastructure pick out dropped calls, send it to Greenplum and correlate this with the customer attributes (good or bad), and within 100msec supply an interaction with the customer in to apologize and offer some services to make it better.
  • Internet is the new wild west –use at your own risk,  spoofing websites, respond to email could be anyone, chaos to security. RSA can become the trusted internet provider by looking at the internet holistically, combining information from many customers, aggregating and sharing these interactions to deterimine the trust of every transaction. Trust is becoming a new big data problem.
  • Hybrid and public cloud is their biggest opportunity but they don’t know how to attack it. VMware and SDDC will evolve to provide orchestrated movement from private to public and closed to open.

The thinking seems pretty straightforward given what they are trying to accomplish and the framework he applied to EMC’s strategy going forward made a lot of sense.

20130506-172955.jpgBrian Gallagher did a keynote on enterprise storage new functions and features which covered VMAX, VPLEX, RecoverPoint, and XtremIO/SF/SW. Mentioned RecoverPoint virtual appliance and sort of a statement of direction on being able to move application functionality directly on VMAX. He kind of demoed this with VPLEX running on VMAX.

He also talked about FAST speed of reaction versus the competition, mentioned that FAST provides information about the storage tiering to up to 4 different VMAX arrays. Showed a comparison of VMAX 10K against another prime competitor that looked downright embarrassing.  And talked about VMAX cloud edition.

20130506-173022.jpgAfter that 1 on 1 meetings all under strict NDA. But then the big Keynote with Jeremy again and David Goulden President and COO on ViPR. They have implemented software defined storage (SDS).  Last week I did a post on SDS trying to layout some of the problems and promises of SDS (please see The promise of SDS post).

But what I missed was the data path transformation that ViPR can do to provide object and HDFS access to traditional and commodity storage systems.  ViPR starts out primarily in the control layer providing automated provisioning, self management, across heterogeneous storage pools. With ViPR one can define virtual storage arrays and then configure virtual storage pools across those arrays regardless of the physical infrastructure underneath them.

More on ViPR in a separate post but suffice it to say EMC has been working on this for awhile now. But how it’s positioned with VPLEX and the other storage virtualization capabilities in VMAX and other products is another matter. But it seems they are carving out a space for ViPR between and above the current storage solutions.

End of day one is in the Expo and then cocktail parties… stay tuned for day 2.

 

Posted in Block Storage, Cloud services, Cloud storage, DAS, Disk storage, Distributed computing, File Storage, Information economy, Market dynamics, Mobile computing, Object storage, Server virtualization, Software Defined Network, SSD storage, Storage, Storage architecture, Storage virtualization, Strategic Inflection Points, System effectiveness | Tagged , , , , | 1 Comment

The promise of software defined storage

Data hypervisor, software defined storage, data plane, control plane

(c) 2012 Silverton Consulting, Inc. All rights reserved

Not sure why but all the hype around software defined storage seems to be reaching a crescendo.  Possible due to conference season coming up but it started earlier this year.  I attended an SNW analyst session that was talking about software defined storage had on its panel technical people from HDS, IBM, Data Core and VMware.  It seems the distinction between storage virtualization and software defined storage is getting slimmer every time we talk about it.  I have written before about software defined storage (see my Data Hypervisor post).

Server, networking and storage virtualization today

Server virtualization makes an awful lot of sense, has made lots of money and arguably been around for decades now especially in mainframe systems.  Servers have so much power today that dedicating one to a single workload just doesn’t make any sense anymore.

Network virtualization from OpenFlow and others also makes a lot of sense (see OpenFlow the next wave in networking and OpenFlow part 2, Cisco’s response posts). Here we aren’t necessarily boosting network utilization as much as changing resource allocation to deal with altered traffic flows.  That and the fact that provisioning, monitoring and other management characteristics can now be under pragmatic control from the user makes these systems very appealing. Especially, to organizations that exhibit varying network activity over time.

Storage virtualization has been around for a long time too and essentially places a storage system abstraction layer on top of a group of other, heterogeneous storage systems. This provides a number of capabilities such as allowing data to be migrated from one storage system to another without host knowledge or intervention.  Other storage virtualization features include, centralized, management, common storage features, different storage personalities (protocols), etc. But just being able to migrate data from one storage system to another without host intervention or knowledge provides an awful lot of value, especially to large data centers which refresh technology frequently.

Software defined storage compared to server virtualization

Software defined storage seems to imply some ability to marry storage virtualization services to RESTful and other APIs which would allow programatic storage provisioning, monitoring and management.  This would allow data centers to manage and control their storage without involving storage administrators in day-to-day activities.

When I compare this to server virtualization the above described capabilities really don’t increase storage utilization much.  Yes, by automating provisioning or even running thin provisioning one can potentially boost storage capacity utilization but you really haven’t increased the IO utilization much by doing this.

Looking under the covers of most storage systems one might find that CPU cores are pretty idle, but data paths and storage devices are typically running flat out.  One problem is that today’s enterprise storage subsystems are already highly shared across applications and users.  So there is really no barrier to sharing these resources as widely as they can.   As such, storage system IOPS and/or bandwidth utilization is already pretty high.   I would say a typical enterprise application environment storage subsystem performance usually runs above 30% and reaching 50% or more during peak time periods. Increasing IOPS utilization much beyond that risks seriously impacting peak performance periods.

Now if somehow one could migrate slower data around a complex to lower performing storage when there’s no need for high performance and higher performing data to higher performing storage when there is a need then that could help increase performance utilization considerably.   But, many storage systems already do this internally through automated storage tiering and even some can do this across storage systems using storage virtualization.

But the underlying problem here is that in takes a lot of time, resources and effort to move TBs of data around a data center, especially when its doing other work.  So other than something akin to storage tiering across storage systems we are unlikely to see much increase in storage performance utilization with a gaggle of multiple storage systems.  I suppose in the future moving TB of data may take much less time & resources than today but then the problem becomes moving PB of data around.

Software defined storage compared to network virtualization

When I compare the above capabilities to network virtualization it doesn’t look very similar.   There’s really no way to change the storage performance to optimize it for one direction (or application) at this instant and then move storage performance around to another application a couple of hours later.  Yes, again automated storage tiering can do this, and yes some of these systems can tier across storage systems using storage virtualization but in general barring storage tiering there’s nothing like this available today.  

Maybe if inside a storage system the data paths could somehow be programatically reconfigured to offer say more internal bandwidth to the Device-to-Cache path vs. the Cache-to-Frontend path. Changing or reconfiguring data path resources like this could certainly optimize the internal performance of a storage system and this would be a worthwhile feature of any software defined storage.  Knowing which is more important to one application and less important to all the others will take some smarts, across the storage system and host O/S but it’s certainly feasible.  So, with RESTful interfaces, APIs or application hints data paths could be reconfigurations on demand to support applications that are all vieing for IO activity.  

With these sorts of capabilities software defined storage starts to look a little more like software defined networking.

Software defined storage on its own

But in the end we always reach a fundamental limit of IO capabilities in today’s storage systems which is the devices. Yes you can have 2000 or more devices in high-end storage  today and yes you can have all-flash arrays. However, most storage systems are configured to keep whatever devices they have pretty busy as much of the time as possible.

Until we create some sort of storage device that can provide more performance than most applications can ever use, even when they are shared via a storage system, software defined storage capabilities will be limited.  Today’s SSDs have certainly boosted performance considerably but this just means that most applications that warrant all flash arrays are performing faster.  It just so happens that some applications can take all the performance you throw at them and still want more.

I suppose if SSDs cost were to come down to match NL-SAS storage prices and still maintain the 100X faster IOP rate, then maybe a storage system built on such devices could be more “software defined” than others.  And maybe that’s where everyone is headed, believing NAND/SSD price trends will drive costs down so much that everyone can have all the IOPS performance they will ever need out of a single storage system.

Yet, this still just looks like shared storage we have today, only more of it. So we return back to our roots and see that software defined storage is just another way to add more storage sharing. Storage virtualization is nice, new more programmatical storage systems is even better but faster-cheaper storage devices is best of all.

So what we really need is much cheaper SSDs to realize the full promise of software defined storage.   In the mean time opening up APIs and providing RESTful interfaces to provide programatic interfaces to provisioning, monitoring, managing and tuning storage system data paths and other performance characteristics are all we can hope for.

Comments?

 

 

 

Posted in Server virtualization, Software Defined Network, Storage, Storage performance, Storage utilization, Storage virtualization, System effectiveness | Tagged , , , , , | Leave a comment

Cheap phones + big data = better world

Big data visualization, Facebook friend connections, Data science

Facebook friend carrousel by antjeverena (cc) (from flickr)

Read an article today in MIT Technical Review website (Big data from cheap phones) that shows how cheap phones, call detail records (CDRs) and other phone logs can be used to help fight disease and help understand disaster impacts.

Cheap phones generate big data

In one example, researchers took cell phone data from Kenya and used it to plot people movements throughout the country. What they were looking for is people who frequented malaria disease hot spots so that they could try to intervene in the transmission of this disease. Researchers discovered one region (cell tower) that had many people that were frequenting a particular bad location for malaria.  It turned out the region they identified had a large plantation with many migrant workers. These workers moved around a lot.  In order to reduce the transmission of the disease public health authorities could target this region to use more bed nets or try to reduce infestation at source of the disease.  In either case, people mobility was easier to see with cell phone data than actually putting people on the ground and counting where people go or come from.

In another example, researchers took cell phone data from Haiti before and after the earthquake and were able to calculate how many people were in the region hardest hit by the earthquake.  They were also able to identify how many people left the region and where the went to.  As a follow on to this, researchers were able to in real time show how many people had fled the cholera epidemic.

Gaining access to cheap phone data

Most of this call detail record data is limited to specific researchers for very specialized activities requested by the host countries. But recently  Orange released 2.5 billion cell phone call and text data records for five million customers they have in Ivory Coast that occurred during five months time.  They released the data to the public under some specific restrictions in order to see what data scientists could do with it. The papers detailing their activities will be published at a MIT Data for Development conference.

~~~~

Big data’s contribution to a better world is just beginning but from what we see here there’s real value in data that already exists, if only the data were made more widely available.

Comments?

Posted in Crowdsourcing, Data analytics, Data availability, Data science, Information economy, Strategic Inflection Points, Visionary organizations | Tagged , , | 1 Comment

The antifragility of disk RAID groups, the fragility of SSDs and what to do about it

HDA, disk, disk head-media, Hard Disk by Jeff Kubina (cc) (from Flickr)

Hard Disk by Jeff Kubina (cc) (from Flickr)

[A long post today] I picked up the book Antifragile: Things that gain be disorder,  by Nassim N. Taleb and despite trying to put it away at least 3 times now, can’t stop turning back to it.  In his view fragility is defined by having a negative (or bad) response to variation, volatility or randomness in general.  Antifragile is the exact opposite of fragile in that it has a positive (or good) response to more variation, volatility or randomness.  Somewhere between antifragility and fragility is robustness which has neither a positive or negative response (is indifferent) to high volatility, variation or randomness.

Why disks are robust …

To me there are plenty of issues with disks. To name just a few:

  • They are energy hogs,
  • They are slow (at least in comparison to SSDs and flash memory), and
  • They are mechanical contrivances which can be harmed by excess shock/vibration.

But, aside from their capacity benefits, they have a tendency to fail at a normalized failure rate unless there is a particular (special) problem with media, batch, electronics or micro-programing.  I have seen plenty of these other types of problems at StorageTek over the years to know that there are many things that can disturb disk failure rate normalization. However, in general, absent some systematic causes of failure, disk fail at a predictable rate with a relative wide distribution (although, being away from the engineering of storage systems,  I have no statistics for the standard deviation of disk failures – it just feels right [Nassim would probably disavow me for reading that]).

The other aspect of disk anti-fragility is that as they degrade over time, they seem to get slower and louder.  The former is predominantly due to defect skipping, an error recovery procedure for bad blocks.  And they get louder as bearings start to wear out, signaling eminent failure ahead.

In defect skipping when a disk drive detects a bad block, the disk drive marks the block as bad and uses a spare block it has somewhere else in the disk for all subsequent writes. The new block is typically “far” away from the old block so when reading multiple blocks the drive has to now seek to the new block and seek back to read them. increasing response time in the process.

The other phenomona that disk failures have is a head crash. These seem to occur at completely at random with disks from “mature processes”.

So, I believe disks from mature processes have a normalized failure rate with a reasonably wide standard of deviation around this MTBF rate. As such, disk drives should be classified as robust.

… and RAID groups of disk drives are antifragile

But, while disk drives are robust, when you place such devices in a RAID group with others, the RAID groups survive better.  As long as the failure rate of the devices is randomized and there is a wide variance on this failure rate, when a RAID group encounters a single drive failure it is unlikely that a second, or third (RAID DP/6) will also fail while trying to recover from the first.  (Yes, as disk drives get larger the time to recover gets longer thus increasing the probability of multiple drive failures, but absent systematic causes of drive failures, the likelihood of data loss should be rare).

In a past life we had multiple disk systems in a location subject to volcanic activity. Somehow, sulferic fumes from the volcano had found its way into the computer room and was degrading the optical transceivers in our disk drives causing drive failures.   The subsystem at the time had RAID 6 (dual parity) and over the course of a few weeks we had 20 or more disk drives die in these systems. The customer lost no data during this time but only because the disk drive failure rate was randomly distributed over time with a wide dispersion.

So from Nassim’s definition disk RAID groups are anti-fragile, they do operate better with more randomness.

Why SSD and SSD RAID groups are fragile

SSD, Toshiba's New 2.5" SSD from SSD.Toshiba.com

Toshiba’s New 2.5″ SSD from SSD.Toshiba.com

SSDs have a number of good things going for them. For example:

  • They are blistering fast,  at least compared to rotating disks.
  • They are relatively green storage devices meaning they use less energy than rotating disk
  • They are semiconductor devices and as such, are relatively immune to shock and vibration.

Despite all that, given todays propensity to use wear leveling, RAID groups composed of SSDs can exhibit fragility because all the SSDs will fail at approximately the same number of Program/Erase cycles.

My assumption, is that because NAND wear out is essentially an electro-chemical phenomenon that its failure rate, while a normalized distribution, probably has a very narrow variation.  Now given the technology NAND pages will fail after so many writes, it may be 10K, 30K or 100K (for MLC, eMLC, or SLC) but all the NAND pages from the same technology (manufactured on the same fab line) will likely fail at about the same number of P/E cycles. With wear leveling equalizing the P/E cycles across all pages in an SSD, this means that there is some number of writes that an SSD will endure and then go no farther.  (Again, I have no hard statistics to support this presumption and Nasssim will probabilistically not be pleased with me for saying this).

As such, for a RAID group made up of wear leveling SSDs especially with data stripping across the group, all the SSDs will probabilistically fail at almost same time because they all will have had the same amount of data written to them.  This means that as we reach wear out on one SSD in the group, assuming all the others were also fresh at the time of original creation of the group, then all the other devices will be near wear out.  As a result, when one SSD fails, others in the RAID group will have a high probability of failure, leading to data loss.

I have written about this before, see my Potential data loss using SSD RAID groups post for more information.

What we can do about the fragility of SSD RAID groups?

A couple of items come to mind that can be done to reduce the fragility of a RAID group of SSDs:

  • Intermix older and newer (fresher) SSDs in a single RAID group to not cause them all to fail at the same time.
  • Don’t use data striping across RAID groups of SSDs this would allow some devices to be written more than others and by doing so cause some randomness to the SSD failures in the group.
  • Don’t use RAID 1 as this will always cause the same number of writes to be written to pairs of SSDs
  • Don’t use RAID 5 or other protection methodologies that spread parity writes across the group, using these would be akin to data striping in that all parity writes would be spread evenly across the group.
  • Consider using different technology SSDs in a RAID group, if one intermixed MLC, eMLC and SLC drives in a RAID group this would have the effect of varying the SSD failure rates.
  • Move away from wear leveling to defect skipping while doing so will cause some SSDs to fail earlier than today, their failure rate will be randomly distributed.

The last one probably deserves some discussion.  There are many reasons for wear leveling one of which is to speed up writes (by always having a fresh page to write), another is that NAND blocks cannot be updated in place, they need to be erased to be written.  But another major reason is to distribute write activity across all NAND pages to equalize wear out.

In order to speed up writing sans wear leveling one would need some sort of DRAM buffer to absorb the write activity and then later destage it to NAND when available.   The inability to update in place is more problematic but could potentially be dealt with by using the same DRAM cache to read in the previous information and write back the updates.  Other solutions to this later problem exist but seem to be more problematic than they are worth.

But for the aspect of wear leveling done to equalize NAND page wearout, I believe there’s a less fragile solution.  If we were to institute some form of defect skipping with a certain amount of spare NAND pages, we could easily extend the life of an SSD, at least until we run out of spare pages.

Today, there is a considerable amount of spare capacity shipped with most SSDs, over 10% in most enterprise class storage and more with consumer grade. With this much capacity a single NAND logical block could be rewritten an awful high number of times. For instance using defect skipping, with a 100GB MLC SSD at 10K write endurance with 10% spare pages and a 1MB page size, one single logical block address page could written ~100million times (assuming no other pages were being written beyond their maximum).

The main advantage is that, now SSD failure rates would be more widely distributed. Yes there would be more early life failures, especially for SSDs that get hit a lot. But they would no longer fail in unison at some magical write level.

Making SSDs less fragile

While doing all the above may help a RAID group full of SSDs be less fragile, addressing the inherent antifragility of an SSD is more problematic.  Nonetheless, some ideas do come to mind:

  • Randomly mix NAND chips from different FABs/vendors, then the SSDs that use this intermixture could have a more randomly distributed failure rate, which should increase the standard deviation of MTBF.
  • Use different NAND technologies in an SSD, using say MLC for the bulk of the storage capacity and SLC for the defect skip capacity on an SSD (with no wear leveling). Doing this would elongate the lifetime of the average SSD and randomly distribute failures of SSDs based on write locality of reference thereby increasing the standard deviation of MTBF.  Of course this would also have the affect of speeding up heavily written blocks now coming out of SLC rather than slower MLC, making these SSDs even faster for those blocks which are written more frequently.
  • Use more random, less deterministic predictive maintenance, SSD predictive maintenance is used to limit the damage from a failing SSD by replacing it before death. By using less deterministic algorithms and more randomized algorithms  (such as how close to wear out we let the SSD get before signaling failure) we would have the impact of increasing the variance of failure.

This post is almost too long now but there are probably other ideas to increase the robustness of SSDs and PCIe Flash cards that deserve mention someplace. Maybe we can explore these in a subsequent post.

Comments?

[Full disclosure:  I have a number of desktops that use single disk drives (without RAID) that are backed up to other disk drives.  I own and use a laptop, iPads, and an iPhone that all use SSDs or NAND technology (without RAID). I have neither disk or SSD storage subsystems that I own.]

 

Posted in Disk storage, SSD storage, Storage availability, Storage Quality | Tagged , , , , , , , , , , | Leave a comment

SNWUSA Spring 2013 summary

SNWUSA, SNIA, partyFor starters the parties were a bit more subdued this year although I heard Wayne’s suite was hopping to 4am last night (not that I would ever be there that late).

And a trend seen the past couple of years was even more evident this year, many large vendors and vendor spokespeople went missing. I heard that there were a lot more analyst presentations this SNW than prior ones although it was hard to quantify.  But it did seem that the analyst community was pulling double duty in presentations.

I would say that SNW still provides a good venue for storage marketing across all verticals. But these days many large vendors find success elsewhere, leaving SNW Expo mostly to smaller vendors and niche products.  Nonetheless, there were a\ a few big vendors (Dell, Oracle and HP) still in evidence. But EMC, HDS, IBM and NetApp were not   showing on the floor.

I would have to say the theme for this years SNW was hybrid storage. It seemed last fall the products that impressed me were either cloud storage gateways or all flash arrays but this year there weren’t as many of these at the show but hybrid storage certainly has arrived.

Best hybrid storage array of the show

It’s hard to pick just one hybrid storage vendor as my top pick, especially since there were at least 3 others talking to me under NDA, but from my perspective the Hybrid vendor of the show had to be Tegile (pronounced I think, as te’-jile). They seemed to have a fully functional system with snapshot, thin provisioning, deduplication and pretty good VMware support (only time I have heard a vendor talk about VASA “stun” support for thin provisioned volumes).

They made the statement that SSD in their system is used as a cache, not a tier. This use is similar to NetApp’s FlashCache and has proven to be a particularly well performing approach to use of hybrid storage. (For more information on that take a look at some of my NFS and recent SPC-1 benchmark review dispatches. How well this is integrated with their home grown dedupe logic is another question.

On the negative side, they seem to be lacking a true HA/dual controller version but could use two separate systems with synch (I think) replication between them to cover this ground?? They also claimed their dedupe had no performance penalty, a pretty bold claim that cries out for some serious lab validation and/or benchmarking to prove. They also offer an all flash version of their storage (but then how can it be used as a cache?).

The marketing team seemed pretty knowledgeable about the market space and they seem to be going after mid-range storage space.

The product supports file (NFS and CIFS/SMB), iSCSI and FC with GigE, 10GbE and 8Gbps FC. They quote “effective” capacities with dedupe enabled but it can be disabled on a volume basis.

Overall, I was impressed by their marketing and the product (what little I saw).

Best storage tool of the show

Moving onto other product categories, it was hard to see anything that caught my eye. Perhaps I have just been to too many storage conferences but I did get somewhat excited when I looked at SwiftTest.  Essentially they offer a application profiling, storage modeling, workload generating tool set.

The team seems to be branching out of their traditional vendor market focus and going after large service providers and F100 companies with large storage requirements.

Way back, when I was in Engineering, we were always looking for some information as to how customers actually used storage products. Well what SwiftTest has, is an appliance to instrument your application environment (through network taps/hardware port connections) to monitor your storage IO and create a statistical operational profile of your I/O environment. Then take that profile and play it against a storage configuration model to show how well it’s likely to perform.  And if that’s not enough the same appliance can be used to drive a simulated version of the operational profile back onto a storage system.

It offers NFS (v2,v3, v4) CIFS/SMB (SMB1, SMB2, SMB3) FC, iSCSI, and HTTP/REST (what no FCoe?). They mentioned an $8oK price tag for the base appliance (one protocol?) but grows up pretty fast, if you want all of them.  They also seem to have three levels of appliances (my guess more performance and more protocols come with the bigger boxes).

Not sure where they top out but simulating an operational profile can be quite complex especially when you have to be able to control data patterns to match deduplication potential in customer data, drive markov chains with probability representations of operational profiles, and actually execute IO operations. They said something about their ports have dedicated CPU cores to insure adequate performance or something similar but still it seems to little to hit high IO workloads.

Like I said, when I was in engineering were searching for this type of solution back in the late 90s and we would have probably bought it in a moment, if it was available.

GoDaddy.com, the domain/web site services provider was one of their customers that used the appliance to test storage configurations. They presented at SNW on some of their results but I missed their session (the case study is available on SwiftTests website).

~~~~

In short, SNW had a diverse mixture of end user customers bet lacked a full complement of vendors to show off to them.   The ratio of vendors to customers has definetly shifted to end-users the last couple of years and if anything has gotten more skewed to end-users, (which paradoxically should appeal to more storage vendors?!).

I talked with lots of end-users, from companies like FedEx, Northrop Grumman and AOL to name just a few big ones. But there were plenty of smaller ones as well.

The show lasted three days and had sessions scheduled all the way to the end. I was surprised at the length and the fact that it started on Tuesday rather than Monday as in years past.  Apparently, SNIA and Computerworld are still tweaking the formula.

It seemed to me there were more cancelled sessions than in years past but again this was hard to quantify.

Some of the customers I talked with thought SNW should go to a once a year and can’t understand why it’s still twice a year.  Many mentioned VMworld as having taken the place of SNW in being a showplace for storage vendors of all sizes and styles.  That and the vendor specific shows from EMC, IBM, Dell and others.

The fall show is moving to Long Beach, CA. Probably, a further experiment to find a formula that works.  Let’s hope they succeed.

Comments?

 

Posted in Block Storage, DAS, Disk storage, Ethernet, FC, File Storage, iSCSI, Networking, R&D measures, SSD storage, Storage, Storage performance | Tagged , , , , , , , , , | Leave a comment

Latest Microsoft ESRP ChampionsChart™ for over 5K mailboxes – chart of the month

(c) 2013 Silverton Consulting, Inc., All Rights Reserved

(c) 2013 Silverton Consulting, Inc., All Rights Reserved

The above, from our November 2012 StorInt Performance Dispatch, is another of our ChampionsCharts™ showing optimum storage performance.  This one displays the  Q4-2012, Exchange Solution Reviewed Program champions for the over 5,000 mailbox solutions.

All of SCI’s ChampionsCharts are divided into four quadrants, the best is upper right and is labeled Champions, the next best is upper left and is labeled Marathoners, then Sprinters and finally Slowpokes.

The ESRP chart shows relative database latency across the horizontal axis and relative log playback performance on the vertical access for all ESRP submissions for the over 5,000 mailbox category.  Systems provide better relative log playback performance the higher in the chart they are placed and better relative database latencies the further to the right one goes.

We believe that ESRP performance is a great way to see how well a similarly configured storage system will perform on the more diverse and mixed workloads seen in every day data center application environments.

All SCI ChampionsCharts represent normalized storage performance for the metrics we choose and as such for ESRP indicates that given the hardware used in the submission, Champions performed relatively better than expected in both log playback and database latency.  Marathoner systems performed relatively better in log playback and not as well in database latency.  Similarly, for the Sprinters quadrant, these systems provided relatively better database latency but worse log playback performance. In the Slowpokes quadrant these systems performed relatively worse on both performance measures\.

There are a gaggle of storage systems in the Champions quadrant.  Here we identify the top 5 systems that stand out over all the rest and are readily separated into 3 groups. Specifically;

Group 1 is a single system and is the IBM DS8700, which had both best relative log playback and database latency in their group of storage systems. The IBM DS8700 was configured to support 20,000 mailboxes.

Group 2 represents four storage systems of which two are the HDS USP-V (the previous generation of HDS VSP enterprise class storage) and the other two HDS AMS 2100 storage systems (the previous generation entry-level, mid-range class storage systems from HDS). The two USP-V systems were configured for 96,000 and 32,000 mailboxes respectively. The two AMS 2100 storage systems were configured to support 17,000 and 5,800 mailboxes respectively.

Group 3 represents a single storage system and is the HP4400 EVA configured with 6000 mailboxes.

The November ESRP Performance Dispatch identified top storage system performers for both the 1,001 to 5000 and 1000 and under mailbox categories as well.  More performance information and ChampionCharts for ESRP, SPC-1 and SPC-2 are  available in our SAN Storage Buying Guide, available for purchase on our website.

~~~~

The complete ESRP performance report went out in SCI’s November 2012 newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our ESRP  performance reports or any of our other storage performance analyses

Posted in ESRP, ESRPv3/Exchange 2010, Storage performance, System effectiveness | Tagged , , , , , , , , , , , | Leave a comment