Since our last blog post on this subject there have been 6 new entries in LRT Top 10 (#3-6 &, 9-10). As can be seen here which combines SPC-1 and 1/E results, response times vary considerably. 7 of these top 10 LRT results come from subsystems which either have all SSDs (#1-4, 7 & 9) or have a large NAND cache (#5). The newest members on this chart were the NetApp 3270A and the Xiotech Emprise 5000-300GB disk drives which were published recently.
The NetApp FAS3270A, a mid-range subsystem with 1TB of NAND cache (512MB in each controller) seemed to do pretty well here with all SSD systems doing better than it and a pair of all SSD systems doing worse than it. Coming in under 1msec LRT is no small feat. We are certain the NAND cache helped NetApp achieve their superior responsiveness.
What the Xiotech Emprise 5000-300GB storage subsystem is doing here is another question. They have always done well on an IOPs/drive basis (see SPC-1&-1/E results IOPs/Drive – chart of the month) but being top ten in LRT had not been their forte, previously. How one coaxes a 1.47 msec LRT out of a 20 drive system that costs only ~$41K, 12X lower than the median price(~$509K) of the other subsystems here is a mystery. Of course, they were using RAID 1 but so were half of the subsystems on this chart.
The full performance dispatch will be up on our website in a couple of weeks but if you are interested in seeing it sooner just sign up for our free monthly newsletter (see upper right) or subscribe by email and we will send you the current issue with download instructions for this and other reports.
As always, we welcome any constructive suggestions on how to improve our storage performance analysis.
Dell and Compellent may be a great match because Compellent uses commodity hardware combined with specialized software to create their storage subsystem. If there’s any company out there that can take advantage of commodity hardware it’s probably Dell. (Of course Commodity hardware always loses in the end, but that’s another story).
Similarly, Dell’s EqualLogic iSCSI storage system uses commodity hardware to provide its iSCSI storage services. It doesn’t take a big leap of imagination to have one storage system that combines the functionality of EqualLogic’s iSCSI and Compellent’s FC storage capabilities. Of course there are others already doing this including Compellent themselves which have their own iSCSI support already built into their FC storage system.
Which way to integrate?
Does EqualLogic survive such a merger? I think so. It’s easy to imagine that Equal Logic may have the bigger market share today. If that’s so, the right thing might be to merge Compellent FC functionality into EqualLogic. If Compellent has the larger market, the correct approach is the opposite. The answer lies probably with a little of both. It seems easiest to add iSCSI functionality to a FC storage system than the converse but the FC to iSCSI approach may be the optimum path for Dell, because of the popularity of their EqualLogic storage.
What about NAS?
The only thing missing from this storage system is NAS. Of course the Compellent storage offers a NAS option through the use of a separate Windows Storage Server (WSS) front end. Dell’s EqualLogic does the much the same to offer NAS protocols for their iSCSI system. Neither of these are bad solutions but they are not a fully integrated NAS offering such as available from NetApp and others.
However, there is a little discussed part, the Dell-Exanet acquisition which happened earlier this year. Perhaps the right approach is to integrate Exanet with Compellent first and target this at the high end enterprise/HPC market place, keeping Equal Logic at the SMB end of the marketplace. It’s been a while since I have heard about Exanet, and nothing since the acquisition earlier this year. Does it make sense to backend a clustered NAS solution with FC storage – probably.
Much of this seems doable to me, but it all depends on taking the right moves once the purchase is closed. But if I look at where Dell is weakest (baring their OEM agreement with EMC), it’s in the highend storage space. Compellent probably didn’t have much of a foot print there as possible due to their limited distribution and support channel. A Dell acquisition could easily eliminate these problems and open up this space without having to do much other than start to marketing, selling and supporting Compellent.
In the end, a storage solution supporting clustered NAS, FC, and iSCSI that combined functionality equivalent to Exanet, Compellent and EqualLogic based on commodity hardware (ouch!) could make a formidable competitor to what’s out there today if done properly. Whether Dell could actually pull this off and in a timely manner even if they purchase Compellent, is another question.
Lost in much of the discussions on storage system performance is the need for both throughput and response time measurements.
By IO throughput I generally mean data transfer speed in megabytes per second (MB/s or MBPS), however another definition of throughput is IO operations per second (IO/s or IOPS). I prefer the MB/s designation for storage system throughput because it’s very complementary with respect to response time whereas IO/s can often be confounded with response time. Nevertheless, both metrics qualify as storage system throughput.
By IO response time I mean the time it takes a storage system to perform an IO operation from start to finish, usually measured in milleseconds although lately some subsystems have dropped below the 1msec. threshold. (See my last year’s post on SPC LRT results for information on some top response time results).
Benchmark measurements of response time and throughput
Both Standard Performance Evaluation Corporation’s SPECsfs2008 and Storage Performance Council’s SPC-1 provide response time measurements although they measure substantially different quantities. The problem with SPECsfs2008’s measurement of ORT (overall response time) is that it’s calculated as a mean across the whole benchmark run rather than a strict measurement of least response time at low file request rates. I believe any response time metric should measure the minimum response time achievable from a storage system although I can understand SPECsfs2008’s point of view.
On the other hand SPC-1 measurement of LRT (least response time) is just what I would like to see in a response time measurement. SPC-1 provides the time it takes to complete an IO operation at very low request rates.
In regards to throughput, once again SPECsfs2008’s measurement of throughput leaves something to be desired as it’s strictly a measurement of NFS or CIFS operations per second. Of course this includes a number (>40%) of non-data transfer requests as well as data transfers, so confounds any measurement of how much data can be transferred per second. But, from their perspective a file system needs to do more than just read and write data which is why they mix these other requests in with their measurement of NAS throughput.
Storage Performance Council’s SPC-1 reports throughput results as IOPS and provide no direct measure of MB/s unless one looks to their SPC-2 benchmark results. SPC-2 reports on a direct measure of MBPS which is an average of three different data intensive workloads including large file access, video-on-demand and a large database query workload.
Why response time and throughput matter
Historically, we used to say that OLTP (online transaction processing) activity performance was entirely dependent on response time – the better storage system response time, the better your OLTP systems performed. Nowadays it’s a bit more complex, as some of todays database queries can depend as much on sequential database transfers (or throughput) as on individual IO response time. Nonetheless, I feel that there is still a large component of response time critical workloads out there that perform much better with shorter response times.
On the other hand, high throughput has its growing gaggle of adherents as well. When it comes to high sequential data transfer workloads such as data warehouse queries, video or audio editing/download or large file data transfers, throughput as measured by MB/s reigns supreme – higher MB/s can lead to much faster workloads.
The only question that remains is who needs higher throughput as measured by IO/s rather than MB/s. I would contend that mixed workloads which contain components of random as well as sequential IOs and typically smaller data transfers can benefit from high IO/s storage systems. The only confounding matter is that these workloads obviously benefit from better response times as well. That’s why throughput as measured by IO/s is a much more difficult number to understand than any pure MB/s numbers.
Now there is a contingent of performance gurus today that believe that IO response times no longer matter. In fact if one looks at SPC-1 results, it takes some effort to find its LRT measurement. It’s not included in the summary report.
Also, in the post mentioned above there appears to be a definite bifurcation of storage subsystems with respect to response time, i.e., some subsystems are focused on response time while others are not. I would have liked to see some more of the top enterprise storage subsystems represented in the top LRT subsystems but alas, they are missing.
Call me old fashioned but I feel that response time represents a very important and orthogonal performance measure with respect to throughput of any storage subsystem and as such, should be much more widely disseminated than it is today.
For example, there is a substantive difference a fighter jet’s or race car’s top speed vs. their maneuverability. I would compare top speed to storage throughput and its maneuverability to IO response time. Perhaps this doesn’t matter as much for a jet liner or family car but it can matter a lot in the right domain.
Now do you want your storage subsystem to be a jet fighter or a jet liner – you decide.
Many in the industry will recall that current GMR (Giant Magneto-resistance) heads and TMR (Tunnel magneto-resistance) next generation disk read heads already make use of spintronics to detect magnetized bit values in disk media. GMR heads detect bit values on media by changing its electrical resistance.
Spintronics however can also be used to record data as well as read it. These capabilities are being exploited in MRAM technology which uses a ferro-magnetic material to record data in magnetic spin alignment – spin UP, means 0; spin down, means 1 (or vice versa).
The technologists claim that when MRAM reaches its full potential it could conceivably replace DRAM, SRAM, NAND, and hard disk drives or all current electrical and magnetic data storage. Some of MRAM’s advantages include unlimited write passes, fast reads and writes and data non-volatilility.
MRAM reminds me of old fashioned magnetic core memory (in photo above) which used magnetic polarity to record non-volatile data bits. Core was a memory mainstay in the early years of computing before the advent of semi-conductor devices like DRAM.
Back to future – MRAM
However, the problems with MRAM today are that it is low-density, takes lots of power and is very expensive. But technologists are working on all these problems with the view that the future of data storage will be MRAM. In fact, researchers at the North Carolina State University (NCSU) Electrical Engineering department have been having some success with reducing power requirements and increasing density.
As for data density NCSU researchers now believe they can record data in cells approximating 20 nm across, better than current bit patterned media which is the next generation disk recording media. However reading data out of such a small cell will prove to be difficult and may require a separate read head on top of each cell. The fact that all of this is created with normal silicon fabrication methods make doing so at least feasible but the added chip costs may be hard to justify.
Regarding high power, their most recent design records data by electronically controlling the magnetism of a cell. They are using dilute magnetic semiconductor material doped with gallium maganese which can hold spin value alignment (see the article for more information). They are also using a semiconductor p-n junction on top of the MRAM cell. Apparently at the p-n junction they can control the magnetization of the MRAM cells by applying -5 volts or removing this. Today the magnetization is temporary but they are also working on solutions for this as well.
NCSU researchers would be the first to admit that none of this is ready for prime time and they have yet to demonstrate in the lab a MRAM memory device with 20nm cells, but the feeling is it’s all just a matter of time and lot’s of research.
Fortunately, NCSU has lots of help. It seems Freescale, Honeywell, IBM, Toshiba and Micron are also looking into MRAM technology and its applications.
Let’s see, using electron spin alignment in a magnetic medium to record data bits, needs a read head to read out the spin values – couldn’t something like this be used in some sort of next generation disk drive that uses the ferromagnetic material as a recording medium. Hey, aren’t disks already using a ferromagnetic material for recording media? Could MRAM be fabricated/layed down as a form of magnetic disk media?? Maybe there’s life in disks yet….
We have been using the wi-fi iPad for just under 6 months now and I have a few suggestions to make it even easier to use.
Aside from the problem with lack of Flash support there are a few things that would make websurfing easier on the iPad:
Tabbed windows option – I use tabbed windows on my desktop/laptop all the time but for some reason on the iPad Apple chose to use a grid of distinct windows accessible via a Safari special purpose icon. While this approach probably makes a lot of sense for the iPhone/iPod, there is little reason to only do this on the iPad. There is ample screen real-estate to show tabs selectable with the touch of a finger. As it is now, it takes two touches to select an alternate screen for web browsing, not to mention some time to paint the thumbnail screen when you have multiple web pages open.
Non-mobile mode – It seems that many websites nowadays detect whether one is accessing a web page from a mobile device or not and as such, shrink their text/window displays to accommodate their much smaller display screen. With the iPad this shows up as a wasted screen space and takes more than necessary screen paging to get to data that retrievable on a single screen with a desktop/laptop. Not sure whether the problem is in the web server or the iPad’s signaling what device it is, however it seems to me that if the iPad/Safari app could signal to web servers that it is a laptop/small-desktop, web browsing could be better.
There are a number of Apps freely available on the iPhone/iPod that are not available on the iPad without purchase. For some reason, I find I can’t live without some of these:
Clock app – On the iPhone/iPod I use the clock app at least 3 times a day. I time my kids use of video games, my own time to having to do something, how much time I am willing/able to spend on a task, and myriad other things. It’s one reason why I keep the iPhone on my body or close by whenever I am at home. I occasionally use the clock app as a stop watch and a world clock but what I really need on the iPad is a timer of some sort. I really have been unable to find an equivalent app for the iPad that matches the functionality of the iPhone/iPod Clock app.
Calculator app – On the iPhone/iPod I use the calculator sporadically, mostly when I am away from my desktop/office (probably because I have a calculator on my desk). However, I don’t have other calculators that are easily accessible throughout my household and having one on the iPad would just make my life easier. BTW, I ended up purchasing a calculator app that Apple says is equal to the iPhone Calc App which works fine but it should have come free.
Weather app – This is probably the next most popular app on my iPhone. I know this information is completely available on the web, but by the time I have to enter the url/scan my bookmarks it takes at least 3-4 touches to get the current weather forecast. By having the Weather app available on the iPhone it takes just one touch to get this same information. I believe there is some way to transform a web page into an app icon on the iPad but this is not the same.
IOS software tweaks
There are some things I think could make IOS much better from my standpoint and I assume all the stuff in IOS 4.2 will be coming shortly so I won’t belabor those items:
File access – This is probably heresy but, I would really like a way to be able to cross application boundaries to access all files on the iPad. That is, have something besides Mail, iBook and Pages be able to access PDF file, and Mail, Photo, and Pages/Keynote be able to access photos. Specifically, some of the FTP upload utilities should be able to access any file on the iPad. Not sure where this belongs but there should be some sort of data viewer at the IOS level that can allow access to any file on the iPad.
Dvorak soft keypad – Ok, maybe I am a bit weird, but I spent the time and effort to learn the Dvorak keyboard layout to be able to type faster and would like to see this same option available for the iPad soft keypad. I currently use Dvorak with the iPad’s external BT keyboard hardware but I see no reason that it couldn’t work for the soft keypad as well.
Widgets – The weather app discussed above looks to me like the weather widget on my desktop iMac. It’s unclear why IOS couldn’t also support other widgets so that the app developers/users could easily create use their desktop widgets on the iPad.
iPad hardware changes
There are some things that scream out to me for hardware changes.
Ethernet access – I have been burned before and wish not to be burned again but some sort of adaptor that would allow an Ethernet plug connection would make the tethered iPad a much more complete computing platform. I don’t care if such a thing comes as a BlueTooth converter or has to use the same plug as the power adaptor but having this would just make accessing the internet (under some circumstances) that much easier.
USB access – This just opens up another whole dimension to storage access and information/data portability that is sorely missing from the iPad. It would probably need some sort of “file access” viewer discussed above but it would make the iPad much more extensible as a computing platform.
Front facing camera – I am not an avid user of FaceTime (yet) but if I were, I would really need a front camera on the iPad. Such a camera would also provide some sort of snapshot capability with the iPad (although a rear facing camera would make more sense for this). In any event, a camera is a very useful device to record whiteboard notes, scan paper documents, and record other items of the moment and even a front-facing one could do this effectively.
Solar panels – Probably off the wall, but having to lug a power adaptor everywhere I go with the iPad is just another thing to misplace/loose. Of course, when traveling to other countries, one also needs a plug adaptor for each country as well. It seems to me having some sort of solar panel on the back or front could provide adequate power to charge the iPad would be that much simpler.
Well that’s about it for now. We are planning on taking a vacation soon and we will be taking both a laptop and the iPad (because we can no longer live without it). I would rather just leave the laptop home but can’t really do that given my problems in the past with the iPad. Some changes described above could make hauling the laptop on vacation a much harder decision.
As for how the iPad fares on the beach, I will have to let you know…
It’s all about delivering value to the end user. If one can deliver equivalent value with commodity hardware than possible with special purpose hardware then obviously commodity hardware wins – no question about it.
But, and it’s a big BUT, when some company invests in special purpose hardware, they have an opportunity to deliver better value to their customers. Yes it’s going to be more expensive on a per unit basis but that doesn’t mean it can’t deliver commensurate benefits to offset that cost disadvantage.
Look around, one sees special purpose hardware everywhere. For example, just checkout Apple’s iPad, iPhone, and iPod just to name a few. None of these would be possible without special, non-commodity hardware. Yes, if one disassembles these products, you may find some commodity chips, but I venture, the majority of the componentry is special purpose, one-off designs that aren’t readily purchase-able from any chip vendor. And the benefits it brings, aside from the coolness factor, is significant miniaturization with advanced functionality. The popularity of these products proves my point entirely – value sells and special purpose hardware adds significant value.
One may argue that the storage industry doesn’t need such radical miniaturization. I disagree of course, but even so, there are other more pressing concerns worthy of hardware specialization, such as reduced power and cooling, increased data density and higher IO performance, to name just a few. Can some of this be delivered with SBB and other mass-produced hardware designs, perhaps. But I believe that with judicious selection of special purposed hardware, the storage value delivered along these dimensions can be 10 times more than what can be done with commodity hardware.
Special purpose HW cost and development disadvantages denied
The other advantage to commodity hardware is the belief that it’s just easier to develop and deliver functionality in software than hardware. (I disagree, software functionality can be much harder to deliver than hardware functionality, maybe a subject for a different post). But hardware development is becoming more software like every day. Most hardware engineers do as much coding as any software engineer I know and then some.
Then there’s the cost of special purpose hardware but ASIC manufacturing is getting more commodity like every day. Several hardware design shops exist that sell off the shelf processor and other logic one can readily incorporate into an ASIC and Fabs can be found that will manufacture any ASIC design at a moderate price with reasonable volumes. And, if one doesn’t need the cost advantage of ASICs, use FPGAs and CPLDs to develop special purpose hardware with programmable logic. This will cut engineering and development lead-times considerably but will cost commensurably more than ASICs.
Do we ever stop innovating?
Probably the hardest argument to counteract is that over time, commodity hardware becomes more proficient at providing the same value as special purpose hardware. Although this may be true, products don’t have to stand still. One can continue to innovate and always increase the market delivered value for any product.
If there comes a time when further product innovation is not valued by the market than and only then, does commodity hardware win. However, chairs, cars, and buildings have all been around for many years, decades, even centuries now and innovation continues to deliver added value. I can’t see where the data storage business will be any different a century or two from now…
I was talking with another cloud storage gateway provider today and I asked them if they do any sort of backup for data sent to the cloud. His answer disturbed me – they said they depend on backend cloud storage providers replication services to provide data protection – sigh. Curtis and I have written about this before (see my Does Cloud Storage need Backup? post and Replication is not backup by W. Curtis Preston).
Cloud replication is not backup
Cloud replication is not data protection for anything but hardware failures! Much more common than hardware failures is mistakes by end-users who inadvertently delete files, overwrite files, corrupt files, or systems that corrupt files any of which would just be replicated in error throughout the cloud storage multi-verse. (In fact, cloud storage itself can lead to corruption see Eventual data consistency and cloud storage).
Replication does a nice job of covering a data center or hardware failure which leaves data at one site inaccessible but allows access to a replica of the data from another site. As far as I am concerned there’s nothing better than replication for these sorts of DR purposes but it does nothing for someone deleting the wrong file. (I one time did a “rm * *” command on a shared Unix directory – it wasn’t pretty).
Some cloud storage (backend) vendors delay the deletion of blobs/containers until sometime later as one solution to this problem. By doing this, the data “stays around” for “sometime” after being deleted and can be restored via special request to the cloud storage vendor. The only problem with this is that “sometime” is an ill-defined, nebulous concept which is not guaranteed/specified in any way. Also, depending on the “fullness” of the cloud storage, this time frame may be much shorter or longer. End-user data protection cannot depend on such a wishy-washy arrangement.
Other solutions to data protection for cloud storage
One way is to have a local backup of any data located in cloud storage. But this kind of defeats the purpose of cloud storage and has the cloud data being stored both locally (as backups) and remotely. I suppose the backup data could be sent to another cloud storage provider but someone/somewhere would need to support some sort of versioning to be able to keep multiple iterations of the data around, e.g., 90 days worth of backups. Sounds like a backup package front-ending cloud storage to me…
Another approach is to have the gateway provider supply some sort of backup internally using the very same cloud storage to hold various versions of data. As long as the user can specify how many days or versions of backups can be held this works great, as cloud replication supports availability in the face of hardware failures and multiple versions support availability in the face of finger checks/logical corruptions.
This problem can be solved in many ways, but just using cloud replication is not one of them.
Listen up folks, whenever you think about putting data in the cloud, you need to ask about backups among other things. If they say we only offer data replication provided by the cloud storage backend – go somewhere else. Trust me, there are solutions out there that really backup cloud data.
Above one can see a chart from our September SPECsfs2008 Performance Dispatch displaying the scatter plot of NFS Throughput Operations/Second vs. number of disk drives in the solution. Over the last month or so there has been a lot of Twitter traffic on the theory that benchmark results such as this and Storage Performance Council‘s SPC-1&2 are mostly a measure of the number of disk drives in a system under test and have little relation to the actual effectiveness of a system. I disagree.
As proof of my disagreement I offer the above chart. On the chart we have drawn a linear regression line (supplied by Microsoft Excel) and displayed the resultant regression equation. A couple of items to note on the chart:
Regression Coefficient – Even though there are only 37 submissions which span anywhere from 1K to over 330K NFS throughput operations a second, we do not have a perfect correlation (R**2=~0.8 not 1.0) between #disks and NFS ops.
Superior systems exist – Any of the storage systems above the linear regression line have superior effectiveness or utilization of their disk resources than systems below the line.
As one example, take a look at the two circled points on the chart.
The one above the line is from Avere Systems and is a 6-FXT 2500 node tiered NAS storage system which has internal disk cache (8-450GB SAS disks per node) and an external mass storage NFS server (24-1TB SATA disks) for data with each node having a system disk as well, totaling 79 disk drives in the solution. The Avere system was able to attain ~131.5K NFS throughput ops/sec on SPECsfs2008.
The one below the line is from Exanet Ltd., (recently purchased by Dell) and is an 8-ExaStore node clusterd NAS system which has attached storage (576-146GB SAS disks) as well as mirrored boot disks (16-73GB disks) totaling 592 disks drives in the solution. They were only able to attain ~119.6K NFS throughput ops/sec on the benchmark.
Now the two systems respective architectures were significantly different but if we just count the data drives alone, Avere Systems (with 72 data disks) was able to attain 1.8K NFS throughput ops per second per data disk spindle and Exanet (with 576 data disks) was able to attain only 0.2K NFS throughput ops per second per data disk spindle. A 9X difference in per drive performance for the same benchmark.
As far as I am concerned this definitively disproves the contention that benchmark results are dictated by the number of disk drives in the solution. Similar comparisons can be seen looking horizontally at any points with equivalent NFS throughput levels.
Rays reading: NAS system performance is driven by a number of factors and the number of disk drives is not the lone determinant of benchmark results. Indeed, one can easily see differences in performance of almost 10X on a throughput ops per second per disk spindle for NFS storage without looking very hard.
We would contend that similar results can be seen for block and CIFS storage benchmarks as well which we will cover in future posts.
The full SPECsfs2008 performance report will go up on SCI’s website next month in our dispatches directory. However, if you are interested in receiving this sooner, just subscribe by email to our free newsletter and we will send you the current issue with download instructions for this and other reports.
As always, we welcome any suggestions on how to improve our analysis of SPECsfs2008 performance information so please comment here or drop us a line.