Exchange 2010/ESRP 3.0 results – chart of the month

(c) 2010 Silverton Consulting, Inc., All Rights Reserved
(c) 2010 Silverton Consulting, Inc., All Rights Reserved

Well after last months performance reversals and revelations we now return to the more typical review of the latest Exchange Solution Review Program (ESRP 3.0) for Exchange 2010 results.  Microsoft’s new Exchange 2010 has substantially changed the efficiency and effectiveness of Exchange database I/O.  This will necessitate a new round of ESRP results for all vendors to once again show what their storage can support in Exchange 2010 mail users.  IBM was the first vendor to take this on with their XIV and SVC results.  But within the last quarter EMC and HP also submitted results.  This marks our first blog review of ESRP 3.0 results.

We show here a chart on database latency for current ESRP 3.0 results.  The three lines for each subsystem show the latency in milliseconds for a ESE database read, database write and log write.  In prior ESRP reviews, one may recall that write latency was impacted by the Exchange redundancy in use.  In this chart all four subsystems were using database availability group redundancy (DAG) so write activity should truly show subsystem overhead and not redundancy options.

Unclear why IBM’s XIV showed up so poorly here.  The HP EVA 8400 is considered a high end subsystem but all the rest are midrange.  If one considered drives being used – the HP used 15Krpm FC disk drives, the SVC used 15Krpm SAS drives and both the CLARiiON and the XIV used 7.2Krpm SATA drives.  Still doesn’t explain the poor showing yet.

Of course the XIV had the heaviest user mail workload at 40,000 user mailboxes being simulated and it did perform relatively better from a normalized database transactions perspective (not shown).  Given all this perhaps this XIV submission was intended to show the top end of what the XIV could do from a mailbox count level rather than latency.

Which points up one failing in our analysis. In past ESRP reviews we have always split results into one of three categories <1Kmbx, 1001..5Kmbx, and >5Kmbx.  As ESRP 3.0 is so new there are only 4 results to date and as such, we have focused only on “normalized” quantities in our full newsletter analysis and here.  We believe database latency should not “normally” be impacted by the count of mail users being simulated and must say we are surprised by XIVs showing because of this.  But in all fairness, it sustained 8 80 times the workload that the CLARiiON did.

Interpreting ESRP 3.0 results

As discussed above all 4 tested subsystems were operating with database availability group (DAG) redundancy and as such, 1/2 of the simulated mail user workload was actually being executed on a subsystem while the other 1/2 was being executed as if it were a DAG copy being updated on the subsystem under test.  For example, the #1 HP EVA configuration requires 2-8400s to sustain a real 9K mailbox configuration with DAG in operation.  Such a configuration would support 2 mailbox databases (with 4500 mailboxes each), one active mailbox database residing on each 8400 and the inactive copy of this database residing on it’s brethern.  (Naturally, the HP ESRP submission also supported VSS shadow copies for the DAGs which added yet another wrinkle to our comparisons.)

A couple of concerns simulating DAGs in this manner:

  • Comparing DAG and non-DAG ESRP results will be difficult at best.  It’s unclear to me whether all future ESRP 3.0 submissions will be required to use DAGs or not.  But if not, comparing DAG to non-DAG results will be almost meaningless.
  • Vendors could potentially perform ESRP 3.0 tests with less server and storage hardware. By using DAGs, the storage under test need only endure 1/2 the real mail server I/O workload and 1/2 a DAG copy workload.  The other half of this workload simulation may not actually be present as it’s exactly equivalent to the first workload.
  • Hard to determine if all the hardware was present or only half.  It’s unclear from a casual  skimming of the ESRP report whether all the hardware was tested or not.
  • 1/2 the real mail server I/O is not the same as 1/2 the DAG copy workload. As such, it’s unclear whether 1/2 the proposed configuration could actually sustain a non-DAG version of an equivalent user mailbox count.

All this makes for exciting times in interpreting current and future ESRP 3.0 results.  Look for more discussion on future ESRP results in about a quarter from now.

As always if you wish to obtain a free copy of our monthly Storage Intelligence newsletter please drop us a line. The full report on ESRP 3.0 results will be up on the dispatches section of our website later this month.

Is M and A the only way to grow?

Photograph of Women Working at a Bell System Telephone Switchboard by US National Archives (cc) (from flickr)
Photograph of Women Working at a Bell System Telephone Switchboard by US National Archives (cc) (from flickr)

Oracle buys Sun, EMC buys Data Domain, Cisco buys Tandberg, it seems like every month another major billion dollar acquisition occurs.  Part of this is because of the recent economic troubles, which now values many companies at the lowest they have been for many years and thus, making it cheaper to acquire good (and/or failing) companies.  But one has to wonder is this the only way to grow?

I don’t think so.

Corporate growth can be purely internally driven or organic just as well as from acquisition.  But it’s definitely harder to do internally.  Why?

  • Companies are focused on current revenue producing products – Revolutionary products rarely make it into development in today’s corporations because they take resources away from other (revenue producing) products.
  • Companies are focused on their current customer base – Products that serve other customers rarely make out into the market from today’s corporations because such markets are foreign to the companies current marketing channels.
  • Company personnel understand current customer problems – To be successful, any new product must address it’s customer pain points and offer some sort of a unique, differentiated solution to those issues and because this takes understanding other customer problems, it seldom happens.
  • New products can sometimes threaten old product revenue streams – It’s a rare new product that doesn’t take market share aware from some old way of doing business.  As companies focus on a particular market, any new product development will no doubt focus on those customers as well.  Thus, many new internally developed products will often displace (or eat away at) current product revenue.  Early on, it’s hard to see how any such product can be justified with respect to current corporate revenue.
  • New products often take efforts above and beyond current product activities – To develop, market and sell revolutionary products takes enormous, “all-out” efforts to get off the ground.  Most corporations are unable to sustain this level of effort for long, as their startup phase was long ago and long forgotten.

We now know how hard it can be but how does Apple do it?  The iPod and iPhone were revolutionary products (at least from Apple’s perspective) and yet they both undeniably became great successes and helped to redefine industries in the process.  And no one can argue that they haven’t helped Apple to grow significantly in the process.  So how can this be done?

  • It takes strong visionary leadership in the company at the highest level – Such management can make the tough decisions to take resources away from current, revenue producting products and devote time and effort to new ones.
  • It takes marketing genius – Going after new markets, even if they are adjacent, requires in-depth understanding of new market dynamics and total engagement to be succesful.
  • It takes development genius – Developing entirely new products, even if based on current technology, takes development expertise above and beyond evolutionary product enhancement.
  • It takes hard work and a dedicated team – Getting new products off the ground takes a level of effort above and beyond current ongoing product activities.
  • It takes a willingness to fail – Most new internally developed products and/or startups fail.  This fact can be hard to live with and makes justifying future products even harder.

In general, all these items are easier to find in startups rather than an ongoing corporation today.  This is why most companies today find it easier and more successful to grow through acquisitions rather than through organic or internal development.

However, it’s not the only way.  ATT did it for almost a century in the telecom industry but they owned a monopoly.  IBM and HP did it occasionally over the past 60 years or so, but they had strong visionary leadership for much of that time and stumbled miserably, when such leadership was lacking.  Apple has done it over the past couple of decades or so but this is mainly due to Steve Jobs.  There are others of course, but I would venture to say all had strong leadership at the helm.

But these are the exceptions.  Strong visionary leaders usually don’t make it to the top of today’s corporations.  Why that’s the case needs to be the subject of a future post…

Latest SPC-2 results – chart of the month

SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput
SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput

Latest SPC-2 (Storage Performance Council-2) benchmark resultschart displaying the top ten in aggregate MBPS(TM) broken down into Large File Processing (LFP), Large Database Query (LDQ) and Video On Demand (VOD) throughput results. One problem with this chart is that it really only shows 4 subsystems: HDS and their OEM partner HP; IBM DS5300 and Sun 6780 w/8GFC at RAID 5&6 appear to be the same OEMed subsystem; IBM DS5300 and Sun 6780 w/ 4GFC at RAID 5&6 also appear to be the same OEMed subsystem; and IBM SVC4.2 (with IBM 4700’s behind it).

What’s interesting about this chart is what’s going on at the top end. Both the HDS (#1&2) and IBM SVC (#3) seem to have found some secret sauce for performing better on the LDQ workload or conversely some dumbing down of the other two workloads (LFP and VOD). According to the SPC-2 specification

  • LDQ is a workload consisting of 1024KiB and 64KiB transfers whereas the LFP consists of 1024KiB and 256KiB transfers and the VOD consists of only 256KiB, so transfer size doesn’t tell the whole story.
  • LDQ seems to have a lower write proportion (1%) while attempting to look like joining two tables into one, or scanning data warehouse to create output whereas, LFP processing has a read rate of 50% (R:W of 1:1) while executing a write-only phase, read-write phase and a read-only phase, and apparently VOD has a 100% read only workload mimicking streaming video.
  • 50% of the LDQ workload uses 4 I/Os outstanding and the remainder 1 I/O outstanding. The LFP uses only 1 I/O outstanding and VOD uses only 8 I/Os outstanding.

These seem to be the major differences between the three workloads. I would have to say that some sort of caching sophistication is evident in the HDS and SVC systems that is less present in the remaining systems. And I was hoping to provide some sort of guidance as to what that sophistication looked like but

  • I was going to say they must have a better sequential detection algorithm but the VOD, LDQ and LFP workloads have 100%, 99% and 50% read ratios respectively and sequential detection should perform better with VOD and LDQ than LFP. So thats not all of it.
  • Next I was going to say it had something to do with I/O outstanding counts. But VOD has 8 I/Os outstanding and the LFP only has 1, so the if this were true VOD should perform better than LFP. While LDQ having two sets of phases with 1 and 4 I/Os outstanding should have results somewhere in between these two. So thats not all of it.
  • Next I was going to say stream (or file) size is an important differentiator but “Segment Stream Size” for all workloads is 0.5GiB. So that doesn’t help.

So now I am a complete loss as to understand why the LDQ workloads are so much better than the LFP and VOD workload throughputs for HDS and SVC.

I can only conclude that the little write activity (1%) thrown into the LDQ mix is enough to give the backend storage a breather and allow the subsystem to respond better to the other (99%) read activity. Why this would be so much better for the top performers than the remaining results is not entirely evident. But I would add that, being able to handle lots of writes or lots of reads is relatively straight forward, but handling a un-ballanced mixture is harder to do well.

To validate this conjecture would take some effort. I thought it would be easy to understand what’s happening but as with most performance conundrums the deeper you look the more confounding the results often seem to be.

The full report on the latest SPC results will be up on my website later this year but if you want to get this information earlier and receive your own copy of our newsletter – email me at SubscribeNews@SilvertonConsulting.com?Subject=Subscribe_to_Newsletter.

I will be taking the rest of the week off so Happy Holidays to all my readers and a special thanks to all my commenters. See you next week.

ESRP results over 5K mbox-chart of the month

ESRP Results, over 5K mailboxr, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009
ESRP Results, over 5K mailbox, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009

In our quarterly study on Exchange Solution Reviewed Program (ESRP) results we show a number of charts to get a good picture of storage subsystem performance under Exchange workloads. The two that are of interest to most data centers are both the normalized and un-normalized database transfer (DB xfer) charts. The problem with un-normalized DB xfer charts is that the subsystem supporting the largest mailbox count normally shows up best, and the rest of the results are highly correlated to mailbox count. In contrast, the normalized view of DB xfers tends to discount high mailbox counts and shows a more even handed view of performance.

 

We show above a normalized view of ESRP results for the category that were available last month. A couple of caveats are warranted here:

  • Normalized results don’t necessarily scale – results shown in the chart range from 5,400 mailboxes (#1) to 100,000 mailboxes (#6). While normalization should allow one to see what a storage subsystem could do for any mailbox count. It is highly unlikely that one would configure the HDS AMS2100 to support 100,000 mailboxes and it is equally unlikely that one would configure the HDS USP-V to support 5,400 mailboxes.
  • The higher count mailbox results tend to cluster when normalized – With over 20,000 mailboxes, one can no longer just use one big Exchange server and given the multiple servers driving the single storage subsystem, results tend to shrink when normalized. So one should probably compare like mailbox counts rather than just depend on normalization to iron out the mailbox count differences.

There are a number of storage vendors in this Top 10. There are no standouts here, the midrange systems from HDS, HP, and IBM seem to hold down the top 5 and the high end subsystems from EMC, HDS, and 3PAR seem to own the bottom 5 slots.

However, Pillar is fairly unusual in that their 8.5Kmbx result came in at #4 and their 12.8Kmbx result came in at #8. In contrast, the un-normalized results for this chart appear exactly the same. Which brings up yet another caveat, when running two benchmarks with the same system, normalization may show a difference where none exists.

The full report on the latest ESRP results will be up on our website later this month but if you want to get this information earlier and receive your own copy of our newsletter – just subscribe by emailing us.

Ibrix reborn as HP X9000 Network Storage

HP X9000 appliances pictures from HP(c) presentation
HP X9000 appliances pictures from HP(c) presentation

On Wednesday 4 November, HP announced a new network storage system based on the Ibrix Fusion file system called the X9000. Three versions were announced:

  • X9300 gateway appliance which can be attached to SAN storage (HP EVA, MSA, P4000, or 3rd party SAN storage) and provides scale out file system services
  • X9320 performance storage appliance which includes a fixed server gateway and storage configuration in one appliance targeted at high performance application environments
  • X9720 extreme storage appliance using blade servers for file servers and separate storage in one appliance but can be scaled up (with additional servers and storage) as well as out (by adding more X9720 appliances) to target more differentiated application environments

The new X9000 appliances support a global name space of 16PB by adding additional X9000 network storage appliances to a cluster. The X9000 supports a distributed metadata architecture which allows the system to scale performance by adding more storage appliances.

X9000 Network Storage appliances

With the X9300 gateway appliance, storage can be increased by adding more SAN arrays. Presumably, multiple gateways can be configured to share the same SAN storage creating a highly available file server node. The gateway can be configured to support the following Gige, 10Gbe, and/or QDR (40gb/s) Infiniband interfaces for added throughput.

The Extreme appliance (X9720) comes with 82 TB in the starting configuration and storage can be increased by in 82TB raw capacity block increments (7u-1/2rack wide/35*2 drive enclosures + 1-12 drive tray for each capacity block) up to a maximum of 656TB in two rack (42U) configuration. Capacity blocks are connected to the file servers via 3gb SAS, and the X9720 includes a SAS switch as well as two ProCurve 10Gbe ethernet switches. Also, file system performance can be scaled by independently adding performance blocks, essentially C-class HP blade servers. The starter configuration includes 3 performance blocks (blades) but up to 8 can be added to one X9720 appliance.

For the X9320 scale out appliance, performance and capacity are fixed in a 12U rack mountable appliance that includes 2-X9300 gateways and 21.7TB SAS or 48TB SATA raw storage per appliance. The X9320 comes with either GigE or 10Gbe attachments for added performance. The 10Gbe version supports up to 700MB/s raw potential throughput per gateway (node).

X9000 capabilities

All these systems have separate, distinct internal-like storage devoted to O/S, file server software and presumably metadata services. In the X9300 and X9320 storage, this internal storage is packaged in the X9300 gateway server itself. In the X9720, presumably this internal storage is configured via storage blades in the blade server cabinet which would need to be added with each performance block.

All X9000 storage is now based on the Fusion file system technology acquired by HP from Ibrix, an acquisition which closed this summer. Ibrix’s Fusion file system provided a software only implementation of a distributed (or segmented) metadata serviced file system which allowed the product to scale out performance and/or capacity, independently by adding appropriate hardware.

HP’s X9000 supports both NFS and CIFS interfaces. Moreover, a\Advanced storage features such as continuous remote file replication, snapshot, high availability (with two or more gateways/performance blocks), and automated policy driven data tiering also come with the X9000 Network Storage system. In additition, file data is automatically re-distributed across all nodes in X9000 appliance to ballance storage performance across nodes. Every X9000 Network Storage system requires a separate management server to manage the X9000 Network Storage nodes but one server can support the whole 16PB name space.

I like the X9720 and look forward to seeing some performance benchmarks on what it can do. In the past Ibrix never released a SPECsfs(tm) benchmark, presumably because they were a software only solution. But now that HP has instantiated it with top-end hardware there seems to be no excuse to providing benchmark comparisons.

Full disclosure: I have an current contract with another group within HP StorageWorks, not associated with HP X9000 storage.

Repositioning of tape

HP LTO 4 Tape Media
HP LTO 4 Tape Media
In my past life, I worked for a dominant tape vendor. Over the years, we had heard a number of times that tape was dead. But it never happened. BTW, it’s also not happening today.

Just a couple of weeks ago, I was at SNW and vendor friend of mine asked if I knew anyone with tape library expertise because they were bidding on more and more tape archive opportunities. Tape seems alive and kicking for what I can see.

However, the fact is that tape use is being repositioned. Tape is no longer the direct target for backups that it once was. Most backup packages nowadays backup to disk and then later, if at all, migrate this data to tape (D2D2T). Tape is being relegated to a third tier of storage, a long-term archive and/or a long term backup repository.

The economics of tape are not hard to understand. You pay for robotics, media and drives. Tape, just like any removable media requires no additional power once it’s removed from the transport/drive used to write it. Removable media can be transported to an offsite repository or accross the continent. There it can await recall with nary an ounce (volt) of power consumed.

Problems with tape

So what’s wrong with tape, why aren’t more shops using it. Let me count the problems

  1. Tape, without robotics, requires manual intervention
  2. Tape, because of its transportability, can be lost or stolen, leading to data security breaches
  3. Tape processing, in general, is more error prone than disk. Tape can have media and drive errors which cause data transfer operations to fail
  4. Tape is accessed sequentially, it cannot be randomly accessed (quickly) and only one stream of data can be accepted per drive
  5. Much of a tape volume is wasted, never written space
  6. Tape technology doesn’t stay around forever, eventually causing data obsolescence
  7. Tape media doesn’t last forever, causing media loss and potentially data loss

Likely some other issues with tape missed here, but these seem the major ones from my perspective.

It’s no surprise that most of these problems are addressed or mitigated in one form or another by the major tape vendors, software suppliers and others interested in continuing tape technology.

Robotics can answer the manual intervention, if you can afford it. Tape encryption deals effectively with stolen tapes, but requires key management somewhere. Many applications exist today to help predict when media will go bad or transports need servicing. Tape data, is and always will be, accessed sequentially, but then so is lot’s of other data in today’s IT shops. Tape transports are most definitely single threaded but sophisticated applications can intersperse multiple streams of data onto that single tape. Tape volume stacking is old technology, not necessarily easy to deploy outside of some sort of VTL front-end, but is available. Drive and media technology obsolescence will never go away, but this indicates a healthy tape market place.

Future of tape

Say what you will about Ultrium or the Linear Tape-Open (LTO) technology, made up of HP, IBM, and Quantum research partners, but it has solidified/consolidated the mid-range tape technology. Is it as advanced as it could be, or pushing to open new markets – probably not. But they are advancing tape technology providing higher capacity, higher performance and more functionality over recent generations. And they have not stopped, Ultrium’s roadmap shows LTO-6 right after LTO-5 and delivery of LTO-5 at 1.6TB uncompressed capacity tape, is right around the corner.

Also IBM and Sun continue to advance their own proprietary tape technology. Yes, some groups have moved away from their own tape formats but that’s alright and reflects the repositioning that’s happening in the tape marketplace.

As for the future, I was at an IEEE magnetics meeting a couple of years back and the leader said that tape technology was always a decade behind disk technology. So the disk recording heads/media in use today will likely see some application to tape technology in about 10 years. As such, as long as disk technology advances, tape will come out with similar capabilities sometime later.

Still, it’s somewhat surprising that tape is able to provide so much volumetric density with decade old disk technology, but that’s the way tape works. Packing a ribbon of media around a hub, can provide a lot more volumetric storage density than a platter of media using similar recording technology.

In the end, tape has a future to exploit if vendors continue to push its technology. As a long term archive storage, it’s hard to beat its economics. As a backup target it may be less viable. Nonetheless, it still has a significant install base which turns over very slowly, given the sunk costs in media, drives and robotics.

Full disclosure: I have no active contracts with LTO or any of the other tape groups mentioned in this post.

The price of quality

At HPTechDay this week we had a tour of the EVA test lab, in the south building of HP’s Colorado Springs Facility. I was pretty impressed and I have seen more than my fair share of labs in my day.

Tony Green HP's EVA Lab Manager
Tony Green HP's EVA Lab Manager
The fact that they have 1200 servers and 500 EVA arrays was pretty impressive but they also happen to have about 20PB of storage over that 500 arrays. In my day a couple of dozen arrays and a 100 or so servers seemed to be enough to test a storage subsystem.

Nowadays it seems to have increased by an order of magnitude. Of course they have sold something like 70,000 EVAs over the years and some of these 500 arrays happen to be older subsystems used to validate problems and debug issues for current field population.

Another picture of the EVA lab with older EVAs
Another picture of the EVA lab with older EVAs

They had some old Compaq equipment there but I seem to have flubbed the picture of that equipment. This one will have to suffice. It seems to have both vertically and horizontally oriented drive shelves. I couldn’t tell you which EVAs these were but as they were earlier in the tour, I figured they were older equipment. It seemed as you got farther into the tour you moved closer to the current iterations of EVA. It seemed like an archive dig in reverse instead of having the most current layers/levels first they were last.

I asked Tony how many FC ports he had and he said it was probably easiest to count the switch ports and double them but something in the thousands seemed reasonable.

FC switch rack with just a small selection of switch equipment
FC switch rack with just a small selection of switch equipment

There were parts of the lab which were both off limits to cameras and to bloggers which was deep into the bowels of the lab. But we were talking about some of the remote replication support that EVA had and how they tested this over distance. Tony said they had to ship their reel of 100 miles of FC up north (probably for some other testing) but he said they have a surragate machine which can be programmed to create the proper FC delay to meet any required distances.

FC delay generator box
FC delay generator box

The blue box in the adjacent picture seemed to be this magic FC delay inducer box. Had interesting lights on it.

Nigel Poulton of Ruptured Monkeys and Devang Panchigar of StorageNerve Blog were also on the tour taking pictures&video. You can barely make out Devang in the picture next to Nigel. Calvin Zito from HP StorageWorks Blog was also on tour but not in any of my pictures.

Nigel and Devang (not pictured) taking videos on EVA lab tour
Nigel and Devang (not pictured) taking videos on EVA lab tour

Throughout our tour of the lab I can say I only saw one logic analyzer although I am sure there were plenty more in the off limits area.

Lonely logic analyzer in EVA lab
Lonely logic analyzer in EVA lab
During HPTechDay they hit on the topic of storage-server convergence and the use of commodity, X86 hardware for future storage systems. From the lack of logic analyzers I would have to concur with this analysis.

Nonetheless, I saw some hardware workstations although this was another lonely workstation sorrounded in a sea of EVAs.

Hardware workstation in the EVA lab, covered in parts and HW stuff
Hardware workstation in the EVA lab, covered in parts and HW stuff
Believe it or not I actually saw one stereo microscope but failed to take a picture of it. Yet another indicator of hardware descent and my inadequacies as a photographer.

One picture of an EVA obviously undergoing some error injection test with drives tagged as removed and being rebuilt or reborn as part of RAID testing.

Drives tagged for removal during EVA test
Drives tagged for removal during EVA test
In my day we would save particularly “squirrelly drives” from the field and use them to verify storage subsystem error handling. I would bet anything these tagged drives had specific error injection points used to validate EVA drive error handling.

I could go on and I have a couple of more decent lab pictures but you get the jist of the tour.

For some reason I enjoy lab tours. You can tell a lot about an organization by how their labs look, how they are manned, organized and set up. What HP’s EVA lab tells me is that they spare no expense to insure their product is literally bulletproof, bug proof, and works every time for their customer base. I must say I was pretty impressed.

At the end of HPTechDay event Greg Knieriemen of Storage Monkeys and Stephen Foskett of GestaltIT hosted an InfoSmack podcast to be broadcast next Sunday 10/4/2009. There we talked a little more on commodity hardware versus purpose built storage subsystem hardware, it was a brief, but interesting counterpoint to the discussions earlier in the week and the evidence from our portion of the lab tour.

Why SO/CFS, Why Now

Why all the interest in Scale-out/Cluster File Systems (SO/CFS) and why now?

Why now is probably easiest to answer, valuations are down. NetApp is migrating GX to their main platform, IBM continues to breath life in GPFS, HP buys IBRIX, and now LSI buys ONStor. It seems every day brings some new activity with scale out/cluster file system products. Interest seems to be based on the perception that SO/CFS would make a good storage backbone/infrastructure for Cloud Computing. But this takes some discussion…

What can one do with a SO/CFS.

  • As I see it SO/CFS provides a way to quickly scale out and scale up NAS system performance. This doesn’t mean that file data can be in multiple locations/sites or that files can be supplied across the WAN but file performance can be scaled independently of file storage.
  • What seems even more appealing is the amount of data/size of the file systems supported by SO/CFS systems. It seems like PBs of storage can be supported and served up as millions of files. Now that sounds like something useful to Cloud environments if one could front end it with some Cloud enabled services.

So why aren’t they taking off because low valuations signal to me they aren’t doing well. I think today few end-users need to support millions of files, PBs of data or the performance these products could sustain. Currently, their main market is the high performance computing (HPC) labs but there are only so many physic/genomic labs out there that need this much data/performance.

That’s where the cloud enters the picture. Cloud’s promise is that it can aggregate everybody’s computing and storage demand into a service offering where 1,000s of user can login from the internet and do their work. With 1,000s of users each with 1,000s files, we now start to talk in the million file range.

Ok, so if the cloud market is coming, then maybe SO/CFS’s has some lasting/broad appeal. One can see preliminary cloud services emerging today especially in backup services such as Mozy or Norton Online Backup (see Norton Online Backup) but not many cloud services exist today with general purpose/generic capabilities, Amazon notwithstanding. If the Cloud market takes time to develop, then buying into SO/CFS technology while it’s relatively cheap and early in its adoption cycle makes sense.

There are many ways to supply cloud storage. Some companies have developed their own brand new solutions here, EMC/Atmos and DataDirect Network/WOS (see DataDirect Network WOS) seem most prominent. Many others exist, toiling away to address this very same market. Which of these solutions survive/succeed in the Cloud market is an open question that will take years to answer.