I understand the rationale behind EMC’s purchase of Isilon scale out NAS technology for big data applications. More and more data is being created every day and most of that unstructured. How can one begin to support multiple PBs of file data that’s coming online in the next couple of years without scale out NAS. Scale out NAS has the advantage that within the same architecture one can scale from TBs to PBs of file storage by just adding storage and/or accessor nodes. Sounds great.
Isilon for backup storage?
But what’s surprising to me is the use of Isilon NL-Series storage in more mundane applications like Database backup. A couple of weeks ago I wrote a post on how Oracle RMAN compressed backups don’t dedupe very well. The impetus for that post was that a very large enterprise customer I was talking with had just started deploying Isilon NAS systems in their backup environment to handle non-dedupable data. The customer was backing up PB of storage, a good portion of which was non-dedupable, and as such, they planed to use Isilon Systems to store this data.
I had never seen scale out NAS systems used for backup storage so I was intrigued to find out why. Essentially, this customer was in the throws of replacing tape and between deduplication appliances and Isilon storage they believed they had the solutions to eliminate tape forever from their backup systems.
All this begs the question where does EMC put Isilon – with Celerra and other storage platforms, with Atmos and other cloud services, or with Data Domain and other backup systems? It seems one could almost break out the three Isilon storage systems and split them into these three business groups but given Isilon’s flexibility it probably belongs in storage platforms.
However, I would think that BRS would have an immediate market requirement for Isilon’s NL-Series storage to complement it’s other backup systems. I guess we will know shortly where EMC puts it – until then it’s anyone’s guess.
On Wednesday 4 November, HP announced a new network storage system based on the Ibrix Fusion file system called the X9000. Three versions were announced:
X9300 gateway appliance which can be attached to SAN storage (HP EVA, MSA, P4000, or 3rd party SAN storage) and provides scale out file system services
X9320 performance storage appliance which includes a fixed server gateway and storage configuration in one appliance targeted at high performance application environments
X9720 extreme storage appliance using blade servers for file servers and separate storage in one appliance but can be scaled up (with additional servers and storage) as well as out (by adding more X9720 appliances) to target more differentiated application environments
The new X9000 appliances support a global name space of 16PB by adding additional X9000 network storage appliances to a cluster. The X9000 supports a distributed metadata architecture which allows the system to scale performance by adding more storage appliances.
X9000 Network Storage appliances
With the X9300 gateway appliance, storage can be increased by adding more SAN arrays. Presumably, multiple gateways can be configured to share the same SAN storage creating a highly available file server node. The gateway can be configured to support the following Gige, 10Gbe, and/or QDR (40gb/s) Infiniband interfaces for added throughput.
The Extreme appliance (X9720) comes with 82 TB in the starting configuration and storage can be increased by in 82TB raw capacity block increments (7u-1/2rack wide/35*2 drive enclosures + 1-12 drive tray for each capacity block) up to a maximum of 656TB in two rack (42U) configuration. Capacity blocks are connected to the file servers via 3gb SAS, and the X9720 includes a SAS switch as well as two ProCurve 10Gbe ethernet switches. Also, file system performance can be scaled by independently adding performance blocks, essentially C-class HP blade servers. The starter configuration includes 3 performance blocks (blades) but up to 8 can be added to one X9720 appliance.
For the X9320 scale out appliance, performance and capacity are fixed in a 12U rack mountable appliance that includes 2-X9300 gateways and 21.7TB SAS or 48TB SATA raw storage per appliance. The X9320 comes with either GigE or 10Gbe attachments for added performance. The 10Gbe version supports up to 700MB/s raw potential throughput per gateway (node).
All these systems have separate, distinct internal-like storage devoted to O/S, file server software and presumably metadata services. In the X9300 and X9320 storage, this internal storage is packaged in the X9300 gateway server itself. In the X9720, presumably this internal storage is configured via storage blades in the blade server cabinet which would need to be added with each performance block.
All X9000 storage is now based on the Fusion file system technology acquired by HP from Ibrix, an acquisition which closed this summer. Ibrix’s Fusion file system provided a software only implementation of a distributed (or segmented) metadata serviced file system which allowed the product to scale out performance and/or capacity, independently by adding appropriate hardware.
HP’s X9000 supports both NFS and CIFS interfaces. Moreover, a\Advanced storage features such as continuous remote file replication, snapshot, high availability (with two or more gateways/performance blocks), and automated policy driven data tiering also come with the X9000 Network Storage system. In additition, file data is automatically re-distributed across all nodes in X9000 appliance to ballance storage performance across nodes. Every X9000 Network Storage system requires a separate management server to manage the X9000 Network Storage nodes but one server can support the whole 16PB name space.
I like the X9720 and look forward to seeing some performance benchmarks on what it can do. In the past Ibrix never released a SPECsfs(tm) benchmark, presumably because they were a software only solution. But now that HP has instantiated it with top-end hardware there seems to be no excuse to providing benchmark comparisons.
Full disclosure: I have an current contract with another group within HP StorageWorks, not associated with HP X9000 storage.
Earlier this week Symantec GA’ed their Veritas FileStore software. This software was an outgrowth of earlier Symantec Veritas Cluster File System and Storage Foundation software which were combined with new frontend software to create scaleable NAS storage.
FileStore is another scale-out, cluster file system (SO/CFS) implemented as NAS head via software. The software runs on a hardened Linux OS and can run on any commodity x86 hardware. It can be configured with up to 16 nodes. Also, it currently supports any storage supported by Veritas Storage Foundation which includes FC, iSCSI, and JBODs. Symantec claims FileStoreo has the broadest storage hardware compatibility list in the industry for a NAS head.
As a NAS head FileStore supports NFS, CIFS, HTTP, and FTP file services and can be configured to support anywhere from under a TB to over 2PB of file storage. Currently FileStore can support up to 200M files per file system, up to 100K file systems, and over 2PB of file storage.
FileStore nodes work in an Active-Active configuration. This means any node can fail and the other, active nodes will take over providing the failed node’s file services. Theoretically this means that in a 16 node system, 15 nodes could fail and the lone remaining node could continue to service file requests (of course performance would suffer considerably).
As part of cluser file system, FileStore support quick failover of active nodes. This can be accomplished in under 20 seconds. In addition, FileStore supports asynchronous replication to other FileStore clusters to support DR and BC in the event of a data center outage.
One of the things that FileStore brings to the table is that as it’s running standard Linux O/S services. This means other Symantec functionality can also be hosted on FileStore nodes. The first Symantec service to be co-hosted with FileStore functionality is NetBackup Advanced Client services. Such a service can have the FileStore node act as a media server for it’s own backup cutting network traffic required to do a backup considerably.
FileStore also supports storage tiering whereby files can be demoted and promoted between storage tiers in the multi-volume file system. Also, Symantec EndPoint Protection can be hosted on a FileStore node provided anti-virus protection completely onboard. Other Symantec capabilities will soon follow to add to the capabilities already available.
FileStore’s NFS performance
Regarding performance, Symantec has submitted a 12 node FileStore system for SPECsfs2008 NFS performance benchmark. I looked today to see if it was published yet and it’s not available but they claim to currently be the top performer for SPECsfs2008 NFS operations. I asked about CIFS and they said they had yet to submit one. Also they didn’t mention what the backend storage looked like for the benchmark, but one can assume it had lots of drives (look to the SPECsfs2008 report whenever it’s published to find out).
In their presentation they showed a chart depicting FileStore performance scaleability. According to this chart, at 16 nodes, the actual NFS Ops performance was 93% of theoretical NFS Ops performance. In my view, scaleability is great but often as you approach some marginal utility as the number of nodes increases, the net performance improvement decreases. The fact that they were able to hit 93% with 16 nodes of what a linear extrapolation of NFS ops performance was from 2 to 8 nodes is pretty impressive. (I asked to show the chart but hadn’t heard back by post time
Pricing and market space
At the lowend, FileStore is meant to compete with Windows Storage Server and would seem to provide better performance and availability versus Windows. At the high end, I am not sure but the competition would be with HP/PolyServe and standalone NAS heads from EMC and NetApp/IBM and others. List pricing is about US$7K/node and that top performing SPECsfs2008 12-node system would set you back about $84K for the software alone (please note that list pricing <> street pricing). You would need to add node hardware and the storage hardware to provide a true apples-to-apples pricing comparison with other NAS storage.
As far as current customers they range from large from the high end (>1PB) E-retailers to SAAS providers (Symantec SAAS offering), and at the low end (<10TB) universities and hospitals. FileStore with it’s inherent scaleability and ability to host storage applications from Symantec on the storage nodes can offer a viable solution to many hard file system problems.
We have discussed scale-out and cluster file systems (SO/CFS) in a prior post (Why SO/CFS, Why Now) so I won’t elaborate on why they are so popular today. But, suffice it to say Cloud and SAAS will need SO/CFS to be viable solutions and everybody is responding to supply that market as it emerges.
Full disclosure: I currently have no active or pending contracts with Symantec.