CTERA, Cloud NAS on steroids

We attended SFD22 last week and one of the presenters was CTERA, (for more information please see SFD22 videos of their session) discussing their enterprise class, cloud NAS solution.

We’ve heard a lot about cloud NAS systems lately (see our/listen to our GreyBeards on Storage podcast with LucidLink from last month). Cloud NAS systems provide a NAS (SMB, NFS, and S3 object storage) front-end system that uses the cloud or onprem object storage to hold customer data which is accessed through the use of (virtual or hardware) caching appliances.

These differ from file synch and share in that Cloud NAS systems

  • Don’t copy lots or all customer data to user devices, the only data that resides locally is metadata and the user’s or site’s working set (of files).
  • Do cache working set data locally to provide faster access
  • Do provide NFS, SMB and S3 access along with user drive, mobile app, API and web based access to customer data.
  • Do provide multiple options to host user data in multiple clouds or on prem
  • Do allow for some levels of collaboration on the same files

Although admittedly, the boundary lines between synch and share and Cloud NAS are starting to blur.

CTERA is a software defined solution. But, they also offer a whole gaggle of hardware options for edge filers, ranging from smart phone sized, 1TB flash cache for home office user to a multi-RU media edge server with 128TB of hybrid disk-SSD solution for 8K video editing.

They have HC100 edge filers, X-Series HCI edge servers, branch in a box, edge and Media edge filers. These later systems have specialized support for MacOS and Adobe suite systems. For their HCI edge systems they support Nutanix, Simplicity, HyperFlex and VxRail systems.

CTERA edge filers/servers can be clustered together to provide higher performance and HA. This way customers can scale-out their filers to supply whatever levels of IO performance they need. And CTERA allows customers to segregate (file workloads/directories) to be serviced by specific edge filer devices to minimize noisy neighbor performance problems.

CTERA supports a number of ways to access cloud NAS data:

  • Through (virtual or real) edge filers which present NFS, SMB or S3 access protocols
  • Through the use of CTERA Drive on MacOS or Windows desktop/laptop devices
  • Through a mobile device app for IOS or Android
  • Through their web portal
  • Through their API

CTERA uses a, HA, dual redundant, Portal service which is a cloud (or on prem) service that provides CTERA metadata database, edge filer/server management and other services, such as web access, cloud drive end points, mobile apps, API, etc.

CTERA uses S3 or Azure compatible object storage for its backend, source of truth repository to hold customer file data. CTERA currently supports 36 on-prem and in cloud object storage services. Customers can have their data in multiple object storage repositories. Customer files are mapped one to one to objects.

CTERA offers global dedupe, virus scanning, policy based scheduled snapshots and end to end encryption of customer data. Encryption keys can be held in the Portals or in a KMIP service that’s connected to the Portals.

CTERA has impressive data security support. As mentioned above end-to-end data encryption but they also support dark sites, zero-trust authentication and are DISA (Defense Information Systems Agency) certified.

Customer data can also be pinned to edge filers, Moreover, specific customer (director/sub-directorydirectories) data can be hosted on specific buckets so that data can:

  • Stay within specified geographies,
  • Support multi-cloud services to eliminate vendor lock-in

CTERA file locking is what I would call hybrid. They offer strict consistency for file locking within sites but eventual consistency for file locking across sites. There are performance tradeoffs for strict consistency, so by using a hybrid approach, they offer most of what the world needs from file locking without incurring the performance overhead of strict consistency across sites. For another way to do support hybrid file locking consistency check out LucidLink’s approach (see the GreyBeards podcast with LucidLink above).

At the end of their session Aron Brand got up and took us into a deep dive on select portions of their system software. One thing I noticed is that the portal is NOT in the data path. Once the edge filers want to access a file, the Portal provides the credential verification and points the filer(s) to the appropriate object and the filers take off from there.

CTERA’s customer list is very impressive. It seems that many (50 of WW F500) large enterprises are customers of theirs. Some of the more prominent include GE, McDonalds, US Navy, and the US Air Force.

Oh and besides supporting potentially 1000s of sites, 100K users in the same name space, and they also have intrinsic support for multi-tenancy and offer cloud data migration services. For example, one can use Portal services to migrate cloud data from one cloud object storage provider to another.

They also mentioned they are working on supplying K8S container access to CTERA’s global file system data.

There’s a lot to like in CTERA. We hadn’t heard of them before but they seem focused on enterprise’s with lots of sites, boatloads of users and massive amounts of data. It seems like our kind of storage system.


SMB2.2 (CIFS) screams over InfiniBand

Microsoft MVP Summit 2010 by David McCarter (cc) (From Flickr)
Microsoft MVP Summit 2010 by David McCarter (cc) (From Flickr)

I missed the MVP summit last month in Redmond, but I heard there was some more discussion of the Server Message Block v2.2 (SMB2.2, also known previously as CIFS) coming in Windows Server (R) 8.

The big news is SMB2.2 now supports RDMA and can use InfiniBand (announced at SNIA Developer Conference last fall). It also supports RDMA over Ethernet via RoCE (see my Intel buys Qlogic’s Infiniband post) and iWARP.

SMB2.2 over InfiniBand performance

As reported last fall at the SNIA Developer Conference SMB2.2 using RDMA over InfiniBand reached over 3.7GB/sec with no server configuration changes using two QDR cards and 160K IOPs (the IOPs are from an SQLIO run using 8KB IOs, not SPECsfs2008). The pre-beta, SMB2.2 code was running on commodity server hardware using 32Gbps InfiniBand links. I couldn’t find any performance numbers with ROCE or iWARP but I would suspect running on 10GbE these would be much slower than InfiniBand.

Hints are that performance gets even better with the released versions of the code coming out in Windows Server 8.

SMB2.2 gets even faster than NFS

We have noted in the past that SMB (CIFS) on average, shows better throughput (IOPS) performance than NFS in SPECsfs2008 results (for example, see our latest Chart-of-the-Month post on SPECsfs results). However, those results were all at best SMB2 or even SMB1 results, and commonly using Ethernet links.

NFS already supports InfiniBand but I am unsure whether it makes use of RDMA. Nevertheless, the significant speed up shown here for SMB2.2 will potentially take SPECsfs2008 SMB2.2 performance up to a whole new level.

Why InfiniBand?

As you may recall, InfinBand is primarily deployed as a server to server interface and used extensively in the past for high performance computing environments. However nowadays, we find storage clusters, such as EMC Isilon, HP X9000 (Ibrix), IBM XIV and others using InfiniBand for their inter-node communications. The use of InfiniBand in these storage clusters is probably due primarily to its superior latency over Ethernet.

But InfiniBand has another advantage, fast data throughput, when using RDMA it can transfer data faster than almost any other networking protocol alive today. SMB2.2 takes advantage of this throughput boost by using RDMA only for large blocks of data and avoiding it for smaller blocks of data. Not sure what the cutoff is, but this would certainly help in large SQL database queries, disk copies, and any other large file data transfer operations.

Of course with 56Gbps FDR InfiniBand available today and faster transfer rates coming (see IBTA roadmap), there appears to be every reason to believe that superior throughput performance will continue at least for the foreseeable future. Better latency is also certain to be retained as well

Now that Intel’s pushing it, Mellanox continuing to push Infiniband and storage cluster’s using it more frequently, we may start to see more storage protocols supporting it.

We thought that FC only had Ethernet to worry about, with SMB2.2 moving to InfiniBand, NFS already supporting it, can a fully functional FCoIB be far behind?


New file system capacity tool – Microsoft’s FSCT

Filing System by BinaryApe (cc) (from Flickr)
Filing System by BinaryApe (cc) (from Flickr)

Jose Barreto blogged about a recent report Microsoft did on File Server Capacity Tool (FSCT) results (blog here, report here).  As you may know FSCT is a free tool released in September of 2009, available from Microsoft that verifies a SMB (CIFS) and/or SMB2 storage server configuration.

The FSCT can be used by anyone to verify that a SMB/SMB2 file server configuration can adequately support a particular number of users, doing typical Microsoft Office/Window’s Explorer work with home folders.

Jetstress for SMB file systems?

FSCT reminds me a little of Microsoft’s Jetstress tool used in the Exchange Solution Review Program (ESRP) which I have discussed extensively in prior blog posts (search my blog) and other reports (search my website).  Essentially, FSCT has a simulated “home folder” workload which can be dialed up or down by the number of users selected.  As such, it can be used to validate any NAS system which supports SMB/SMB2 or CIFS protocol.

Both Jetstress and FSCT are capacity verification tools.  However, I look at all such tools as a way of measuring system performance for a solution environment and FSCT is no exception.

Microsoft FSCT results

In Jose’s post on the report he discusses performance for five different storage server configurations running anywhere from 4500 to 23,000 active home directory users, employing white box servers running Windows (Storage) Server 2008 and 2008 R2 with various server hardware and SAS disk configurations.

Network throughput ranged from 114 to 650 MB/sec. Certainly respectable numbers and somewhat orthogonal to the NFS and CIFS throughput operations/second reported by SPECsfs2008.  Unclear if FSCT reports activity in an operations/second.

Microsoft ‘s FSCT reports did not specifically state what the throughput was other than at the scenario level.  I assume Network throughput that Jose reported was extracted concurrently with the test run from something akin to Perfmon.  FSCT seems to only report performance or throughput as the number of home folder scenarios sustainable per second and the number of users.  Perhaps there is an easy way to convert user scenarios to network throughput?

While the results for the file server runs looks interesting, I always want more. For whatever reason, I have lately become enamored with ESRPs log playback results (see my latest ESRP blog post) and it’s not clear whether FSCT reports anything similar to this.  Something like file server simulated backup performance would suffice from my perspective.


Despite that, another performance tool is always of interest and I am sure my readers will want to take a look as well.  The current FSCT tester can be downloaded here.

Not sure whether Microsoft will be posting vendor results for FSCT similar to what they do for Jetstress via ESRP but that would be a great next step.  Getting the vendors onboard is another problem entirely.  SPECsfs2008 took almost a year to get the first 12 (NFS) submissions and today, almost 9 months later there are still only ~40 NFS and ~20 CIFS submissions.


Latest SPECsfs2008 CIFS performance – chart of the month

Above we reproduce a chart from our latest newsletter StorInttm Dispatch on SPECsfs(R) 2008 benchmark results.  This chart shows the top 10 CIFS throughput benchmark results as of the end of last year.  As observed in the chart Apple’s Xserve running Snow Leopard took top performance with over 40K CIFS throughput operations per second.  My problem with this chart is that there are no enterprise class systems represented in the top 10 or for that matter (not shown in the above) in any CIFS result.

Now some would say it’s still early yet in the life of the 2008 benchmark but it has been out now for 18 months and still has not a single enterprise class system submission reported.  Possibly, CIFS is not considered an enterprise class protocol but I can’t believe that given the proliferation of Windows.  So what’s the problem?

I have to believe it’s part tradition, part not wanting to look bad, and part just lack of awareness on the part of CIFS users.

  • Traditionally, NFS benchmarks were supplied by SPECsfs and CIFS benchmarks were supplied elsewhere, i.e., NetBenc. However, there never was a central repository for NetBench results so comparing system performance was cumbersome at best.  I believe that’s one reason for SPECsfs’s CIFS benchmark.  Seeing the lack of a central repository for a popular protocol, SPECsfs created their own CIFS benchmark.
  • Performance on system benchmarks are always a mixed bag.  No-one wants to look bad and any top performing result is temporary until the next vendor comes along.  So most vendors won’t release a benchmark result unless it shows well for them.  Not clear if Apple’s 40K CIFS ops is a hard number to beat, but it’s been up there for quite awhile now, and has to tell us something.
  • CIFS users seem to be aware and understand NetBench but don’t have similar awareness on SPECsfs CIFS benchmark yet.  So, given today’s economic climate, any vendor wanting to impress CIFS customers would probably choose to ignore SPECsfs and spend their $s on NetBench.  The fact that comparing results was neigh impossible, could be considered an advantage for many vendors.

So SPECsfs CIFS just keeps going on.  One way to change this dynamic is to raise awareness.  So as more IT staff/consultants/vendors discuss SPECsfs CIFS results, its awareness will increase.  I realize some of  my analysis on CIFS and NFS performance results doesn’t always agree with the SPECsfs party line, but we all agree that this benchmark needs wider adoption.  Anything that can be done to facilitate that deserves my (and their) support.

So for all my storage admins, CIOs and other influencers of NAS system purchases friends out there, you need to start asking to about SPECsfs CIFS benchmark results.  All my peers out their in the consultant community, get on the bandwagon.  As for my friends in the vendor community, SPECsfs CIFS benchmark results should be part of any new product introduction.  Whether you want to release results is and always will be, a marketing question but you all should be willing to spend the time and effort to see how well new systems perform on this and other benchmarks.

Now if I could just get somebody to define an iSCSI benchmark, …

Our full report on the latest SPECsfs 2008 results including both NFS and CIFS performance, will be up on our website later this month.  However, you can get this information now and subscribe to future newsletters to receive the full report even earlier, just email us at SubscribeNews@SilvertonConsulting.com?Subject=Subscribe_to_Newsletter.