Object store and hybrid clouds at Cloudian

IMG_4364Out of Japan comes another object storage provider called Cloudian.  We talked with Cloudian at Storage Field Day 7 (SFD7) last month in San Jose (see the videos of their presentations here).

Cloudian history

Cloudian has been out on the market since March of 2011 but we haven’t heard much about them, probably because their focus has been East Asia.  The same day that the  Tōhoku Earthquake and Tsunami hit the company announced Cloudian, an Amazon S3 Compliant Multi-Tenant Cloud Storage solution.

Their timing couldn’t have been better. Japanese IT organizations were beating down their door over the next two years for a useable and (earthquake and tsunami) resilient storage solution.

Cloudian spent the next 2 years, hardening their object storage system, the HyperStore, and now they are ready to take on the rest of the world.

Currently Cloudian has about 20PB of storage under management and are shipping a HyperStore Appliance or a software only distribution of their solution. Cloudian’s solutions support S3 and NFS access protocols.

Their solution uses Cassandra, a highly scaleable, distributed NoSQL database which came out of FaceBook for their meta-data database. This provides a scaleable, non-sharable meta-data data base for object meta-data repository and lookup.

Cloudian creates virtual storage pools on backend storage which can be optimized for small objects, replication or erasure coding and can include automatic tiering to any Amazon S3 and Glacier compatible cloud storage. I would guess this is how they qualify for Hybrid Cloud status.

The HyperStore appliance

Cloudian creates a HyperStore P2P ring structure. Each appliance has Cloudian management console services as well as the HyperStore engine which supports three different data stores: Cassandra, Replicas, and Erasure coding. Unlike Scality, it appears as if one HyperStore Ring must exist in a region. But it can be split across data centers. Unclear what their definition of a “region” is.

HyperStore hardware come in entry level (HSA-2024: 24TB/1U), capacity optimized (HSA-2048: 48TB/1U), performance optimized (HSA-2060: all flash, 60TB/2U

Replication with Dynamic Consistency

The other thing that Cloudian supports is different levels of consistency for replicated data. Most object stores support eventual consistency (see Eventual Data Consistency and Cloud Storage post).  HyperStore supports 3 (well maybe 5) different levels of consistency:

  1. One – object written to one replica and committed there before responding to client
  2. Quorum – object written to N/2+1 replicas before responding to client
    1. Local Quorum – replicas are written to N/2+1 nodes in same data center  before responding to client
    2. Each Quorum – replicas are written to N/2+1 nodes in each data center before responding to client.
  3. All – all replicas must have received and committed the object write before responding to client

There are corresponding read consistency levels as well. The objects writes have a “coordinator” node which handles this consistency. The implication is that consistency could be established on an object basis. Unclear to me whether Read and Write dynamic consistency can be different?

Apparently small objects are also stored in the  Cassandra datastore.  That way HyperStore optimizes for object size. Also, HyperStore nodes can be added to a ring and the system will auto balance the data across the old and new nodes automatically.

Cloudian also support object versioning, ACL, and QoS services as well.

~~~

I was a bit surprised by Cloudian. I thought I knew all the object storage solutions out on the market. But then again they made their major strides in Asia and as an on-premises Amazon S3 solution, rather than a generic object store.

For more on Cloudian from SFD7 see:

Cloudian – Storage Field Day 7 Preview by @VirtualizedGeek (Keith Townsend)

Transporter, a private Dropbox in a tower

Move over DropboxBox and all you synch&share wannabees, there’s a new synch and share in town.

At SFD7 last month, we were visiting with Connected Data where CEO, Geoff Barrell was telling us all about what was wrong with today’s cloud storage solutions. In front of all the participants was this strange, blue glowing device. As it turns out, Connected Data’s main product is the File Transporter, which is a private file synch and share solution.

All the participants were given a new, 1TB Transporter system to take home. It was an interesting sight to see a dozen of these Transporter towers sitting in front of all the bloggers.

I was quickly, established a new account, installed the software, and activated the client service. I must admit, I took it upon myself to “claim” just about all of the Transporter towers as the other bloggers were still paying attention to the presentation.  Sigh, they later made me give back (unclaim) all but mine, but for a minute there I had about 10TB of synch and share space at my disposal.

Transporters rule

transporterB2So what is it. The Transporter is both a device and an Internet service, where you own the storage and networking hardware.

The home-office version comes as a 1 or 2TB 2.5” hard drive, in a tower configuration that plugs into a base module. The base module runs a secured version of Linux and their synch and share control software.

As tower power on, it connects to the Internet and invokes the Transporter control service. This service identifies the node, who owns it, and provides access to the storage on the Transporter to all desktops, laptops, and mobile applications that have access to it.

At initiation of the client service on a desktop/laptop it creates (by default) a new Transporter directory (folder). Files that are placed in this directory are automatically synched to the Transporter tower and then synchronized to any and all online client devices that have claimed the tower.

Apparently you can have multiple towers that are claimed to the same account. I personally tested up to 10 ;/ and it didn’t appear as if there was any substantive limit beyond that but I’m sure there’s some maximum count somewhere.

A couple of nice things about the tower. It’s your’s so you can move it to any location you want. That means, you could take it with you to your hotel or other remote offices and have a local synch point.

Also, initial synchronization can take place over your local network so it can occur as fast as your LAN can handle it. I remember the first time I up-synched 40GB to DropBox, it seemed to take weeks to complete and then took less time to down-synch for my laptop but still days of time. With the tower on my local network, I can synch my data much faster and then take the tower with me to my other office location and have a local synch datastore. (I may have to start taking mine to conferences. Howard (@deepstorage.net, co-host on our  GreyBeards on Storage podcast) had his operating in all the subsequent SFD7 sessions.

The Transporter also allows sharing of data. Steve immediately started sharing all the presentations on his Transporter service so the bloggers could access the data in real time.

They call the Transporter a private cloud but in my view, it’s more a private synch and share service.

Transporter heritage

The Transporter people were all familiar to the SFD crowd as they were formerly with  Drobo which was at a previous SFD sessions (see SFD1). And like Drobo, you can install any 2.5″ disk drive in your Transporter and it will work.

There’s workgroup and business class versions of the Transporter storage system. The workgroup versions are desktop configurations (looks very much like a Drobo box) that support up to 8TB or 12TB supporting 15 or 30 users respectively.  The also have two business class, rack mounted appliances that have up to 12TB or 24TB each and support 75 or 150 users each. The business class solution has onboard SSDs for meta-data acceleration. Similar to the Transporter tower, the workgroup and business class appliances are bring your own disk drives.

Connected Data’s presentation

transporterA1Geoff’s discussion (see SFD7 video) was a tour of the cloud storage business model. His view was that most of these companies are losing money. In fact, even Amazon S3/Glacier appears to be bleeding money, although this may not stop Amazon. Of course, DropBox and other synch and share services all depend on cloud storage for their datastores. So, the lack of a viable, profitable business model threatens all of these services in the long run.

But the business model is different when a customer owns the storage. Here the customer owns the actual storage cost. The only thing that Connected Data provides is the client software and the internet service that runs it. Pricing for the 1TB and 2TB transporters with disk drives are $150 and $240.

Having a Transporter

One thing I don’t like is the lack of data-at-rest encryption. They use TLS for data transfers across your LAN and the Internet. But the nice thing about having possession of the actual storage is that you can move it around. But the downside is that you may move it to less secure environments (like conference hotel rooms). And as with the any disk storage, someone can come up to the device and steel the disk. Whether the data would be easily recognizable is another question but having it be encrypted would put that question to rest. There’s some indication on the Transporter support site that encryption may be coming for the business class solution. But nothing was said about the Transporter tower.

On the Mac, the Transporter folder has the shared folders as direct links (real sub-folders) but the local data is under a Transporter Library soft link. It turns out to be a hidden file (“.Transporter Library”) under the Transporter folder. When you Control click on this file your are given the option to view deleted files. You can also do this with shared files as well.

One problem with synch and share services is once someone in your collaboration group deletes some shared files they are gone (over time) from all other group users. Even if some of them wanted them. Transporter makes it a bit easier to view these files and save them elsewhere. But I assume at some point they have to be purged to free up space.

When I first installed the Transporter, it showed up as a network node on my finder shared servers. But the latest desktop version (3.1.17) has removed this.

Also some of the bloggers complained about files seeing files “in flux” or duplicates of the shared files but with unusual file suffixes appended to them, such as ” filename124224_f367b3b1-63fa-4d29-8d7b-a534e0323389.jpg”. Enrico (@ESignoretti) opened up a support ticket on this and it’s supposedly been fixed in the latest desktop and was a temporary filename used only during upload and should have been deleted-renamed after the upload was completed. I just uploaded 22MB with about 40 files and didn’t see any of this.

I really want encryption as I wanted one transporter in a remote office and another in the home office with everything synched locally and then I would hand carry the remote one to the other location. But without encryption this isn’t going to work for me. So I guess I will limit myself to just one and move it around to wherever I want to my data to go.

Here are some of the other blog posts by SFD7 participants on Transporter:

Storage field day 7 – day 2 – Connected Data by Dan Firth (@PenguinPunk)

File Transporter, private Synch&Share made easy by Enrico Signoretti (@ESignoretti)

Transporter – Storage Field Day 7 preview by Keith Townsend (@VirtualizedGeek)

Comments?

Data virtualization surfaces

There’s a new storage startup out of stealth, called Primary Data and it’s implementing data (note, not storage) virtualization.

They already have $60M in funding with some pretty highpowered talent from Fusion IO, namely David Flynn, Rick White and Steve Wozniak (the ‘Woz’)  (also of Apple fame).

There have been a number of attempts at creating a virtualization layers for data namely ViPR (See my post ViPR virtues, vexations but no storage virtualization) but Primary Data is taking a different tack to the problem.

Data virtualization explained

Data hypervisor, software defined storage, data plane, control plane
(c) 2012 Silverton Consulting, Inc. All rights reserved

Essentially they want to separate the data plane from the control plane (See my Data Hypervisor post and comments for another view on this).

  • The data plane consists of those storage system activities that actually perform IO or read and writes.
  • The control plane is those storage system activities that do everything else that has to be done by a storage system, including provisioning, monitoring, and managing the storage.

Separating the data plane from the control plane offers a number of advantages. EMC ViPR does this but it’s data plane is either standard storage systems like VMAX, VNX, Isilon etc, or software defined storage solutions. Primary Data wants to do it all.

Their meta data or control plane engine is called a Data Director which holds information about the data objects that are stored in the Primary Data system, runs a data policy management engine and handles data migration.

Primary Data relies on purpose-built, Data Hypervisor (client) software that talks to Data Directors to understand where data objects reside and how to go about accessing them. But once the metadata information is transferred to the client SW, then IO activity can go directly between the host and the storage system in a protocol independent fashion.

[The graphic above is from my prior post and I assumed the data hypervisor (DH) would be co-located with the data but Primary Data has rightly implemented this as a separate layer in host software.]

Data Hypervisor protocol independence?

As I understand it this means that customers could use file storage, object storage or block storage to support any application requirement. This also means that file data (objects) could be migrated to block storage and still be accessed as file data. But the converse is also true, i.e., block data (objects) could be migrated to file storage and still be accessed as block data. You need to add object, DAS, PCIe flash and cloud storage to the mix to see where they are headed.

All data in Primary Data’s system are object encapsulated and all data objects are catalogued within a single, global namespace that spans file, block, object and cloud storage repositories

Data objects can reside on Primary storage systems, external non-Primary data aware file or block storage systems, DAS, PCIe Flash, and even cloud storage.

How does Data Virtualization compare to Storage Virtualization?

There are a number of differences:

  1. Most storage virtualization solutions are in the middle of the data path and because of this have to be fairly significant, highly fault-tolerant solutions.
  2. Most storage virtualization solutions don’t have a separate and distinct meta-data engine.
  3. Most storage virtualization systems don’t require any special (data hypervisor) software running on hosts or clients.
  4. Most storage virtualization systems don’t support protocol independent access to data storage.
  5. Most storage virtualization systems don’t support DAS or server based, PCIe flash for permanent storage. (Yes this is not supported in the first release but the intent is to support this soon.)
  6. Most storage virtualization systems support internal storage that resides directly inside the storage virtualization system hardware.
  7. Most storage virtualization systems support an internal DRAM cache layer which is used to speed up IO to internal and external storage and is in addition to any caching done at the external storage system level.
  8. Most storage virtualization systems only support external block storage.

There are a few similarities as well:

  1. They both manage data migration in a non-disruptive fashion.
  2. They both support automated policy management over data placement, data protection, data performance, and other QoS attributes.
  3. They both support multiple vendors of external storage.
  4. They both can support different host access protocols.

Data Virtualization Policy Management

A policy engine runs in the Data Directors and provides SLAs for data objects. This would include performance attributes, protection attributes, security requirements and cost requirements.  Presumably, policy specifications for data protection would include RAID level, erasure coding level and geographic dispersion.

In Primary Data, backup becomes nothing more than object snapshots with different protection characteristics, like offsite full copy. Moreover, data object migration can be handled completely outboard and without causing data access disruption and on an automated policy basis.

Primary Data first release

Primary Data will be initially deployed as an integrated data virtualization solution which includes an all flash NAS storage system and a standard NAS system. Over time, Primary Data will add non-Primary Data external storage and internal storage (DAS, SSD, PCIe Flash).

The Data Policy Engine and Data Migrator functionality will be separately charged for software solutions. Data Directors are sold in pairs (active-passive) and can be non-disruptively upgraded. Storage (directors?) are also sold separately.

Data Hypervisor (client) software is available for most styles of Linux, Openstack and coming for ESX. Windows SMB support is not split yet (control plane/data plane) but Primary data does support SMB. I believe the Data Hypervisor software will also be released in an upcoming version of the Linux kernel.

They are currently in testing. No official date for GA but they did say they would announce pricing in 2015.

~~~~

Comments?

Disclosure: We have done work for Primary Data over the past year.

Photo Credits:

  1. Screen shot of beta test system supplied by Primary Data
  2. Graphic created by SCI for prior Data Hypervisor post

Coho Data, hyperloglog and the quest for IO performance

We were at SFD6, last month and Coho Data‘s CTO & Co-Founder, Andy Warfield got up to tell us what’s happening at Coho. (We also met with Andy at SFD4, check out the videolinks to learn more.)

What’s new at Coho Data

Coho Data has been shipping GA product for about 3 quarters and is a simple to use, scale-out, hybrid (SSD & disk) storage system for VMware NFS datastores. Coho Data storage uses Software Defined Networking (SDN) switches to perform faster networking handoffs and optimized data flow across storage nodes. They use standard servers and a SDN switch that can scale from two nodes (micro-arrays) to lots (100 or more?).

Version 2.0 will add remote asynch replication and enhanced API enhancements. We won’t discuss the update anymore but if you want your storage to tweet its messages/alerts check it out. Thank Chris Wahl when you start seeing storage system tweets pollute your  twitter feed.

The highlight of the session, was Andy’s discussion of HyperLogLog, a new approach to understanding customer workloads.

HyperLogLog

Coho Data was designed from the start using Microsoft IO traces (1-week of MSR Cambridge datacenter block IO traces available at SNIA IO Trace repository).  [bold italics added later, ed.] which recorded all IO from 10 But Coho also recorded linux developers developer desktop IO activity for a year, amounting to ~ 1B 7.6B IOs and multi-TBs of data. I just got a call looking for some file activity tracing, so everybody in storage could use more IO traces. But detailed IO traces take up CPU cycles and lot’s of space. HyperLogLogs can solve a portion of this.

Before we go there, a little background. For instance, with a Bloom Filter you can tell whether a block has been referenced or not. In a bloom filter you hash a key, term or whatever multiple times and then OR them into separate bitfields, one per hash. Bloom filters have a small possibility of a false positive (block-id present in filter but was not really in IO stream) but no possibility of a false negative (block-id NOT present in filter but it really was in IO stream). However, bloom filters tells us nothing about how frequently blocks were read.

With a HyperLogLog, one can approximate (within ~2%) how many times a block was referenced. By capturing multiple HyperLogLogs pictures over time, one can determine block access frequency during application processing. Each HyperLogLog trace only occupies ~2 KB, so recording one/hour takes ~50KB/day. The math is beyond me but there’s plenty info online (e.g. here).

HyperLogLog functionality will be included in a future Coho Data update. Coho Data will be implementing what they call “Counter Stacks” which makes use of hyperloglogs in a future release (see Jake Wire’s Usenix Session video/PDF)Once present, Coho Data will save hyperloglog counter stack data, analyze it, and use it to better characterize customer IO with the goal of better optimizing their storage system to actual workloads

For more info please see other SFD6 blogger posts on Coho Data:

~~~~

Now if someone could just develop a super efficient algorithm/storage structure to record block sequences I think we have this licked.

Disclosure statement: I have done work for Coho Data over the last year.

Picture credits: (Lego) Me holding a Coho (Data) Salmon 🙂

Veeam’s upcoming V8 virtues

[Not] Vamoosing VMworld

We were at Storage Field Day 5 (SFD5, see the videos here) last month and had a briefing on Veeam’s upcoming V8 release.

They also told us (news to me) that they are leaving VMworld[I sit corrected, I have been informed after this went to press that Veeam is not leaving VMworld2014, and never said anything about it at the session – My mistake and I take full responsibility, sorry for any confusion] (sigh, now who’s going to have THE after conference, KILLER PARTY at VMworld) and moving to [but they did say they are definitely starting up] their own VeeamON conference at The Cosmopolitan in Las Vegas on October 6,7 & 8 this year. If their VMworld parties are any indication, the conference in the Cosmo should be a fun and rewarding time for all. Pre-registration is open and they have a call out for papers.

Doug Hazelman (@VMDoug), Rick Vanover (@RickVanover) and Luca Dell’Oca (@dellock6) all presented although Luca’s session was under strict NDA to be revealed later. I think sometime later this summer.

Doug mentioned that after 6 years, Veeam now has over 100,000 customers world wide.  One of their more popular, early innovations was the ability to run a VM directly off of a backup and sometime over the past couple of years they have moved from a VMware only backup & replication solution to also supporting Microsoft Hyper-V (more news to me).

V8’s virtues

Veeam V8 will add some interesting capabilities to the Veeam product solutions:

  • (VMware only) Built-in backups from storage snapshots – (Enterprise Plus edition only) Backup from VMware snapshots can sometimes impact app performance, especially when it comes time to commit changes. But with V7, Veeam now offers backup utilizing VMware’s Change Block Tracking (CBT)and taking backups from storage snapshots directly for HP 3PAR StoreServ, HP (Lefthand) StoreVirtual/StoreVirtual VSA and in soon to be available V8, NetApp FAS (Data ONTAP 8.1 or above, 7- or cluster-mode, clones too) storage systems. First Veeam does its application level processing (under Windows Server does VSS operations), after that completes tells VMware to take (a VMware) snapshot, when that completes they tell the storage to take a (storage) snapshot, when that completes they release the VMware snapshot. What all this does is allows them to utilize VMware CBT as well as storage snapshots which makes it up to 20 times faster than normal VMware snapshot backups. This way they can backup directly from the storage snapshot using the Veeam proxy. Also because the VMware snapshot is so short lived there is little overhead for committing any changes.  Also there is no need to use a proxy ESX server to do this, i.e., promote the VMware snapshot to a LUN, add it to an ESX, resignature, add the VM, and do all the backups, which, of course destroys CBT. This works for FC, iSCSI and NFS data stores. With NetApp storage you can also take the (VSS) application consistent snapshot and copy it to SnapVault.
  • Veeam Explorer (recovery) for storage snapshots – (Free backup edition) Recovery from (HP in V7 & NetApp in V8) storage snapshots is yet another feature and provides item (e.g., emails, contacts, email folders for Exchange), granular (VM level or file level) or full (volume) recovery from storage based snapshots, regardless of how those storage snapshots were created.
  • Veeam Explorer for SQL Server (V8 only) – (unsure what license is required) Similar to the Explorer for snapshots discussed above, this would allow a Veeam admin to do item level recovery for an SQL database. This also includes recovery from Veeam Backup repositories as well as storage snapshots. But this means that you could restore a ROW of an SQL table, an SQL TABLE as well as a whole SQL database. Now DBAs always had these sorts of abilities which required using Log services. But allowing a Veeam admin to do these sorts of activities seems like putting a gun in the hands of a child (or maybe a bazooka in the hands of an untrained civilian).
  • Veeam Explorer for Active Directory (V8 only) – (unsure what license is required) You’ve seen whats’ available above and just consider these same capabilities only applied to active directory. This means you can restore a password hash, user, group or organizational unit (OU). I don’t know about you but this seems more akin to a howitzer in the hands of a civilian.

They showed an example of competitive situation where running V8 (in beta?) with NetApp backups using snapshots versus some unnamed competition. They were able to complete a full backup in 1/4 the time of their competition (2hrs. vs. 8hrs.) and completed incremental backups in 35min. vs. 2hrs. for the competition.

“Thar be dragons there …”

Ok, maybe I am a little more paranoid than the average IT guy/gal. But in my (old world, greybeards) view, SQL databases belong in the realm of DBAs and Active Directory databases belong to domain controller admins. Messing around with production versions of SQL DBs or AD DBs seems hazardous to a data centers health. We’re not just talking files anymore here guys.

In Veeam’s defense, these new Explorer recovery tools are only probably going to be used to do something that needs to be done right away, to get things back operating again, and would not be used unless there’s a real need/emergency to do so. Otherwise let the DBA and security admins do it with their log recovery tools.  And another thing, they have had similar capabilities for Exchange emails, folders, contacts, etc. and no ones shot their foot off yet so why the concern.

Nonetheless, I feel strongly that these tools ought to be placed under lock and key and the key put in a safe with the combination under a glass case labeled IN CASE OF EMERGENCY, BREAK GLASS.

Comments.

New SPECsfs2008 CIFS/SMB vs. NFS (again) – chart of the month

SPECsfs2008 benchmark results for CIFS/SMB vs. NFS protocol performance
SCISFS140326-001 (c) 2014 Silverton Consulting, All Rights Reserved

The above chart represents another in a long line of charts on the relative performance of CIFS[/SMB] versus NFS file interface protocols. The information on the chart are taken from vendor submissions that used the same exact hardware configurations for both NFS and CIFS/SMB protocol SPECsfs2008 benchmark submissions.

There are generally two charts I show in our CIFS/SMB vs. NFS analysis, the one above and another that shows a ops/sec per spindle count analysis for all NFS and CIFS/SMB submissions.  Both have historically indicated that CIFS/SMB had an advantage. The one above shows the total number of NFS or CIFS/SMB operations per second on the two separate axes and provides a linear regression across the data. The above shows that, on average, the CIFS/SMB protocol provides about 40% more (~36.9%) operations per second than NFS protocol does with the same hardware configuration.

However, there are a few caveats about this and my other CIFS/SMB vs. NFS comparison charts:

  • The SPECsfs2008 organization has informed me (and posted on their website) that  CIFS[/SMB] and NFS are not comparable.  CIFS/SMB is a stateful protocol and NFS is stateless and the corresponding commands act accordingly. My response to them and my readers is that they both provide file access, to a comparable set of file data (we assume, see my previous post on What’s wrong with SPECsfs2008) and in many cases today, can provide access to the exact same file, using both protocols on the same storage system.
  • The SPECsfs2008 CIFS/SMB benchmark does slightly more read and slightly less write data operations than their corresponding NFS workloads. Specifically, their CIFS/SMB workload does 20.5% and 8.6% READ_ANDX and WRITE_ANDX respectively CIFS commands vs. 18% and 9% READ and WRITE respectively NFS commands.
  • There are fewer CIFS/SMB benchmark submissions than NFS and even fewer with the same exact hardware (only 13). So the statistics comparing the two in this way must be considered preliminary, even though the above linear regression is very good (R**2 at ~0.98).
  • Many of the submissions on the above chart are for smaller systems. In fact 5 of the 13 submissions were for storage systems that delivered less than 20K NFS ops/sec which may be skewing the results and most of which can be seen above bunched up around the origin of the graph.

And all of this would all be wonderfully consistent if not for a recent benchmark submission by NetApp on their FAS8020 storage subsystem.  For once NetApp submitted the exact same hardware for both a NFS and a CIFS/SMB submission and lo and behold they performed better on NFS (110.3K NFS ops/sec) than they did on CIFS/SMB (105.1K CIFS ops/sec) or just under ~5% better on NFS.

Luckily for the chart above this was a rare event and most others that submitted both did better on CIFS/SMB. But I have been proven wrong before and will no doubt be proven wrong again. So I plan to update this chart whenever we get more submissions for both CIFS/SMB and NFS with the exact same hardware so we can see a truer picture over time.

For those with an eagle eye, you can see NetApp’s FAS8020 submission as the one below the line in the first box above the origin which indicates they did better on NFS than CIFS/SMB.

Comments?

~~~~

The complete SPECsfs2008  performance report went out in SCI’s March 2014 newsletter.  But a copy of the report will be posted on our dispatches page sometime next quarter (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

Even more performance information on NFS and CIFS/SMB protocols, including our ChampionCharts™ for file storage can be found in  SCI’s recently (March 2014) updated NAS Buying Guide, on sale from our website.

As always, we welcome any suggestions or comments on how to improve our SPECsfs2008 performance reports or any of our other storage performance analyses.

What’s wrong with SPECsfs2008?

I have been analyzing SPECsfs results now for almost 7 years now and I feel that maybe it’s time for me to discuss some of the t problems with SPECsfs2008 today that should be fixed in the next SPECsfs20xx whenever that comes out.

CIFS/SMB

First and foremost, for CIFS SMB 1 is no longer pertinent to today’s data center. The world of Microsoft has moved on to SMB 2 mostly and are currently migrating to SMB 3.  There were plenty of performance fixes in the last years SMB 3.0 release which would be useful to test with current storage systems. But I would be even be somewhat happy with SMB2 if that’s all I can hope for.

My friends at Microsoft would consider me remiss if I didn’t mention that since SMB 2 they no longer call it CIFS and have moved to SMB. SPECsfs should follow this trend. I have tried to use CIFS/SMB in my blog posts/dispatches as a step in this direction mainly because SPEC continues to use CIFS and Microsoft wants me to use SMB.

In my continuing quest to better compare different protocol performance I believe it would be useful to insure that the same file size distributions are used for both CIFS and NFS benchmarks. Although the current Users Guide discusses some file size information for NFS it is silent when it comes to CIFS. I have been assuming that they were the same because of lack of information but this would be worthy to have confirmed in documentation.

Finally for CIFS, it would be very useful if there could be a closer approximation of the same amount of data transfers that are done for NFS.  This is a nit but when I compare CIFS to NFS storage system results there is a slight advantage to NFS because NFS’s workload definition doesn’t do as much reading as CIFS. In contrast, CIFS has slightly less file data write activity than the NFS benchmark workload. Having them be exactly the same would help in any (unsanctioned) comparisons.

NFSv3

As for NFSv3, although NFSv4 has been out for more than 3 years now, it has taken a long time to be widely adopted. However, these days there seems to be more client and storage support coming online every day and maybe this would be a good time to move on to NFSv4.

The current NFS workloads, while great for the normal file server activities, have not kept pace with much of how NFS is used today especially in virtualized environments. As far as I can tell under VMware NFS data stores don’t do a lot of meta-data operations and do an awful lot more data transfers than normal file servers do. Similar concerns apply to NFS used for Oracle or other databases. Unclear how one could incorporate a more data intensive workload mix into the standard SPECsfs NFS benchmark but it’s worthy of some thought. Perhaps we could create a SPECvms20xx benchmark that would test these types of more data intensive workloads.

For both NFSv3 and CIFs benchmarks

Both the NFSv3 and CIFS benchmarks typically report [throughput] ops/sec. These are a mix of all the meta-data activities and the data transfer activities.  However, I think many storage customers and users would like a finer view of system performance. .

I have often been asked just how many files a storage system actually support. This depends of course on the workload and file size distributions but SPECsfs already defines this. As a storage performance expert, I would also like to know how much data transfer can a storage system support in MB/sec read and written.  I believe both of these metrics can be extracted from the current benchmark programs with a little additional effort. Probably another half dozen metrics that would be useful maybe we could sit down and have an open discussion of what these might be.

Also the world has changed significantly over the last 6 years and SSD and flash has become much more prevalent. Some of your standard configuration tables could be better laid out to insure that readers understand just how much DRAM, flash, SSDs and disk drives are in a configuration.

Beyond file NAS

Going beyond SPECsfs there is a whole new class of storage, namely object storage where there are no benchmarks available. I would think now that Amazon S3 and Openstack Cinder are well defined and available that maybe a new set of SPECobj20xx benchmarks would be warranted. I believe with the adoption of software defined data centers, object storage may become the storage of choice over the next decade or so. If that’s the case then having some a benchmark to measure object storage performance would help in its adoption. Much like the original SPECsfs did for NFS.

Then there’s the whole realm of server SAN or (hyper-)converged storage which uses DAS inside a cluster of compute servers to support block and file services. Not sure exactly where this belongs but NFS is typically the first protocol of choice for these systems and having some sort of benchmark configuration that supports converged storage would help adoption of this new type of storage as well.

I think thats about it for now but there’s probably a whole bunch more that I am missing out here.

Comments?

Securing synch & share data-at-rest

 

1003163361_ba156d12f7Snowden at SXSW said last week that it’s up to the vendors to encrypt customer data. I think he was talking mostly about data-in-flight but there’s just a big an exposure for data-at-rest, maybe more so because then, all the data is available, at one sitting.

iMessage security

A couple of weeks ago there was a TechCrunch article (see Apple Explains Exactly How Secure iMessage Really Is or see the Apple IOS Security document) about Apple’s iMessage security.

The documents said that Apple iMessage uses public key encryption where every IOS/OS X device generates a pair of public and private keys (one for messages and one for signing) which are used to encrypt the data while it is transmitted through Apple’s iMessage service.  Apple encrypts the data on its iMessage App running in the devices with every destination device’s public key before it’s saved on the iMessage server cloud, which can then be decrypted on the device with its private key whenever the message is received by the device.

It’s a bit more complex for longer messages and attachments but the gist is that this data is encrypted with a random key at the device and is saved in encrypted form while residing iMessage servers. This random key and URI is then encrypted with the destination devices public keys which is then stored on the iMessage servers. Once the destination device retrieves the message with an attachment it has the location and the random key to decrypt the attachment.

According to Apple’s documentation when you start an iMessage you identify the recipient, the app retrieves the public keys for all these devices and then it encrypts the message (with each destination device’s public message key) and signs the message (with the originating device’s private signing key). This way Apple servers never see the plain text message and never holds the decryption keys.

Synch & share data security today

As mentioned in prior posts, I am now a Dropbox user and utilize this service to synch various IOS and OSX device file data. Which means a copy of all this synch data is sitting on Dropbox (AWS S3) servers, someplace (possibly multiple places) in the cloud.

Dropbox data-at-rest security is explained in their How secure is Dropbox document. Essentially they use SSL for data-in-flight security and AES-256 encryption with a random key for data-at-rest security.

This probably makes it easier to support multiple devices and perhaps data sharing because they only need to encrypt/save the data once and can decrypt the data on its servers before sending it through (SSL encrypted, of course) to other devices.

The only problem is that Dropbox holds all the encryption keys for all the data that sits on its servers. I (and possibly the rest of the tech community) would much prefer that the data be encrypted at the customer’s devices and never decrypted again except at other customer devices. This would be true end-to-end data security for sync&share

As far as I know from a data-at-rest security perspective Box looks about the same, so does EMC’s Syncplicity, Oxygen Cloud, and probably all the others. There are some subtle differences about how and where the keys are kept and how many security domains exist in each service, but in the end, the service holds the keys to all data that is encrypted on their storage cloud.

Public key cryptography to the rescue

I think we could do better and public key cryptography should show us the way. I suppose it would probably be easiest to follow the iMessage approach and just encrypt all the data with each device’s public key at the time you create/update the data and send it to the service but,

  • That would further delay the transfer of new and updated data to the synch service, also further delaying its availability at other devices linked to the login.
  • That would cause the storage requirement for your sync&share data to be multiplied by the number of devices you wish to synch with.

Synch data-at-rest security

If we just take on the synch side of the discussion first maybe it would be easiest. For example,  if a new public and private key pair for encryption and signing were to be assigned to each new device at login to the service then the service could retain a directory of the device’s public keys for data encryption and signing.

The first device to login to a synch service with a new user-id, would assign a single encryption key for all data to be shared by all devices that could use this login.  As other devices log into the service, the prime device sends the single service encryption key encrypted using the target device’s public key and signing the message with the source device’s private key. Actually any device in the service ring could do this but the primary device could be used to authenticate the new devices login credentials. Each device’s synch service would have a list of all the public keys for all the devices in the “synch” region.

As data is created or updated there are two segments of each file that are created, the AES-256 encrypted data package using the “synch” region’s random encryption key and the signature package, signed by the device doing the creation/update of the file.  Any device could authenticate the signature package at the time it receives a file, as could the service. But ONLY the devices with the AES-256 encryption key would have access to the plain text version of the data.

There are some potential holes in this process, first is that the service could still intercept the random encryption key, at the primary device when it’s created or could retrieve it anytime later at its leisure using the app running in the device. This same exposure exists for the iMessage App running in IOS/OS X devices, the private keys in this instance could be sent to another party at any time. We would need to depend on service guarantees to not do this.

Share data-at-rest security

For Apple’s iMessage attachment security the data is kept in the cloud encrypted by a random key but the key and the URI are sent to the devices when they receive the original message. I suppose this could just as easily work for a file share service but the sharing activity might require a share service app running in the target device to create public-private key pairs and access the file.

Yes this leaves any “shared” data keys being held by the service but it can’t be helped. The data is being shared with others so maybe having it be a little more accessible to prying eyes would be acceptable.

~~~~

I still prefer the iMessage approach, having multiple copies of encrypted shared data, that is encrypted by each device’s public key. It’s simpler this way, a bit more verifiable and doesn’t need to have as much out-of-channel communication (to send keys to other devices).

Yes it would cost more to store any amount of data and would take longer to transmit, but I feel we would all would be willing to support this extra constraints as long as the service guaranteed that private keys were only kept on devices that have logged into the service.

Data-at-rest and -in-flight security is becoming more important these days. Especially since Snowden’s exposure of what’s happening to web data. I love the great convenience of sync&share services, I just wish that the encryption keys weren’t so vulnerable…

Comments?

Photo Credits: Prizon Planet by AZRainman