Cloud storage growth is hurting NAS & SAN storage vendors

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

My friend Alex Teu (@alexteu), from Oxygen Cloud wrote a post today about how Cloud Storage is Eating the World Alive. Alex reports that all major NAS and SAN storage vendors lost revenue this year over the previous year ranging from a ~3% loss to over a 20% loss (Q1-2014 compared to Q1-2013, from IDC).

Although an interesting development, it’s hard to say that this is the end of enterprise storage as we know it.  I believe there are a number of factors that are impacting  enterprise storage revenues and Cloud storage adoption may be only one of them.

Other trends impacting NAS & SAN storage adoption

One thing that has emerged over the last decade or so is the advance of Flash storage. Some of this is used in storage controllers to speed up IO access and some is used in servers to speed up IO access. But any speedup of IO could potentially reduce the need for high-performing disk drives and could allow customers to use higher capacity/slower disk drives instead. This could definitely reduce the cost of storage systems. A little bit of flash goes  long way to speed up IO access.

The other thing is that disk capacity is trending upward, at exponential rates. Yesterday,s 2TB disk drive is todays 4TB disk drive and we are already seeing 6TB from Seagate, HGST and others. And this is also driving down the cost of NAS and SAN storage.

Nowadays you can configure 1PB of storage with just over 170 drives. Somewhere in there you might want a couple 100TB of Flash to speed up IO access to these slow disks but Flash is also coming down in ($/GB) price (see SanDISK’s recent consumer grade TLC drive at $0.44/GB). Also the move to MLC flash has increased the capacity of flash devices, leading to less SSDs/flash cache cards to store/speed up more data.

Finally, the other trend which seems to have emerged recently is the movement away from enterprise class storage to server storage. One can see this in VMware’s VSAN, HyperConverged systems such as Nutanix and Scale Computing, as well as a general trend in Windows Server applications (SQL Server, Exchange Server, etc.) to make better use of DAS storage. So some customers are moving their data to shared DAS storage today, whereas before this was more difficult to accomplish effectively and because of that they previously purchased networked storage.

What about cloud storage?

Yes, as Alex has noted, the price of cloud storage has declined precipitously over the last year or so. Alex’s cloud storage pricing graph is shows how the entry of Microsoft and Google has seemingly forced Amazon to match their price reductions. But the other thing of note is that they have all come down to about the same basic price of $0.024/GB/Month.

It’s interesting that Amazon delayed their first S3 serious price reductions by about 4 months after Azure and Google Cloud Storage dropped there’s and then within another month after that, they all were at price parity.

What’s cloud storage real growth?

I reported last August that Microsoft Azure and Amazon S3 were respectively storing 8 trillion and over 2 trillion objects (see my Is object storage outpacing structured and unstructured data growth). This year (April 2014) Microsoft mentioned at TechEd that Azure was storing 20 Trillion object and servicing 2 million request per second.

I could find no update to Amazon S3 numbers from last year but the 10x  2.5x growth in Azure’s object count in ~8 months and the roughly doubling of request/second (In my post I didn’t mention last year they were processing 900K requests/second) say something interesting is going on in cloud storage.

I suppose Google’s cloud storage service is too new to report serious results and maybe Amazon wants to keep their growth a secret. But considering Amazon’s recent matching of Azure’s and Google’s pricing, it probably means that their growth wasn’t what they expected.

The other interesting item from the Microsoft discussions on Azure, was that they were already hosting 1M SQL databases in Azure and that 57% of Fortune 500 customers are currently using Azure.

In the “olden days”, before cloud storage, all these SQL databases and Fortune 500 data sets would have more than likely resided on NAS or SAN storage of some kind. And possibly due to the traditional storage’s higher cost and greater complexity, some of this data would never have been spun up in the first place if they had to use traditional storage, but with cloud storage so cheap, rapidly configurable and easy to use all this new data was placed in the cloud.

So I must conclude from Microsofts growth numbers and their implication for the rest of the cloud storage industry that maybe Alex was right, more data is moving to the cloud and this is impacting traditional storage revenues.  With IDC’s (2013) data growth at ~43% per year, it would seem that Microsoft’s cloud storage is growing more rapidly than the worldwide data growth, ~14X faster!

On the other hand, if cloud storage was consuming most of the world’s data growth, it would seem to precipitate the collapse of traditional storage revenues, not just a ~3-20% decline. So maybe the most new cloud storage applications would never have been implemented before if they had to use traditional storage, which means that only some of this new data would ever have been stored on traditional storage in the first place, leading to a relatively smaller decline in revenue.

One question remains: is this a short term impact or more of a long running trend that will play out over the next decade or so? From my perspective, new applications spinning up on non-traditional storage is a long running threat to traditional NAS and SAN storage which will ultimately see traditional storage relegated to a niche. How big this niche will ultimately be and how well it can be defended needs to be the subject for another post?

~~~~

Comments?

Securing synch & share data-at-rest

 

1003163361_ba156d12f7Snowden at SXSW said last week that it’s up to the vendors to encrypt customer data. I think he was talking mostly about data-in-flight but there’s just a big an exposure for data-at-rest, maybe more so because then, all the data is available, at one sitting.

iMessage security

A couple of weeks ago there was a TechCrunch article (see Apple Explains Exactly How Secure iMessage Really Is or see the Apple IOS Security document) about Apple’s iMessage security.

The documents said that Apple iMessage uses public key encryption where every IOS/OS X device generates a pair of public and private keys (one for messages and one for signing) which are used to encrypt the data while it is transmitted through Apple’s iMessage service.  Apple encrypts the data on its iMessage App running in the devices with every destination device’s public key before it’s saved on the iMessage server cloud, which can then be decrypted on the device with its private key whenever the message is received by the device.

It’s a bit more complex for longer messages and attachments but the gist is that this data is encrypted with a random key at the device and is saved in encrypted form while residing iMessage servers. This random key and URI is then encrypted with the destination devices public keys which is then stored on the iMessage servers. Once the destination device retrieves the message with an attachment it has the location and the random key to decrypt the attachment.

According to Apple’s documentation when you start an iMessage you identify the recipient, the app retrieves the public keys for all these devices and then it encrypts the message (with each destination device’s public message key) and signs the message (with the originating device’s private signing key). This way Apple servers never see the plain text message and never holds the decryption keys.

Synch & share data security today

As mentioned in prior posts, I am now a Dropbox user and utilize this service to synch various IOS and OSX device file data. Which means a copy of all this synch data is sitting on Dropbox (AWS S3) servers, someplace (possibly multiple places) in the cloud.

Dropbox data-at-rest security is explained in their How secure is Dropbox document. Essentially they use SSL for data-in-flight security and AES-256 encryption with a random key for data-at-rest security.

This probably makes it easier to support multiple devices and perhaps data sharing because they only need to encrypt/save the data once and can decrypt the data on its servers before sending it through (SSL encrypted, of course) to other devices.

The only problem is that Dropbox holds all the encryption keys for all the data that sits on its servers. I (and possibly the rest of the tech community) would much prefer that the data be encrypted at the customer’s devices and never decrypted again except at other customer devices. This would be true end-to-end data security for sync&share

As far as I know from a data-at-rest security perspective Box looks about the same, so does EMC’s Syncplicity, Oxygen Cloud, and probably all the others. There are some subtle differences about how and where the keys are kept and how many security domains exist in each service, but in the end, the service holds the keys to all data that is encrypted on their storage cloud.

Public key cryptography to the rescue

I think we could do better and public key cryptography should show us the way. I suppose it would probably be easiest to follow the iMessage approach and just encrypt all the data with each device’s public key at the time you create/update the data and send it to the service but,

  • That would further delay the transfer of new and updated data to the synch service, also further delaying its availability at other devices linked to the login.
  • That would cause the storage requirement for your sync&share data to be multiplied by the number of devices you wish to synch with.

Synch data-at-rest security

If we just take on the synch side of the discussion first maybe it would be easiest. For example,  if a new public and private key pair for encryption and signing were to be assigned to each new device at login to the service then the service could retain a directory of the device’s public keys for data encryption and signing.

The first device to login to a synch service with a new user-id, would assign a single encryption key for all data to be shared by all devices that could use this login.  As other devices log into the service, the prime device sends the single service encryption key encrypted using the target device’s public key and signing the message with the source device’s private key. Actually any device in the service ring could do this but the primary device could be used to authenticate the new devices login credentials. Each device’s synch service would have a list of all the public keys for all the devices in the “synch” region.

As data is created or updated there are two segments of each file that are created, the AES-256 encrypted data package using the “synch” region’s random encryption key and the signature package, signed by the device doing the creation/update of the file.  Any device could authenticate the signature package at the time it receives a file, as could the service. But ONLY the devices with the AES-256 encryption key would have access to the plain text version of the data.

There are some potential holes in this process, first is that the service could still intercept the random encryption key, at the primary device when it’s created or could retrieve it anytime later at its leisure using the app running in the device. This same exposure exists for the iMessage App running in IOS/OS X devices, the private keys in this instance could be sent to another party at any time. We would need to depend on service guarantees to not do this.

Share data-at-rest security

For Apple’s iMessage attachment security the data is kept in the cloud encrypted by a random key but the key and the URI are sent to the devices when they receive the original message. I suppose this could just as easily work for a file share service but the sharing activity might require a share service app running in the target device to create public-private key pairs and access the file.

Yes this leaves any “shared” data keys being held by the service but it can’t be helped. The data is being shared with others so maybe having it be a little more accessible to prying eyes would be acceptable.

~~~~

I still prefer the iMessage approach, having multiple copies of encrypted shared data, that is encrypted by each device’s public key. It’s simpler this way, a bit more verifiable and doesn’t need to have as much out-of-channel communication (to send keys to other devices).

Yes it would cost more to store any amount of data and would take longer to transmit, but I feel we would all would be willing to support this extra constraints as long as the service guaranteed that private keys were only kept on devices that have logged into the service.

Data-at-rest and -in-flight security is becoming more important these days. Especially since Snowden’s exposure of what’s happening to web data. I love the great convenience of sync&share services, I just wish that the encryption keys weren’t so vulnerable…

Comments?

Photo Credits: Prizon Planet by AZRainman