Eventual data consistency and cloud storage

cloud map by psyberartist (cc) (from Flickr)
cloud map by psyberartist (cc) (from Flickr)

We were talking with Ursheet Parikh at StorSimple today about their new cloud gateway product (to be covered in a future post) when at the end of the talk he described some IP they have to handle cloud storage’s “eventual consistency“.  Dumbfounded, I asked him to clarify, having never heard this term before.

Apparently, eventual data consistency is what you get when you use most cloud storage providers.  With eventual consistency they will not guarantee that when you read back an object that has been recently updated that you will get the latest copy.

In contrast, “immediate consistency” means that if you update an object the cloud storage provider guarantees the latest version will be supplied for any and all subsequent read backs.  To me all storage up until cloud storage guaranteed immediate consistency otherwise it was considered a data integrity failure.

To explain, cloud storage providers have multiple copies of any object replicated about that must be updated throughout their environment.  As such, they cannot guarantee that you will read back an updated version versus one of the downlevel one(s)- Yikes!

What does this mean for your cloud storage?

First, Microsoft’s Azure cloud storage is the only provider that guarantees immediate consistency but in order to do so has made some restrictions on object size.  But this means all the other cloud storage providers only guarantee eventual consistency.

Second, cloud storage with eventual consistency guarantee should not be used for data that’s updated frequently and then read back.  It’s probably ok for archive or backup storage (that’s not restored for awhile) BUT it’s not ok for “normal” file or block data which is updated frequently and then read back expecting to see the updates.

According to Ursheet, the cloud storage providers have been completely up-front about their consistency level and as such his product, StorSimple, has been specifically designed to accommodate variable levels of consistency.  We would need to ask the other providers how they handle cloud storage consistency-ness to understand whether they have tried to deal with this as well.

However, from my perspective eventual consistency is scary.  It appears that cloud storage has redefined what we mean by storage or at the very least eliminating data integrity.  Moreover, this seriously limits the usability of raw cloud storage to very archive-like, infrequently updated data storage.

And I thought cloud storage was going to take over the data center – not like this…