Thinly provisioned compute clouds

Thin provisioning has been around in storage since StorageTek’s Iceberg hit the enterprise market in 1995.  However, thin provisioning has never taken off for system servers or virtual machines (VMs).

But recently a paper out of MIT Making cloud computing more efficient discusses some recent research that came up with the idea of monitoring system activity to model and predict application performance.

So how does this enable thinly provision VMs?

With a model like this in place, one could concievably provide a thinly provisioned virtual server that could guarantee a QoS and still minimize resource consumption.  For example, have the application VM just consume the resources needed at any instant in time which could be adjusted as demands on the system change.  Thus, as an application  needs grew, more resources could be supplied and as needs shrink, resources could be given up for other uses.

With this sort of server QoS, certain classes of application VMs would need to have variable or no QoS to be sacrificed in times of need to those that required guaranteed QoS. But in a cloud service environment a multiplicity of service classes like these could be supplied at different price points.

Thin provisioning grew up in storage because it’s relatively straightforward for a storage subsystem to understand capacity demands at any instant in time.  A storage system only needs to monitor data write activity and if a data block was written or consumed then it would be backed by real storage. If it had never been written, then it was relatively easy to fabricate a block of zeros if it ever was read.

Prior to thinly provisioned storage, fat provisioning required that storage be configured to the maximum capacity required of it. Similarly, with fully (or fat) provisioned VMs, they must be configured for peak workloads. With the advent of thin provisioning on storage wasted resources (capacity in the case of storage) could be shared across multiple thinly provisioned volumes (LUNs) thereby freeing up these resources for other users.

Problems with server thin provisioning

I see some potential problems with the model and my assumptions as to how thinly provisioned VM would wore. First, the modeled performance is a lagging indicator at best.  Just as system transactions start to get slower, a hypervisor would need to interrupt the VM to add more physical (or virtual) resources.  Naturally during the interruption system performance would suffer.

It would be helpful if resources could be added to a VM dynamically, in real time without impacting the applications running in the VM. But it seems to me that adding physical or virtual CPU cores,  memory, bandwidth, etc., to a VM would require at least some sort of interruption to a pair of VMs [the one giving up the resource(s) and the one gaining the freed up resource(s)].

Similar issues occur for thinly provisioned storage. As storage is consumed for a thinly provisioned volume, allocating more physical capacity takes some amount of storage subsystem resources and time to accomplish.

How does the model work?

It appears that the software model works by predicting system performance based on a limited set of measurements. Indeed, their model is bi-modal. That is there are two approaches:

  • Black box model – tracks server or VM indictors such as “number and type of user requests” as well as system performance and uses AI to correlate the two. This works well for moderate fluctuations in demand but doesn’t help when requests for services falls beyond those boundaries.
  • Grey box model – is more sophisticated and is based on an understanding of a specific database functionality, such as how frequently they flush host buffers, commit transactions to disk logs, etc.  In this case, they are able to predict system performance when demand peaks at 4X to 400X current system requirements.

They have implemented the grey box model for MySQL and are in the process of doing the same for PostGres.

Model validation and availability

They tested their prediction algorithm against published TPC-C benchmark results and were able to come within 80% accuracy for CPU use and 99% accuracy for disk bandwidth consumption.

It appears that the team has released their code as open source. At least one database vendor, Teradata is porting it over to their own database machine to better allocate physical resources to data warehouse queries.

It seems to me that this would be a natural for cloud compute providers and even more important for hypervisor solutions such as vSphere, Hyper-V, etc.  Anyplace one could use more flexibility in assigning virtual or physical resources to an application or server would find use for this performance modeling.

~~~~

Now, if they could just do something to help create thinly provisioned highways, …

Image: Intel Team Inside Facebook Data Center By IntelFreePress

Top 10 storage technologies over the last decade

Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)
Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)

Some of these technologies were in development prior to 2000, some were available in other domains but not in storage, and some were in a few subsystems but had yet to become popular as they are today.  In no particular order here are my top 10 storage technologies for the decade:

  1. NAND based SSDs – DRAM and other technology solid state drives (SSDs) were available last century but over the last decade NAND Flash based devices have dominated SSD technology and have altered the storage industry forever more.  Today, it’s nigh impossible to find enterprise class storage that doesn’t support NAND SSDs.
  2. GMR head– Giant Magneto Resistance disk heads have become common place over the last decade and have allowed disk drive manufacturers to double data density every 18-24 months.  Now GMR heads are starting to transition over to tape storage and will enable that technology to increase data density dramatically
  3. Data DeduplicationDeduplication technologies emerged over the last decade as a complement to higher density disk drives as a means to more efficiently backup data.  Deduplication technology can be found in many different forms today, ranging from file and block storage systems, backup storage systems, to backup software only solutions.
  4. Thin provisioning – No one would argue that thin provisioning emerged last century but it took the last decade to really find its place in the storage pantheon.  One almost cannot find a data center class storage device that does not support thin provisioning today.
  5. Scale-out storage – Last century if you wanted to get higher IOPS from a storage subsystem you could add cache or disk drives but at some point you hit a subsystem performance wall.  With scale-out storage, one can now add more processing elements to a storage system cluster without having to replace the controller to obtain more IO processing power.  The link reference talks about the use of commodity hardware to provide added performance but scale-out storage can also be done with non-commodity hardware (see Hitachi’s VSP vs. VMAX).
  6. Storage virtualizationserver virtualization has taken off as the dominant data center paradigm over the last decade but a counterpart to this in storage has also become more viable as well.  Storage virtualization was originally used to migrate data from old subsystems to new storage but today can be used to manage and migrate data over PBs of physical storage dynamically optimizing data placement for cost and/or performance.
  7. LTO tape When IBM dominated IT in the mid to late last century, the tape format dejour always matched IBM’s tape technology.  As the decade dawned, IBM was no longer the dominant player and tape technology was starting to diverge into a babble of differing formats.  As a result, IBM, Quantum, and HP put their technology together and created a standard tape format, called LTO, which has become the new dominant tape format for the data center.
  8. Cloud storage Unclear just when over the last decade cloud storage emerged but it seemed to be a supplement to cloud computing that also appeared this past decade.  Storage service providers had existed earlier but due to bandwidth limitations and storage costs didn’t survive the dotcom bubble. But over this past decade both bandwidth and storage costs have come down considerably and cloud storage has now become a viable technological solution to many data center issues.
  9. iSCSI SCSI has taken on many forms over the last couple of decades but iSCSI has the altered the dominant block storage paradigm from a single, pure FC based SAN to a plurality of technologies.  Nowadays, SMB shops can have block storage without the cost and complexity of FC SANs over the LAN networking technology they already use.
  10. FCoEOne could argue that this technology is still maturing today but once again SCSI has taken opened up another way to access storage. FCoE has the potential to offer all the robustness and performance of FC SANs over data center Ethernet hardware simplifying and unifying data center networking onto one technology.

No doubt others would differ on their top 10 storage technologies over the last decade but I strived to find technologies that significantly changed data storage that existed in 2000 vs. today.  These 10 seemed to me to fit the bill better than most.

Comments?

HDS Dynamic Provisioning for AMS

HDS announced support today for their thin provisioning (called Dynamic Provisioning) feature to be available in their mid-range storage subsystem family the AMS. Expanding the subsystems that support Thin provisioning can only help the customer in the long run.

It’s not clear whether you can add dynamic provisioning to an already in place AMS subsystem or if it’s only available on a fresh installation of an AMS subsystem. Also no pricing was announced for this feature. In the past, HDS charged double the price of a GB of storage when it was in a thinly provisioned pool.

As you may recall, thin provisioning is a little like a room with a bunch of inflatable castles inside. Each castle starts with it’s initial inflation amount. As demand dictates, each castle can independently inflate to whatever level is needed to support the current workload up to that castles limit and the overall limit imposed by the room the castles inhabit. In this analogy, the castles are LUN storage volumes, the room the castles are located in, is the physical storage pool for the thinly provisioned volumes, and the air inside the castles is the physical disk space consumed by the thinly provisioned volumes.

In contrast, hard provisioning is like building permanent castles (LUNS) in stone, any change to the size of a structure would require major renovation and/or possible destruction of the original castle (deletion of the LUN).

When HDS first came out with dynamic provisioning it was only available for USP-V internal storage, later they released the functionality for USP-V external storage. This announcement seems to complete the roll out to all their SAN storage subsystems.

HDS also announced today a new service called the Storage Reclamation Service that helps
1) Assess whether thin provisioning will work well in your environment
2) Provide tools and support to identify candidate LUNs for thin provisioning, and
3) Configure new thinly provisioned LUNs and migrate your data over to the thinly provisioned storage.

Other products that support SAN storage thin provisioning include 3PAR, Compellent, EMC DMX, IBM SVC, NetApp and PillarData.