Faster Docker initialization through Slacker snapshots & NFS storage

Just got back from EMCWorld2016 this week but on the way there and back I was perusing the FAST’16 papers. One of the papers I read  (see Slacker: Fast Distribution with Lazy Docker Containers, p. 181) discussed performance problems with initializing Docker container micro-services and how they could be solved using persistent, intelligent NFS storage.

It appears that Docker container initialization spends a lot of time provisioning and initializing a local file system for each container.  Docker containers typically make use of an AUFS (Another Union File System) storage driver which makes use of another file system (like ext4) as its underlying storage which has beneath it either DAS or external storage.

When using persistent and intelligent NFS storage, Docker can take advantage of storage system snapshots and cloning to improve container initialization significantly. In the paper, the researchers used Tintri as the underlying persistent, enterprise class NFS storage but I believe the functionality that’s taken advantage of is available with most enterprise class NAS systems and as such, is readily available with other storage subsystems.
Continue reading “Faster Docker initialization through Slacker snapshots & NFS storage”

Data storage features for virtual desktop infrastructure (VDI) deployments

The Planet Data Center by The Planet (cc) (from Flickr)
The Planet Data Center by The Planet (cc) (from Flickr)

Was talking with someone yesterday about one of my favorite topics, data storage for virtual desktop infrastructure (VDI) deployments.  In my mind there are a few advanced storage features that help considerably with VDI implemetations:

  • Deduplication – almost every one of your virtual desktops will share 75-90% of their O/S disk data with every other virtual desktop.  Having sub-file/sub-block deduplication can be a godsend for all this replicated data and reduce O/S storage requirements considerably.
  • 0 storage snapshots/clones – another solution to the duplication of O/S data is to use some sort of space conserving snapshots.  For example, one creates a master (gold) disk image and makes 100s if not 1000s of snapshots of it, taking almost no additional space.
  • Highly available/highly reliable storage – when you have a lone desktop dependent on DAS for it’s O/S, it doesn’t impact a lot of users if that device fails. However, when you have 100s to 1000s of users dependent on DAS device(s) for their O/S software, any DAS failure could impact all of them at the same time.  As such, one needs to move off DAS and invest in highly reliable and available external storage of some kind to sustain reasonable uptime for your user community.

Those seem to me to be the most important attributes for VDI storage but there are a couple more features/facilities which can also:

  • NAS systems with NFS – VDI deployments will generate lots of VMDKs for all the user desktop C: drives.  Although this can be managed with block level storage as separate LUNs or multi-VMDK LUNs, who want’s to configure a 100 to 1000 LUNs.  NFS files can perform just as well and are much easier to create on the fly and thus, for VDI it’s hard to beat NFS storage.
  • Boot storm enhancements – Another problem with VDI is that everyone gets to work 8am Monday and proceeds to boot up their (virtual) machines, which drives an awful lot of IO to their virtual C: drives.  Deduplication and 0 storage snapshots can help manage the boot storm as long as these characteristics are retained throughout system cache, i.e, deduplication exists in cache as well as on backend disk.  But there are other approaches to the problem as well, available from various vendors to better manage boot storms.
  • Anti-Virus scan enhancements – Similar to boot storms, A-V scans also typically happen around the same time for many desktop users and can be just as bad for virtual C: drive performance.  Again, deduplication or 0 storage snapshots can help (with above caveats) but some vendor storage can offload these activities from the desktop alltogether.  Also last weeks VMworld release of VMware’s vShield Edge (see VMworld 2010 review) also supports some A-V scan enhancements. Any of these approaches should be able to help.

Regular “dumb” block storage will always work but it will require a lot more raw storage, performance will suffer just when everybody gets back to work, and the administrative burden will be much higher.

I may seem biased but enterprise class reliability&availability with some of the advanced storage features described above can help make your deployment of VDI that much better for you and all your knowledge workers.

Anything I missed?