TPU and hardware vs. software innovation (round 3)

tpu-2At Google IO conference this week, they revealed (see Google supercharges machine learning tasks …) that they had been designing and operating their own processor chips in order to optimize machine learning.

They called the new chip, a Tensor Processing Unit (TPU). According to Google, the TPU provides an order of magnitude more power efficient machine learning over what’s achievable via off the shelf GPU/CPUs. TensorFlow is Google’s open sourced machine learning  software.

This is very interesting, as Google and the rest of the hype-scale hive seem to have latched onto open sourced software and commodity hardware for all their innovation. This has led the industry to believe that hardware customization/innovation is dead and the only thing anyone needs is software developers. I believe this is incorrect and that hardware innovation combined with software innovation is a better way, (see Commodity hardware always loses and Better storage through hardware posts).
Faster Docker initialization through Slacker snapshots & NFS storage

Just got back from EMCWorld2016 this week but on the way there and back I was perusing the FAST’16 papers. One of the papers I read  (see Slacker: Fast Distribution with Lazy Docker Containers, p. 181) discussed performance problems with initializing Docker container micro-services and how they could be solved using persistent, intelligent NFS storage.

It appears that Docker container initialization spends a lot of time provisioning and initializing a local file system for each container.  Docker containers typically make use of an AUFS (Another Union File System) storage driver which makes use of another file system (like ext4) as its underlying storage which has beneath it either DAS or external storage.

When using persistent and intelligent NFS storage, Docker can take advantage of storage system snapshots and cloning to improve container initialization significantly. In the paper, the researchers used Tintri as the underlying persistent, enterprise class NFS storage but I believe the functionality that’s taken advantage of is available with most enterprise class NAS systems and as such, is readily available with other storage subsystems.
Surprises from 4 years of SSD experience at Google

Flash field experience at Google 

Overview SSDsIn a FAST’16 article I recently read (Flash reliability in production: the expected and unexpected, see p. 67), researchers at Google reported on field experience with flash drives in their data centers, totaling many millions of drive days covering MLC, eMLC and SLC drives with a minimum of 4 years of production use (3 years for eMLC). In some cases, they had 2 generations of the same drive in their field population. SSD reliability in the field is not what I would have expected and was a surprise to Google as well.

The SSDs seem to be used in a number of different application areas but mainly as SSDs with a custom designed PCIe interface (FusionIO drives maybe?). Aside from the technology changes, there were some lithographic changes as well from 50 to 34nm for SLC and 50 to 43nm for MLC drives and from 32 to 25nm for eMLC NAND technology.
Better erasure coding for scale-out & cloud storage

LRcC(6,2,2) example layout
Microsoft Azure uses a different style of erasure coding for their cloud storage than what I have encountered in the past. Their erasure coding technique was documented in a paper presented at USENIX ATC’12 (for more info check out their Erasure coding in Windows Azure Storage paper).

The new erasure coding can be optimized for rebuild read or storage space overhead. can at times correct for more errors than equivalent, more traditional, Reed-Solomon (RS) erasure coding schemes.
Intel Cloud Day 2016 news and views

 A couple of weeks back I was at Intel Cloud Day 2016 with the rest of the TFD team. We listened to a number of presentations from Intel Management team mostly about how the IT world was changing and how they planned to help lead the transition to the new cloud world.

The view from Intel is that any organization with 1200 to 1500 servers has enough scale to do a private cloud deployment that would be more economical than using public cloud services. Intel’s new goal is to facilitate (private) 10,000 clouds, being deployed across the world.

In order to facilitate the next 10,000, Intel is working hard to introduce a number of new technologies and programs that they feel can make it happen. One that was discussed at the show was the new OpenStack scheduler based on Google’s open sourced, Kubernetes technologies which provides container management for Google’s own infrastructure but now supports the OpenStack framework.

Another way Intel is helping is by building a new 1000 (500 now) server cloud test lab in San Antonio, TX. Of course the servers will be use the latest Xeon chips from Intel (see below for more info on the latest chips). The other enabling technology discussed a lot at the show was software defined infrastructure (SDI) which applies across the data center, networking and storage.

According to Intel, security isn’t the number 1 concern holding back cloud deployments anymore. Nowadays it’s more the lack of skills that’s governing how quickly the enterprise moves to the cloud.

At the event, Intel talked about a couple of verticals that seemed to be ahead of the pack in adopting cloud services, namely, education and healthcare.  They also spent a lot of time talking about the new technologies they were introducing today.
A tale of two AFAs: EMC DSSD D5 & Pure Storage FlashBlade

There’s been an ongoing debate in the analyst community about the advantages of software only innovation vs. hardware-software innovation (see Commodity hardware loses again and Commodity hardware always loses posts). Here is another example where two separate companies have turned to hardware innovation to take storage innovation to the next level.

DSSD D5 and FlashBlade

DSSD-d5Within the last couple of weeks, two radically different AFAs were introduced. One by perennial heavyweight EMC with their new DSSD D5 rack scale flash system and the other by relatively new comer Pure Storage with their new FlashBlade storage system.FB

These two arrays seem to be going after opposite ends of the storage market: the 5U DSSD D5 is going after both structured and unstructured data that needs ultra high speed IO access (<100µsec) times and the 4U FlashBlade going after more general purpose unstructured data. And yet the two have have many similarities at least superficially.
5D storage for humanity’s archive

5D data storage.jpg_SIA_JPG_fit_to_width_INLINEA group of researchers at the University of Southhampton in the UK have  invented a new type of optical recording, based on femto-second laser pulses and silica/quartz media that can store up to 300TB per (1″ diameter) disc platter with thermal stability at up to 1000°C or a media life of up to 13.8B years at room temperature (190°C?). The claim is that the memory device could outlive humanity and maybe the universe.

The new media/recording technique was used recently to create copies of text files (Holy Bible, pictured above). Other significant humanitarian, political and scientific treatise have also been stored on the new media. The new device has been nicknamed “Superman Memory Crystal”, due to the memory glass (quartz) likeness to Superman’s memory crystals.

We have written before on long term archives(See Super Long Term Archive and Today’s data and the 1000 year archive posts) but this one beats them all by many orders of magnitude.
(QoM16-002): Will Intel Omni-Path GA in scale out enterprise storage by February 2016 – NO 0.91 probability

opa-cardQuestion of the month (QoM for February is: Will Intel Omni-Path (Architecture, OPA) GA in scale out enterprise storage by February 2016?

In this forecast enterprise storage are the major and startup vendors supplying storage to data center customers.

What is OPA?

OPA is Intel’s replacement for InfiniBand and starts out at 100Gbps. It’s intended more for high performance computing (HPC), to be used as an inter-cluster server interconnect or next generation fabric. Intel says it “will maintain consistency and compatibility with existing Intel True Scale Fabric and InfiniBand APIs by working through the open source OpenFabrics Alliance (OFA) software stack on leading Linux* distribution releases”. Seems like Intel is making it as easy as possible for vendors to adopt the technology.
