096: GreyBeards YE2019 IT Industry Trends podcast

In this, our yearend industry wrap up episode, the GreyBeards discuss trends and technologdies impacting the IT industry in 2019 and what’s ahead for 2020. This year we have Matt and Keith on the podcast along with Ray. Just like last year, we start off with NVMeoF.

NVMeoF unleashed

This year just about every major storage vendor announced new systems that either have support for NVMeoF or currently offer NVMeoF on their storage systems. Most offer FC based NVMeoF but a few offer NVMeoF/Ethernet, fewer still offer both.

All of the NVMeoF/Ethernet seem to be using RoCE or iWARP. Unclear if one is more often used that the other, so for now both continue to be used in the market. Some storage vendors are offering NVMeoF as an internal fabric to access storage while still using iSCSI or FC/SCSI to access the data. This works better than SAS but won’t provide all the performance you can get from end-to-end NVMeoF.

NVMeoF is all about increasing IOPS and reducing response times. That and getting ready for SCM SSDs. In the mean time the SSD industry has introduced some very attractive NVMe (NAND) SSDs that in NVMeoF storage system can increase IOPS and reduce latencies.

We talked last year about NVMeoF standards finally stabilizing and this year the rollout across enterprise storage systems is testament to that.

SCM hits the enterprise

Most of us attended an Intel Data Center Event earlier this past yea,r where Optane DC PM was introduced. Optane DC PM is the memory version of Optane SCM (3DX Crosspoint) technology. Intel offers two distinct modes of accessing Optane DC PM as memory: 1) App Direct mode, where data in Optane DC PM persists across power cycles but requires one to use a special AP; and 2) Memory mode where Optane DC PM is cleared during a power cycle, (see our RayOnStorage post Need memory, Intel’s Optane DC PM…).

Vendors seem to be using Optane both memory and SCM technology differently. Pure is using Optane SSDs plugged into their FlashArray as sort of a read cache for customer IO. They suggest for well behaved applications this can reduce IO response times considerably.

Dell EMC introduced SCM as a storage tier and are using their automated storage tiering to move the hottest data to SCM. Oracle’s latest Exadata appliance uses Optane DC PM as both a read and write caching layer.

It won’t be long before every enterprise vendor offers SCM drives in their storage systems with a few offering Optane DC PM as in memory caching technology.

Of course, the big news for Optane DC PM is its use in memory databases, specifically SAP HANA. HANA can take advantage of the (6) TB of memory to to handle larger databases. Keith mentioned that even Microsoft SQL server can take advantage of the additional memory to provide faster responses to queries.

Keith also mentioned that there are some systems out there that can be configured to share Optane memory (or storage). When SAP or other databases use this solution they are able to amortize the cost of the technology over more use cases.

Of course, Optane DC PM are only available on the lastest generation Intel processors. None of us have heard anything from AMD (or Micron) on providing a second source for support of Optane DC PM (or the memory technology itself). Presumably most customers would want a second source for Optane DC PM processor support (as well as the technology)

Cloud enterprise storage hits mainstream

The other thing we saw more of this year is enterprise vendors offering versions of storage in public cloud environments. NetApp was an early proponent of doing this.

We saw at Pure that they have a new Cloud Block Store witch is a re-architected version of FlashArray//X storage using AWS hardware and networking services. We were very impressed with what they have accomplished and it was the subject of more than one late night discussion. Listen to the Keith & Ray show at Pure//Accelerate2019 podcast to learn more.

Matt mentioned Nimble’s cloud volume storage which is cloud adjacent. Most enterprise vendors offer something similar today. They differentiate on how easy it is to configure, use and where (which regions) it’s available in.

NetApp has arguably been at this the longest and has the deepest offerings available from cloud adjacent file and block storage, to offering native enterprise file services for all public cloud environments, to supplying a suite of dedicated data services to surround all of their storage technology operating in public clouds and on premises.

While Dell EMC may have missed the turn to the cloud, they are quickly trying to catch up. Keith mentioned Faction, a Dell partner that offers cloud storage services using VMware with VMC. With Faction and vSAN customers have access to software defined storage that uses cloud hardware to support data services.

What’s driving data growth

There seems to be no end for the need for storage to store data. The GreyBeards point to three trends driving data growth today.

  1. IoT seems to have no bounds. A recent RayOnStorage post Internet of Tires discussed how tire companies were tying their tires to the internet. And that’s just the start, pretty soon every artifact, every device, every manufactured item will have a number of sensors attached all of which will be creating massive amounts of data.
  2. AI ML DL has an insatiable appetite for data. IoT is being used largely to optimize products and services. But it’s DL, with a large dollop of data, that is behind much of that optmization.
  3. SaaS applications is a relatively new application approach that’s being rolled out to more arenas and as it’s online and user oriented, seems to generate lots of data.

Containers storage debate

We closed the podcast with a heavy debate on whether container applications have need for storage. Keith was adamant that containers by their very nature are stateless and that Kubernetes ability to stop and start container applications at will almost requires stateless operations.

Ray was a bit more theoretical on the topic and believed that most container applications today take advantage of some sort of database or other services to store state and that state is just another word for storage.

Keith mentioned encoding as a typical container app. Encoding containers can be fired up and taken down at will without hurting anything but throughput. Yes, but those encoder container apps must access some database or other state information to find out what work is left to do and as they complete their work they update this data as well as store their newly encoded segments. This all involves the use of state information.

In the end, I think we were talking about the same thing but using different terminology. Keith believes that persistent state information is needed and Ray says that this is just another word for (containers) storage. Matt said we probably need Nigel (@NigelPoulton) on the podcast to straighten us both out.

The podcast ran a bit long and could have run longer. Keith and Matt bring systems level perspective to what’s happening in the storage market. But they come at it from different sides. Ray seems to frame everything from a storage perspective. Diverse perspectives lead to a more fuller and interesting discussion. Listen to the podcast to learn more.


This image has an empty alt attribute; its file name is Spotify_Logo_CMYK_Black-1024x307.png
This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Ray Lucchesi ( @RayLucchesi) is the host of GreyBeardsOnStorage and is President/Founder of Silverton Consulting, and a prominent blogger at RayOnStorage.com.

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek

Matt Leib (@MBLeib), one of our co-hosts, has been blogging in the storage space for over 10 years, with work experience both on the engineering and presales/product marketing. His blog is at Virtually Tied to My Desktop.


93: GreyBeards talk HPC storage with Larry Jones, Dir. Storage Prod. Mngmt. and Mark Wiertalla, Dir. Storage Prod. Mkt., at Cray, an HPE Enterprise Company

Supercomputing Conference 2019 (SC19) is coming to Denver next week and in anticipation of that show, we thought it would be a good to talk with some HPC storage group. We contacted HPE and given their recent acquisition of Cray, they offered up Larry and Mark to talk about their new ClusterStor E1000 storage system.

There are a number of components that go into Cray supercomputers and besides the ClusterStor, Larry and Mark mentioned their new SlingShot cluster interconnect which is Ethernet based with significant enhancements to congestion handling. But the call focused on ClusterStor.

What is ClusterStor

ClusterStor, is a Lustre file system hardwareappliance. Lustre has always been popular with the HPC crowd as it offered high bandwidth file services. But Lustre often took a team of (PhD) scientists to configure, deploy and run properly because of all the parameters that had to be setup for optimum performance.

Cray’s ClusterStor was designed to make configuring, deploying and running Lustre a lot simpler with a GUI and system defaults that provided an optimal running environment. But if customers still want access to all Lustre features and functionality, all the Lustre parameters can still be tweaked to personalize it.

What sort of appliance

The ClusterStore team has created a Lustre storage appliance using two systems, a 2U-24 NVMe SSD system and a 4U-106 disk drive system. Both systems use PCIe Gen 4 buses which offer 2X the bandwidth of Gen 3 and NVMe Gen 4 SSDs. Each ClusterStore E1000 appliance comes with 2 servers for HA and the storage behind it.

Larry said the 2U NVMe Gen 4 appliance offers 80GB/sec of read and 60GB/sec of write data bandwidth. And a full rack of these, could support ~2.5TB/sec of data bandwidth. One TB/sec seems like an awful lot to the GreyBeards, 2.5TB/sec, out of this world.

We asked if it supported InfiniBAND interconnects? Yes, they said it supports the latest generation of InfiniBAND but it also offers Cray’s own (SlingShot) Ethernet interconnect, unusual for HPC environments. And as in any Lustre parallel file system, servers accessing storage use Lustre client software.

ClusterStor Data Services

But on the backend, where normally one would see only LDISKFS for backend storage, ClusterStor also offers ZFS. Larry and Mark said that LDISKFS is faster but ZFS offers more functionality like snapshots and data compression.

Many of the Top 100 & Top 500 supercomputing environments are starting to deploy ML DL (machine learning-deep learning) workloads along with their normal HPC activities. But whereas HPC work has historically depended on bandwidth to read, write and move large files around, ML DL deals with small files and needs high IOPS. ClusterStor was designed to satisfy both high bandwidth and high IOPS workloads.

In previous HPC Lustre flash solutions, customers had to deal with the complexity of where to place data, such as on flash or on disk. But with net ClusterStor E1000, the system can do all this for you. That is it will move data from disk to flash when it sees an advantage to doing so and move it back again when that advantage is gone. But, just as with Lustre configuration parameters above, customers can still pre-stage data to flash.

The other challenge for HPC environments is extreme size. Cray and others are starting to see requirements for Exascale (exabyte, 10**18) byte) storage systems. In fact, Cray has a couple of ClusterStor E1000 configurations of 400PB or more already, As these systems age they may indeed grow to exceed an exabyte.

With an exabyte of data, systems need to support billions of files/inodes and better metadata services and indexing. ClusterStor offers optimized inode indexing and search to enable HPC users to quickly find the data they are looking for. Further, ClusterStor offers, data at rest encryption and supports virtual file systems, for multi-tenancy.

With a ZFS backend, ClusterStor can supply data compression and snapshots. Cray has tested ZFS compression on HPC scientific ( mostly already application compressed) data and still see ~30% reduction is storage footprint. At an exabyte of storage 30% can be a significant cost reduction

The podcast ran long, ~46 minutes. Larry and Mark had a good knowledge of the HPC storage space and were easy to talk with. Matt’s an old ZFS hand, so wanted to take even more about ZFS. I had a good time discussing ClusterStor and Lustre features/functionalit and how the HPC workloads are changing. Listen to the podcast to learn more. [The podcast was recorded on November 6th, not the 5th as mentioned in the lead in, Ed.]

This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Larry Jones, Director Storage Product Management

Larry Jones is a director of storage product management for Cray, a Hewlett Packard Enterprise company.

Jones previously held senior product management roles at Seagate, DDN and Panasas.

Mark Wiertalla, Director Storage Product Marketing

Mark Wiertalla is a product marketing director for Cray, a Hewlett Packard Enterprise company.

Prior to Cray, Wiertalla held product manager roles at EMC and SGI.

92: Ray talks AI with Mike McNamara, Sr. Manager, AI Solution Mkt., NetApp

Sponsored By: NetApp

NetApp’s been working in the AI DL (deep learning) space for a long time now and announced their partnership with NVIDIA DGX systems, back in August of 2018. At NetApp Insight, this week they were showing off their new NVIDIA DGX systems reference architectures. These architectures use NetApp AFF A800 storage (for more info on AI DL, checkout Ray’s Learning Machine (deep) Learning posts – part 1, – part 2 and – part3).

Besides the ONTAP AI systems, NetApp also offers

  • FlexPod AI solution based on their partnership with Cisco using UCS C480 ML M5 rack servers which include 8 NVIDA Tesla V100 GPUs and also features NetApp AFF A800 storage for use in core AI DL.
  • NetApp HCI has two configurations with 2- or 3-NVIDIA GPUs that come in 1U or 2U rack servers and run VMware vSphere or RedHad OpenStack/OpenShift software hypervisors suitable for edge or core AI DL.
  • E-series reference architecture that uses the BeeGFS parallel file system and offers InfiniBAND data access for HPC or core AI DL.

On the conference floor, NetApp showed AI DL demos for automotive, financial services, Public Sector and healthcare verticals. They also had a facial recognition application running that could estimate your age and emotional state (I didn’t try it, but Mike said they were hedging the model so it predicted a lower age).

Mike said one healthcare solution was focused on radiological image scans, to identify pathologies from x-Ray, MRI, or CAT scan images. Mike mentioned there was a lot of radiological technologists burn-out due to the volume of work caused by the medical imaging explosion over the last decade or so. Mike said image analysis is something that h AI DL can perform very effectively and doing so would improve the accuracy and reduce the volume of work being done by technologists.

He also mentioned another healthcare application that uses an AI DL app to count TB cells in blood samples and estimate the extent of TB infections. Historically, this has been time consuming, error prone and hard to do in the field. The app uses a microscope with a smart phone and can be deployed and run anywhere in the world.

Mike mentioned a genomics AI DL application that examined DNA sequences and tried to determine its functionality. He also mentioned a retail AI DL facial recognition application that would help women “see” what they would look like with different makeup on.

There was a lot of discussion on NetApp Cloud services at the show, such as Cloud Volume Services and Azure NetApp File (ANF). Both of these could easily be used to implement an AI DL application or be part of an edge to core to cloud data flow for an AI DL application deployment using NetApp Data Fabric.

NetApp also announced a new, all flash StorageGRID appliance that was targeted at heavy IO intensive uses of object store like AI DL model training and data analytics.

Finally, Mike mentioned NetApp’s ecosystem of partners working in the AI space to help customers deploy AI DL algorithms in their industries. Some of these include:

  1. Flexential, Try and Buy AI so that customers could bring them in to supply AI DL expertise to generate an AI DL application using customer data and deploy it on customer cloud or on prem infrastructure .
  2. Core Scientific, AI-as-a-Service, so that customers could purchase a service to implement an AI DL application using customer data and running on Core Scientific infrastructure..
  3. Scale Matrix, Mobile data center AI, so that customers could create an AI DL application and run it on Scale Matrix infrastructure that was transported to wherever the customer wanted it to be run.

We recorded the podcast on the show floor, in a glass booth, so there’s some background noise (sorry about that, but can’t be helped). The podcast is ~27 minutes. Mike is a long time friend and NetApp product expert, recently working in AI DL solutions at NetApp. When I saw Mike at Insight, I just had to ask him about what NetApp’s been doing in the AI DL space. Listen to the podcast to learn more.

This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Mike McNamara, Senior Manager AI Solution Marketing, NetApp

With over 25 years of data management product and solution marketing experience, Mike’s background includes roles of increasing responsibility at NetApp (10+ years), Adaptec, EMC and Digital Equipment Corporation. 

In addition to his past role as marketing chairperson for the Fibre Channel Industry Association, he was a member of the Ethernet Technology Summit Conference Advisory Board, a member of the Ethernet Alliance, and a regular contributor to industry journals, and a frequent speaker at events.

89: Keith & Ray show at Pure//Accelerate 2019

There were plenty of announcements at Pure//Accelerate in Austin this past week and we were given a preview of them at a StorageFieldDay Exclusive (SFDx), the day before the announcement.

First up is Pure’s DirectMemory. They have added Optane SSDs to FlashArray//X to be used as a read cache for customer data. As you may know, Pure already has an NVRAM write cache. With DirectMemory, customers can have 3TB or 6TB of Optane storage in a FlashArray//X70 or //X90 storage. It almost looks plug and play, you take out one or two flash modules and plug in Optane SSD(s) and off it goes. DirectMemory went GA at the show.

Pure also announced FlashArray//C at Accelerate. This is a new capacity optimized storage solution. They have re-designed their flash module to support higher capacity flash, and supply higher capacity storage (targeted for QLC flash but will originally ship with TLC). FlashArray//C supplies ~5PB of effective (~1.4PB raw) capacity in 9U. Although, FlashArray//C offers cheaper storage on $/GB basis it is also much slower (RT latency on order of 2-4msec) than FlashArray//X storage.. Pure like other vendors we have talked with are trying to drive disk technology out of the enterprise. We had some interesting discussions with Pure (and others) on this topic at the reception. Just remember, tape is still alive and well in the enterprise AND cloud, 52 years after being pronounced dead.

Pure had announced CloudBlockStore (CBS) previously but it is now GA through partners or on AWS marketplace. Give them kudos for their approach as they have taken a different approach to Pure storage in the cloud. With CBS, they have effectively re-archetected and re-implemented Pure FlashArray using AWS EC2, IO1, EBS and S3 storage and ended up with a highly available (iSCSI) block software defined storage. It will be interesting to see how well it’s adopted. Picture is from me explaining CBS architecture to @DVellante.

For Pure’s FlashBlade storage, they have doubled the number of blades in a cluster (or name space), from 75 to 150 FlashBlades. Each FlashBlade contains storage and compute (almost computational storage), so one should see an increase in bandwidth with the added blades. None at Pure would go on record with specific numbers on any performance improvement because it’s still undergoing testing.

Finally, FlashArray//X will offer full NFS and SMB file support. This is coming from a recent acquisition (Compuverde). They plan to differentiate between file on FlashArraiy//X file storage and FlashBlade by saying that FlashArray//X file is for those customers with mostly block storage requirements but also need small amount of file storage and FlashBlade for everyone else that needs file.

The podcast is ~23 minutes. Keith is a long time friend and co-host of our GreyBeards On Storage podcast. He’s always got an interesting perspective on how new technology can benefit the data center today. Listen to the podcast to learn more.

This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Keith Townsend, The CTO Advisor

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

85: GreyBeards talk NVMe NAS with Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data Inc.

As most of you know, Howard Marks was a founding co-Host of the GreyBeards-On- Storage podcast and has since joined with VAST Data, an NVMe file and object storage vendor headquartered in NY with R&D out of Israel. We first met with VAST at StorageFieldDay18 (SFD18, video presentation). Howard announced his employment at that event. VAST was a bit circumspect at their SFD18 session but Howard seems to be more talkative, so on the podcast we learn a lot more about their solution.

VAST Data is essentially an NFS-S3 object store, scale out solution with both stateless, VAST Data storage servers and JBoF drive enclosures with Optane and NVMe QLC SSDs. Storage servers or JBoFs can be scaled independently. They don’t support tiering or DRAM caching of data but instead seem to use the Optane SSDs as a write buffer for the QLC SSDs.

At the SFD18 event their spokesperson said that they were going to kill off disk storage media. (Ed’s note: Disk shipments fell 18% y/y in 1Q 2019, with enterprise disk shipments at 11.5M units, desktop at 24.5M units and laptops at 37M units).

The hardware

The VAST Data storage servers are in a 2U/4 server configuration, that runs interface protocols (NFS & S3), data reduction (see below), data reformating/buffering etc. They are stateless servers with all the metadata and other control state maintained on JBoF Optane drives.

Each drive enclosure JBoF has 12 Optane SSDs and 44 U.2 QLC (no DRAM/no super cap) SSDs. This means there are no write buffers on the QLC SSDs that can lose data when power failures occur. The interface to the JBoF is NVMeoF, either RDMA-RoCE Ethernet or InfiniBand (customer selected). Their JBoFs have high availability, with dual fabric modules that support 2-100Gbps Ethernet/InfiniBand ports per module, 4 per JBoF.

Minimum starting capacity is 500TB and they claim support up to Exabytes. Although how much has actually been tested is an open question. They also support billions of objects/files.

Guaranteed better data reduction

They have a rather unique, multi-level, data reduction scheme. At the start, data is chunked in variable length chunks. They use heuristics to determine the chunk size that fits best. (Ed note, unclear which is first in this sequence below so presented in (our view of) logical order)

  • 1st level computes a similarity hash (56 bit not SHA1), which is used to determine a similarity level with any other currently stored data chunk in the system.
  • 2nd level uses a ZSTD compression algorithm. If a similarity is found, the new data chunk is compressed with the ZSTD compression algorithm and a reference dictionary used by the earlier, similar data chunk. If no existing chunk is similar to this one, the algorithm identifies a semi-unique reference dictionary that optimizes the compression of this data chunk. This semi-unique dictionary is stored as metadata.
  • 3rd level, If it turns out to be a complete duplicate data chunk, then the dedupe count for the original data chunk is incremented, a pointer is saved to the original unique data and the data discarded. If not a complete duplicate of other data, the system computes a delta from the closest “similar’ block and stores just the delta bytes, includes a pointer to the original similar block and increments a delta block counter.

So data is chunked, compressed with a optimized dictionary, be delta-diffed or deduped. All data reduction is done post data write (after the client is ACKed), and presumably, re-hydrated after being read from SSD media. VAST Data guarantees better data reduction for your stored data than any other storage solution.

New data protection

They also supply a unique Locally Decodable Erasure Coding with 4 parity (-like) blocks and anywhere from 36 (single enclosure leaving 4 spare u.2 SSDs) to 150 data blocks per stripe all of which support up to 4 device failures per stripe. 

The locally decodable erasure coding scheme allows for rebuilds without having to read all remaining data blocks in a stripe. In this scheme, once you read the 4 parity (-like) blocks, one has all the information calculated from up to ¾ of the remaining drives in the stripe, so the system only has to read the remaining ¼ drives in the stripe to reconstruct one, two, three, or four failing drives.  Given their data stripe width, this cuts down on the amount of data needing to be read considerably. Still with 150 data drives in a stripe, the system still has to read 38 drives worth of QLC SSD data to rebuild a data drive.

In addition to all the above, VAST Data also reblocks the data into much larger segments, (it writes 1MB segments to the QLC drives) and uses a heat map along with other heuristics to separate actively written data from less actively written data, thus reducing garbage collection, write amplification.

The podcast is a long and runs over ~43 minutes. Howard has always been great to talk with and if anything, now being a vendor, has intensified this tendency. Listen to the podcast to learn more.

Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data, Inc.

Howard Marks brings over forty years of experience as a technology architect for hire and Industry observer to his role as VAST Data’s Technologist Extraordinary and Plienopotentary. In this role, Howard demystifies VAST’s technologies for customers and customer requirements for VAST’s engineers.

Before joining VAST, Howard ran DeepStorage an industry test lab and analyst firm. An award-winning speaker, he has appeared at events on three continents including Comdex, Interop and VMworld.

Howard is the author of several books (all gratefully out of print) and hundreds of articles since Bill Machrone taught him journalism at PC Magazine in the 1980s.

Listeners may also remember that Howard was a founding co-Host of the Greybeards-on-Storage Podcast.