096: GreyBeards YE2019 IT Industry Trends podcast

In this, our yearend industry wrap up episode, the GreyBeards discuss trends and technologdies impacting the IT industry in 2019 and what’s ahead for 2020. This year we have Matt and Keith on the podcast along with Ray. Just like last year, we start off with NVMeoF.

NVMeoF unleashed

This year just about every major storage vendor announced new systems that either have support for NVMeoF or currently offer NVMeoF on their storage systems. Most offer FC based NVMeoF but a few offer NVMeoF/Ethernet, fewer still offer both.

All of the NVMeoF/Ethernet seem to be using RoCE or iWARP. Unclear if one is more often used that the other, so for now both continue to be used in the market. Some storage vendors are offering NVMeoF as an internal fabric to access storage while still using iSCSI or FC/SCSI to access the data. This works better than SAS but won’t provide all the performance you can get from end-to-end NVMeoF.

NVMeoF is all about increasing IOPS and reducing response times. That and getting ready for SCM SSDs. In the mean time the SSD industry has introduced some very attractive NVMe (NAND) SSDs that in NVMeoF storage system can increase IOPS and reduce latencies.

We talked last year about NVMeoF standards finally stabilizing and this year the rollout across enterprise storage systems is testament to that.

SCM hits the enterprise

Most of us attended an Intel Data Center Event earlier this past yea,r where Optane DC PM was introduced. Optane DC PM is the memory version of Optane SCM (3DX Crosspoint) technology. Intel offers two distinct modes of accessing Optane DC PM as memory: 1) App Direct mode, where data in Optane DC PM persists across power cycles but requires one to use a special AP; and 2) Memory mode where Optane DC PM is cleared during a power cycle, (see our RayOnStorage post Need memory, Intel’s Optane DC PM…).

Vendors seem to be using Optane both memory and SCM technology differently. Pure is using Optane SSDs plugged into their FlashArray as sort of a read cache for customer IO. They suggest for well behaved applications this can reduce IO response times considerably.

Dell EMC introduced SCM as a storage tier and are using their automated storage tiering to move the hottest data to SCM. Oracle’s latest Exadata appliance uses Optane DC PM as both a read and write caching layer.

It won’t be long before every enterprise vendor offers SCM drives in their storage systems with a few offering Optane DC PM as in memory caching technology.

Of course, the big news for Optane DC PM is its use in memory databases, specifically SAP HANA. HANA can take advantage of the (6) TB of memory to to handle larger databases. Keith mentioned that even Microsoft SQL server can take advantage of the additional memory to provide faster responses to queries.

Keith also mentioned that there are some systems out there that can be configured to share Optane memory (or storage). When SAP or other databases use this solution they are able to amortize the cost of the technology over more use cases.

Of course, Optane DC PM are only available on the lastest generation Intel processors. None of us have heard anything from AMD (or Micron) on providing a second source for support of Optane DC PM (or the memory technology itself). Presumably most customers would want a second source for Optane DC PM processor support (as well as the technology)

Cloud enterprise storage hits mainstream

The other thing we saw more of this year is enterprise vendors offering versions of storage in public cloud environments. NetApp was an early proponent of doing this.

We saw at Pure that they have a new Cloud Block Store witch is a re-architected version of FlashArray//X storage using AWS hardware and networking services. We were very impressed with what they have accomplished and it was the subject of more than one late night discussion. Listen to the Keith & Ray show at Pure//Accelerate2019 podcast to learn more.

Matt mentioned Nimble’s cloud volume storage which is cloud adjacent. Most enterprise vendors offer something similar today. They differentiate on how easy it is to configure, use and where (which regions) it’s available in.

NetApp has arguably been at this the longest and has the deepest offerings available from cloud adjacent file and block storage, to offering native enterprise file services for all public cloud environments, to supplying a suite of dedicated data services to surround all of their storage technology operating in public clouds and on premises.

While Dell EMC may have missed the turn to the cloud, they are quickly trying to catch up. Keith mentioned Faction, a Dell partner that offers cloud storage services using VMware with VMC. With Faction and vSAN customers have access to software defined storage that uses cloud hardware to support data services.

What’s driving data growth

There seems to be no end for the need for storage to store data. The GreyBeards point to three trends driving data growth today.

  1. IoT seems to have no bounds. A recent RayOnStorage post Internet of Tires discussed how tire companies were tying their tires to the internet. And that’s just the start, pretty soon every artifact, every device, every manufactured item will have a number of sensors attached all of which will be creating massive amounts of data.
  2. AI ML DL has an insatiable appetite for data. IoT is being used largely to optimize products and services. But it’s DL, with a large dollop of data, that is behind much of that optmization.
  3. SaaS applications is a relatively new application approach that’s being rolled out to more arenas and as it’s online and user oriented, seems to generate lots of data.

Containers storage debate

We closed the podcast with a heavy debate on whether container applications have need for storage. Keith was adamant that containers by their very nature are stateless and that Kubernetes ability to stop and start container applications at will almost requires stateless operations.

Ray was a bit more theoretical on the topic and believed that most container applications today take advantage of some sort of database or other services to store state and that state is just another word for storage.

Keith mentioned encoding as a typical container app. Encoding containers can be fired up and taken down at will without hurting anything but throughput. Yes, but those encoder container apps must access some database or other state information to find out what work is left to do and as they complete their work they update this data as well as store their newly encoded segments. This all involves the use of state information.

In the end, I think we were talking about the same thing but using different terminology. Keith believes that persistent state information is needed and Ray says that this is just another word for (containers) storage. Matt said we probably need Nigel (@NigelPoulton) on the podcast to straighten us both out.

The podcast ran a bit long and could have run longer. Keith and Matt bring systems level perspective to what’s happening in the storage market. But they come at it from different sides. Ray seems to frame everything from a storage perspective. Diverse perspectives lead to a more fuller and interesting discussion. Listen to the podcast to learn more.


This image has an empty alt attribute; its file name is Spotify_Logo_CMYK_Black-1024x307.png
This image has an empty alt attribute; its file name is Subscribe_on_iTunes_Badge_US-UK_110x40_0824.png
This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Ray Lucchesi ( @RayLucchesi) is the host of GreyBeardsOnStorage and is President/Founder of Silverton Consulting, and a prominent blogger at RayOnStorage.com.

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations. Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek

Matt Leib (@MBLeib), one of our co-hosts, has been blogging in the storage space for over 10 years, with work experience both on the engineering and presales/product marketing. His blog is at Virtually Tied to My Desktop.


85: GreyBeards talk NVMe NAS with Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data Inc.

As most of you know, Howard Marks was a founding co-Host of the GreyBeards-On- Storage podcast and has since joined with VAST Data, an NVMe file and object storage vendor headquartered in NY with R&D out of Israel. We first met with VAST at StorageFieldDay18 (SFD18, video presentation). Howard announced his employment at that event. VAST was a bit circumspect at their SFD18 session but Howard seems to be more talkative, so on the podcast we learn a lot more about their solution.

VAST Data is essentially an NFS-S3 object store, scale out solution with both stateless, VAST Data storage servers and JBoF drive enclosures with Optane and NVMe QLC SSDs. Storage servers or JBoFs can be scaled independently. They don’t support tiering or DRAM caching of data but instead seem to use the Optane SSDs as a write buffer for the QLC SSDs.

At the SFD18 event their spokesperson said that they were going to kill off disk storage media. (Ed’s note: Disk shipments fell 18% y/y in 1Q 2019, with enterprise disk shipments at 11.5M units, desktop at 24.5M units and laptops at 37M units).

The hardware

The VAST Data storage servers are in a 2U/4 server configuration, that runs interface protocols (NFS & S3), data reduction (see below), data reformating/buffering etc. They are stateless servers with all the metadata and other control state maintained on JBoF Optane drives.

Each drive enclosure JBoF has 12 Optane SSDs and 44 U.2 QLC (no DRAM/no super cap) SSDs. This means there are no write buffers on the QLC SSDs that can lose data when power failures occur. The interface to the JBoF is NVMeoF, either RDMA-RoCE Ethernet or InfiniBand (customer selected). Their JBoFs have high availability, with dual fabric modules that support 2-100Gbps Ethernet/InfiniBand ports per module, 4 per JBoF.

Minimum starting capacity is 500TB and they claim support up to Exabytes. Although how much has actually been tested is an open question. They also support billions of objects/files.

Guaranteed better data reduction

They have a rather unique, multi-level, data reduction scheme. At the start, data is chunked in variable length chunks. They use heuristics to determine the chunk size that fits best. (Ed note, unclear which is first in this sequence below so presented in (our view of) logical order)

  • 1st level computes a similarity hash (56 bit not SHA1), which is used to determine a similarity level with any other currently stored data chunk in the system.
  • 2nd level uses a ZSTD compression algorithm. If a similarity is found, the new data chunk is compressed with the ZSTD compression algorithm and a reference dictionary used by the earlier, similar data chunk. If no existing chunk is similar to this one, the algorithm identifies a semi-unique reference dictionary that optimizes the compression of this data chunk. This semi-unique dictionary is stored as metadata.
  • 3rd level, If it turns out to be a complete duplicate data chunk, then the dedupe count for the original data chunk is incremented, a pointer is saved to the original unique data and the data discarded. If not a complete duplicate of other data, the system computes a delta from the closest “similar’ block and stores just the delta bytes, includes a pointer to the original similar block and increments a delta block counter.

So data is chunked, compressed with a optimized dictionary, be delta-diffed or deduped. All data reduction is done post data write (after the client is ACKed), and presumably, re-hydrated after being read from SSD media. VAST Data guarantees better data reduction for your stored data than any other storage solution.

New data protection

They also supply a unique Locally Decodable Erasure Coding with 4 parity (-like) blocks and anywhere from 36 (single enclosure leaving 4 spare u.2 SSDs) to 150 data blocks per stripe all of which support up to 4 device failures per stripe. 

The locally decodable erasure coding scheme allows for rebuilds without having to read all remaining data blocks in a stripe. In this scheme, once you read the 4 parity (-like) blocks, one has all the information calculated from up to ¾ of the remaining drives in the stripe, so the system only has to read the remaining ¼ drives in the stripe to reconstruct one, two, three, or four failing drives.  Given their data stripe width, this cuts down on the amount of data needing to be read considerably. Still with 150 data drives in a stripe, the system still has to read 38 drives worth of QLC SSD data to rebuild a data drive.

In addition to all the above, VAST Data also reblocks the data into much larger segments, (it writes 1MB segments to the QLC drives) and uses a heat map along with other heuristics to separate actively written data from less actively written data, thus reducing garbage collection, write amplification.

The podcast is a long and runs over ~43 minutes. Howard has always been great to talk with and if anything, now being a vendor, has intensified this tendency. Listen to the podcast to learn more.

Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data, Inc.

Howard Marks brings over forty years of experience as a technology architect for hire and Industry observer to his role as VAST Data’s Technologist Extraordinary and Plienopotentary. In this role, Howard demystifies VAST’s technologies for customers and customer requirements for VAST’s engineers.

Before joining VAST, Howard ran DeepStorage an industry test lab and analyst firm. An award-winning speaker, he has appeared at events on three continents including Comdex, Interop and VMworld.

Howard is the author of several books (all gratefully out of print) and hundreds of articles since Bill Machrone taught him journalism at PC Magazine in the 1980s.

Listeners may also remember that Howard was a founding co-Host of the Greybeards-on-Storage Podcast.


83: GreyBeards talk NVMeoF/TCP with Muli Ben-Yehuda, Co-founder & CTO and Kam Eshghi, VP Strategy & Bus. Dev., Lightbits Labs

This is the first time we’ve talked with Muli Ben-Yehuda (@Muliby), Co-founder & CTO and Kam Eshghi (@KamEshghi), VP of Strategy & Business Development, Lightbits Labs. Keith and I first saw them at Dell Tech World 2019, in Vegas as they are a Dell Ventures funded organization. The company has 70 (mostly engineering) employees and is based in Israel, with offices in NY and the Valley as well as elsewhere around the world. Kam was previously with (Dell) EMC DSSD and Muli’s spent years as a Master Inventor with IBM Research.

[This was Keith Townsend’s (@CTOAdvisor & The CTO Advisor), first time as a GreyBeard co-host and we had a great time with him on the show.]

I would have to say it was a far ranging discussion but focused on their software defined, NVMeoF/TCP storage. As you may recall we talked with Solarflare Communications last year who were also working on a NVMeoF/TCP, only in their case it was an accelerator board. After the recording, Muli said the hardware accelerator they have is their own design.

Why NVMeoF/TCP?

Most NVMeoF today, that uses Ethernet, requires RoCE or iWARP compatible NICs and switches. Lightbits Labs has long been active in the NVMeoF/RoCE-iWARP market place. Early on they noticed that enterprise and cloud service providers were reluctant to adopt NVMeoF technology because of the need to change out all their networking equipment to use it. This is what brought about their focus on NVMeoF/TCP.

The advantage of NVMeoF/TCP is that it can be run on any Ethernet NIC and switch available today. From Muli’s perspective, NVMeoF/TCP is going to become the next SAN of choice for the data center. They were active, early on, in the standards committee to push for NVMeoF/TCP adoption.

How does it work?

Their software defined solution runs LightOS® storage software, a Linux based package, and uses off the shelf, server hardware with persistent storage (Optane DC PM/SSDs, NV DIMMs, V-NAND, etc.). They use persistent memory for a FAST write buffer and a place where they can “mold” the written data into something that can be better written to backend NVMe SSDs.

One surprise about Lightbits solution is that it offers a decent set of data services. These include erasure coding, thin provisioning, wire-speed inline compression, QoS and wide striping. It seems like any of these can be disabled by a customers want. But they only add very little overhead. I think Muli mentioned one Lightbits customer with encrypted data that disabled compression.

Lightbits also offers a global FTL (flash translation layer), which means they control SSD addressing which maps data to physical/raw NAND locations at the storage system level. If done well, a global FTL can help improve flash endurance and may offer better write performance (through increased parallelism).

Lightbits claim to inline, wire speed data compression is premised on the use of more current CPUs with high (>=28) core counts in a storage server. If the storage server has older CPUs (<28 cores), they suggest you install their LightField™ hardware accelerator add in card. LightField offers a number of hardware based, performance accelerations in addition to compression speedups.

LightOS requires no host (client) software. Muli’s a long time Linux kernel contributor and indicated that the only thing LightOS needs is a current Linux Kernel (5.0 or later) which has the NVMeoF/TCP driver software (and persistent memory). Lightbits believes that it’s only a matter of time until other OSs also implement NVMeoF/TCP drivers.

Lightbits business considerations

Long term, Lightbits sees a need for compute-storage disaggregation in hyper scalar and enterprise cloud environments. Early on it was relatively easy to replicate servers with DAS storage but as NVMe SSDs came out the expense to do this throughout their >>1000 server environment starts to become exorbitant. If they only had an easy way to disaggregate their storage from compute and still enjoy all the performance advantages of DAS NVMe SSDS. With LightOS they can do that.

Lightbits can be sold today through Dell, as a partner solution, which means that Dell can integrate, test and validate their servers with LightField accelerator card and deliver that package to your data center. I believe you still need to purchase and install their LightOS software yourself.

Lightbits charges for LightOS software on a per storage node basis, but they have different charges based on the maximum number of NVMe SSD slots available is in a server. There is no capacity charge. They also offer worldwide service and support for LightOS software and LightField hardware.

It’s all about performance

From a performance perspective, one Fortune 500 hyper-scalar benchmarked their storage solution against a DAS NVMe server and found it added about 30 µsec to the IO latency as compare to DAS NVMe SSDs. From their perspective, the added data services, better endurance, and disaggregated compute-storage environment provided by LightOS more than made up for the additional overhead.

Finally, I asked about whether multiple LightOS storage servers could be clustered together. Muli intervened, after stating some legal stuff, said they were working on the next generation LightOS and it will support clustered storage servers, local data replication as well as distributed (across storage servers) erasure coding.

The podcast is a long one and runs over ~47 minutes. There was a lot to talk about and Kam and Muli seem to know it all. It was interesting to hear the history of their pivot to TCP. They seem to have the right technology to address the market. Listen to the podcast to learn more.

Muli Ben-Yehuda, Co-founder and CTO, Lightbits Labs

Muli Ben-Yehuda is the CTO and Co-Founder of Lightbits Labs, where he leads technological developments.

Prior to founding Lightbits, he was chief scientist at Stratoscale and a researcher and Master Inventor at IBM Research.

He holds an M.Sc. in Computer Science (summa cum laude) from the Technion — Israel Institute of Technology and a B.A. (cum laude) from the Open University of Israel.

He is a long time Linux kernel contributor and his code and ideas are most likely included in an operating system or hypervisor running near you. He is also one of the authors of the NVMe/TCP standard and technology. 

Kam Eshghi, VP Strategy & Business Development, Lightbits Labs

Kam joined Lightbits Labs from Dell EMC and has over 20yrs of experience in strategic marketing and business development with startups and public companies.

Most recently as VP of strategic alliances at startup DSSD, Kam led business development with technology partners and developed DSSD’s partnership with EMC, leading to EMC’s acquisition of DSSD.

Previously as Sr. Director of Marketing & Business Development at IDT, Kam built their NVMe Controller business from scratch. Previous to that, Kam worked in data center storage, compute and networking markets at HP, Intel, and Crosslayer Networks. 

Kam is a U.C. Berkeley and MIT graduate with a BS and MS in Electrical Engineering and Computer Science and an MBA.

78: GreyBeards YE2018 IT industry wrap-up podcast

In this, our yearend industry wrap up episode, we discuss trends and technology impacting the IT industry in 2018 and what we can see ahead for 2019 and first up is NVMeoF

NVMeoF has matured

In the prior years, NVMeoF was coming from startups, but last year it’s major vendors like IBM FlashSystem, Dell EMC PowerMAX and NetApp AFF releasing new NVMeoF storage systems. Pure Storage was arguably earliest with their NVMeoF JBOF.

Dell EMC, IBM and NetApp were not far behind this curve and no doubt see it as an easy way to reduce response time without having to rip and replace enterprise fabric infrastructure.

In addition, NVMeoFstandards have finally started to stabilize. With the gang of startups, standards weren’t as much of an issue as they were more than willing to lead, ahead of standards. But major storage vendors prefer to follow behind standards committees.

As another example, VMware showed off an NVMeoF JBOF for vSAN. A JBoF like this improves vSAN storage efficiency for small clusters. Howard described how this works but with vSAN having direct access to shared storage, it can reduce data and server protection requirements for storage. Especially, when dealing with small clusters of servers becoming more popular these days to host application clusters.

The other thing about NVMeoF storage is that NVMe SSDs have also become very popular. We are seeing them come out in everyone’s servers and storage systems. Servers (and storage systems) hosting 24 NVMe SSDs is just not that unusual anymore. For the price of a PCIe switch, one can have blazingly fast, direct access to a TBs of NVMe SSD storage.

HCI reaches critical mass

HCI has also moved out of the shadows. We recently heard news thet HCI is outselling CI. Howard and I attribute this to the advances made in VMware’s vSAN 6.2 and the appliance-ification of HCI. That and we suppose NVMe SSDs (see above).

HCI makes an awful lot of sense for application clusters that VMware is touting these days. CI was easy but an HCI appliance cluster is much, simpler to deploy and manage

For VMware HCI, vSAN Ready Nodes are available from just about any server vendor in existence. With ready nodes, VARs and distributors can offer an HCI appliance in the channel, just like the majors. Yes, it’s not the same as a vendor supplied appliance, doesn’t have the same level of software or service integration, but it’s enough.

[If you want to learn more, Howard’s is doing a series of deep dive webinars/classes on HCI as part of his friend’s Ivan’s ipSpace.net. The 1st 2hr session was recorded 11 December, part 2 goes live 22 January, and the final installment on 5 February. The 1st session is available on demand to subscribers. Sign up here]

Computional storage finally makes sense

Howard and I 1st saw computational storage at FMS18 and we did a podcast with Scott Shadley of NGD systems. Computational storage is an SSD with spare ARM cores and DRAM that can be used to run any storage intensive, Linux application or Docker container.

Because it’s running in the SSD, it has (even faster than NVMe) lightening fast access to all the data on the SSD. Indeed, And the with 10s to 1000s of computational storage SSDs in a rack, each with multiple ARM cores, means you can have many 1000s of cores available to perform your data intensive processing. Almost like GPUs only for IO access to storage (SPUs?).

We tried this at one vendor in the 90s, executing some database and backup services outboard but it never took off. Then in the last couple of years (Dell) EMC had some VM services that you could run on their midrange systems. But that didn’t seem to take off either.

The computational storage we’ve seen all run Linux. And with todays data intensive applications coming from everywhere these days, and all the spare processing power in SSDs, it might finally make sense.

Futures

Finally, we turned to what we see coming in 2019. Howard was at an Intel Analyst event where they discussed Optane DIMMs. Our last podcast of 2018 was with Brian Bulkowski of Aerospike who discussed what Optane DIMMs will mean for high performance database systems and just about any memory intensive server application. For example, affordable, 6TB memory servers will be coming out shortly. What you can do with 6TB of memory is another question….

Howard Marks, Founder and Chief Scientist, DeepStorage

Howard Marks is the Founder and Chief Scientist of DeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

Raymond Lucchesi, Founder and President, Silverton Consulting

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi. Signup for SCI’s free, monthly e-newsletter here.