74: Greybeards talk NVMe shared storage with Josh Goldenhar, VP Cust. Success, Excelero

Sponsored by:

In this episode we talk NVMe shared storage with Josh Goldenhar (@eeschwa), VP, Customer Success at Excelero. Josh has been on our show before (please see our April 2017 podcast), the last time with Excelero’s CTO & Co-founder, Yavin Romen.

This is Excelero’s 1st sponsored GBoS podcast and we wish to welcome them again to the show. Since Excelero’s NVMesh storage software is in customer hands now, Josh is transitioning to add customer support to his other duties.

NVMe storage industry trends

We started our discussion with the maturing NVMe market. Howard mentioned he heard that NVMe SSD sales have overtaken SATA SSD volumes. Josh mentioned that NVMe SSDs are getting harder to come by,  driven primarily by Super 8 (8 biggest hyper-scalars) purchases. And even when these SSDs can be found, customers are paying a premium for NVMe drives.

The industry is also starting to sell larger capacity NVMe SSDs. Customers view this as a way of buying cheaper ($/GB) storage. However, most NVMe shared storage systems use mirroring for data protection, which cuts effective (protected) capacity in half, doubling cost/GB.

Another change in the market, is that with today’s apps many customers no longer need all the  read AND write IO performance from their NVMe storage. For newer applications/workloads, writes are less frequent and as such, less a driver of application performance. But read performance is still critical.

The other industry trend is a number of new vendors offering NVMeoF (Ethernet) storage arrays (see: Pavillion Data’s, Atalla Systems’s, and Solarflare Communication’s  podcasts in just the last few months). Most of the startup systems are essentially top of rack shared NVMe SSDs and some with limited data protection/ management services.

Excelero’s NVMesh has offered a logical volume manager as well as protected NVMe shared storage since the start, with RAID 0 and protected, RAID 1/10 storage.

Excelero is coming out with a new release of its NVMesh™ software defined storage.

NVMesh 2

We were particularly interested in one of NVMesh 2’s new capabilities, its distributed data protection, which is based on Erasure Coding (EC, like RAID 6), with a stripe that includes 8+2 segments. Unlike mirroring/RAID1-10, EC only reduces effective NVMe storage capacity by 20% for protection. And also protects against 2 drive failures within a RAID group.

However, with distributed data protection, write IO will not perform as well as reads. But reads perform just as fast as ever.

As with any data protection, customers will need sufficient spare capacity to rebuild data for a failed device.

The latest release will be available to all current customers, on service contract. When available, customers should immediately start benefiting from the space efficient, distributed data protection for new data on the system.

The new release also adds Fibre Channel (as Howard correctly guessed  on the podcast) and TCP/IP protocols to their current InfiniBand, RoCE, and NVMeoF support as well as new performance analytics to help diagnose performance issues faster and at scale.

The podcast runs ~25 minutes. Josh has an interesting perspective on the NVMe storage market as well as competitive solutions and was great to talk with again. The new data protection functionality in Excelero NVMesh 2 signals an evolving NVMe storage market. As NVMe storage matures, the tradeoff between performance and data services, looks to be an active war zone for some time to come. Listen to the podcast to learn more.

Josh Goldenhar, Vice President Customer Success, Excelero

Josh has been responsible for product strategy and vision at leading storage companies for over two decades. His experience puts him in a unique position to understand the needs of customers.
Prior to joining Excelero, Josh was responsible for product strategy and management at EMC (XtremIO) and DataDirect Networks. Previous to that, his experience and passion was in large scale, systems architecture and administration with companies such as Cisco Systems. He’s been a technology leader in Linux, Unix and other OS’s for over 20 years. Josh holds a Bachelor’s degree in Psychology/Cognitive Science from the University of California, San Diego.

72: GreyBeards talk Computational Storage with Scott Shadley, VP Marketing NGD Systems

For this episode the GreyBeards talked with another old friend, Scott Shadley, VP Marketing, NGD Systems. As we discussed on our FMS18 wrap up show with Jim Handy, computational storage had sort of a coming out party at the show.

NGD systems started in 2013 and have  been working towards a solution that goes general availability at the end of this year. Their computational storage SSD supplies general purpose processing power sitting inside an SSD. NGD shipped their first prototypes in 2016, shipped FPGA version of their smart SSD in 2017 and already have their field upgradable, ASIC prototypes in customer hands.

NGD’s smart SSDs have a 4-core ARM processor and  run an Ubuntu Distro on 3 of them.  Essentially, anything that could be run on Ubuntu Linux, including Docker containers and Kubernetes could be run on their smart SSDs.

NGD sells standard (storage only) SSDs as well as their smart SSDs. The smart hardware is shipped with all of their SSDs, but is only enabled after customer’s purchase a software license key. They currently offer their smart SSD solutions in  America and Europe, with APAC coming later.

They offer smart SSDs in both a 2.5” and M.2 form factor. NGD Systemss are following the flash technology road map and currently offer a 16TB SSD in 2.5” FF.

How applications work on smart SSDs

They offer an open-source, SDK which creates a TCP/IP tunnel across the  NVMe bus that attaches their smart SSD. This allows the host and the SSD server to communicate and send (RPC) work back and forth between them.

A normal smart SSD work flow could be

  1. Host server writes data onto the smart SSD;
  2. Host signals the smart SSD to perform work on the data on the smartSSD;
  3. Smart SSD processes the data that has been sent to the SSD; and
  4. When smart SSD work is done, it sends a response back to the host.

I assume somewhere before #2 above, you load application software onto the device.

All the work to be done on smart SSDs could be the same for the attached SSD and the work could easily be distributed across all attached smart SSDs attached and the host processor. For example, for image processing, a host processor would write images to be processed across all the SSDs and have each perform image recognition and append tags (or other results info) metadata onto the image and then respond back to the host. Or for media transcoding, video streams could be written to a smart SSD and have it perform transcoding completely outboard.

The smart SSD processors access the data just like the host processor or could use services available in their SDK which would access the data much faster. Just about any data processing you could do on the host processor could be done outboard, on smart SSD processor elements. Scott mentioned that memory intensive applications are probably not a good fit for computational storage.

He also said that their processing (ARM) elements were specifically designed for low power operations. So although AI training and inference processing might be much faster on GPUs, their power consumption was much higher. As a result, AI training and inference processing power-performance would be better on smart SSDs.

Markets for smart SSDs?

One target market for NGD’s computational storage SSDs is hyper scalars. At FMS18, Microsoft Research published a report on running FAISS software on NGD Smart SSDs that led to a significant speedup. Scott also brought up one company they’re working with that was testing  to find out just how many 4K video  streams can be processed on a gaggle of smart SSDs. There was also talk of three letter (gov’t) organizations interested in smart SSDs to encrypt data and perform other outboard processing of (intelligence) data.

Highly distributed applications and data reminds me of a lot of HPC customers I  know. But bandwidth is also a major concern for HPC.  NVMe is fast, but there’s a limit to how many SSDs can be attached to a server.

However, with NVMeoF, NGD Systems could support a lot more “attached”  smart SSDs. Imagine a scoop of smart SSDs, all attached to a slurp of servers,  performing data intensive applications on their processing elements in a widely distributed fashion. Sounds like HPC to me.

The podcast runs ~39 minutes. Scott’s great to talk with and is very knowledgeable about the Flash/SSD industry and NGD Systems. His talk on their computational storage was mind expanding. Listen to the podcast to learn more.

Scott Shadley, VP Marketing, NGD Systems

Scott Shadley, Storage Technologist and VP of Marketing at NGD Systems, has more than 20 years of experience with Storage and Semiconductor technology. Working at STEC he was part of the team that enabled and created the world’s first Enterprise SSDs.

He spent 17 years at Micron, most recently leading the SATA SSD product line with record-breaking revenue and growth for the company. He is active on social media, a lover of all things High Tech, enjoys educating and sharing and a self-proclaimed geek around mobile technologies.

70: GreyBeards talk FMS18 wrap-up and flash trends with Jim Handy, General Dir. Objective Analysis

In this episode we talk about Flash Memory Summit 2018 (FMS18) and recent trends affecting the flash market with Jim Handy, General Director, Objective Analysis. This is the 4th time Jim’s been on our show and has been our go to guy on flash technology forever.

NAND supply?

Talking with Jim is always a far reaching discussion. We quickly centered on recent spot NAND pricing trends. Jim said the market is seeing a 10 to 12% pricing drop, Quarter/Quarter, almost 60% since the year started, in NAND spot pricing which is starting to impact long term contracts. During supply glut’s like this, DRAM spot prices typically drop 40-60% Q/Q, so maybe there’s more NAND price reductions on the way.

A new player in the NAND fab business was introduced at FMS18, Yangtze Memory Technology from China. Jim said they were one generation behind the leaders which says their product costs ($/NAND bit) are likely 2X the industry. But apparently, China is prepared to lose money until they can catch up.

I asked Jim if they have a hope of catching up – yes. For example, there’s been some shenanigans with DRAM technology and a Chinese DRAM Fab. They  have (allegedly)stolen technology from Micron’s Taiwan DRAM FAB. They in turn have sued Micron for patent infringement and won, locking Micron out of the Chinese DRAM market. With DRAM market tightening, Micron’s absence will hurt Chinese electronics producers. Others will step in, but Micron will have to focus DRAM sales elsewhere.

3D Xpoint/Optane?

There wasn’t much discussion on 3D XPoint. Intel did announce some customers for Optane SSDs and that they are starting to produce 3D XPoint in DIMMs. The Intel-Micron 3D XPoint partnership has disolved. Intel seems willing to continue to price their Optane and 3D XPoint DIMM below cost and make it up selling micro processors.

Jim predicted years back there would be little to no market for 3D Xpoint SSDs. With Optane SSDs at 5X higher cost than NAND SSDs and only 5X faster, it’s not a significant enough advantage to generate volumes needed to make a profitable product. But in a DIMM form factor, hanging off the memory bus, it’s 1000X faster than NAND, and with that much performance, it shouldn’t have a problem generating sufficient volumes to become profitable.

Other NAND/SCM news

We talked about the emergence of QLC NAND. With 3D NAND, there appears to be sufficient electrons to make QLC viable. The write speeds are still horrible,  ~1000X slower than SLC. But vendors are now adding SLC NAND (write cache) in their SSDs to sustain faster writes.

The other new technology from FMS18 was computational storage. Computational storage vendors are putting compute near (inside) an SSD to better perform IO intensive workloads. Some computational storage vendors   talked about their technology and how it could speed up select workloads

There’s SCM beyond 3D XPoint. These vendors have been quietly shipping for some time now, they just aren’t at the capacities/bit density to challenge NAND. Jim mentioned a few that were in production, EverSpin/MRAM, Adesto/ReRAM and Crossbar/FeRAM.

Jim said IBM was using EverSpin/MRAM technology in their latest FlashCore Modules for their FlashSystem 9100. And EverSpin MRAM is being used in satellites. Adesto/ReRAM is being used medical instrument market.

The podcast runs ~42 minutes. We apologize for the audio quality, we promise to do better next time. Jim’s been the GreyBeards memory and flash technology guru before our hair turned grey and is always enlightening about the flash market and technology trends.  Listen to the podcast to learn more.

Jim Handy, General Director, Objective Analysis

Jim Handy of Objective Analysis has over 35 years in the electronics industry including 20 years as a leading semiconductor and SSD industry analyst. Early in his career he held marketing and design positions at leading semiconductor suppliers including Intel, National Semiconductor, and Infineon.

A frequent presenter at trade shows, Mr. Handy is known for his technical depth, accurate forecasts, widespread industry presence and volume of publication. He has written hundreds of market reports, articles for trade journals, and white papers, and is frequently interviewed and quoted in the electronics trade press and other media.  He posts blogs at www.TheMemoryGuy.com, and www.TheSSDguy.com

69: GreyBeards talk HCI with Lee Caswell, VP Products, Storage & Availability, VMware

Sponsored by:

For this episode we preview VMworld by talking with Lee Caswell (@LeeCaswell), Vice President of Product, Storage and Availability, VMware.

This is the third time Lee’s been on our show, the previous one was back in August of last year. Lee’s been at VMware for a couple of years now and, among other things, is leading the HCI journey at VMware.

The first topic we discussed was VMware’s expanded HCI software defined data center (SDDC) solution, which now includes compute, storage, networking and enhanced operations with alerts/monitoring/automation that ties it all together.

We asked Lee to explain VMware’s SDDC:

  • HCI operates at the edge – with ROBO-2-server environments, VMware’s HCI can be deployed in a closet and remotely operated by a VI from the central site.
  • HCI operates in the data center – with vSphere-vSAN-NSX-vRealize and other software, VMware modernizes data centers for the  pace of digital business..
  • HCI operates in the public Cloud –with VMware Cloud (VMC)  on AWS, IBM Cloud and over 400 service providers, VMware HCI also operates in the public cloud.
  • HCI operates for containers and cloud native apps – with support for containers under vSphere, vSAN and NSX, developers are finding VMware HCI an easy option to run container apps in the data center, at the edge, and in the public cloud.

The importance of the edge will become inescapable, as 50B edge connected devices power IoT by 2020. Lee heard Pat saying compute processing is moving to the edge because of 3 laws:

  1. the law of physics, light/information only travels so fast;
  2. the law of economics, doing all processing at central sites would take too much bandwidth and cost; and
  3. the law(s) of the land, data sovereignty and control is ever more critical in today’s world.

VMware SDDC is a full stack option, that executes just about anywhere the data center wants to go. Howard mentioned one customer he talked with at FMS18, just wanted to take their 16 node VMware HCI rack and clone it forever, to supply infinite infrastructure.

Next, we turned our discussion to Virtual Volumes (VVols). Recently VMware added replication support for VVols. Lee said VMware has an intent to provide a SRM SRA for VVols. But the real question is why hasn’t there been higher field VVol adoption. We concluded it takes time.

VVols wasn’t available in vSphere 5.5 and nowadays, three or more years have to go by before a significant amount of the field moves to a new release. Howard also said early storage systems didn’t implement VVols right. Moreover, VMware vSphere 5.5 is just now (9/16/18) going EoGS.

Lee said 70% of all current vSAN deployments are AFA. With AFA, hand tuning storage performance is no longer something admins need to worry about. It used to be we all spent time defragging/compressing data to squeeze more effective capacity out of storage, but hand capacity optimization like this has become a lost art. Just like capacity, hand tuning AFA performance doesn’t make sense anymore.

We then talked about the coming flash SSD supply glut. Howard sees flash pricing ($/GB) dropping by 40-50%, regardless of interface. This should drive AFA shipments above 70%, as long as the glut continues.

The podcast runs ~21 minutes. Lee’s always great to talk with and is very knowledgeable about the IT industry, HCI in general, and of course, VMware HCI in particular.  Listen to the podcast to learn more.

Lee Caswell, V.P. of Product, Storage & Availability, VMware

Lee Caswell leads the VMware storage marketing team driving vSAN products, partnerships, and integrations. Lee joined VMware in 2016 and has extensive experience in executive leadership within the storage, flash and virtualization markets.

Prior to VMware, Lee was vice president of Marketing at NetApp and vice president of Solution Marketing at Fusion-IO. Lee was a founding member of Pivot3, a company widely considered to be the founder of hyper-converged systems, where he served as the CEO and CMO. Earlier in his career, Lee held marketing leadership positions at Adaptec, and SEEQ Technology, a pioneer in non-volatile memory. He started his career at General Electric in Corporate Consulting.

Lee holds a bachelor of arts degree in economics from Carleton College and a master of business administration degree from Dartmouth College. Lee is a New York native and has lived in northern California for many years. He and his wife live in Palo Alto and have two children. In his spare time Lee enjoys cycling, playing guitar, and hiking the local hills.

68: GreyBeards talk NVMeoF/TCP with Ahmet Houssein, VP of Marketing & Strategy @ Solarflare Communications

In this episode we talk with Ahmet Houssein, VP of Marketing and Strategic Direction at Solarflare Communications, (@solarflare_comm). Ahmet’s been in the industry forever and has a unique view on where NVMeoF needs to go. Howard had talked with Ahmet at last years FMS. Ahmet will also be speaking at this years FMS (this week in Santa Clara, CA)..

Solarflare Communications sells Ethernet communication gear, mostly to the financial services market and has developed a software plugin for the standard TCP/IP stack on Linux that supports both target and client mode NVMeoF/TCP. That is, their software plugin provides a complete implementation of NVMeoF across TCP Ethernet that extends the TCP protocol but doesn’t require RDMA (RoCE or iWARP) or data center bridging.

Implementing NVMeoF/TCP

Solarflare’s NVMeoF/TCP is a free plugin that once approved by the NVMe(oF) standard’s committees anyone can use to create a NVMeoF storage system and consume that storage from almost anywhere. The standards committee is expected to approve the protocol extension soon and sometime after that the plugin will be added to the Linux Kernel. After standards approval, maybe VMware and Microsoft will adopt it as well, but may take more work.

Over the last year plus most NVMeoF/Ethernet we encounter requires sophisticated RDMA hardware. When we talked with Pavilion Data Systems, a month or so ago, they had designed a more networking like approach to NVMeoF using RoCE and TCP a special purpose FPGA that’s used in their RDMA NICs and Mellanox switches to support client-target mode NVMeoF/UDP [updated 8/8/18 after VR’s comment, the ed.]. When we talked with Attala Systems, they had special purpose FPGA that’s used in RDMA NICs and Mellanox switches to support target & client mode NVMeoF/UDP were using standard RDMA NICs and Mellanox switches to support their NVMeoF/Ethernet storage [updated 8/8/18 after VR’s comment, the ed.].

Solarflare is taking a different tack.

One problem with the NVMeoF/Ethernet RDMA is compatibility. You can use either RoCE or iWARP RDMA NICs but at the moment you can’t use both. With TCP/IP plugins there’s no hardware compatibility issue. (Yes, there’s software compatibility at both ends of the pipe).

SolarFlare recently measured latencies for their NVMeoF/TCP (Iometer/FIO) which shows that the with the protocol running it adds about a 5-10% increase in latency versus running RDMA NVMeoF/UDP-RoCE-iWARP.

Performance measurements were taken using a server, running Red Hat Linux + their TCP plugin with NVMe SSDs on the storage side and a similar configuration on the client side without the SSDs.

If they add 10% latency to 10 microsec. IO (for Optane), latency becomes 11 microsec. Similarly for flash NVMe SSDs it moves from 100 microsec to 110 microsec.

Ahmet did mention that their NICs have some hardware optimizations which brings down this added latency into something approaching closer to 5%. And later we discuss the immense parallelism opportunities using the TCP stack in user space. Their hardware also better supports more threads doing IO in parallel.

Why TCP

Ahmets on a mission. He says there’s this misbelief that Ethernet RDMA hardware is required to achieve lightening fast response times using NVMeoF, but it’s not true. Standard TCP with proper protocol enhancements is more than capable of performing at very close to the same latencies as RDMA, without special NICs and DCB switch configurations.

Furthermore, TCP/IP already has multipathing support. So current high availability characteristics of TCP are readily applicable to NVMeoF/TCP

Parallelism through user space

NVMeoF/TCP was the subject of 1st half of our discussion but we spent the 2nd half talking about scaling or parallelism. Even if you can do 11 or 110 microsecond latency at some point, if you do enough of these IOs, the kernel overhead in processing blocks and transferring control from kernel space to user space will become a bottleneck.

However, there’s nothing stopping IT from running the TCP/IP stack in user space and eliminating any kernel control transfer whatsoever. By doing so, data centers could parallelize all this IO using as many cores as available.

Running the plugin in a TCP/IP stack in user space allows you to scale NVMeoF lightening fast IO to as many users as you have user spaces or cores, and the kernel doesn’t even break into a sweat

Anyone could simply download Solarflare’s plugin, configure a white box server with Linux and 24 NVMe SSDs and support ~8.4M IOPS (350Kx24) at ~110 microsec latency And with user space scaling, one could easily have 1000s of user spaces connected to it.

They’re going to need need faster pipes!

The podcast runs ~39 minutes. Ahmet was very knowledgeable about NVMe, NVMeoF and TCP.  He was articulate and easy to talk with.  Listen to the podcast to learn more.

Ahmet Houssein, VP of Marketing and Strategic Direction at Solarflare Communications 

Ahmet Houssein is responsible for establishing marketing strategies and implementing programs to drive revenue growth, enter new markets and expand brand awareness to support Solarflare’s continuous development and global expansion.

He has over twenty-five years of experience in the server, storage, data center and networking industry, and held senior level executive positions in product development, marketing and business development at Intel and Honeywell. Most recently Houssein was SVP/GM at QLogic where he successfully delivered first to market with 25Gb Ethernet products securing design wins at HP and Dell.

One of the key leaders in the creation of the INFINIBAND and PCI-Express industry standard, Houssein is a recipient of the Intel Achievement Award and was a founding board member of the Storage Network Industry Association (SNIA), a global organization of 400 companies in the storage market. He was educated in London, UK and holds an Electrical Engineering Degree equivalent.