75: GreyBeards talk persistent memory IO with Andy Grimes, Principal Technologist, NetApp

Sponsored By:  NetApp
In this episode we talk new persistent memory IO technology  with Andy Grimes, Principal Technologist, NetApp. Andy presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). If you get a chance we encourage you to watch the videos as Andy, did a great job describing their new MAX Data persistent memory IO solution.

The technology for MAX Data came from NetApp’s Plexistor acquisition. Prior to the acquisition, Plexistor had also presented at a SFD9 and TFD11.

Unlike NVMeoF storage systems, MAX Data is not sharing NVMe SSDs across servers. What MAX Data does is supply an application-neutral way to use persistent memory as a new, ultra fast, storage tier together with a backing store.

MAX Data performs a write or an “active” (Persistent Memory Tier) read in single digit µseconds for a single core/single thread server. Their software runs in user space and as such, for multi-core servers, it can take up to 40  µseconds.  Access times for backend storage reads is the same as NetApp AFF but once read, data is automatically promoted to persistent memory, and while there, reads ultra fast.

One of the secrets of MAX Data is that they have completely replaced the Linux Posix File IO stack with their own software. Their software is streamlined and bypasses a lot of the overhead present in today’s Linux File Stack. For example, MAX Data doesn’t support metadata-journaling.

MAX Data works with many different types of (persistent) memory, including DRAM (non-persistent memory), NVDIMMs (DRAM+NAND persistent memory) and Optane DIMMs (Intel 3D Xpoint memory, slated to be GA end of this year). We suspect it would work with anyone else’s persistent memory as soon as they come on the market.

Even though the (Optane and NVDIMM) memory is persistent, server issues can still lead to access loss. In order to provide data availability for server outages, MAX Data also supports MAX Snap and MAX Recovery. 

With MAX Snap, MAX Data will upload all persistent memory data to ONTAP backing storage and ONTAP snapshot it. This way you have a complete version of MAX Data storage that can then be backed up or SnapMirrored to other ONTAP storage.

With MAX Recovery, MAX Data will synchronously replicate persistent memory writes to a secondary MAX Data system. This way, if the primary MAX Data system goes down, you still have an RPO-0 copy of the data on another MAX Data system that can be used to restore the original data, if needed. Synchronous mirroring will add 3-4  µseconds to the access time for writes, quoted above.

Given the extreme performance of MAX Data, it’s opening up whole new set of customers to talking with NetApp. Specifically, high frequency traders (HFT) and high performance computing (HPC). HFT companies are attempting to reduce their stock transactions access time to as fast as humanly possible. HPC vendors have lots of data and processing all of it in a timely manner is almost impossible. Anything that can be done to improve throughput/access times should be very appealing to them.

To configure MAX Data, one uses a 1:25 ratio of persistent memory capacity to backing store. MAX Data also supports multiple LUNs.

MAX Data only operates on Linux OS and supports (IBM) RedHat and CentOS, But Andy said it’s not that difficult to add support for other versions of Linux Distros and customers will dictate which other ones are supported, over time.

As discussed above, MAX Data works with NetApp ONTAP storage, but it also works with SSD/NVMe SSDs as backend storage. In addition, MAX Data has been tested with NetApp HCI (with SolidFire storage, see our prior podcasts on NetApp HCI with Gabriel Chapman and Adam Carter) as well as E-Series storage. The Plexistor application has been already available on AWS Marketplace for use with EC2 DRAM and EBS backing store. It’s not much of a stretch to replace this with MAX Data.

MAX Data is expected to be GA released before the end of the year.

A key ability of the MAX Data solution is that it requires no application changes to use persistent memory for ultra-fast IO. This should help accelerate persistent memory adoption in data centers when the hardware becomes more available. Speaking to that, at Insight2018, Lenovo, Cisco and Intel were all on stage when NetApp announced MAX Data.

The podcast runs ~25 minutes. Andy’s an old storage hand (although no grey beard) and talks the talk, walks the walk of storage religion. Andy is new to TFD but we doubt it will be the last time we see him there. Andy was very conversant on the MAX Data technology and the market that it apparently is opening up for this new technology.  Listen to our podcast to learn more.

Andy Grimes, Principal Technologiest, NetApp

Andy has been in the IT industry for 17 years, working in roles spanning development, technology architecture, strategic outsourcing and Healthcare..

For the past 4 years Andy has worked with NetApp on taking the NetApp Flash business from #5 to #1 in the industry (according to IDC). During this period NetApp also became the fastest growing Flash and SAN vendor in the market and regained leadership in the Gartner quadrant.

Andy also works with NetApp’s product vision, competitive analysis and future technology direction and working with the team bringing the MAX Data PMEM product to market.

Andy has a BS degree in psychology, a BPA in management information systems, and an MBA. He current works as a Principal Technologist for the NetApp Cloud Infrastructure Business Unit with a focus on PMEM, HCI and Cloud Strategy. Andy lives in Apex, NC with his beautiful wife and has 2 children, a 4 year old and a 22 year old (yes don’t let this happen to you). For fun Andy likes to Mountain Bike, Rock Climb, Hike and Scuba Dive.

74: Greybeards talk NVMe shared storage with Josh Goldenhar, VP Cust. Success, Excelero

Sponsored by:

In this episode we talk NVMe shared storage with Josh Goldenhar (@eeschwa), VP, Customer Success at Excelero. Josh has been on our show before (please see our April 2017 podcast), the last time with Excelero’s CTO & Co-founder, Yavin Romen.

This is Excelero’s 1st sponsored GBoS podcast and we wish to welcome them again to the show. Since Excelero’s NVMesh storage software is in customer hands now, Josh is transitioning to add customer support to his other duties.

NVMe storage industry trends

We started our discussion with the maturing NVMe market. Howard mentioned he heard that NVMe SSD sales have overtaken SATA SSD volumes. Josh mentioned that NVMe SSDs are getting harder to come by,  driven primarily by Super 8 (8 biggest hyper-scalars) purchases. And even when these SSDs can be found, customers are paying a premium for NVMe drives.

The industry is also starting to sell larger capacity NVMe SSDs. Customers view this as a way of buying cheaper ($/GB) storage. However, most NVMe shared storage systems use mirroring for data protection, which cuts effective (protected) capacity in half, doubling cost/GB.

Another change in the market, is that with today’s apps many customers no longer need all the  read AND write IO performance from their NVMe storage. For newer applications/workloads, writes are less frequent and as such, less a driver of application performance. But read performance is still critical.

The other industry trend is a number of new vendors offering NVMeoF (Ethernet) storage arrays (see: Pavillion Data’s, Atalla Systems’s, and Solarflare Communication’s  podcasts in just the last few months). Most of the startup systems are essentially top of rack shared NVMe SSDs and some with limited data protection/ management services.

Excelero’s NVMesh has offered a logical volume manager as well as protected NVMe shared storage since the start, with RAID 0 and protected, RAID 1/10 storage.

Excelero is coming out with a new release of its NVMesh™ software defined storage.

NVMesh 2

We were particularly interested in one of NVMesh 2’s new capabilities, its distributed data protection, which is based on Erasure Coding (EC, like RAID 6), with a stripe that includes 8+2 segments. Unlike mirroring/RAID1-10, EC only reduces effective NVMe storage capacity by 20% for protection. And also protects against 2 drive failures within a RAID group.

However, with distributed data protection, write IO will not perform as well as reads. But reads perform just as fast as ever.

As with any data protection, customers will need sufficient spare capacity to rebuild data for a failed device.

The latest release will be available to all current customers, on service contract. When available, customers should immediately start benefiting from the space efficient, distributed data protection for new data on the system.

The new release also adds Fibre Channel (as Howard correctly guessed  on the podcast) and TCP/IP protocols to their current InfiniBand, RoCE, and NVMeoF support as well as new performance analytics to help diagnose performance issues faster and at scale.

The podcast runs ~25 minutes. Josh has an interesting perspective on the NVMe storage market as well as competitive solutions and was great to talk with again. The new data protection functionality in Excelero NVMesh 2 signals an evolving NVMe storage market. As NVMe storage matures, the tradeoff between performance and data services, looks to be an active war zone for some time to come. Listen to the podcast to learn more.

Josh Goldenhar, Vice President Customer Success, Excelero

Josh has been responsible for product strategy and vision at leading storage companies for over two decades. His experience puts him in a unique position to understand the needs of customers.
Prior to joining Excelero, Josh was responsible for product strategy and management at EMC (XtremIO) and DataDirect Networks. Previous to that, his experience and passion was in large scale, systems architecture and administration with companies such as Cisco Systems. He’s been a technology leader in Linux, Unix and other OS’s for over 20 years. Josh holds a Bachelor’s degree in Psychology/Cognitive Science from the University of California, San Diego.

73: GreyBeards talk HCI with Gabriel Chapman, Sr. Mgr. Cloud Infrastructure NetApp

Sponsored by: NetApp

In this episode we talk HCI  with Gabriel Chapman (@Bacon_Is_King), Senior Manager, Cloud Infrastructure, NetApp. Gabriel presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). Gabriel also presented last year at the VMworld 2017 TFDx event (video available here). If you get a chance we encourage you to watch the videos as Gabriel, did a great job providing some design intent and descriptions of NetApp HCI capabilities. Our podcast was recorded after the TFDx event.

NetApp HCI consists of NetApp Solidfire storage re-configured, as a small enterprise class AFA storage node occupying one blade of a four blade system, where the other three blades are dedicated compute servers. NetApp HCI runs VMware vSphere but uses enterprise class iSCSI storage supplied by the NetApp SolidFire AFA.

On our podcast, we talked a bit about SolidFire storage. It’s not well known but the 1st few releases of SolidFire (before NetApp acquisition) didn’t have a GUI and was entirely dependent on its API/CLI for operations. That heritage continues today as NetApp HCI management console is basically a front end GUI for NetApp HCI API calls.

Another advantage of SolidFire storage was it’s extensive QoS support which included state of the art service credits as well as service limits.  All that QoS sophistication is also available in NetApp HCI, so that customers can more effectively limit noisy neighbor interference on HCI storage.

Although NetApp HCI runs VMware vSphere as its preferred hypervisor, it’s also possible to run other hypervisors in bare metal clusters with NetApp HCI storage and compute servers. In contrast to other HCI solutions, with NetApp HCI, customers can run different hypervisors, all at the same time, sharing access to NetApp HCI storage.

On our podcast and the Insight TFDx talk, Gabriel mentioned some future deliveries and roadmap items such as:

  • Extending NetApp HCI hardware with a new low-end, 2U configuration designed specifically for RoBo and SMB customers;.
  • Adding NetApp Cloud Volume support so that customers can extend their data fabric out to NetApp HCI; and
  • Adding (NFS) file services support so that customers using NFS data stores /VVols could take advantage of NetApp HCI storage.

Another thing we discussed was the new development HCI cadence. In the past they typically delivered new functionality about 1/year. But with the new development cycle,  they’re able to deliver functionality much faster but have settled onto a 2 releases/year cycle, which seems about as quickly as their customer base can adopt new functionality.

The podcast runs ~22 minutes. We apologize for any quality issues with the audio. It was recorded at the show and we were novices with the onsite recording technology. We promise to do better in the future. Gabriel has almost become a TFDx regular these days and provides a lot of insight on both NetApp HCI and SolidFire storage.  Listen to our podcast to learn more.

Gabriel Chapman, Senior Manager, Cloud Infrastructure, NetApp

Gabriel is the Senior Manager for NetApp HCI Go to Market. Today he is mainly engaged with NetApp’s top tier customers and partners with a primary focus on Hyper Converged Infrastructure for the Next Generation Data Center.

As a 7 time vExpert that transitioned into the vendor side after spending 15 years working in the end user Information Technology arena, Gabriel specializes in storage and virtualization technologies. Today his primary area of expertise revolves around storage, data center virtualization, hyper-converged infrastructure, rack scale/hyper scale computing, cloud, DevOps, and enterprise infrastructure design.

Gabriel is a Prime Mover, Technologist, Unapologetic Randian, Social Media Junky, Writer, Bacon Lover, and Deep Thinker, whose goal is to speak truth on technology and make complex ideas sound simple. In his free time, Gabriel is the host of the In Tech We Trust podcast and enjoys blogging as well as public speaking.

Prior to joining SolidFire, Gabriel was a storage technologies specialist covering the United States with Cisco, focused on the Global Service Provider customer base. Before Cisco, he was part of the go-to-market team at SimpliVity, where he concentrated on crafting the customer facing messaging, pre-sales engagement, and evangelism efforts for the early adopters of Hyper Converged Infrastructure.

72: GreyBeards talk Computational Storage with Scott Shadley, VP Marketing NGD Systems

For this episode the GreyBeards talked with another old friend, Scott Shadley, VP Marketing, NGD Systems. As we discussed on our FMS18 wrap up show with Jim Handy, computational storage had sort of a coming out party at the show.

NGD systems started in 2013 and have  been working towards a solution that goes general availability at the end of this year. Their computational storage SSD supplies general purpose processing power sitting inside an SSD. NGD shipped their first prototypes in 2016, shipped FPGA version of their smart SSD in 2017 and already have their field upgradable, ASIC prototypes in customer hands.

NGD’s smart SSDs have a 4-core ARM processor and  run an Ubuntu Distro on 3 of them.  Essentially, anything that could be run on Ubuntu Linux, including Docker containers and Kubernetes could be run on their smart SSDs.

NGD sells standard (storage only) SSDs as well as their smart SSDs. The smart hardware is shipped with all of their SSDs, but is only enabled after customer’s purchase a software license key. They currently offer their smart SSD solutions in  America and Europe, with APAC coming later.

They offer smart SSDs in both a 2.5” and M.2 form factor. NGD Systemss are following the flash technology road map and currently offer a 16TB SSD in 2.5” FF.

How applications work on smart SSDs

They offer an open-source, SDK which creates a TCP/IP tunnel across the  NVMe bus that attaches their smart SSD. This allows the host and the SSD server to communicate and send (RPC) work back and forth between them.

A normal smart SSD work flow could be

  1. Host server writes data onto the smart SSD;
  2. Host signals the smart SSD to perform work on the data on the smartSSD;
  3. Smart SSD processes the data that has been sent to the SSD; and
  4. When smart SSD work is done, it sends a response back to the host.

I assume somewhere before #2 above, you load application software onto the device.

All the work to be done on smart SSDs could be the same for the attached SSD and the work could easily be distributed across all attached smart SSDs attached and the host processor. For example, for image processing, a host processor would write images to be processed across all the SSDs and have each perform image recognition and append tags (or other results info) metadata onto the image and then respond back to the host. Or for media transcoding, video streams could be written to a smart SSD and have it perform transcoding completely outboard.

The smart SSD processors access the data just like the host processor or could use services available in their SDK which would access the data much faster. Just about any data processing you could do on the host processor could be done outboard, on smart SSD processor elements. Scott mentioned that memory intensive applications are probably not a good fit for computational storage.

He also said that their processing (ARM) elements were specifically designed for low power operations. So although AI training and inference processing might be much faster on GPUs, their power consumption was much higher. As a result, AI training and inference processing power-performance would be better on smart SSDs.

Markets for smart SSDs?

One target market for NGD’s computational storage SSDs is hyper scalars. At FMS18, Microsoft Research published a report on running FAISS software on NGD Smart SSDs that led to a significant speedup. Scott also brought up one company they’re working with that was testing  to find out just how many 4K video  streams can be processed on a gaggle of smart SSDs. There was also talk of three letter (gov’t) organizations interested in smart SSDs to encrypt data and perform other outboard processing of (intelligence) data.

Highly distributed applications and data reminds me of a lot of HPC customers I  know. But bandwidth is also a major concern for HPC.  NVMe is fast, but there’s a limit to how many SSDs can be attached to a server.

However, with NVMeoF, NGD Systems could support a lot more “attached”  smart SSDs. Imagine a scoop of smart SSDs, all attached to a slurp of servers,  performing data intensive applications on their processing elements in a widely distributed fashion. Sounds like HPC to me.

The podcast runs ~39 minutes. Scott’s great to talk with and is very knowledgeable about the Flash/SSD industry and NGD Systems. His talk on their computational storage was mind expanding. Listen to the podcast to learn more.

Scott Shadley, VP Marketing, NGD Systems

Scott Shadley, Storage Technologist and VP of Marketing at NGD Systems, has more than 20 years of experience with Storage and Semiconductor technology. Working at STEC he was part of the team that enabled and created the world’s first Enterprise SSDs.

He spent 17 years at Micron, most recently leading the SATA SSD product line with record-breaking revenue and growth for the company. He is active on social media, a lover of all things High Tech, enjoys educating and sharing and a self-proclaimed geek around mobile technologies.

71: GreyBeards talk DP appliances with Sharad Rastogi, Sr. VP & Ranga Rajagopalan, Sr. Dir., Dell EMC DPD

Sponsored by:

In this episode we talk data protection appliances with Sharad Rastogi (@sharadrastogi), Senior VP Product Management,  and Ranga Rajagopalan, Senior Director, Data Protection Appliances Product Management, Dell EMC Data Protection Division (DPD). Howard attended Ranga’s TFDx session (see TFDx videos here) on their new Integrated Data Protection Appliance (IDPA) the DP4400 at VMworld last month in Las Vegas.

This is the first time we have had anyone from Dell EMC DPD on our show. Ranga and Sharad were both knowledgeable about the data protection industry, industry trends and talked at length about the new IDPA DP4400.

Dell EMC IDPA DP4400

The IDPA DP4400 is the latest member of the Dell EMC IDPA product family.  All IDPA products package secondary storage, backup software and other solutions/services to make for a quick and easy deployment of a complete backup solution in your data center.  IDPA solutions include protection storage and software, search and analytics, system management — plus cloud readiness with cloud disaster recovery and long-term retention — in one 2U appliance. So there’s no need to buy any other secondary storage or backup software to provide data protection for your data center.

The IDPA DP4400 grows in place  from 24 to 96TB of usable capacity and at an average 55:1 dedupe ratio, it could support over 5PB of backup storage on the appliance. The full capacity always ships with the appliance. Customers can select how much or little they get to use by just purchasing a software license key.

In addition to the on appliance capacity, the IDPA DP4400 can use up to 192TB of cloud storage for a native Cloud tier. Cloud tiering takes place after a specified, appliance residency interval, after which backup data is moved from the appliance to the cloud. IDPA Cloud Tier works with AWS, Azure, IBM Cloud Object Storage, Ceph and Dell EMC Elastic Cloud Storage. With the 192TB of cloud and 96TB of on appliance usable storage, together with a 55:1 dedupe ratio, a single IDPA DP4400 can support over 14PB of logical backup data.

Furthermore, IDPA supports Cloud DR. With Cloud DR, backed up VMs are copied to the public cloud (AWS) on a scheduled basis. In case of a disaster, there is an orchestrated failover with the VMs spun up in the cloud. The cloud workloads can then easily be failed back on site once the disaster is resolved.

The IDPA DP4400 also comes with native DD Boost™ support. This means Oracle, SQL server and other applications that already support DD Boost can also use the appliance to backup and restore their application data. DD Boost customers can make use of native application services such as Oracle RAC to manage their database backups/restores with the appliance.

Dell EMC also offers their Future-Proof Loyalty Program guarantees for the IDPA DP4400, including a Data Protection Deduplication guarantee, which, if best practices are followed, Dell EMC will guarantee the appliance dedupe ratio for backup data. Additional guarantees from the Dell EMC Future-Proof Program for IDPA DP4400 include a 3-Year Satisfaction guarantee, a Clear Price guarantee which guarantees predictable pricing for future maintenance and service as well as a Cloud Enabled guarantee. These are just a few of the Dell EMC guarantees provided for the IDPA DP4400.

The podcast runs ~16 minutes. Ranga and Sharad were both very knowlegdeable on DP industry, DP trends and the new IDPA DP4400.  Listen to the podcast to learn more.

Sharad Rostogi, Senior V.P. Product Management, Dell EMC Data Protection Division

Sharad Rastogi is a global technology executive with strong track record of transforming businesses and increasing shareholder value across a broad range of leadership roles, in both high growth and turnaround situations.

As SVP of Product Management at Dell EMC, Sharad is responsible for all products for the $3B Data Protection business.  He oversees a diverse portfolio, and is currently developing next generation integrated appliances, software and cloud based data protection solutions. In the past, Sharad has held senior roles in general management, products, marketing, corporate development and strategy at leading companies including Cisco, JDSU, Avid and Bain.

Sharad holds an MBA from the Wharton School at the University of Pennsylvania, an MS in engineering from the Boston University and a B.Tech in engineering from the Indian Institute of Technology in New Delhi.

He is an advisor to Boston University, College of Engineering, and a Board member at Edventure More – a non-profit providing holistic education. Sharad is a world traveler, always seeking new adventures and experiences

Rangaraaj (Ranga) Rajagopalan, Senior Director Data Protection Appliances Product Management, Dell EMC Data Protection Division

Ranga Rajagopalan is Senior Director of Product Management for Data Protection Appliances at Dell EMC. Ranga is responsible for driving the product strategy and vision for Data Protection Appliances, setting and delivering the multi-year roadmap for Data Domain and Integrated Data Protection Appliance.

Ranga has 15 years of experience in data protection, business continuity and disaster recovery, in both India and USA. Prior to Dell EMC, Ranga managed the Veritas Cluster Server and Veritas Resiliency Platform products for Veritas Technologies.