75: GreyBeards talk persistent memory IO with Andy Grimes, Principal Technologist, NetApp

Sponsored By:  NetApp
In this episode we talk new persistent memory IO technology  with Andy Grimes, Principal Technologist, NetApp. Andy presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). If you get a chance we encourage you to watch the videos as Andy, did a great job describing their new MAX Data persistent memory IO solution.

The technology for MAX Data came from NetApp’s Plexistor acquisition. Prior to the acquisition, Plexistor had also presented at a SFD9 and TFD11.

Unlike NVMeoF storage systems, MAX Data is not sharing NVMe SSDs across servers. What MAX Data does is supply an application-neutral way to use persistent memory as a new, ultra fast, storage tier together with a backing store.

MAX Data performs a write or an “active” (Persistent Memory Tier) read in single digit µseconds for a single core/single thread server. Their software runs in user space and as such, for multi-core servers, it can take up to 40  µseconds.  Access times for backend storage reads is the same as NetApp AFF but once read, data is automatically promoted to persistent memory, and while there, reads ultra fast.

One of the secrets of MAX Data is that they have completely replaced the Linux Posix File IO stack with their own software. Their software is streamlined and bypasses a lot of the overhead present in today’s Linux File Stack. For example, MAX Data doesn’t support metadata-journaling.

MAX Data works with many different types of (persistent) memory, including DRAM (non-persistent memory), NVDIMMs (DRAM+NAND persistent memory) and Optane DIMMs (Intel 3D Xpoint memory, slated to be GA end of this year). We suspect it would work with anyone else’s persistent memory as soon as they come on the market.

Even though the (Optane and NVDIMM) memory is persistent, server issues can still lead to access loss. In order to provide data availability for server outages, MAX Data also supports MAX Snap and MAX Recovery. 

With MAX Snap, MAX Data will upload all persistent memory data to ONTAP backing storage and ONTAP snapshot it. This way you have a complete version of MAX Data storage that can then be backed up or SnapMirrored to other ONTAP storage.

With MAX Recovery, MAX Data will synchronously replicate persistent memory writes to a secondary MAX Data system. This way, if the primary MAX Data system goes down, you still have an RPO-0 copy of the data on another MAX Data system that can be used to restore the original data, if needed. Synchronous mirroring will add 3-4  µseconds to the access time for writes, quoted above.

Given the extreme performance of MAX Data, it’s opening up whole new set of customers to talking with NetApp. Specifically, high frequency traders (HFT) and high performance computing (HPC). HFT companies are attempting to reduce their stock transactions access time to as fast as humanly possible. HPC vendors have lots of data and processing all of it in a timely manner is almost impossible. Anything that can be done to improve throughput/access times should be very appealing to them.

To configure MAX Data, one uses a 1:25 ratio of persistent memory capacity to backing store. MAX Data also supports multiple LUNs.

MAX Data only operates on Linux OS and supports (IBM) RedHat and CentOS, But Andy said it’s not that difficult to add support for other versions of Linux Distros and customers will dictate which other ones are supported, over time.

As discussed above, MAX Data works with NetApp ONTAP storage, but it also works with SSD/NVMe SSDs as backend storage. In addition, MAX Data has been tested with NetApp HCI (with SolidFire storage, see our prior podcasts on NetApp HCI with Gabriel Chapman and Adam Carter) as well as E-Series storage. The Plexistor application has been already available on AWS Marketplace for use with EC2 DRAM and EBS backing store. It’s not much of a stretch to replace this with MAX Data.

MAX Data is expected to be GA released before the end of the year.

A key ability of the MAX Data solution is that it requires no application changes to use persistent memory for ultra-fast IO. This should help accelerate persistent memory adoption in data centers when the hardware becomes more available. Speaking to that, at Insight2018, Lenovo, Cisco and Intel were all on stage when NetApp announced MAX Data.

The podcast runs ~25 minutes. Andy’s an old storage hand (although no grey beard) and talks the talk, walks the walk of storage religion. Andy is new to TFD but we doubt it will be the last time we see him there. Andy was very conversant on the MAX Data technology and the market that it apparently is opening up for this new technology.  Listen to our podcast to learn more.

Andy Grimes, Principal Technologiest, NetApp

Andy has been in the IT industry for 17 years, working in roles spanning development, technology architecture, strategic outsourcing and Healthcare..

For the past 4 years Andy has worked with NetApp on taking the NetApp Flash business from #5 to #1 in the industry (according to IDC). During this period NetApp also became the fastest growing Flash and SAN vendor in the market and regained leadership in the Gartner quadrant.

Andy also works with NetApp’s product vision, competitive analysis and future technology direction and working with the team bringing the MAX Data PMEM product to market.

Andy has a BS degree in psychology, a BPA in management information systems, and an MBA. He current works as a Principal Technologist for the NetApp Cloud Infrastructure Business Unit with a focus on PMEM, HCI and Cloud Strategy. Andy lives in Apex, NC with his beautiful wife and has 2 children, a 4 year old and a 22 year old (yes don’t let this happen to you). For fun Andy likes to Mountain Bike, Rock Climb, Hike and Scuba Dive.

73: GreyBeards talk HCI with Gabriel Chapman, Sr. Mgr. Cloud Infrastructure NetApp

Sponsored by: NetApp

In this episode we talk HCI  with Gabriel Chapman (@Bacon_Is_King), Senior Manager, Cloud Infrastructure, NetApp. Gabriel presented at the NetApp Insight 2018 TechFieldDay Extra (TFDx) event (video available here). Gabriel also presented last year at the VMworld 2017 TFDx event (video available here). If you get a chance we encourage you to watch the videos as Gabriel, did a great job providing some design intent and descriptions of NetApp HCI capabilities. Our podcast was recorded after the TFDx event.

NetApp HCI consists of NetApp Solidfire storage re-configured, as a small enterprise class AFA storage node occupying one blade of a four blade system, where the other three blades are dedicated compute servers. NetApp HCI runs VMware vSphere but uses enterprise class iSCSI storage supplied by the NetApp SolidFire AFA.

On our podcast, we talked a bit about SolidFire storage. It’s not well known but the 1st few releases of SolidFire (before NetApp acquisition) didn’t have a GUI and was entirely dependent on its API/CLI for operations. That heritage continues today as NetApp HCI management console is basically a front end GUI for NetApp HCI API calls.

Another advantage of SolidFire storage was it’s extensive QoS support which included state of the art service credits as well as service limits.  All that QoS sophistication is also available in NetApp HCI, so that customers can more effectively limit noisy neighbor interference on HCI storage.

Although NetApp HCI runs VMware vSphere as its preferred hypervisor, it’s also possible to run other hypervisors in bare metal clusters with NetApp HCI storage and compute servers. In contrast to other HCI solutions, with NetApp HCI, customers can run different hypervisors, all at the same time, sharing access to NetApp HCI storage.

On our podcast and the Insight TFDx talk, Gabriel mentioned some future deliveries and roadmap items such as:

  • Extending NetApp HCI hardware with a new low-end, 2U configuration designed specifically for RoBo and SMB customers;.
  • Adding NetApp Cloud Volume support so that customers can extend their data fabric out to NetApp HCI; and
  • Adding (NFS) file services support so that customers using NFS data stores /VVols could take advantage of NetApp HCI storage.

Another thing we discussed was the new development HCI cadence. In the past they typically delivered new functionality about 1/year. But with the new development cycle,  they’re able to deliver functionality much faster but have settled onto a 2 releases/year cycle, which seems about as quickly as their customer base can adopt new functionality.

The podcast runs ~22 minutes. We apologize for any quality issues with the audio. It was recorded at the show and we were novices with the onsite recording technology. We promise to do better in the future. Gabriel has almost become a TFDx regular these days and provides a lot of insight on both NetApp HCI and SolidFire storage.  Listen to our podcast to learn more.

Gabriel Chapman, Senior Manager, Cloud Infrastructure, NetApp

Gabriel is the Senior Manager for NetApp HCI Go to Market. Today he is mainly engaged with NetApp’s top tier customers and partners with a primary focus on Hyper Converged Infrastructure for the Next Generation Data Center.

As a 7 time vExpert that transitioned into the vendor side after spending 15 years working in the end user Information Technology arena, Gabriel specializes in storage and virtualization technologies. Today his primary area of expertise revolves around storage, data center virtualization, hyper-converged infrastructure, rack scale/hyper scale computing, cloud, DevOps, and enterprise infrastructure design.

Gabriel is a Prime Mover, Technologist, Unapologetic Randian, Social Media Junky, Writer, Bacon Lover, and Deep Thinker, whose goal is to speak truth on technology and make complex ideas sound simple. In his free time, Gabriel is the host of the In Tech We Trust podcast and enjoys blogging as well as public speaking.

Prior to joining SolidFire, Gabriel was a storage technologies specialist covering the United States with Cisco, focused on the Global Service Provider customer base. Before Cisco, he was part of the go-to-market team at SimpliVity, where he concentrated on crafting the customer facing messaging, pre-sales engagement, and evangelism efforts for the early adopters of Hyper Converged Infrastructure.

67: GreyBeards talk infrastructure monitoring with James Holden, Sr. Prod. Mgr. NetApp

Sponsored by: Howard and I first talked with James Holden, NetApp Senior Product Manager for OnCommand Insight and Cloud Insights,  last month, at Storage Field Day 16 (SFD16) in Waltham, MA. At the time, we thought it would be great to also have him on the show.

James has been with the NetApp OnCommand Insight (OCI) team for quite awhile now and is very knowledgeable about the product and its technology. NetApp Cloud Insights is a new SaaS offering that provides some of the same services as OCI without the footprint, focused on newer, non-traditional applications and available on a pay as you go model.

NetApp OnCommand Insight (OCI)

NetApp OCI is sort of a stripped down, souped up enterprise SRM tool, without storage and servers configuration-provisioning (see James’s introduction video from SFD15 for more info). It supports NetApp and just about anyone’s storage including Dell EMC, IBM, Hitachi Vantara (HDS), HPE, Infinidat, and Pure Storage as well as most major OSs such as VMware vSphere, Microsoft HyperV, RHEL, etc. Other storage can easily be  added to OCI through a patch/minor update and is typically done by customer request.

NetApp OCI currently runs in some of the biggest enterprises  in the world today, including top F500 companies and one of the world’s largest banks. OCI is agentless but does use a data collector server/VM onprem or in cloud that takes advantage of storage and system APIs to gather data.

OCI provides extensive end-to-end infrastructure monitoring and trouble shooting (see James’s SFD16 OCI monitoring & troubleshooting session). OCI monitors application workloads from VMs to the storage supporting them.

OCI also supplies extensive charge back capabilities (see his SFD16 OCI cost control/chargebacks session). In times like these when IT competes with public cloud offerings every day, charge backs can be very illuminating.

Also, OCI has extensive integration with ServiceNOW and similar offerings (see SFD16 OCI ecosystem session). With this level of integration, OCI can provide seamless tracking of service requests from initiation to completion through verification.

In addition, OCI can monitor public cloud infrastructure as well as onprem. For example, with Amazon Web Services (AWS), customers can use OCI to monitor EC2 instances EBS IO activity. OCI reports on AWS IOPS rates by EC2-EBS connection. Customers paying for EBS IOPS, can use OCI to monitor and tailor their EBS costs. OCI also supports Microsoft Azure environments.

NetApp Cloud Insights

NetApp Cloud Insights, a new SaaS offering, that is currently in Public Preview status but is expected to release in October, 2018 (checkout his SFD16 Cloud Insights session video).

Customers can currently register to use the preview version at Cloud.netapp.com/Cloud Insights. There’s a registration wall but that’s all it takes to get started. .

The minimum Cloud Insights instance is a single server and 5TB of storage. Unlike OCI, Cloud Insights is tailored to support smaller shops without significant infrastructure. However, Cloud Insight also offers standard onprem enterprise infrastructure monitoring as well.

Cloud Insights is also focused on modern, cloud-native applications whether they operate on prem or in the cloud. The problem with cloud native, container apps is that they come and go in seconds, and there’s thousands of them. Cloud Insights was designed specifically for container and other cloud native applications and as such, should provide a more accurate monitoring of operations for these systems.

We talked about Cloud Insight’s development cadence. James said that because it’s a SaaS offering new Cloud Insights functionality can be released daily, if not more frequently. Contrast that with OCI, where they schedule 3-4 releases a year.

Cloud Insight currently supports the Kubernetes container ecosystems today but more are on the way. Again, customers determine which Container or other cloud native ecosystems will be supported next.

The podcast runs ~22 minutes. James was very knowledgeable about OCI, Cloud Insights and infrastructure monitoring in general and he was easy to talk with. Howard and I had a great time at SFD16 and enjoyed our time talking with him again on the podcast.  Listen to the podcast to learn more.

James Holden, Senior Product Manager NetApp OCI and Cloud Insights 

 

James Holden is a Senior Manager of Product Management at NetApp, and for the last 5 years  has been building the infrastructure monitoring and reporting tool OnCommand Insight.

Today he is working across NetApp’s Cloud Analytics portfolio, including Cloud Insights, a new SaaS offering currently in preview.

Prior to NetApp, James worked for 14 years at CSC in both the US and the UK on their storage, compute and automation solutions.