63: GreyBeards talk with NetApp A-Team members John Woodall & Paul Stringfellow

Sponsored by NetApp:

In this episode, we talk with NetApp A-Team members John Woodall (@John_Woodall), VP Eng, Integrated Archive Systems and Paul Stringfellow (@techstringy), Technical Dir.  Data Management Consultancy, Gardner Systems Plc.

Both John and Paul have been NetApp partners for quite awhile (John since the beginning of NetApp). John and Paul work directly with infrastructure customers in solving customer, real world data problems.

NetApp A-Team is a select, small (only 25 total) group of individuals that are brought together periodically and briefed by NetApp Execs and Product managers. A-Team membership is for life (as long as they continue to work in IT and not for a competitor). The briefings span a number of topics but are typically about what NetApp plans to do in the near term. The A-Team is there to provide a customer perspective to NetApp management and product teams.

Oftentimes, big companies can lose sight of customer problems and having a separate channel that’s engaged directly with customers can sometimes bring to light these issues. By having the A-Team, NetApp is getting feedback on customer problems and concerns from partners that directly engage with them.

Both Howard and I were very impressed that when John and Paul introduced themselves they mentioned DATA rather than storage. This signalsa a different perspective from pure infrastructure to a more customer view.

Following that theme, Howard asked about how customers were seeing the NetApp Data Fabric. This led to a long discussion of just what NetApp Data Fabric represents to customers in this multi-cloud world today. NetApp’s Data Fabric provides choice on where customers can run their work, liberating work that previously may have be stuck in the cloud or on prem.

Ray asked about how NetApp is embracing the cloud. What with cloud data volumes (see earlier NetApp sponsored podcast), NPS, Cloud ONTAP and other cloud solutions NetApp has lit up in various public clouds.  John mentioned that public preview for Cloud Data Volumes should free up by end of the year and at that time anyone can use it.

I was at a dinner with NetApp, 3-5 years ago, when the cloud looked like a steamroller that was going to grind infrastructure providers into dust. I was talking with a NetApp executive, he said they were doing everything they could at the time to figure out how to offer value with cloud providers rather than competing with them. Either you embrace change or you’re buried by it.

At the end of the podcast, Howard turned the discussion to NetApp HCI. Paul said, at first HCI was just shrunk infrastructure, but now, its more about the software stack on top of HCI that matters. The stack enables simpler deployment and configuration flexibility. From a NetApp HCI perspective, flexibility in being able to separately add more compute or storage is a strong differentiator.

The podcast runs ~30 minutes. Both John and Paul were very knowledgeable about current IT trends. I think we could have easily talked with them for another hour or so and not exhausted the conversation.  Listen to the podcast to learn more.

Paul Stringfellow, Technical Director, Data Management Consultancy Gardner Systems, Plc

An experienced technology professional, Paul Stringfellow is the Technical Director at Data Management Consultancy Gardner Systems Plc. He works with businesses of all types to assist with the development of technology strategies, and, increasingly, to help them manage, secure, and gain benefit from their data assets.

Paul is a NetApp A-Team and is very involved in the tech community. Paul often presents at conferences and user group events. He also produces a wide range of business focused technology content from his blog techstringy.com and Tech Interviews Podcast (podcast.techstringy.com), and he also writes regularly for a number of industry technology sites. You can find Paul on twitter at @techstringy.

John Woodall, VP Engineering, Integrated Archive Systems 

John Woodall is Vice President of Engineering at Integrated Archive Systems, Inc. (IAS). John has more than 28 years of experience in technology with a background focused on Enterprise and Infrastructure Architecture, Systems Engineering and Technology Management. In these roles John developed a long string of successes designing and implementing complex systems in demanding, mission critical large-scale enterprise environments.

John is a NetApp A-Team member and has managed the complete range of IT disciplines. John brings that experience and perspective to his role at IAS.At IAS, his focus is on mapping the company’s strategic direction, evaluating emerging technologies, trends, practices and managing the technology portfolio for IAS with the express goal of producing excellent customer experiences and business outcomes. Prior to joining IAS, John held architecture and management roles at Symantec, Solectron (now part of Flextronics), Madge Networks and Elsevier MDL.You can find me at @John_Woodall on twitter and Skype: TechWood

62: GreyBeards talk NVMeoF storage with VR Satish, Founder & CTO Pavilion Data Systems

In this episode,  we continue on our NVMeoF track by talking with VR Satish (@satish_vr), Founder and CTO of Pavilion Data Systems (@PavilionData). Howard had talked with Pavilion Data over the last year or so and I just had a briefing with them over the past week.

Pavilion data is taking a different tack to NVMeoF, innovating in software and hardware design, but using merchant silicon for their NVMeoF accelerated array solution. They offer Ethernet based NVMeoF block storage.

VR is a storage “lifer“, having worked at Veritas on their Volume Manager and other products for a long time. Moreover, Pavilion Data has a number of exec’s from Pure Storage (including their CEO, Gurpreet Singh), other storage technology companies and is located in San Jose, CA.

VR says there were 5 overriding principles for Pavilion Data as they were considering a new storage architecture:

  1. The IT industry is moving to rack scale compute and hence, there is a need for rack scale storage.
  2. Great merchant silicon was coming online so, there was less of a need to design their own silicon/asics/FPGAs.
  3. Rack scale storage needs to provide “local” (within the rack) resiliency/high availability and let modern applications manage “global” (outside the rack) resiliency/HA.
  4. Rack scale storage needs to support advanced data management services.
  5. Rack scale storage has to be easy to deploy and run

Pavilion Data’s key insight was in order to meet all those principles and deal with high performance NVMe flash and up and coming, SCM SSDs,  storage had to be redesigned to look more like network switches.

Controller cards?

One can see this new networking approach in their bottom of rack, 4U storage appliance. Their appliance has up to 20 controller cards creating a heavy compute/high bandwidth cluster attached via an internal PCIe switch to a backend storage complex made up of up to 72 U.2 NVMe SSDs.

The SSDs fit into an interposer that plugs into their PCIe switch and maps single (or dual ported) SSDs to the applianece’s PCIe bus. Each controller card supports an Intel  XeonD micrprocessor and 2 100GbE ports for up to 40 100GbE ports per appliance. The controller cards are configured in an active-active, auto-failover mode, for high availability. They don’t use memory caching or have any NVram.

On their website Pavilion data show  117 µsec response times and 114 GB/sec of throughput for IO performance.

Data management for NVMeoF storage

Pavilion Data storage supports widely striped/RAID6 data protection (16+2), thin provisioning, space efficient read only (redirect on write) snapshots and space efficient read-write clones. With RAID6, it takes more than 2  drive failures to lose data.

Like traditional storage, volumes (NVMe namespaces) are assigned to RAID groups.  The backend layout appears to be a log structured file. VR mentioned that they don’t do garbage collection and with no Nvram/no memory caching, there’s a bit of secret sauce here.

Pavilion Data storage offers two NVMeoF/Ethernet protocols:

  • Standard off the shelf,  NVMeoF/RoCE interface that makes use of v1.x of the Linux kernel NVMeoF/RoCE drivers and special NIC/switch hardware
  • New NVMeof/TCP interface that doesn’t need special networking  hardware and as such, offers NVMeoF over standard NIC/switches. I assume this takes host software to work.

In addition, Pavilion Data has developed their own Multi-path IO (MPIO) driver for NVMeoF high availability which they have contributed to the current Linux kernel project.

Their management software uses RESTful APIs (documented on their website). They also offer a CLI and GUI, both built using these APIs.  Bottom of rack storage appliances are managed as separate storage units, so they don’t support clusters of more than one appliance. However, there are only a few cluster storage systems we know of that support 20 controllers today for block storage.

Market

VR mentioned that they are going after new applications like MongoDB, Cassandra, CouchBase, etc. These applications are designed around rack scaling and provide “global”, off-rack/cross datacenter availability themselves. But VR also mentioned Oracle and other, more traditional applications. Pavilion Data storage is sold on a ($/GB) capacity basis.

The system comes in a minimum, 5 controller cards-18 NVMe SSD configuration and can be extended in groups of 5 controllers-18 NVMe SSDs to the full 20 controller cards-72 NVMe SSDs.

The podcast runs ~42 minutes. VR was very knowledgeable about the storage industry, NVMeoF storage protocols, NVMe SSDs and advanced data management capabilities. We had a good talk with VR on what Pavilion Data does and how well it works.   Listen to the podcast to learn more.

VR Satish, Founder and CTO, Pavilion Data Systems

VR Satish is the Chief Technology Officer at Pavilion Data Systems and brings more than 20 years of experience in enterprise storage software products.

Prior to joining Pavilion Data, he was an Entrepreneur-in-Residence at Artiman Ventures. Satish was an early employee of Veritas and later served as the Vice President and the Chief Technology Officer for the Information & Availability Group at Symantec Corporation prior to joining Artiman.

His current areas of interest include distributed computing, information-centric storage architectures and virtualization.

Satish holds multiple patents in storage management, and earned his Master’s degree in computer science from the University of Florida.

61: GreyBeards talk composable storage infrastructure with Taufik Ma, CEO, Attala Systems

In this episode,  we talk with Taufik Ma, CEO, Attala Systems (@AttalaSystems). Howard had met Taufik at last year’s FlashMemorySummit (FMS17) and was intrigued by their architecture which he thought was a harbinger of future trends in storage. The fact that Attala Systems was innovating with new, proprietary hardware made an interesting discussion, in its own right, from my perspective.

Taufik’s worked at startups and major hardware vendors in his past life and seems to have always been at the intersection of breakthrough solutions using hardware technology.

Attala Systems is based out of San Jose, CA.  Taufik has a class A team of executives, engineers and advisors making history again, this time in storage with JBoFs and NVMeoF.

Ray’s written about JBoF (just a bunch of flash) before (see  FaceBook moving to JBoF post). This is essentially a hardware box, filled with lots of flash storage and drive interfaces that directly connects to servers. Attala Systems storage is JBOF on steroids.

Composable Storage Infrastructure™

Essentially, their composable storage infrastructure JBOF connects with NVMeoF (NVMe over Fabric) using Ethernet to provide direct host access to  NVMe SSDs. They have implemented special purpose, proprietary hardware in the form of an FPGA, using this in a proprietary host network adapter (HNA) to support their NVMeoF storage.

Their HNA has a host side and a storage side version, both utilizing Attala Systems proprietary FPGA(s). With Attala HNAs they have implemented their own NVMeoF over UDP stack in hardware. It supports multi-path IO and highly available dual- or single-ported, NVMe SSDs in a storage shelf. They use standard RDMA capable Ethernet 25-50-100GbE (read Mellanox) switches to connect hosts to storage JBoFs.

They also support RDMA over Converged Ethernet (RoCE) NICS for additional host access. However I believe this requires host (NVMeoF) (their NVMeoY over UDP stack) software to connect to their storage.

From the host, Attala Systems storage on HNAs, looks like directly attached NVMe SSDs. Only they’re hot pluggable and physically located across an Ethernet network. In fact, Taufik mentioned that they already support VMware vSphere servers accessing Attala Systems composable storage infrastructure.

Okay on to the good stuff. Taufik said they measured their overhead and it was able to perform an IO with only an additional 5 µsec of overhead over native NVMe SSD latencies. Current NVMe SSDs operate with a response time of from 90 to 100 µsecs, and with Attala Systems Composable Storage Infrastructure, this means you should see 95 to 105 µsec response times over a JBoF(s) full of NVMe SSDs! Taufik said with Intel Optane SSD’s 10 µsec response times, they see response times at ~16 µsec (the extra µsec seems to be network switch delay)!!

Managing composable storage infrastructure

They also use a management “entity” (running on a server or as a VM),  that’s used to manage their JBoF storage and configure NVMe Namespaces (like a SCSI LUN/Volume).  Hosts use NVMe NameSpaces to access and split out the JBoF  NVMe storage space. That is, multiple Attala Systems Namespaces can be configured over a single NVMe SSD, each one corresponding to a single  (virtual to real) host NVMe SSD.

The management entity has a GUI but it just uses their RESTful APIs. They also support QoS on an IOPs or bandwidth limiting basis for Namespaces, to control manage noisy neighbors.

Attala systems architected their management system to support scale out storage. This means they could support many JBoFs in a rack and possibly multiple racks of JBoFs connected to swarms of servers. And nothing was said that would limit the number of Attala storage system JBoFs attached to a single server or under a single (dual for HA) management  entity. I thought the software may have a problem with this (e.g., 256 NVMe (NameSpaces) SSDs PCIe connected to the same server) but Taufik said this isn’t a problem for modern OS.

Taufik mentioned that with their RESTful APIs,  namespaces can be quickly created and torn down, on the fly. They envision their composable storage infrastructure to be a great complement to cloud compute and container execution environments.

For storage hardware, they use storage shelfs from OEM vendors. One recent configuration from Supermicro has hot-pluggable, dual ported, 32 NVMe slots in a 1U chasis, which at todays ~16TB capacities, is ~1/2PB of raw flash. Taufik mentioned 32TB NVMe SSDs are being worked on as we speak. Imagine that 1PB of flash NVMe SSD storage in 1U!!

The podcast runs ~47 minutes. Taufik took a while to get warmed up but once he got going, my jaw dropped away.  Listen to the podcast to learn more.

Taufik Ma, CEO Attala Systems

Tech-savvy business executive with track record of commercializing disruptive data center technologies.  After a short stint as an engineer at Intel after college, Taufik jumped to the business side where he led a team to define Intel’s crown jewels – CPUs & chipsets – during the ascendancy of the x86 server platform.

He honed his business skills as Co-GM of Intel’s Server System BU before leaving for a storage/networking startup.  The acquisition of this startup put him into the executive team of Emulex where as SVP of product management, he grew their networking business from scratch to deliver the industry’s first million units of 10Gb Ethernet product.

These accomplishments draw from his ability to engage and acquire customers at all stages of product maturity including partners when necessary.

59: GreyBeards talk IT trends with Marc Farley, Sr. Product Manager, HPE

In Episode 59,  we talk with Marc Farley, Senior Product Manager at HPE discussing trends in the storage industry today. Marc been on our show before (GreyBeards talk Cloud Storage…, GreyBeards video discussing file analytics, Greybeards talk cars, storage and IT…) and has been a longtime friend and associate of both Howard and I.

Marc’s been at HPE for a while now but couldn’t discuss publicly what he is working on, so we spent time discussing industry trends rather than HPE products.

We discussed the public cloud and its impact on enterprise IT. Although the cloud has been arguable alive and well for almost a decade now, its impact is still being felt today and for the foreseeable future

We next discussed AI and data storage. HPE’s acquisition of Nimble brought InfoSight into their product family, which was arguably one of the first to use big data analytics to improve field support and ongoing operations.

Howard mentioned that a logical next step is to apply AI to storage performance. Using AI to fingerprint application workloads and thereby help determine when that app’s data was needed in cache. We also mentioned that AI could be better used to help workload optimization/orchestration, in almost real time, rather than after the fact.

We talked about containerization as the next big thing. Howard and Marc said sometimes it’s less risky to just keep chugging away with what IT has always done rather than risking a move to a new paradigm/platform AKA containers. As further evidence, Marc had seen a survey (by an unnamed research firm) of customers pre-purchase expectations for new storage and what they actually used it for post-purchase. Pre-purchase, customers expected to use storage for server virtualization but post-purchase, a majority used it for more traditional, non-virtualized applications.

We returned to a perennial theme, when will SSDs supplant disk. Howard talked about a recent vendors introduction of a dual head disk and which he thought was  overreach. But all agreed the key metric is $/GB and getting the difference between rotating media and SSD $/GB below 10X. Howard believes when it’s more like 4X, then SSDs will kill off disk technology. Although some of us felt disks would never completely go away, witness tape.

The podcast runs ~38 minutes. Marc’s always a gas to talk with and is currently the most frequent guest we have had on our show  (although Jim Handy was tied with him up until now). Its’ great to hear from him again.  Listen to the podcast to learn more.

Marc Farley, Senior Product Manger, HPE

Marc is a storage greybeard who has worked for many storage companies and is currently providing product strategy for HPE. He has written three books on storage including his most recent, Rethinking Enterprise Storage: A Hybrid Cloud Model and his previous books Building Storage Networksand Storage Networking Fundamentals.

In addition to his writing books he has been a blogger and podcaster about storage topics while working for EqualLogic, Dell, 3PAR, HP, StorSimple,  Microsoft, HPE and others.

When he is not working, Marc likes to ride bicycles, listen to music, spend time with his family and dote on his cats. Of course there’s that car video curation…

55: GreyBeards storage and system yearend review with Ray & Howard

In this episode, the Greybeards discuss the year in systems and storage. This year we kick off the discussion with a long running IT trend which has taken off over the last couple of years. That is, recently the industry has taken to buying pre-built appliances rather than building them from the ground up.

We can see this in all the hyper-converged solutions available  today but it goes even deeper than that. It seems to have started with the trend in organizations to get by with less man-women power.

This led to a desire to purchase pre-buit software applications and now, appliances rather than build from parts. It just takes to long to build and lead architects have better things to do with their time than checking compatibility lists, testing and verifying that hardware works properly with software. The pre-built appliances are good enough and doing it yourself doesn’t really provide that much of an advantage over the pre-built solutions.

Next, we see the coming systems using NVMe over Fabric storage systems as sort of a countertrend to the previous one. Here we see some customers paying well for special purpose hardware with blazing speed that takes time and effort to get working right, but the advantages are significant. Both Howard and I were at the Excelero SFD12 event and it blew us away. Howard also attended the E8 Storage SFD14 event which was another example along a similar vein.

Finally, the last trend we discussed was the rise of 3D TLC and the absence of 3DX and other storage class memory (SCM) technologies to make a dent in the marketplace. 3D TLC NAND is coming out of just about every fab these days and resulting in huge (but costly) SSDs, in the multi-TB range.  Combine these with NVMe interfaces and you have msec access to almost a PB of storage without breaking a sweat.

The missing 3DX SCM tsunami some of us predicted is mainly due to the difficulties in bringing new fab technologies to market. We saw some of this in the stumbling with 3D NAND but the transition to 3DX and other SCM technologies is a much bigger change to new processes and technology. We all believe it will get there someday but for the moment, the industry just needs to wait until the fabs get their yields up.

The podcast runs over 44 minutes. Howard and I could talk for hours on what’s happening in IT today. Listen to the podcast to learn more.

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

 

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi.