GreyBeards talk data-aware, scale-out file systems with Peter Godman, Co-founder & CEO, Qumulo

In this podcast we discuss Qumulo’s data-aware, scale-out file system storage with Peter Godman, Co-founder & CEO of Qumulo. Peter has been involved in scale-out storage for a while now, coming from (EMC) Isilon before starting Qumulo. Although, this time he’s adding data-awareness to scale-out storage. Presently, Qumulo is vertically focused on the HPC and media/entertainment market spaces.

Qumulo is the first storage vendor we have heard of that implements their software with Agile development techniques. This allows them to release new functionality to the field every two weeks – talk about rapidly turning out software. We believe this is pretty risky and Ray talks more about Agile development for storage in his Storage on Agile blog post.

But Qumulo mostly sees itself as data aware NAS, using Posix metadata and a neat, internally designed/developed database to store, index and retrieve file system metadata. Qumulo’s proprietary database provides much faster response to queries on meta-data, such as what files have changed since last backup, calculate all the  storage space consumed by a specific owner, supply inclusion/exclusion lists to split the file systems into 100 partitions, etc. The database is not a relational or conventional database, but almost old-school, indexed data structures tailored to providing quick answers to the queries of most interest to customers and their application environment. In a scale-out NAS environment like Qumulo’s, with potentially billions of files, you just don’t have time to walk an inode tree to get these sorts of answers, anymore.

Qumulo supplies both hardware and software to its customers but also offers a software-only or software defined storage (SDS) version for those few customers that want it. SDS versions can help potential customers perform  proofs of concept (PoCs) using VMs.

In their system nodes, Qumulo uses SSDs and disks. SSDs provide a sort of NVM that holds recently written data but can also be used for reading data. Behind the SSDs are 8TB disks. Today, Qumulo provides mirrored storage that’s widely spread or dispersed across all the storage in their system. With this wide-striping of data, rebuild times for (an 8TB) disk failure is ~1:20 for a single QC204 (204TB) system node and halves every time you double the number of nodes.

It was refreshing to hear a startup vendor clearly answer what they have and don’t have implemented in their current system. Some startups try to obfuscate or talk around the lack of functionality but Peter’s answers were always clear and (sometimes to) concise on what’s in and not in current Qumulo functionality.

This months edition runs just over 47 minutes and gets pretty technical in places, but mostly stays at a high functional level.  We hope you enjoy the podcast.

Peter Godman, Co-founder & CEO Qumulo

pete_7CPeter brings 20 years of industry systems experience to Qumulo. As VP of Engineering and CEO of Corensic, Peter brought the world’s first thin-hypervisor based product to market. As Director of Software Engineering at Isilon, he led development of several major releases of Isilon’s award-winning OneFS distributed file system and was inventor of 18 patented technologies. Peter studied math and computer science at MIT.

Leave a Reply