As most of you know, Howard Marks (@deepstoragenet), Technologist Extraordinary & Plenipotentiary at VAST Data used to be a Greybeards co-host and is still on our roster as a co-host emeritus. When I started to schedule this podcast, it was going to be our 100th podcast and we wanted to invite Howard and the rest of the co-hosts to be on the call to discuss our podcast. But alas, the 100th Greybeards podcast came and went, before we could get it done. So we decided to refocus this podcast back on VAST Data.
We talked with Howard last year about VAST and some of this podcast covers the same ground (see last year’s podcast with Howard on VAST Data) but I highlighted below different aspects of their product that we also discussed.
For starters, VAST just finalized a recent round of funding, which if I recall, valued them at over $1B USD, or yet another data storage unicorn.
VAST is a scale out, disaggregated, unstructured data platform that takes advantage of the economics of QLC SSD (from Intel) combined with the speed of 3D XPoint storage class memory (Optane SSD, also from Intel) to support customer data. Intel is an investor in VAST.
VAST uses mutliple front end (controller) servers, with one or more HA NVMe drive module(s) connected via a dual infiniband or 100Gbps Ethernet RDMA cluster interconnect. The HA NVMe drive module has two (IO modules) adapter cards, one for each connection that takes IO and data requests and transfers them across a PCIe bus which connects to QLC and Optane SSDs. They also have a Mellanox (another investor) switch on their backend with a (round robin) DNS router to connect hosts to their storage (front-end) servers.
Each backend HA NVMe drive module has 12 1.5TB Optane U.2 SSDs and 44 15.4TB QLC SSDs, for a total of 56 drives. Customer data is first written to Optane and then destaged to QLC SSD.
QLC has the advantage of being 4 bits per cell (for a lower $/GB stored) but it’s endurance or drive writes/day (dw/d)) is significantly worse than TLC. So VAST has had to work to increase QLC endurance in their system.
Natively, QLC offers ~0.2 dw/d when doing random 4K writes. However, if your system does 128KB sequential writes, it offers 4.0 dw/d. VAST destages data from Optane SSDs to QLC in 1MB chunks which both optimizes endurance and reduces garbage collection write amplification within the drive.
Howard mentioned their frontend servers are stateless, i.e., maintain no state information about any IO activity going on. Any IO state information is maintained by their system in Optane SSDs. Each server maintains a work log (like) structure on Optane that describes what they are doing in support of host IO and other activities. That way, if one front end server goes down, another one can access its log and take over its activity.
Metadata is also maintained only on Optane SSDs. Howard called their metadata structure a V-tree (B-tree). VAST mirrors all meta-data and customer data to two Optane SSDs. So if one Optane SSD goes down, its pair can be used to continue operations.
In last years podcast we talked at length about VAST data protection and data reduction capabilities so we won’t discuss these any further here.
However, one thing worth noting is that VAST has a very large RAID (erasure code protection) stripe. Data is written to the QLC SSDs in a VAST designed, locally decodable erasure coding format.
One problem with large stripes is rebuild time. VAST’s locally decodable parity codes help with this but the other thing that helps is distributing rebuild IO activity to all front end servers in the system.
The other problem with large stripe sizes is garbage collection. VAST segregates customer data by “temporariness” based on their best guess. In this way all data in one stripe should have similar lifetimes. When it’s time for stripe garbage collection, having all temporary data allows VAST to jettison the whole stripe (or most of it) rather than having to collect and re-write old stripe data to another new stripe.
VAST came out supporting NFSv3 and S3 object storage protocols, Their next release adds support for SMB 2.2, data-at-rest encryption and snapshotting to an external S3 store. As you may recall SMB is a stateful protocol. In VAST’s home grown, SMB implementation, front end servers can take over SMB transactions from other failed servers, without having to fail the whole transaction and start over again.
VAST uses a fail in place, maintenance policy. That is failed SSDs are not normally replaced in customer deployments, rather blocks, pages, or SSDs are marked as failed and the spare capacity available in the drive enclosure is used to provide space for any needed rebuilt data.
VAST offers a 10 year maintenance option where the customer keeps the same storage for 10 full years. That way customers don’t have to migrate data from one system to another until their 10 years are up.
The podcast runs a little under 44 minutes. Howard and I can talk forever. He is always a pleasure to talk with as well as extremely knowledgeable about (VAST) storage and other industry solutions. The co-hosts and I had a great time talking with him again. Listen to the podcast to learn more.
Howard Marks, Technologist Extraordinary and Plenipotentiary, VAST Data, Inc.
Howard Marks brings over forty years of experience as a technology architect for hire and Industry observer to his role as VAST Data’s Technologist Extraordinary and Plienopotentary. In this role, Howard demystifies VAST’s technologies for customers and customer requirements for VAST’s engineers.
Before joining VAST, Howard ran DeepStorage an industry test lab and analyst firm. An award-winning speaker, he has appeared at events on three continents including Comdex, Interop and VMworld.
Howard is the author of several books (all gratefully out of print) and hundreds of articles since Bill Machrone taught him journalism at PC Magazine in the 1980s.
Listeners may also remember that Howard was a founding co-Host of the Greybeards-on-Storage Podcast.