Category Archives: Scale-out storage

54: GreyBeards talk scale-out secondary storage with Jonathan Howard, Dir. Tech. Alliances at Commvault

This month we talk scale-out secondary storage with Jonathan Howard,  Director of Technical Alliances at Commvault.  Both Howard and I attended Commvault GO2017 for Tech Field Day, this past month in Washington DC. We had an interesting overview of their Hyperscale secondary storage solution and Jonathan was the one answering most of our questions, so we thought he would make an good guest for our podcast.

Commvault has been providing data protection solutions for a long time, using anyone’s secondary storag, but recently they have released a software defined, scale-out secondary storage solution that runs their software with a clustered file system.

Hyperscale secondary storage

They call their solution, Hyperscale secondary storage and it’s available in both an hardware-software appliance as well as software only configuration on compatible off the shelf commercial hardware. Hyperscale uses the Red Hat Gluster cluster file system and together with the Commvault Data Platform provides a highly scaleable, secondary storage cluster that can meet anyone’s secondary storage needs while providing high availability and high throughput performance.

Commvault’s Hyperscale secondary storage system operates onprem in customer data centers. Hyperscale uses flash storage for system metadata but most secondary storage resides on local server disk.

Combined with Commvault Data Platform

With the sophistication of Commvault Data Platform one can have all the capabilities of a standalone Commvault environment with software defined storage. This allows just about any RTO/RPO needed by today’s enterprise and includes Live Sync secondary storage replication,  Onprem IntelliSnap for on storage snapshot management, Live Mount for instant recovery using secondary storage directly  to boot your VMs without having to wait for data recovery.  , and all the other recovery sophistication available from Commvault.

Hyperscale storage is capable of doing up to 5 Live Mount recoveries simultaneously per node without a problem but more are possible depending on performance requirements.

We also talked about Commvault’s cloud secondary storage solution which can make use of AWS S3 storage to hold backups.

Commvault’s organic growth

Most of the other data protection companies have came about through mergers, acquisitions or spinoffs. Commvault has continued along, enhancing their solution while bashing everything on an underlying centralized metadata database.  So their codebase was grown from the bottom up and supports pretty much any and all data protection requirements.

The podcast runs ~50 minutes. Jonathan was very knowledgeable about the technology and was great to talk with. Listen to the podcast to learn more.

Jonathan Howard, Director, Technical and Engineering Alliances, Commvault

Jonathan Howard is a Director, Technology & Engineering Alliances for Commvault. A 20-year veteran of the IT industry, Jonathan has worked at Commvault for the past 8 years in various field, product management, and now alliance facing roles.

In his present role with Alliances, Jonathan works with business and technology leaders to design and create numerous joint solutions that have empowered Commvault alliance partners to create and deliver their own new customer solutions.

51: GreyBeards talk hyper convergence with Lee Caswell, VP Product, Storage & Availability BU, VMware

Sponsored by:

VMware

In this episode we talk with Lee Caswell (@LeeCaswell), Vice President of Product, Storage and Availability Business Unit, VMware.  This is the second time Lee’s been on our show, the previous one back in April of last year when he was with his prior employer. Lee’s been at VMware for a little over a year now and has helped lead some significant changes in their HCI offering, vSAN.

VMware vSAN/HCI business

Many customers struggle to modernize their data centers with funding being the primary issue. This is very similar to what happened in the early 2000s as customers started virtualizing servers and consolidating storage. But today, there’s a new option, server based/software defined storage like VMware’s vSAN, which can be deployed for little expense and grown incrementally as needed. VMware’s vSAN customer base is currently growing by 150% CAGR, and VMware is adding over 100 new vSAN customers a week.

Many companies say they offer HCI, but few have adopted the software-only business model this entails. The transition from a hardware-software, appliance-based business model to a software-only business model is difficult and means a move from a high revenue-lower margin business to a lower revenue-higher margin business. VMware, from its very beginnings, has built a sustainable software-only business model that extends to vSAN today.

The software business model means that VMware can partner easily with a wide variety of server OEM partners to supply vSAN ReadyNodes that are pre-certified and jointly supported in the field. There are currently 14 server partners for vSAN ReadyNodes. In addition, VMware has co-designed the VxRail HCI Appliance with Dell EMC, which adds integrated life-cycle management as well as Dell EMC data protection software licenses.

As a result, customers can adopt vSAN as a build or a buy option for on-prem use and can also leverage vSAN in the cloud from a variety of cloud providers, including AWS very soon. It’s the software-only business model that sets the stage for this common data management across the hybrid cloud.

VMware vSAN software defined storage (SDS)

The advent of Intel Xeon processors and plentiful, relatively cheap SSD storage has made vSAN an easy storage solution for most virtualized data centers today. SSDs removed any performance concerns that customers had with hybrid HCI configurations. And with Intel’s latest Xeon Scalable processors, there’s more than enough power to handle both application compute and storage compute workloads.

From Lee’s perspective, there’s still a place for traditional SAN storage, but he sees it more for cold storage that is scaled independently from servers or for bare metal/non-virtualized storage environments. But for everyone else using virtualized data centers, they really need to give vSAN a look.

Storage vendors shifting sales

It used to be that major storage vendor sales teams would lead with hardware appliance storage solutions and then move to HCI when pushed. The problem was that a typical SAN storage sale takes 9 months to complete and then 3 years of limited additional sales.

To address this, some vendors have taken the approach where they lead with HCI and only move to legacy storage when it’s a better fit. With VMware vSAN, it’s a quicker sales cycle than legacy storage because HCI costs less up front and there’s no need to buy the final storage configuration with the first purchase. VMware vSAN HCI can grow as the customer applications needs dictate, generating additional incremental sales over time.

VMware vSAN in AWS

Recently, VMware has announced VMware Cloud in AWS.What this means is that you can have vSAN storage operating in an AWS cloud just like you would on-prem. In this case, workloads could migrate from cloud to on-prem and back again with almost no changes. How the data gets from on-prem to cloud is another question.

Also the pricing model for VMware Cloud in AWS moves to a consumption based model, where you pay for just what you use on a monthly basis. This way VMware Cloud in AWS and vSAN is billed monthly, consistent with other AWS offerings.

VMware vs. Microsoft on cloud

There’s a subtle difference in how Microsoft and VMware are adopting cloud. VMware came from an infrastructure platform and is now implementing their infrastructure on cloud. Microsoft started as a development platform and is taking their cloud development platform/stack and bringing it to on-prem.

It’s really two different philosophies in action. We now see VMware doing more for the development community with vSphere Integrated Containers (VIC), Docker Containers, Kubernetes, and Pivotal Cloud foundry. Meanwhile Microsoft is looking to implement the Azure stack for on-prem environments, and they are focusing more on infrastructure. In the end, enterprises will have terrific choices as the software defined data center frees up customers dollars and management time.

The podcast runs ~25 minutes. Lee is a very knowledgeable individual and although he doesn’t qualify as a Greybeard (just yet), he has been in and around the data center and flash storage environments throughout most of his career. From his diverse history, Lee has developed a very business like perspective on data center and storage technologies and it’s always a pleasure talking with him.  Listen to the podcast to learn more.

Lee Caswell, V.P. of Product, Storage & Availability Business Unit, VMware

Lee Caswell leads the VMware storage marketing team driving vSAN products, partnerships, and integrations. Lee joined VMware in 2016 and has extensive experience in executive leadership within the storage, flash and virtualization markets.

Prior to VMware, Lee was vice president of Marketing at NetApp and vice president of Solution Marketing at Fusion-IO (now SanDisk). Lee was a founding member of Pivot3, a company widely considered to be the founder of hyper-converged systems, where he served as the CEO and CMO. Earlier in his career, Lee held marketing leadership positions at Adaptec, and SEEQ Technology, a pioneer in non-volatile memory. He started his career at General Electric in Corporate Consulting.

Lee holds a bachelor of arts degree in economics from Carleton College and a master of business administration degree from Dartmouth College. Lee is a New York native and has lived in northern California for many years. He and his wife live in Palo Alto and have two children. In his spare time Lee enjoys cycling, playing guitar, and hiking the local hills.

49: Greybeards talk open convergence with Brian Biles, CEO and Co-founder of Datrium

Sponsored By:

In this episode we talk with Brian Biles, CEO and Co-founder of Datrium. We last talked with Brian and Datrium in May of 2016 and at that time we called it deconstructed storage. These days, Datrium offers a converged infrastructure (C/I) solution, which they call “open convergence”.

Datrium C/I

Datrium’s C/I  solution stores persistent data off server onto data nodes and uses onboard flash for a local, host read-write IO cache. They also use host CPU resources to perform some other services such as compression, local deduplication and data services.

In contrast to hyper converged infrastructure solutions available on the market today, customer data is never split across host nodes. That is data residing on a host have only been created and accessed by that host.

Datrium uses on host SSD storage/flash as a fast access layer for data accessed by the host. As data is (re-)written, it’s compressed and locally deduplicated before being persisted (written) down to a data node.

A data node is a relatively light weight dual controller/HA storage solution with 12 high capacity disk drives. Data node storage is global to all hosts running Datrium storage services in the cluster. Besides acting as a permanent repository for data written by the cluster of hosts, it also performs global deduplication of data across all hosts.

The nice thing about their approach to CI is it’s easily scaleable — if you need more IO performance just add more hosts or more SSDs/flash to servers already connected in the cluster. And if a host fails it doesn’t impact cluster IO or data access for any other host.

Datrium originally came out supporting VMware virtualization and acts as an NFS datastore for VMDKs.

Recent enhancements

In July, Datrium released new support for RedHat and KVM virtualization alongside VMware vSphere. They also added Docker persistent volume support to Datrium. Now you can have mixed hypervisors KVM, VMware and Docker container environments, all accessing the same persistent storage.

KVM offered an opportunity to grow the user base and support Redhat enterprise accounts  Redhat is a popular software development environment in non-traditional data centers. Also, much of the public cloud is KVM based, which provides a great way to someday support Datrium storage services in public cloud environments.

One challenge with Docker support is that there are just a whole lot more Docker volumes then VMDKs in vSphere. So Datrium added sophisticated volume directory search capabilities and naming convention options for storage policy management. Customers can define a naming convention for application/container volumes and use these to define group storage policies, which will then apply to any volume that matches the naming convention. This is a lot easier than having to do policy management at a volume level with 100s, 1000s to 10,000s distinct volume IDs.

Docker is being used today to develop most cloud based applications. And many development organizations have adopted Docker containers for their development and application deployment environments. Many shops do development under Docker and production on vSphere. So now these shops can use Datrium to access development as well as production data.

More recently, Datrium also scaled the number of data nodes available in a cluster. Previously you could only have one data node using 12 drives or about 29TB raw storage of protected capacity which when deduped and compressed gave you an effective capacity of ~100TB. But with this latest release, Datrium now supports up to 10 data nodes in a cluster for a total of 1PB of effective capacity for your storage needs.

The podcast runs ~25 minutes. Brian is very knowledgeable about the storage industry, has been successful at many other data storage companies and is always a great guest to have on our show. Listen to the podcast to learn more.

Brian Biles, Datrium CEO & Co-founder

Prior to Datrium, Brian was Founder and VP of Product Mgmt. at EMC Backup Recovery Systems Division. Prior to that he was Founder, VP of Product Mgmt. and Business Development for Data Domain (acquired by EMC in 2009).

48: Greybeards talk object storage with Enrico Signoretti, Head of Product Strategy, OpenIO

In this episode we talk with Enrico Signoretti, Head of Product Strategy for OpenIO, a software defined, object storage startup out of Europe. Enrico is an old friend, having been a member of many Storage Field Day events (SFD) in the past which both Howard and I attended and we wanted to hear what he was up to nowadays.

OpenIO open source SDS

It turns out that OpenIO is an open source object storage project that’s been around since 2008 and has recently (2015) been re-launched as a new storage startup. The open source, community version is still available and OpenIO has links to downloads to try it out. There’s even one for a Raspberry PI (Raspbian 8, I believe) on their website.

As everyone should recall object storage is meant for multi-PB data storage environments. Objects are assigned an ID and are stored in containers or buckets. Object storage has a flat hierarchy unlike file systems that have a multi-tiered hierarchy.

Currently, OpenIO is in a number of customer sites running 15-20PB storage environments. OpenIO supports AWS S3 compatible protocol and OpenStack Swift object storage API.

OpenIO is based on open source but customer service and usability are built into the product they license to end customers  on a usable capacity basis. Minimum license is for 100TB and can go into the multiPB range. There doesn’t appear to be any charge for enhancements of additional features or additional cluster nodes.

The original code was developed for a big email service provider and supported a massive user community. So it was originally developed for small objects, with fast access and many cluster nodes. Nowadays, it can also support very large objects as well.

OpenIO functionality

Each disk device in the OpenIO cluster is a dedicated service. By setting it up this way,  load balancing across the cluster can be at the disk level. Load balancing in OpenIO, is also a dynamic operation. That is, every time a object is created all node’s current capacity is used to determine the node with the least used capacity, which is then allocated to hold that object. This way there’s no static allocation of object IDs to nodes.

Data protection in OpenIO supports erasure coding as well as mirroring (replication{. This can be set by policy and can vary depending on object size. For example, if an object is say under 100MB it can be replicated 3 times but if it’s over 100MB it uses erasure coding.

OpenIO supports hybrid tiering today. This means that an object can move from OpenIO residency to public cloud (AWS S3 or BackBlaze B2) residency over time if the customer wishes. In a future release they will support replication to public cloud as well as tiering.  Many larger customers don’t use tiering because of the expense. Enrico says S3 is cheap as long as you don’t access the data.

OpenIO provides compression of objects. Although many object storage customers already compress and encrypt their data so may not use this. For those customers who don’t, compression can often double the amount of effective storage.

Metadata is just another service in the OpenIO cluster. This means it can be assigned to a number of nodes or all nodes on a configuration basis. OpenIO keeps their metadata on SSDs, which are replicated for data protection rather than in memory. This allows OpenIO to have a light weight footprint. They call their solution “serverless” but what I take from that is that it doesn’t use a lot of server resources to run.

OpenIO offers a number of adjunct services besides pure object storage such as video transcoding or streaming that can be invoked automatically on objects.

They also offer stretched clusters where an OpenIO cluster exists across multiple locations. Objects can have dispersal-like erasure coding for multi-site environments so that if one site goes down you still have access to the data. But Enrico said you have to have a minimum of 3 sites for this.

Enrico mentioned one media & entertainment customer stored only one version of a video in the object storage but when requested in another format automatically transcoded it in realtime. They kept this newly transcoded version in a CDN for future availability, until it aged out.

There seems to be a lot of policy and procedural flexibility available with OpenIO but that may just be an artifact of running in Linux.

They currently support RedHat, Ubuntu and CentOS. They also have a Docker container in Beta test for persistent objects, which is expected to ship later this year.

OpenIO hardware requirements

OpenIO has minimal hardware requirements for cluster nodes. The only thing I saw on their website was the need for at least 2GB of RAM on each node.  And metadata services seem to require SSDs on multiple nodes.

As discussed above, OpenIO has a uniquely light weight footprint (which is why it can run on Raspberry PI) and only seems to need about 500MB of DRAM and 1 core to run effectively.

OpenIO supports heterogeneous nodes. That is nodes can have different numbers and types of disks/SSDs on them, different processor, memory configurations and OSs. We talked about the possibility of having a node go down or disks going down and operating without them for a month, at the end of which admins could go through and fix them/replacing them as needed. Enrico also mentioned it was very easy to add and decommission nodes.

OpenIO supports a nano-node, which is just an (ARM) CPU, ram and a disk drive. Sort of like Seagate Kinetic and other vendor Open Ethernet drive solutions. These drives have a lightweight processor with small memory running Linux accessing an attached disk drive.

Also, OpenIO nodes can offer different services. Some cluster nodes can offer metadata and object storage services and others only object storage services. This seems configurable on a server basis. There’s probably some minimum number of metadata and object services required in a cluster. Enrico mentioned three nodes as a minimum cluster.

The podcast runs ~42 minutes but Enrico is a very knowledgeable, industry expert and a great friend from multiple SFD/TFD events. Howard and I had fun talking with him again. Listen to the podcast to learn more.

Enrico Signoretti, Head of Product Strategy at OpenIO.

In his role as head of product strategy, Enrico is responsible for the planning design and execution of OpenIO product strategy. With the support of his team, he develops product roadmaps from the planning stages to development to ensure their market fit.

Enrico promotes OpenIO products and represent the company and its products at several industry events, conferences and association meetings across different geographies. He actively participates in the company’s sales effort with key accounts as well as by exploring opportunities for developing new partnerships and innovative channel activities.

Prior to joining OpenIO, Enrico worked as an independent IT analyst, blogger and advisor for six years, serving clients among primary storage vendors, startups and end users in Europe and the US.

Enrico is constantly keeping an eye on how the market evolves and continuously looking for new ideas and innovative solutions.

Enrico is also a great sailor and an unsuccessful fisherman.

45: Greybeards talk desktop cloud backup/storage & disk reliability with Andy Klein, Director Marketing, Backblaze

In this episode, we talk with Andy Klein, Dir of Marketing for Backblaze, which backs up  desktops and computers to the cloud and also offers cloud storage.

Backblaze has a unique consumer data protection solution where customers pay a flat fee to backup their desktops and then may pay a separate fee for a large recovery. On their website, they have a counter indicating they have restored almost 22.5B files. Desktop/computer backup costs $50/year. To restore files, if it’s under 500GB you can get a ZIP file downloaded at no charge but if it’s larger, you can get a USB flash stick or hard drive shipped FedEx but it will cost you.

They also offer a cloud storage service called B2 (not AWS S3 compatible) which costs $5/TB/year. Backblaze just celebrated their tenth anniversary last April.

Early on Backblaze figured out the only way they were going to succeed was to use consumer class disk drives and to engineer their own hardware and to write their own software to manage them.

Backblaze openness

Backblaze has always been a surprisingly open company. Their Storage Pod hardware (6th generation now) has been open sourced from the start and holds 60 drives for 480TB raw capacity.

A couple of years back when there was a natural disaster in SE Asia, disk drive manufacturing was severely impacted and their cost per GB for disk drives, almost doubled overnight. Considering they were buying about 50PB of drives during that period it was going to cost them ~$1M extra. But you could still purchase drives, in limited quantities, at select discount outlets. So, they convinced all their friends and family to go out and buy consumer drives for them (see their drive farming post[s] for more info).

Howard said that Gen 1 of their Storage Pod hardware used rubber bands to surround and hold disk drives and as a result, it looked like junk. The rubber bands were there to dampen drive rotational vibration because they were inserted vertically. At the time, most if not all of the storage industry used horizontally inserted drives.  Nowadays just about every vendor has a high density, vertically inserted drive tray but we believe Backblaze was the first to use this approach in volume.

Hard drive reliability at Backblaze

These days Backblaze has over 300PB of storage and they have  been monitoring their disk drive SMART (error) logs since the start.  Sometime during 2013 they decided to keep the log data rather than recycling the space. Since they had the data and were calculating drive reliability anyways, they thought that the industry and consumers would appreciate seeing their reliability info. In December of 2014 Backblaze published their hard drive reliability report using Annualized Failure Rates (AFR) they calculated from the many thousands of disk drives they ran every day. They had not released Q2 2017 hard drive stats yet but their Q1 2017 hard drive stats post has been out now for about 3 months.

Most drive vendors report disk reliability using Mean Time Between Failure (MTBF), which is the interval of time until half the drives will fail.  AFR is an alternative reliability metric, which is the percentage of drives that will fail in one year’s time.  Although both are equivalent (for MTBF in hours, AFR=8766/MTBF), AFR is more useful as it tells users the percent of drives they can expect to fail over the next twelve months.

Drive costs matter, but performance matters more

It seemed to the Greybeards that SMR (shingle magnetic recording, read my RoS post for more info) disks would be a great fit for Backblaze’s application. But Andy said their engineering team looked at SMR disks and found the 2nd write (overwrite of a zone) had terrible performance. As Backblaze often has customers who delete files or drop the service, they reuse existing space all the time and SMR disks would hurt performance too much.

We also talked a bit about their current data protection scheme. The new scheme is a Reed Solomon (RS) solution with data written to 17 Storage Pods and parity written to 3 Storage Pods across a 20 Storage Pod group called a Vault.  This way they can handle 3 Storage Pod failures across a Vault without losing customer data.

Besides disk reliability and performance, Backblaze is also interested in finding the best $/GB for drives they purchase. Andy said nowadays the consumer disk pricing (at Backblaze’s volumes) generally falls between ~$0.04/GB and ~$0.025/GB, with newer generation disks starting out at the higher price and as the manufacturing lines mature, fall to the lower price. Currently, Backblaze is buying 8TB disk drives.

The podcast runs ~45 minutes.  Andy was great to talk with and was extremely knowledgeable about disk drives, reliability statistics and “big” storage environments.  Listen to the podcast to learn more.

Andy Klein, Director of marketing at Backblaze

Mr. Klein has 25 years of experience in the cloud storage, computer security, and network security.

Prior to Backblaze he worked at Symantec, Checkpoint, PGP, and PeopleSoft, as well as startups throughout Silicon Valley.

He has presented at the Federal Trade Commission, RSA, the Commonwealth Club, Interop, and other computer security and cloud storage events