47: Greybeards talk Storage as a Service with Lazarus Vekiarides, CTO & Co-Founder ClearSky Data

In this episode, we talk with ClearSky Data’s Lazarus Vekiarides, CTO and Co-founder,  who we have talked with before (see our podcast from October 2015). ClearSky Data provides a storage-as-a-service offering that uses an on-premises appliance plus point of presence (PoP) storage in the local metro area to hold customer data and offloads this data to cloud storage. In addition to the on-premises storage-as-a-service they offer access to customer data from an in-cloud virtual appliance. ClearSky provides the whole storage service, including gigabit metro Ethernet connections from the customer to the POP for simple capacity based charge every month.

How does it work

Their Edge (on premises) appliance supports 24 SSDs and can scale up to 4 appliances. Soon a single appliance will be able to hold up to 32TB of data.  It’s intended to hold a data center’s entire working set for one week of activity. So essentially it’s a big caching appliance for the local data center

For ClearSky Data the lone source of truth for customer data lies in the PoP. The PoP is connected to metro wide fibre that is available in a number of large metropolitan areas. Laz says they have measured sub 500 µsec round trip response time between their PoP equipment and Edge appliance. The PoP provides the backing store for the Edge appliance. Data written to the edge appliance(s) are written through to the PoP storage. This data and it’s metadata (<1% of LUN size) is flushed to cloud storage which holds the data indefinitely.

DR through the PoP

If customers have multiple data centers within the same metro area (100Km) then they can have a single “logical” array that accesses the same data, say a cluster file system across the two data centers. The PoP will take care of copying the metadata to the secondary edge device and will invalidate any data sitting in the secondary device which is no longer valid. In this way customers can have a Recovery Point Objective (RPO)=0 seconds. That is any data written to the primary data center is automatically available to the secondary data center as long as the PoP survives.

But even if you wanted to fail over to a different metro area the PoP data is offloaded to the cloud continuously so while you wouldn’t attain an RPO=0 seconds, it could be awfully short, on the order of a couple of seconds.

Recent enhancements

ClearSky Data has recently enhanced their storage as a service to provide policy management over snapshots. That is you can establish policies as to how often to take LUN snapshots and how long to retain them in the cloud.

Also, ClearSky Data has added VMware functionality via plugins that allow their storage to know which VMs are writing data or are being backed up to their appliance. And this is included in the metadata written for a LUN which is offloaded to the cloud. Someday soon when you can have vSphere running bare metal in a public cloud service, you will be able to run the Cloud Edge (cloud software version of their Edge appliance) and restore the data from your data center directly to the cloud and have an iSCSI LUN available to EC2 running VMware providing complete Cloud DR for a data center.

We talked a bit about our favorite topic, NVMe storage and Laz sees a potential for it to help their Edge appliances but at the moment fault-tolerence/high availability is not there. And as they are primary storage for data centers HA is a critical capability.

Pricing and availability

Their product is priced as a service in $0.nn/GB/Month and if you do a 36 month cost analysis they feel they would come out cheaper than hybrid storage. They currently have PoP’s in Boston, NyNy, Northern Virginia, Dallas, and California. Laz says they believe there’s 15 major metropolitan areas across the USA they have targeted for service.  What nothing in Europe or Asia? We would imagine this is merely a question of the number of customers, amount of data and metro infrastructure.

The podcast runs ~24 minutes. Laz has been in the storage industry across a number of companies and has been with a few startups as well. Laz is very knowledgeable about storage, cloud, and metro networking, a good friend and is always a pleasure to talk with.  Listen to the podcast to learn more.

Lazarus Vekiarides, CTO & Co-Founder ClearSky Data

For over 20 years Laz Vekiarides has served in key technical and leadership roles delivering breakthrough technologies to market. Most recently, he served as the Executive Director of Software Engineering for Dell’s EqualLogic Storage Engineering group, where he led the development of numerous storage innovations and established the EqualLogic product line as a leader in host OS and hypervisor integration.

Laz joined Dell from EqualLogic, which was acquired in early 2008, where he was a member of the core leadership team – playing a key role in the company’s early success as a Senior Engineering Manager and Architect for the PS Series SAN arrays and host tools. Prior to EqualLogic, Laz held senior engineering and management positions at several companies including 3COM and Banyan Systems.

An occasional blogger, Laz frequently speaks at industry conferences, particularly in the areas of virtualization and data storage. He holds several storage technology patents, as well as a BSEE from Northeastern University, and an MSCS from the Worcester Polytechnic Institute.

45: Greybeards talk desktop cloud backup/storage & disk reliability with Andy Klein, Director Marketing, Backblaze

In this episode, we talk with Andy Klein, Dir of Marketing for Backblaze, which backs up  desktops and computers to the cloud and also offers cloud storage.

Backblaze has a unique consumer data protection solution where customers pay a flat fee to backup their desktops and then may pay a separate fee for a large recovery. On their website, they have a counter indicating they have restored almost 22.5B files. Desktop/computer backup costs $50/year. To restore files, if it’s under 500GB you can get a ZIP file downloaded at no charge but if it’s larger, you can get a USB flash stick or hard drive shipped FedEx but it will cost you.

They also offer a cloud storage service called B2 (not AWS S3 compatible) which costs $5/TB/year. Backblaze just celebrated their tenth anniversary last April.

Early on Backblaze figured out the only way they were going to succeed was to use consumer class disk drives and to engineer their own hardware and to write their own software to manage them.

Backblaze openness

Backblaze has always been a surprisingly open company. Their Storage Pod hardware (6th generation now) has been open sourced from the start and holds 60 drives for 480TB raw capacity.

A couple of years back when there was a natural disaster in SE Asia, disk drive manufacturing was severely impacted and their cost per GB for disk drives, almost doubled overnight. Considering they were buying about 50PB of drives during that period it was going to cost them ~$1M extra. But you could still purchase drives, in limited quantities, at select discount outlets. So, they convinced all their friends and family to go out and buy consumer drives for them (see their drive farming post[s] for more info).

Howard said that Gen 1 of their Storage Pod hardware used rubber bands to surround and hold disk drives and as a result, it looked like junk. The rubber bands were there to dampen drive rotational vibration because they were inserted vertically. At the time, most if not all of the storage industry used horizontally inserted drives.  Nowadays just about every vendor has a high density, vertically inserted drive tray but we believe Backblaze was the first to use this approach in volume.

Hard drive reliability at Backblaze

These days Backblaze has over 300PB of storage and they have  been monitoring their disk drive SMART (error) logs since the start.  Sometime during 2013 they decided to keep the log data rather than recycling the space. Since they had the data and were calculating drive reliability anyways, they thought that the industry and consumers would appreciate seeing their reliability info. In December of 2014 Backblaze published their hard drive reliability report using Annualized Failure Rates (AFR) they calculated from the many thousands of disk drives they ran every day. They had not released Q2 2017 hard drive stats yet but their Q1 2017 hard drive stats post has been out now for about 3 months.

Most drive vendors report disk reliability using Mean Time Between Failure (MTBF), which is the interval of time until half the drives will fail.  AFR is an alternative reliability metric, which is the percentage of drives that will fail in one year’s time.  Although both are equivalent (for MTBF in hours, AFR=8766/MTBF), AFR is more useful as it tells users the percent of drives they can expect to fail over the next twelve months.

Drive costs matter, but performance matters more

It seemed to the Greybeards that SMR (shingle magnetic recording, read my RoS post for more info) disks would be a great fit for Backblaze’s application. But Andy said their engineering team looked at SMR disks and found the 2nd write (overwrite of a zone) had terrible performance. As Backblaze often has customers who delete files or drop the service, they reuse existing space all the time and SMR disks would hurt performance too much.

We also talked a bit about their current data protection scheme. The new scheme is a Reed Solomon (RS) solution with data written to 17 Storage Pods and parity written to 3 Storage Pods across a 20 Storage Pod group called a Vault.  This way they can handle 3 Storage Pod failures across a Vault without losing customer data.

Besides disk reliability and performance, Backblaze is also interested in finding the best $/GB for drives they purchase. Andy said nowadays the consumer disk pricing (at Backblaze’s volumes) generally falls between ~$0.04/GB and ~$0.025/GB, with newer generation disks starting out at the higher price and as the manufacturing lines mature, fall to the lower price. Currently, Backblaze is buying 8TB disk drives.

The podcast runs ~45 minutes.  Andy was great to talk with and was extremely knowledgeable about disk drives, reliability statistics and “big” storage environments.  Listen to the podcast to learn more.

Andy Klein, Director of marketing at Backblaze

Mr. Klein has 25 years of experience in the cloud storage, computer security, and network security.

Prior to Backblaze he worked at Symantec, Checkpoint, PGP, and PeopleSoft, as well as startups throughout Silicon Valley.

He has presented at the Federal Trade Commission, RSA, the Commonwealth Club, Interop, and other computer security and cloud storage events


44: Greybeards talk 3rd platform backup with Tarun Thakur, CEO Datos IO

In this episode, we talk with a new vendor that’s created a new method to backup database information. Our guest for this podcast is Tarun Thakur, CEO of Datos IO. Datos IO was started in 2014 with the express purpose to provide a better way to back up and recover databases in the cloud. They started with noSQL, cloud based databases, such as MongoDB and Cassandra.

The problem with backing up noSQL and any SQL databases for that matter, is that they are big files and always have some changes in them. So, for most typical backup systems, databases are always flagged as files that have been changed and thus need to be backed up. So each incremental, backups up the whole database file, even if only a row has changed. All this results in a tremendous waste of storage.

Deduplication can help, but there are problems deduplicating databases. Many databases used compressed data for storing data and deduplication that is based on fixed length blocks don’t work well for variable length, compressed data (see my RayOnStorage Poor deduplication … post).

Also, variable length deduplication algorithms are usually based on known start of record triggers to understand where a chunk of data can be found. Some databases do not use these start of row, start of table indicators, which throw off variable length deduplication algorithms.

So, with traditional backup systems most databases don’t deduplicate very well and are backed up all the time resulting in lots of waisted storage space.

How’s Datos IO different?

Datos IO identifies and backups only changed data, not changed (database) files. Their Datos IO RecoverX product extracts rows from a database, identifies whether this data has changed and then just backups the changed data.

As more customers create applications for the cloud, backups become a critical component of cloud operations. Most cloud based applications are developed from the start, using noSQL databases.

Traditional backup packages don’t work well with NoSQL, cloud databases, if at all. And data center customers are reluctant to move their expensive, enterprise backup packages to the cloud, even if they could operate effectively there.

Datos IO saw that backing up noSQL MongoDB and Cassandra databases in the cloud as a major new opportunity, if it could be done properly.

How does Datos IO backup changed  data?

Essentially, RecoverX takes a point-in-time snapshot of the database and then reads each table, row by row, comparing (hashes of) each row’s data obtained, with the row data they previously backed up and if changed, the new row’s data is added to the current backup. This provides a semantic deduplication of database data.

Furthermore, because RecoverX is looking at the data rather than files, compressed data works just as well as uncompressed. Datos IO uses standardized database APIs to extract the row data, that way they remain compatible with each release of database software.

RecoverX backups reside in S3 objects on the public cloud.

New in RecoverX Version 2

Many customers liked their approach so much they wanted RecoverX to do this for regular SQL databases as well. Major customers are not just developing new applications for the cloud they also want to do enterprise application development, test and QA in the cloud as well, and these applications almost always use SQL databases.

So, Datos IO RecoverX Version 2 nows supports migration and cloud backups for standardized SQL databases. They are starting with MySQL, with plans to support other SQL databases used by the enterprise. Migration occurs by backing up the datacenter MySQL databases to the cloud and then recovering it to the cloud.

They have also added backup and recovery support for Apache Hadoop, HDFS from Cloudera and HortonWorks. Another change is that Datos IO originally offered only a 3 node solution but with Version 2, it will now support  up to a 5 node cluster.

They have also added more backup management and policy support. Now you can add/subtract database table backups at anytime. Now admins can change backup policies  to add or subtract tables/databases on the fly, even while backups are taking place.

The podcast runs ~30 minutes. Tarun has been in the storage industry for a number of years from microcoding storage control logic to managing major industry development organizations. He has an in depth knowledge of storage and backup systems that’s hard to come by and was a pleasure to talk with.  Listen to the podcast to learn more.

Tarun Thakur, CEO, Datos IO

Tarun Thakur is co-founder and CEO, where he leads overall Datos IO business strategy, day-to-day execution, and product efforts. Prior to founding Datos IO he held senior product and technical roles at several leading technology companies.

Most recently, he worked at Data Domain (EMC), where he led and delivered multiple product releases and new products. Prior to EMC, Thakur was at Veritas (Symantec), where he was responsible for inbound and outbound product management for their clustered NAS appliance products.

Prior to that, he worked at the IBM Almaden Research Center where he focused on distributed systems technology. Thakur started his career at Seagate, where he developed advanced storage architecture products.

Thakur has more than 10 patents granted from USPTO and holds an MBA from Duke University.

40: Greybeards storage industry yearend review podcast

In this episode, the Greybeards discuss the year in storage and naturally we kick off with the consolidation trend in the industry and the big one last year, the DELL-EMC acquisition. How the high margin EMC storage business is going to work in a low margin company like Dell is the subject of much speculation. That and which of the combined companies storage products will make it through the transition make for interesting discussions. And Finally what exactly is Dell’s long term strategy is another question.

We next turn to the coming of age of object storage. A couple of years ago, object storage was being introduced to a wider market but few wanted to code to RESTful interfaces. Nowadays, that seems to be less of a concern and the fact that one can have onsite/offsite/cloud based object storage repositories from open source, proprietary solutions and everything in between is making object storage a much more appealing option to enterprise IT.

Finally, we discuss the new Tier 0. What with NVMe SSDs and the emergence of NVMe over Fabric coming out last year, Tier 0 has never looked so promising.  You may recall that Tier 0 was hot about 5 years with TMS and Violin and others coming out with lightning fast storage IO. But with DELL-EMC DSSD: startups (E8 storage, Mangstor, Apeiron data systems, and others); NVMDIMMs, CrossBar, and Everspin coming out with denser offerings; and other SCM (Micron, HPE, IBM, others?) technologies on the horizon, Tier 0 has become red hot again.

Sorry about the occasional airplane noise and other audio anomalies. The podcast runs  over 47 minutes. Howard and I could talk for hours on what’s happening in the storage industry. Listen to the podcast to learn more.

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at, and can be found on twitter @RayLucchesi.

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.


39: Greybeards talk deep storage/archive with Matt Starr, CTO Spectra Logic

In this episode, we talk with Matt Starr (@StarrFiles),  CTO of Spectra Logic, the deep storage experts. Matt has been around a long time and Ray’s shared many a meal with Matt as we’re both in NW Denver. Howard has a minor quibble with Spectra Logic over the use of his company’s name (DeepStorage) in their product line but he’s also known Matt for awhile now.

The Pearl

Matt and Spectra Logic have a number of customers with multi-PB to over an EB of data repository problems and how to take care of these ever expanding storage stashes is an ongoing concern.  One of the solutions Spectra Logic offers is the Black Pearl Deep Storage, which provides an object storage, RESTfull interface front end to storage tiering/archive backend that uses flash, (spin-down) disk, (LTFS) tape (libraries) and the (AWS) cloud as backend storage.

Major portions of the Black Pearl are open sourced and available on GitHub. I see several (DS3-)SDK’s for Java, Python, C, and others. Open sourcing the product provides an easy way for client customization. In fact, one customer was using CEPH and they modified their CEPH backup client to send a copy of data off to the Pearl.

We talk a bit about the Black Pearl’s data integrity. It uses a checksum, computed over the object at creation time which is then verified anytime the object is retrieved, copied, moved or migrated and can be validated periodically (scrubbed), even when it has not been touched.

Super Computing’s interesting (storage) problems

Matt just returned from the SC16 (Super Computing Conference 2016) in Salt Lake City last month. At the conference there were plenty of MultiPB customers that were looking for better storage alternatives.

One customer Matt mentioned  was the Square Kilometer Array, the world’s largest radio telescope which will be transmitting 700TB/hour, over an 1EB per year.  All that data has to land somewhere and for this quantity (>eb) of data, tape becomes an necessary choice.

Matt likened Spectra’s  archive solutions to warehouses vs. factories. For the factory floor,  you need responsive (AFA or hybrid) primary storage but for the warehouse, you just want cheap, bulk storage (capacity).

The podcast runs long, over 51 minutes, and reveals a different world from the GreyBeards everyday enterprise environments. Specifically customers that have extra large data repositories and how they manage to survive under the data deluge. Matt’s an articulate spokesperson for Spectra Logic and their archive solutions and we could have talked about >eb data repositories for hours.  Listen to the podcast to learn more.

matt-starrMatt Starr, CTO, Spectra Logic

Matt Starr’s tenure with Spectra Logic spans 24 years and includes experience in service, hardware design, software development, operating systems, electronic design and management. As CTO, he is responsible for helping define the company’s product vision, and serves as the executive representative for the voice of the market. He leads Spectra’s efforts in high-performance computing, private cloud and other vertical markets.

Matt served as the lead engineering architect for the design and production of Spectra’s TSeries tape library family. Spectra Logic has secured more than 50 patents under Matt’s direction, establishing the company as the innovative technology leader in the data storage industry. He holds a BS in electrical engineering from the University of Colorado at Colorado Springs.