Tag Archives: scale-out storage

GreyBeards talk global storage with Ellen Rubin CEO & Laz Vekiarides CTO, ClearSky Data

In this edition we discuss ClearSky Data’s global storage service with Ellen Rubin (@ellen_rubin), CEO & Co-Founder and Laz Vekiarides (@lazvek), CTO & Co-founder of ClearSky Data.  Both Ellen and Laz have been around the IT industry for decades and Laz in particular was deeply involved in the development of EqualLogic storage systems both at Dell and at EqualLogic, prior to the acquisition.

ClearSky Data provides a global, primary storage service that connects edge device(s) in the data center that supply a read/write cache for iSCSI block storage to a point-of-presence (PoP) appliance in the metro area which uses cloud storage as its backend storage repository . We get into the technology later but essentially a customer pays a $/GB/month fee and all the edge, point-of-presence hardware and cloud storage repository is bundled into that monthly price.

The service is implemented as a two level caching service: level one at the edge is a cluster of 2U appliances with compute, DRAM and up to 24 SSDs and a dedicated metro-ethernet networking link to the PoP; level two at the PoP includes a dual HA server configuration with a JBOD with even more SSDs that has a direct link to Amazon Web Services Simple Storage Service (AWS S3).

Data is compressed, (inline or post-process) deduped and encrypted at the edge. Encryption keys are kept by the customer. Data written to the edge is synch-mirrored to the PoP and when the PoP fills up or the customer’s time interval has elapsed, their data is destaged to Amazon S3 which can then be replicated to other regions, where needed.

As part of their service, ClearSky Data also offers disaster recovery. As all customer data resides in S3, it can easily be supplied to another edge appliance (with the proper keys) at any other metro area location connected to one of their PoP’s.

ClearSky Data handles eventual consistency (not all copies of the data residing in  cloud storage may be the same) by versioning the cloud data objects and providing point-in-time consistency.

At the edge, the service can be deployed as a cluster of appliances that work together to support the IO workload and the PoP is configured to handle whatever IO workload is required in the metro area. Activity at the edge is heavy compute (compression, dedupe and encrypting all the data that comes in) and workload at the PoP is more IO bandwidth/networking based.

ClearSky Data currently has PoP’s in Las Vegas, Philadelphia and Boston with more on the way in the US. Today, ClearSky Data offer’s iSCSI interface protocols but have plans to provide FC, NFS and SMB support as well.  As we recorded the podcast, ClearSky Data’s  service was not quite GA yet, but were close.

Full Disclosure: Howard has worked for ClearSky in the past.

This months edition runs just under 41 minutes and gets into the business side and technical side of their service. Ellen provided the business view and Laz handled all the technical questions Howard and I threw at him. We hope you enjoy the podcast.

Ellen and Laz-orig copyEllen Rubin, CEO & Co-Founder ClearSky Data

Ellen Rubin is an experienced entrepreneur with a proven track record in leading strategy, market positioning and go-to-market for fast-growing companies. Most recently she was co-founder of CloudSwitch, a cloud enablement software company that was successfully acquired by Verizon in 2011. At Verizon, Ellen ran the cloud products group and was responsible for the strategy and roadmap for all cloud offerings.

Prior to founding CloudSwitch, Ellen was Vice President of Marketing at Netezza (NYSE: NZ), the pioneer and global leader in data warehouse appliances that power business intelligence and analytics at over 200 enterprises worldwide. As a member of the early management team at Netezza, Ellen helped grow the company to $130 million in revenues and a successful IPO in 2007. Ellen defined and created broad market acceptance of a new category, “data warehouse appliances,” and led market strategy, product marketing, complementary technology relationships and marketing communications.

Prior to Netezza, Ellen founded Manna, an Israeli and Boston-based developer of real-time personalization software. Ellen played a key role in raising over $18 million in venture financing from leading US and Israeli venture capital firms, recruiting the US-based management team and defining product and market strategy. Ellen began her career as a marketing strategy consultant at Booz, Allen & Hamilton, and holds an MBA from Harvard Business School and an undergraduate degree magna cum laude from Harvard College. She speaks regularly at industry events and has been recognized as one of the Top 10 Women in the Cloud by CloudNOW, as a Woman to Watch by Mass High Tech and Rising Star Entrepreneur by the New England Venture Capital Association.

Laz Vekiarides, CTO  & Co-Founder ClearSky Data

For over 20 years Laz Vekiarides has served in key technical and leadership roles delivering breakthrough technologies to market. Most recently, he served as the Executive Director of Software Engineering for Dell’s EqualLogic Storage Engineering group, where he led the development of numerous storage innovations and established the EqualLogic product line as a leader in host OS and hypervisor integration.

Laz joined Dell from EqualLogic, which was acquired in early 2008, where he was a member of the core leadership team – playing a key role in the company’s early success as a Senior Engineering Manager and Architect for the PS Series SAN arrays and host tools. Prior to EqualLogic, Laz held senior engineering and management positions at several companies including 3COM and Banyan Systems.

An occasional blogger, Laz frequently speaks at industry conferences, particularly in the areas of virtualization and data storage. He holds several storage technology patents, as well as a BSEE from Northeastern University, and an MSCS from the Worcester Polytechnic Institute.

GreyBeards talk data-aware, scale-out file systems with Peter Godman, Co-founder & CEO, Qumulo

In this podcast we discuss Qumulo’s data-aware, scale-out file system storage with Peter Godman, Co-founder & CEO of Qumulo. Peter has been involved in scale-out storage for a while now, coming from (EMC) Isilon before starting Qumulo. Although, this time he’s adding data-awareness to scale-out storage. Presently, Qumulo is vertically focused on the HPC and media/entertainment market spaces.

Qumulo is the first storage vendor we have heard of that implements their software with Agile development techniques. This allows them to release new functionality to the field every two weeks – talk about rapidly turning out software. We believe this is pretty risky and Ray talks more about Agile development for storage in his Storage on Agile blog post.

But Qumulo mostly sees itself as data aware NAS, using Posix metadata and a neat, internally designed/developed database to store, index and retrieve file system metadata. Qumulo’s proprietary database provides much faster response to queries on meta-data, such as what files have changed since last backup, calculate all the  storage space consumed by a specific owner, supply inclusion/exclusion lists to split the file systems into 100 partitions, etc. The database is not a relational or conventional database, but almost old-school, indexed data structures tailored to providing quick answers to the queries of most interest to customers and their application environment. In a scale-out NAS environment like Qumulo’s, with potentially billions of files, you just don’t have time to walk an inode tree to get these sorts of answers, anymore.

Qumulo supplies both hardware and software to its customers but also offers a software-only or software defined storage (SDS) version for those few customers that want it. SDS versions can help potential customers perform  proofs of concept (PoCs) using VMs.

In their system nodes, Qumulo uses SSDs and disks. SSDs provide a sort of NVM that holds recently written data but can also be used for reading data. Behind the SSDs are 8TB disks. Today, Qumulo provides mirrored storage that’s widely spread or dispersed across all the storage in their system. With this wide-striping of data, rebuild times for (an 8TB) disk failure is ~1:20 for a single QC204 (204TB) system node and halves every time you double the number of nodes.

It was refreshing to hear a startup vendor clearly answer what they have and don’t have implemented in their current system. Some startups try to obfuscate or talk around the lack of functionality but Peter’s answers were always clear and (sometimes to) concise on what’s in and not in current Qumulo functionality.

This months edition runs just over 47 minutes and gets pretty technical in places, but mostly stays at a high functional level.  We hope you enjoy the podcast.

Peter Godman, Co-founder & CEO Qumulo

pete_7CPeter brings 20 years of industry systems experience to Qumulo. As VP of Engineering and CEO of Corensic, Peter brought the world’s first thin-hypervisor based product to market. As Director of Software Engineering at Isilon, he led development of several major releases of Isilon’s award-winning OneFS distributed file system and was inventor of 18 patented technologies. Peter studied math and computer science at MIT.

Graybeards talk hyper-convergence with Kelly Murphy, Founder & CTO, Gridstore

In our 14th podcast we return to hyper converged systems and talk with Kelly Murphy, Founder and CTO of  GridStore. Gridstore is a startup supplying hyper-coverged systems for Microsoft (Hyper-V) virtualization environments. Howard and I had a chance to talk with Gridstore at SFD4, just about a year ago.

Gridstore has recently added an all-flash version of their hyper-converged systems to their hybrid and pure SATA storage lineup. Howard, in a recent post, wrote about how all-flash hyper-converged systems make as much sense as chocolate covered pickles. It just so happens that within a month of writing the post, there two hyper-converged vendors announced all-flash nodes. Kelly responds well to Howard’s critique of the idea.

Howard apparently has a mischievous side, as sometime in the past he blew-up a  Gridstore node to test its fault tolerance. The video went viral and made Howard a YouTube star.

In the podcast, we get into Erasure coding, EVO RAIL pricing vs. cost, and why Hyper-V and not VMware to name just a few of the topics covered.  At the end of the podcast there’s a nice bit about how Gridstore came about and it involves disposable motherboards? Listen to the podcast to learn more…

This months episode comes in at a little more than 48 minutes.

Kelly Murphy, Founder & CTO
Kelly Murphy

Kelly Murphy, Founder and CTO, Gridstore

As a serial entrepreneur with a track record of bringing disruptive technologies to market, Kelly Murphy brings 15 years CEO experience with disruptive venture backed software companies. In 1998, almost a decade before the cloud became popular, Murphy founded Marrakech, the first software company that offered on-demand procurement and supply chain systems to over 30,000 trading partners including some of the world’s largest retailers, consumer food producers, packaging companies and utilities.

After selling Marrakech in 2007, he turned his sights onto what was his largest obstacle in growing his previous business — storage. In 2009, Murphy founded Gridstore — a pioneer of software-defined storage that is set to disrupt the traditional storage industry. Currently, he serves on Gridstore’s Board of Directors and is also the Chief Technology Officer.

Originally from Canada, Murphy obtained his BS in Computer Science from Michigan Technological University, played Division I hockey and was the seventh pick of the New York Islanders in the 1984 entry draft.

GreyBeards year end storage trends wrap-up

Welcome to our fourth episode. In this year end wrap-up Howard and Ray talk about the three trends that have emerged over the last year or so which are impacting the storage industry in a big way and will continue to affect the industry in the the years to come.

First up is scale-out storage. Howard and Ray were part of Storage Field Day 4 (SFD4) where we met with at least 5 different vendors of scale out storage. All the blogger participants were starting to call this the “Scale-Out” field day. It turns out that the compute requirements for storage are starting to increase, for many reasons not the least of which is the performance of SSDs. This rising compute requirement generates a need for scale-out storage.

Second is software defined storage. Howard took a stab at defining it and in our view software defined storage is delivered as a software only solution that provides storage and compute services together in one server environment. With a 2U server, one can have a couple of SSDs and a gaggle of HDDs and still only use 4 of the 24 cores to supply storage services, leaving the other 20 for compute. What with VMware’s VSAN and the other software defined storage players, this is becoming another hot trend this year.

Finally, whither the disk drive? Drive capacity continues to grow with no end in sight, with helium, HAMR, and shingled magnetic recording. SSD is not killing them off as quickly as we thought, even though SSD costs on $/GB basis keep coming down. The net effect of this is that both of us believe disks are going to be around for the near term (5 yrs or so) but we differed on the long term prospects of disk.

Listen to the podcast to learn more….