Tag Archives: HDFS

Graybeards talk object storage with Russ Kennedy, Sr. VP Prod. Strategy & Cust. Solutions Cleversafe

In our 15th podcast we talk object storage with Russ Kennedy Senior V.P. Product Strategy and Customer Solutions, Cleversafe. Cleversafe is a 10 year old  company sellinge scale out, object storage solutions with a number of interesting characteristics. Howard and I had the chance to talk with Cleversafe at SFD4 (we suggest you view the video if you want to learn more), just about a year ago. But we have both known Russ for a number of years and Ray has done work for Cleversafe in the past.

We haven’t talked about objects storage in the past so this podcast goes over some foundational information about it. Object storage is starting to become more mainstream and general purpose as more interfaces become available and as the amount of data being stored grows out of sight.  Object storage has a flat name space, rich metadata, and relatively rudimentary, native storage access methods. But on top of this one can build sophisticated PB storage environments that can handle high amounts of data throughput, spread this data across multiple sites, and provide highly fault tolerant/highly available storage environments. Object storage will never replace OLTP block oriented storage but for environments with massive unstructured data repositories, it’s probably the best solution out there today.

Cleversafe has some unique characteristics namely their ability to split object storage elements over multiple disparate locations and use erase coding to supply data availability in the event of storage, server, or site failures. Some other object storage systems use 2- or 3-way replication to protect against data loss. But Russ makes the apt comment that when you are talking about PBs of data, replication can cause your storage costs to go up quite fast. Someone mentioned that there are Cleversafe customers that have 15-9’s data availability using erasure coding with only 150% of the original capacity. This is significantly more reliability than what could be obtained by dual or even triple redundancy alone. However, I always find that the weak link in data reliability  discussions such as these is always the software that implements the solution, not the data integrity architecture of the system.

Currently, Cleversafe has many multi-PB installations some of which span continents and others of which are looking to breach an EB (10**21 bytes of storage) of object data. We asked what these customers look like and Russ said lots of Accessors®  (stateless on- and off-ramps for object data) and a lots more Slicestors® (servers holding the statefull storage).

One of the significant barriers to higher object storage adoption has always been their unique, native object storage access protocols. But these days, it turns out that Amazon’s S3 protocol has become the defacto standard for object storage and this is helping accelerate object storage adoption.  In the podcast we discuss how historically, defacto standards have been a successful approach used to introduce new storage access protocols. Cleversafe offers its native RESTful access protocol, S3 and a smattering of others but you can also use other partner solutions if you need standard file access to the object store.

Cleversafe also offers HDFS as another access protocol. With Cleversafe HDFS, Hadoop can access all of it’s data from the Cleversafe object repository. In addition, you can run Hadoop MapReduce on its Slicestor nodes, if you want. Apparently, moving PB of data to analyze it and then deleting it is an expensive and very time consuming proposition, and of course native HDFS uses triple redundancy…

In the podcast, we get into object storage, some of Cleversafe’s advanced functionality, access protocol evolution and more. Listen to the podcast to learn more…

This months episode comes in at a little more than 47 minutes.

Russ Kennedy

Russ Kennedy, Sr VP Product Strategy & Customer Solutions

Russ Kennedy brings more than 20 years experience in the storage industry to Cleversafe as the company’s Senior Vice President of Product Strategy and Customer Solutions. Having rolled up his sleeves working on automated tape libraries, Russ is still attracted to the technological challenges that have shaped the industry and particularly to the innovative approach that Cleversafe delivers to storage.

Russ joined the company initially in 2007 and left in 2009, staying on in an advisory role. In 2011, Russ rejoined the company seeing a clear opportunity to solve the storage needs surrounding the exponential growth of big data and the unique impact that Cleversafe delivers over traditional systems.

Previously, Russ served as the Vice President of Competitive Intelligence at CA Technologies, and was the Senior Director of Engineering and Product Management at Thin Identity Corporation. Russ has an MBA from the University of Colorado at Denver and a bachelor’s degree in Computer Science from Colorado State University.

GreyBeards talk enterprise Hadoop with Jack Norris, CMO MapR Technologies

Welcome to our third episode. In this podcast we take a step up from the technical depths to talk with Jack Norris, Chief Marketing Officer of MapR Technologies about their enterprise class Hadoop distribution that customers far and wide are finding a viable solution to today’s Hadoop problems.

This months podcast runs a little over 37 minutes (still trying to get this down – but obviously not succeeding). It seems just when I think we’re close to the end, Howard or I jump in with yet another question that takes us down a different tack.

Back to the episode, the advantages of big data are obvious to many. Hadoop and its ecosystem allow IT to tackle jobs previously impossible to perform, clustering together 1000s of servers, and co-locating compute with data. All this provides an immense platform to perform data analytics on a scale never before possible.

Nonetheless, Hadoop has some weaknesses based on its heritage which make it less than it could be and that’s where MapR Technologies steps in. MapR has taken the Hadoop distribution and yanked out HDSF, replacing it with a completely re-architected/reimplemented enterprise class, fault tolerant storage service. But MapR is more than just better and faster storage, so listen to our talk with Jack to find out more about it. I am really having trouble getting my head around snapshotting a PB of Hadoop data at a whack …

 


Jack Norris, CMO, MapR

Jack leads worldwide marketing efforts for MapR. Jack has over 20 years of enterprise software marketing and product management experience in defining and delivering analytics, storage, and information delivery products. Jack has also held senior executive roles with EMC, Rainfinity, Brio Technology, SQRIBE, and Bain and Company.