Category Archives: NAS storage

34: GreyBeards talk Copy Data Management with Ash Ashutosh, CEO Actifio

In this episode, we talk with Ash Ashutosh (@ashashutosh), CEO of Actifio a copy data virtualization company. Howard met up with Ash at TechFieldDay11 (TFD11) a couple of weeks back and wanted another chance to talk with him.  Ash seems to have been around forever, the first time we met I was at a former employer and he was with AppIQ (later purchased by HP).  Actifio is populated by a number of industry veterans and since being founded in 2009 is doing really well, with over 1000 customers.

So what’s copy data virtualization (management) anyway?  At my former employer, we did an industry study that determined that IT shops (back in the 90’s) were making 9-13 copies of their data. These days,  IT is making, even more, copies of the exact same data.

Data copies proliferate like weeds

Engineers use snapshots for development, QA and validation. Analysts use data copies to better understand what’s going on in their customer-partner interactions, manufacturing activities, industry trends, etc. Finance, marketing , legal, etc. all have similar needs which just makes the number of data copies grow out of sight. And we haven’t even started to discuss backup.

Ash says things reached a tipping point when server virtualization become the dominant approach to running applications, which led to an ever increasing need for data copies as app’s started being developed and run all over the place. Then came along data deduplication which displaced tape in IT’s backup process, so that backup data (copies) now could reside on disk.  Finally, with the advent of disk deduplication, backups no longer had to be in TAR (backup) formats but could now be left in-app native formats. In native formats, any app/developer/analyst could access the backup data copy.

Actifio Copy Data Virtualization

So what is Actifio? It’s essentially a massively distributed object storage with a global name space, file system on top of it. Application hosts/servers run agents in their environments (VMware, SQL Server, Oracle, etc.) to provide change block tracking and other metadata as to what’s going on with the primary data to be backed up. So when a backup is requested, only changed blocks have to be transferred to Actifio and deduped. From that deduplicated change block backup, a full copy can be synthesized, in native format, for any and all purposes.

With change block tracking, backups become very efficient and deduplication only has to work on changed data so that also becomes more effective. Data copying can also be done more effectively since their only tracking deduplicated data. If necessary, changed blocks can also be applied to data copies to bring them up to date and current.

With Actifio, one can apply SLA’s to copy data. These SLA’s can take the form of data governance, such that some copies can’t be viewed outside the country, or by certain users. And they can also provide analytics on data copies. Both of these capabilities take copy data to whole new level.

We didn’t get into all Actifio’s offerings on the podcast but Actifio CDS is as a high availability appliance which runs their  object/file system and contains data storage. Actifio also comes in a virtual appliance as Actifio SKY, which runs as a VM under VMware, using anyone’s storage.  Actifio supports NFS, SMB/CIFS, FC, and iSCSI access to data copies, depending on the solution chosen. There’s a lot more information on their website.

It sounds a little bit like PrimaryData but focused on data copies rather than data migration and mostly tier 2 data access.

The podcast runs ~46 minutes and  covers a lot of ground. I spent most of the time asking Ash to explain Actifio (for Howard, TFD11 filled this in). Howard had some technical difficulties during the call which caused him to go offline but then came back on the call. Ash and I never missed him :), listen to the podcast to learn more.

Ash Ashutosh, CEO Actifio

Ash Ashutosh Hi Res copy-resizedAsh Ashutosh brings more than 25 years of storage industry and entrepreneurship experience to his role of CEO at Actifio. Ashutosh is a recognized leader and architect in the storage industry where he has spearheaded several major industry initiatives, including iSCSI and storage virtualization, and led the authoring of numerous storage industry standards. Ashutosh was most recently a Partner with Greylock Partners where he focused on making investments in enterprise IT companies. Prior to Greylock, he was Vice President and Chief Technologist for HP Storage.

Ashutosh founded and led AppIQ, a market leader of Storage Resource Management (SRM) solutions, which was acquired by HP in 2005. He was also the founder of Serano Systems, a Fibre Channel controller solutions provider, acquired by Vitesse Semiconductor in 1999. Prior to Serano, Ashutosh was Senior Vice President at StorageNetworks, the industry’s first Storage Service Provider. He previously worked as an architect and engineer at LSI and Intergraph.

GreyBeards talk HPC storage with Molly Rector, CMO & EVP, DDN

oIn our 27th episode we talk with Molly Rector (@MollyRector), CMO & EVP of Product Management/Worldwide Marketing for DDN.  Howard and I have known Molly since her days at Spectra Logic. Molly is also on the BoD of SNIA and Active Archive Alliance (AAA), so she’s very active in the storage industry, on multiple dimensions and a very busy lady.

We (or maybe just I) didn’t know that DDN has a 20 year history in storage and in servicing high performance computing (HPC) customers. It turns out that more enterprise IT organizations are starting to take on workloads that look like HPC activity.

In HPC there are 1000s of compute cores that are crunching on PB of data. For Oil&Gas companies, it’s seismic and wellhead analysis; with bio-informatics it’s genomic/proteomic analysis; and with financial services, it’s economic modeling/backtesting trading strategies. For today’s enterprises such as retailers, it’s customer activity analytics; for manufacturers, it’s machine sensor/log analysis;  and for banks/financial institutions, it’s credit/financial viability assessments. Enterprise IT might not have 1000s of cores at their disposal just yet, but it’s not far off. Molly thinks one way to help enterprise IT is to provide a SuperComputer as a service (ScaaS?) offering, where top 10 supercomputers can be rented out by the hour, sort of like a supercomputing compute/data cloud.

We start early talking about DDN WOS: object store, which can handle archive to cloud or backend tape libraries. Later we discuss DDN ExaScaler and GridScaler, which are NAS appliances for Lustre and massively scale out, parallel file system storage, respectively.

Another key supercomputing storage requirement is  predictable performance. Aside from sophisticated QoS offerings across their products, DDN also offers the IME solution, a bump in the cable, caching system, that can optimize large and small file IO activity for backend DDN NAS scalers. DDN IME is stateless and can be removed from the data path while still allowing IT access  to all their data.

While we were discussing DDN storage interfaces, Molly mentioned they were working on an Omni Path Fabric.  Intel’s new Omni Path Fabric is intended to replace rack scale PCIe networks for HPC.

This months edition is not too technical and runs just over 45 minutes. We only got to SNIA and AAA at the tail end and just for a minute or two. Molly’s always fun to talk to, with enough technical smarts to keep Howard and I at bay, at least for awhile :). Listen to the podcast to learn more.

HeadshotMolly Rector, CMO and EVP Product Management & Worldwide Marketing,  DDN

With 15 years of experience working in the HPC, Media and Entertainment, and Enterprise IT industries running global marketing programs, Molly Rector serves as DDN’s Chief Marketing Officer (CMO) responsible for product management and worldwide marketing. Rector’s role includes providing customer and market input into the company’s product roadmap, raising the Corporate brand visibility outside traditional markets, expanding the partner ecosystem and driving the end-to-end customer experience from definition to delivery.

Rector is a founding member and currently serves as Chairman of the Board for the Active Archive Alliance. She is also the Storage Networking Industry Association’s (SNIA) Vice Chairman of the Board and the Analytics and Big Data committee Vice Chairman. Prior to joining DDN, Rector was responsible for product management and worldwide marketing as CMO at Spectra Logic. During her tenure at Spectra Logic, the company grew revenues consistently by double digits year-over-year, while also maintaining profitability. Rector holds certifications as CommVault Certified System Administrator; Veritas Certified Data Protection Administrator; and Oracle Certified Enterprise DBA: Backup and Recovery. She earned a Bachelor’s of Science degree in biology and chemistry.

GreyBeards talk data-aware, scale-out file systems with Peter Godman, Co-founder & CEO, Qumulo

In this podcast we discuss Qumulo’s data-aware, scale-out file system storage with Peter Godman, Co-founder & CEO of Qumulo. Peter has been involved in scale-out storage for a while now, coming from (EMC) Isilon before starting Qumulo. Although, this time he’s adding data-awareness to scale-out storage. Presently, Qumulo is vertically focused on the HPC and media/entertainment market spaces.

Qumulo is the first storage vendor we have heard of that implements their software with Agile development techniques. This allows them to release new functionality to the field every two weeks – talk about rapidly turning out software. We believe this is pretty risky and Ray talks more about Agile development for storage in his Storage on Agile blog post.

But Qumulo mostly sees itself as data aware NAS, using Posix metadata and a neat, internally designed/developed database to store, index and retrieve file system metadata. Qumulo’s proprietary database provides much faster response to queries on meta-data, such as what files have changed since last backup, calculate all the  storage space consumed by a specific owner, supply inclusion/exclusion lists to split the file systems into 100 partitions, etc. The database is not a relational or conventional database, but almost old-school, indexed data structures tailored to providing quick answers to the queries of most interest to customers and their application environment. In a scale-out NAS environment like Qumulo’s, with potentially billions of files, you just don’t have time to walk an inode tree to get these sorts of answers, anymore.

Qumulo supplies both hardware and software to its customers but also offers a software-only or software defined storage (SDS) version for those few customers that want it. SDS versions can help potential customers perform  proofs of concept (PoCs) using VMs.

In their system nodes, Qumulo uses SSDs and disks. SSDs provide a sort of NVM that holds recently written data but can also be used for reading data. Behind the SSDs are 8TB disks. Today, Qumulo provides mirrored storage that’s widely spread or dispersed across all the storage in their system. With this wide-striping of data, rebuild times for (an 8TB) disk failure is ~1:20 for a single QC204 (204TB) system node and halves every time you double the number of nodes.

It was refreshing to hear a startup vendor clearly answer what they have and don’t have implemented in their current system. Some startups try to obfuscate or talk around the lack of functionality but Peter’s answers were always clear and (sometimes to) concise on what’s in and not in current Qumulo functionality.

This months edition runs just over 47 minutes and gets pretty technical in places, but mostly stays at a high functional level.  We hope you enjoy the podcast.

Peter Godman, Co-founder & CEO Qumulo

pete_7CPeter brings 20 years of industry systems experience to Qumulo. As VP of Engineering and CEO of Corensic, Peter brought the world’s first thin-hypervisor based product to market. As Director of Software Engineering at Isilon, he led development of several major releases of Isilon’s award-winning OneFS distributed file system and was inventor of 18 patented technologies. Peter studied math and computer science at MIT.

GreyBeards talk data-aware storage with Paula Long & Dave Siles, CEO&CTO DataGravity

In this podcast we discuss data-aware storage with Paula Long, CEO/Co-Founder and Dave Siles, CTO of DataGravity. Paula comes from EqualLogic and Dave from Veeam so they both have a lot of history in and around the storage industry, almost qualifying them as grey hairs :/

Data-aware storage is a new paradigm in storage that combines primary (block and file) storage, file and data analytics and text indexing. Just to top it off, they also add data protection to a separate storage partition. Their system is VM aware and is able to crack open VMDKs to find out what’s inside. With all their file and data analytics, DataGravity is  able to supply data leakage detection and a much better understanding of what data is actually being stored on the system.

Paula believes, in 5 years or so, this new approach to storage will become common. Their system also supports targeted data deduplication and compression as well as provide self-service restore and a “google-like” rich search experience to their data aware storage.

DataGravity was designed for mid-market but are being pulled up market by workgroups as department level storage for F500 companies. They find that once installed,  they usually uncover some exposure and then other departments take notice. Also they’re discovering an awful lot of dormant data and moving this off of primary storage can save quite a lot.

DataGravity has a 2U controller with a 24-disk drive shelf but have SSDs inside the controllers. They use spinning disks for a majority of the data storage.

DataGravity has an interesting twist on the active-passive, standard dual conttroller/HA approach to storage, which you will have to listen to the podcast to truly understand.

This months episode runs a bit over 44 minutes and wanders over a lot of high ground but dips into technical waters occasionally.

Paula Long, CEO & Co-founder, DataGravity

PaulaLong-G Paula brings over 30 years of experience to DataGravity in delivering meaningful and game changing high-tech innovation. Prior to DataGravity, Paula served as vice president of product development at Heartland Robotics. In 2001 Paula co-founded storage provider EqualLogic, resetting the bar on how customers managed and purchased data storage. EqualLogic was acquired by Dell for $1.4 billion in 2008 and Paula remained at Dell as vice president of storage until 2010. Previous to EqualLogic, she served in several engineering management positions at Allaire Corporation and oversaw all aspects of the ClusterCATS product line while at Bright Tiger Technologies.

Her executive and technical leadership has been extensively recognized, including the New Hampshire High Tech Council Entrepreneur of the Year award, the Ernst & Young 2008 Northeast Regional “Entrepreneur of the Year” and a national finalist for the same award. Her technical awards span systems designs and enterprise software including the EqualLogic and ClusterCATS product lines. She is a graduate of Westfield State College

Paula is also active in the startup community. Outside of high tech, she works with charities creating equality for professional women and girls, as well as with organizations enabling literacy for all children, regardless of economic status.

Dave Siles, CTO DataGravity

DaveSiles-colorWith more than 20 years in operations and leadership roles with growth companies, David serves as chief technology officer of DataGravity, responsible for leading the technical strategic vision for the company while guiding our product management teams and research and development efforts to better serve the needs of organizations looking for more from their data storage.

Prior to becoming CTO, David served as vice president of worldwide field operations at DataGravity. Previously, David was a member of the senior leadership team at Veeam Software, a leading data protection software provider for virtualized and cloud environments.

David also served as CTO and VP of professional services for systems integrator Hipskind TSG. He also served as CTO for Kane County, Ill., and has held technology leadership roles with various organizations. A graduate of DeVry University, he is a frequent speaker at top tier technology shows and is a recognized expert in virtualization.

 

GreyBeards talk edge-core filers with Ron Bianchini, President & CEO Avere Systems

Welcome to our 13th podcast where we talk edge filers with Ron Bianchini, President and CEO of Avere Systems. Avere has been around the industry for quite awhile now and has always provided superior performance acceleration for backend NAS filers. But with their latest version, they now offer that same sort of performance acceleration for public cloud and object storage systems as well.

Ron has had quite a long history in the IT world. He was the CEO of Spinnaker Networks prior to NetApp’s acquisition which was used as the progenitor for FAS Cluster Mode services. He also worked for another startup and was a university professor before that. The second former professor on our podcast.

Avere Systems started out as an attempt to take NAS in another direction, this time performance at the edge with capacity filers at the core. That promise is now being taken to object store and public cloud storage as well.

This months episode comes in at a little more than 44 minutes.

We start our discussion with a short history of Avere Systems. It was originally targeted to offer an edge-core NAS solution where the Avere appliance supplied performance optimization and the NAS backend storage offered capacity optimization.

You may recall that originally there were a lot of edge-core NAS solutions on the market at one time but Avere has outlasted them all. One secret to Avere’s success was that they construct a virtualization layer with their own file system on top of backend NAS storage. This allowed them to offer unique capabilities such as Global name space, non-disruptive migration and disaster recovery but it also ultimately made it much easier for them to offer the same functionality for object storage and public cloud services.

Ron can talk NAS performance with the best of them and he shows us how Avere performs so well, even with relatively slow object and public cloud storage behind them. The Greybeards were duly impressed with Avere’s last quarter SPECsfs submissions (see Ray’s June SPECsfs2008 dispatch for more) with object storage (Cleversafe & Amplidata) and public cloud (Amazon’s Flash Storage) backends. Listen to the podcast to learn more.

Ron Bianchini

Ron Bianchini, Jr. President & CEO Avere Systems

As president and chief executive officer of Avere Systems, co-founder Ron Bianchini has a long record of accomplishment in building and leading successful companies that deliver breakthrough technologies. Prior to Avere, Ron was a senior vice president at NetApp, where he served as the leader of the NetApp Pittsburgh Technology Center. Before NetApp, he was CEO and co-founder of Spinnaker Networks, which developed the Storage Grid architecture acquired by NetApp. Ron also served as vice president of product architecture of FORE Systems, where he was responsible for ATM products. Previously, he co-founded Scalable Networks [acquired by FORE], which designed and implemented a large-scale Gigabit Ethernet switch, and earlier in his career, he was a professor at Carnegie Mellon University.

Ron received an S.B. degree in Electrical Engineering from the Massachusetts Institute of Technology and M.S. and Ph.D. degrees in Electrical and Computer Engineering from Carnegie Mellon University. He also holds numerous patents in fault-tolerant distributed systems and high-speed network design and has published extensively in technical journals.