76: GreyBeards talk backup content, GDPR and cyber security with Jim McGann, VP Mkt & Bus. Dev., Index Engines

In this episode we talkindexing old backups, GDPR and CyberSense, a new approach to cyber security, with Jim McGann, VP Marketing and Business Development, Index Engines.

Jim’s an old industry hand that’s been around backups, e-discovery and security almost since the beginning. Index Engines solution to cyber security, CyberSense, is also offered by Dell EMC and Jim presented at a TFDx event this past October hosted by Dell EMC (See Dell EMC-Index Engines TFDx session on CyberSense).

It seems Howard’s been using Index Engines for a long time but keeping them a trade secret. In one of his prior consulting engagements he used Index Engines technology to locate a a multi-million dollar email for one customer.

Universal backup data scan and indexing tool

Index Engines has long history as a tool to index and understand old backup tapes and files. Index Engines did all the work to understand the format and content of NetBackup, Dell EMC Networker, IBM TSM (now Spectrum Protect), Microsoft Exchange backups, database vendor backups and other backup files. Using this knowledge they are able to read just about anyone’s backup tapes or files and tell customers what’s on them.

But it’s not just a backup catalog tool, Index Engines can also crack open backup files and index the content of the data. In this way customers can search backup data, with Google like search terms. This is used day in and day out, for E-discovery and the occasional consulting engagement.

Index Engines technology is also useful for companies complying with GDPR and similar legislation. When any user can request information about them be purged from corporate data, being able to scan, index and search backups is great feature.

In addition to backup file scanning, Index Engines has a multi-PB, indexing solution which can be used to perform the same, Google-like searching on a data center’s file storage. Once again, Index Engines has done the development work to implement their own, highly parallelized metadata and content search engine, demonstratively falter than any open source (Lucene) search solution available today.

CyberSense

All that’s old news, what Jim presented at a TFDx event was their new CyberSense solution. CyberSense was designed to help organizations detect and head off ransomware, cyber assaults and other data corruption attacks.

CyberSense computes a data entropy (randomness) score as well as ~39 other characteristics for every file in backups or online in a custmer’s data center. It then uses that information to detect when a cyber attack is taking place and determine the extent of the corruption. With current and previous entropy and other characteristics on every data file, CyberSense can flag files that look like they have been corrupted and warn customers that a cyber attack is in process before it corrupts all of customers data files.

One typical corruption is to change file extensions. CyberSense cracks open file contents and can determine if it’s an office or other standard document type and then check to see if its extension matches its content. Another common corruption is to encrypt files. Such files necessarily have an increased entropy and can be automatically detected by CyberSense

When CyberSense has detected some anomaly, it can determine who last accessed the file and what executable was used to modify it. In this way CyberSecurity can be used to provide forensics on who, what, when and where about a corrupted file, so that IT can shut the corruption activity down before it’s gone to far.

CyberSense can be configured to periodically scan files online as well as just examine backup data (offline) during or after it’s backed up. Their partnership with Dell EMC is to do just that with Data Domain and Dell EMC backup software.

Index Engines proprietary indexing functionality has been optimized for parallel execution and for reduced index size. Jim mentioned that their content indexes average about 5% of the full storage capacity and that they can index content at a TB/hour.

Index Engines is a software only offering but they also offer services for customers that want a turn key solution. They also are available through a number of partners, Dell EMC being one.

The podcast runs ~44 minutes. Jim’s been around backups, storage and indexing forever. And seems to have good knowledge on data compliance regimes and current security threats impacting customers, across the world today . Listen to our podcast to learn more.

Jim McGann, VP Marketing and Business Development, Index Engines

Jim has extensive experience with the eDiscovery and Information Management in the Fortune 2000 sector. Before joining Index Engines in 2004, he worked for leading software firms, including Information Builders and the French based engineering software provider Dassault Systemes.

In recent years he has worked for technology based start-ups that provided financial services and information management solutions. Prior to Index Engines, Jim was responsible for the business development of Scopeware at Mirror Worlds Technologies, the knowledge management software firm founded by Dr. David Gelernter of Yale University. Jim graduated from Villanova University with a degree in Mechanical Engineering.

Jim is a frequent writer and speaker on the topics of big data, backup tape remediation, electronic discovery and records management.

71: GreyBeards talk DP appliances with Sharad Rastogi, Sr. VP & Ranga Rajagopalan, Sr. Dir., Dell EMC DPD

Sponsored by:

In this episode we talk data protection appliances with Sharad Rastogi (@sharadrastogi), Senior VP Product Management,  and Ranga Rajagopalan, Senior Director, Data Protection Appliances Product Management, Dell EMC Data Protection Division (DPD). Howard attended Ranga’s TFDx session (see TFDx videos here) on their new Integrated Data Protection Appliance (IDPA) the DP4400 at VMworld last month in Las Vegas.

This is the first time we have had anyone from Dell EMC DPD on our show. Ranga and Sharad were both knowledgeable about the data protection industry, industry trends and talked at length about the new IDPA DP4400.

Dell EMC IDPA DP4400

The IDPA DP4400 is the latest member of the Dell EMC IDPA product family.  All IDPA products package secondary storage, backup software and other solutions/services to make for a quick and easy deployment of a complete backup solution in your data center.  IDPA solutions include protection storage and software, search and analytics, system management — plus cloud readiness with cloud disaster recovery and long-term retention — in one 2U appliance. So there’s no need to buy any other secondary storage or backup software to provide data protection for your data center.

The IDPA DP4400 grows in place  from 24 to 96TB of usable capacity and at an average 55:1 dedupe ratio, it could support over 5PB of backup storage on the appliance. The full capacity always ships with the appliance. Customers can select how much or little they get to use by just purchasing a software license key.

In addition to the on appliance capacity, the IDPA DP4400 can use up to 192TB of cloud storage for a native Cloud tier. Cloud tiering takes place after a specified, appliance residency interval, after which backup data is moved from the appliance to the cloud. IDPA Cloud Tier works with AWS, Azure, IBM Cloud Object Storage, Ceph and Dell EMC Elastic Cloud Storage. With the 192TB of cloud and 96TB of on appliance usable storage, together with a 55:1 dedupe ratio, a single IDPA DP4400 can support over 14PB of logical backup data.

Furthermore, IDPA supports Cloud DR. With Cloud DR, backed up VMs are copied to the public cloud (AWS) on a scheduled basis. In case of a disaster, there is an orchestrated failover with the VMs spun up in the cloud. The cloud workloads can then easily be failed back on site once the disaster is resolved.

The IDPA DP4400 also comes with native DD Boost™ support. This means Oracle, SQL server and other applications that already support DD Boost can also use the appliance to backup and restore their application data. DD Boost customers can make use of native application services such as Oracle RAC to manage their database backups/restores with the appliance.

Dell EMC also offers their Future-Proof Loyalty Program guarantees for the IDPA DP4400, including a Data Protection Deduplication guarantee, which, if best practices are followed, Dell EMC will guarantee the appliance dedupe ratio for backup data. Additional guarantees from the Dell EMC Future-Proof Program for IDPA DP4400 include a 3-Year Satisfaction guarantee, a Clear Price guarantee which guarantees predictable pricing for future maintenance and service as well as a Cloud Enabled guarantee. These are just a few of the Dell EMC guarantees provided for the IDPA DP4400.

The podcast runs ~16 minutes. Ranga and Sharad were both very knowlegdeable on DP industry, DP trends and the new IDPA DP4400.  Listen to the podcast to learn more.

Sharad Rostogi, Senior V.P. Product Management, Dell EMC Data Protection Division

Sharad Rastogi is a global technology executive with strong track record of transforming businesses and increasing shareholder value across a broad range of leadership roles, in both high growth and turnaround situations.

As SVP of Product Management at Dell EMC, Sharad is responsible for all products for the $3B Data Protection business.  He oversees a diverse portfolio, and is currently developing next generation integrated appliances, software and cloud based data protection solutions. In the past, Sharad has held senior roles in general management, products, marketing, corporate development and strategy at leading companies including Cisco, JDSU, Avid and Bain.

Sharad holds an MBA from the Wharton School at the University of Pennsylvania, an MS in engineering from the Boston University and a B.Tech in engineering from the Indian Institute of Technology in New Delhi.

He is an advisor to Boston University, College of Engineering, and a Board member at Edventure More – a non-profit providing holistic education. Sharad is a world traveler, always seeking new adventures and experiences

Rangaraaj (Ranga) Rajagopalan, Senior Director Data Protection Appliances Product Management, Dell EMC Data Protection Division

Ranga Rajagopalan is Senior Director of Product Management for Data Protection Appliances at Dell EMC. Ranga is responsible for driving the product strategy and vision for Data Protection Appliances, setting and delivering the multi-year roadmap for Data Domain and Integrated Data Protection Appliance.

Ranga has 15 years of experience in data protection, business continuity and disaster recovery, in both India and USA. Prior to Dell EMC, Ranga managed the Veritas Cluster Server and Veritas Resiliency Platform products for Veritas Technologies.

64: GreyBeards discuss cloud data protection with Chris Wahl, Chief Technologist, Rubrik

Sponsored by:

In this episode we talk with Chris Wahl, Chief Technologist, Rubrik. This is our second time having Chris on our show. The last time was about three years ago (see our Chris on agentless backup podcast). Talking with Chris again was great and there’s been plenty of news since we last spoke with him.

Rubrik now has three products the Rubrik Cloud Data Protection suite (onprem, virtual or in the [AWS & Azure] cloud), the Rubrik Datos IO (recent acquisition) for NoSql database with semantic dedupe and Rubrik Polaris GPS, a SaaS monitoring/trending/management solution for your data protection environment. Polaris GPS monitors and watches data protection trends for you, to insure all your data protection SLAs are being met. But we didn’t spend much time on Polaris.

Datos IO was designed from the start to backup new databases based on NoSQL technologies and provides, a semantic based deduplication capability, that’s unique in the industry . We talked with Datos IO before their acquisition by Rubrik (see our podcast with Tarun on 3rd generation data protection).

Cloud Data Protection

As for their Cloud Data Protection suite, one major differentiator is that all their functionality is available via RESTful APIs. Their GUI is completely built off their APIs. This means any customer could use their set of APIs to integrate Rubrik data protection with any application/workload on the planet.

Chris mentioned that Rubrik has 40+ specific application/system integrations that provide “strictly consistent” data protection. We assume this means application consistent backups and recovery but goes beyond mere applications.

With the Cloud Data Protection solution, data resides on the appliance for only a short (customer specifiable) period and then is migrated off to cloud or onprem object storage. The object storage could be any onprem S3 compatible storage, in the AWS or Azure cloud. It’s completely automatic. The data migrated to object storage is self-defining, meaning that metadata and data are all available in one spot and can be restored anywhere there’s a Rubrik Cloud Data Protection suite operating.

The Cloud Data Protection appliance also supports onboard search and analytics to search backup/recovery metadata/catalogs. As such, there’s no need to purchase other tools to uncover which backup files exist. Their solution also uses data deduplication to reduce the data stored.

Data stored is also encrypted by customer keys and use HTTPS to transfer data. So, data is secured at rest, secured in flight and deduped. Cloud Data Protection also offers data mobility. That is it can move your VMs and data from onprem to the cloud and use Rubrik in the cloud to rehydrade the data and translate your VMs to run in AWS or Azure and it works in reverse, translating AWS/Azure compute instances into VMs.

Rubrik’s major differentiator is simplicity. Traditionally, customers had been conditioned to thinking data protection took hours to maintain, fix and keep running. But with Rubrik Cloud Data Protection, a customer just points it to an application and selects an SLA, and Rubrik takes over from there.

The secret behind Rubrik’s simplicity is Cerebro. Cerebro is where they have put all the smarts to understand a data center’s infrastructure, applications/VMs, protected data and requested SLAs and just makes it work

The podcast runs ~27 minutes. Chris was great to talk with again and given how long it’s been since we last talked, he had much to discuss. Rubrik seems like an easy solution to adopt and if their growth is any indicator, customers agree. Listen to the podcast to learn more.

Chris Wahl, Chief Technologist, Rubrik

Chris Wahl, author of the award winning Wahl Network blog and host of the Datanauts Podcast, focuses on creating content that revolves around virtualization, automation, infrastructure, and evangelizing products and services that benefit the technology community.

In addition to co-authoring “Networking for VMware Administrators” for VMware Press, he has published hundreds of articles and was voted the “Favorite Independent Blogger” by vSphere-Land three years in a row (2013 – 2015). Chris also travels globally to speak at industry events, provide subject matter expertise, and offer perspectives to startups and investors as a technical adviser.