In this episode, we talk with Matt Starr (@StarrFiles), CTO of Spectra Logic, the deep storage experts. Matt has been around a long time and Ray’s shared many a meal with Matt as we’re both in NW Denver. Howard has a minor quibble with Spectra Logic over the use of his company’s name (DeepStorage) in their product line but he’s also known Matt for awhile now.
Matt and Spectra Logic have a number of customers with multi-PB to over an EB of data repository problems and how to take care of these ever expanding storage stashes is an ongoing concern. One of the solutions Spectra Logic offers is the Black Pearl Deep Storage, which provides an object storage, RESTfull interface front end to storage tiering/archive backend that uses flash, (spin-down) disk, (LTFS) tape (libraries) and the (AWS) cloud as backend storage.
Major portions of the Black Pearl are open sourced and available on GitHub. I see several (DS3-)SDK’s for Java, Python, C, and others. Open sourcing the product provides an easy way for client customization. In fact, one customer was using CEPH and they modified their CEPH backup client to send a copy of data off to the Pearl.
We talk a bit about the Black Pearl’s data integrity. It uses a checksum, computed over the object at creation time which is then verified anytime the object is retrieved, copied, moved or migrated and can be validated periodically (scrubbed), even when it has not been touched.
Super Computing’s interesting (storage) problems
Matt just returned from the SC16 (Super Computing Conference 2016) in Salt Lake City last month. At the conference there were plenty of MultiPB customers that were looking for better storage alternatives.
One customer Matt mentioned was the Square Kilometer Array, the world’s largest radio telescope which will be transmitting 700TB/hour, over an 1EB per year. All that data has to land somewhere and for this quantity (>eb) of data, tape becomes an necessary choice.
Matt likened Spectra’s archive solutions to warehouses vs. factories. For the factory floor, you need responsive (AFA or hybrid) primary storage but for the warehouse, you just want cheap, bulk storage (capacity).
The podcast runs long, over 51 minutes, and reveals a different world from the GreyBeards everyday enterprise environments. Specifically customers that have extra large data repositories and how they manage to survive under the data deluge. Matt’s an articulate spokesperson for Spectra Logic and their archive solutions and we could have talked about >eb data repositories for hours. Listen to the podcast to learn more.
Matt Starr, CTO, Spectra Logic
Matt Starr’s tenure with Spectra Logic spans 24 years and includes experience in service, hardware design, software development, operating systems, electronic design and management. As CTO, he is responsible for helping define the company’s product vision, and serves as the executive representative for the voice of the market. He leads Spectra’s efforts in high-performance computing, private cloud and other vertical markets.
Matt served as the lead engineering architect for the design and production of Spectra’s TSeries tape library family. Spectra Logic has secured more than 50 patents under Matt’s direction, establishing the company as the innovative technology leader in the data storage industry. He holds a BS in electrical engineering from the University of Colorado at Colorado Springs.