Category Archives: Software defined storage

47: Greybeards talk Storage as a Service with Lazarus Vekiarides, CTO & Co-Founder ClearSky Data

Sponsored By:

In this episode, we talk with ClearSky Data’s Lazarus Vekiarides, CTO and Co-founder,  who we have talked with before (see our podcast from October 2015). ClearSky Data provides a storage-as-a-service offering that uses an on-premises appliance plus point of presence (PoP) storage in the local metro area to hold customer data and offloads this data to cloud storage. In addition to the on-premises storage-as-a-service they offer access to customer data from an in-cloud virtual appliance. ClearSky provides the whole storage service, including gigabit metro Ethernet connections from the customer to the POP for simple capacity based charge every month.

How does it work

Their Edge (on premises) appliance supports 24 SSDs and can scale up to 4 appliances. Soon a single appliance will be able to hold up to 32TB of data.  It’s intended to hold a data center’s entire working set for one week of activity. So essentially it’s a big caching appliance for the local data center

For ClearSky Data the lone source of truth for customer data lies in the PoP. The PoP is connected to metro wide fibre that is available in a number of large metropolitan areas. Laz says they have measured sub 500 µsec round trip response time between their PoP equipment and Edge appliance. The PoP provides the backing store for the Edge appliance. Data written to the edge appliance(s) are written through to the PoP storage. This data and it’s metadata (<1% of LUN size) is flushed to cloud storage which holds the data indefinitely.

DR through the PoP

If customers have multiple data centers within the same metro area (100Km) then they can have a single “logical” array that accesses the same data, say a cluster file system across the two data centers. The PoP will take care of copying the metadata to the secondary edge device and will invalidate any data sitting in the secondary device which is no longer valid. In this way customers can have a Recovery Point Objective (RPO)=0 seconds. That is any data written to the primary data center is automatically available to the secondary data center as long as the PoP survives.

But even if you wanted to fail over to a different metro area the PoP data is offloaded to the cloud continuously so while you wouldn’t attain an RPO=0 seconds, it could be awfully short, on the order of a couple of seconds.

Recent enhancements

ClearSky Data has recently enhanced their storage as a service to provide policy management over snapshots. That is you can establish policies as to how often to take LUN snapshots and how long to retain them in the cloud.

Also, ClearSky Data has added VMware functionality via plugins that allow their storage to know which VMs are writing data or are being backed up to their appliance. And this is included in the metadata written for a LUN which is offloaded to the cloud. Someday soon when you can have vSphere running bare metal in a public cloud service, you will be able to run the Cloud Edge (cloud software version of their Edge appliance) and restore the data from your data center directly to the cloud and have an iSCSI LUN available to EC2 running VMware providing complete Cloud DR for a data center.

We talked a bit about our favorite topic, NVMe storage and Laz sees a potential for it to help their Edge appliances but at the moment fault-tolerence/high availability is not there. And as they are primary storage for data centers HA is a critical capability.

Pricing and availability

Their product is priced as a service in $0.nn/GB/Month and if you do a 36 month cost analysis they feel they would come out cheaper than hybrid storage. They currently have PoP’s in Boston, NyNy, Northern Virginia, Dallas, and California. Laz says they believe there’s 15 major metropolitan areas across the USA they have targeted for service.  What nothing in Europe or Asia? We would imagine this is merely a question of the number of customers, amount of data and metro infrastructure.

The podcast runs ~24 minutes. Laz has been in the storage industry across a number of companies and has been with a few startups as well. Laz is very knowledgeable about storage, cloud, and metro networking, a good friend and is always a pleasure to talk with.  Listen to the podcast to learn more.

Lazarus Vekiarides, CTO & Co-Founder ClearSky Data

For over 20 years Laz Vekiarides has served in key technical and leadership roles delivering breakthrough technologies to market. Most recently, he served as the Executive Director of Software Engineering for Dell’s EqualLogic Storage Engineering group, where he led the development of numerous storage innovations and established the EqualLogic product line as a leader in host OS and hypervisor integration.

Laz joined Dell from EqualLogic, which was acquired in early 2008, where he was a member of the core leadership team – playing a key role in the company’s early success as a Senior Engineering Manager and Architect for the PS Series SAN arrays and host tools. Prior to EqualLogic, Laz held senior engineering and management positions at several companies including 3COM and Banyan Systems.

An occasional blogger, Laz frequently speaks at industry conferences, particularly in the areas of virtualization and data storage. He holds several storage technology patents, as well as a BSEE from Northeastern University, and an MSCS from the Worcester Polytechnic Institute.

46: Greybeards discuss Dell EMC World2017 happenings on vBrownBag

In this episode Howard and I were both at Dell EMC World2017 this past month and Alastair Cooke (@DemitasseNZ) asked us to do a talk at the show for the vBrownBag group (Youtube video here). The GreyBeards asked for a copy of the audio for this podcast.

Sorry about the background noise, but we recorded live at the show, with a huge teleprompter in the background that was re-broadcasting keynotes/interviews from the show.

At the show

Howard was at Dell EMC World2017 on a media pass and I was at the show on an industry analyst pass. There were parts of the show that he saw, that I didn’t and vice versa, but all keynotes and major industry outreach were available to both of us.

As always the Dell EMC team put on a great show, and kudos have to go to their AR and PR teams for having both of us there and creating a great event. There were lots of news at the show and both of us were impressed by how well Dell EMC have come together, in such a short time.

In addition, there were a number of Dell partners at the show. Howard met  Datadobi on the show floor who have a file migration tool that walks a filesystem tree and migrates files as well as reports on files it can’t. And we both saw Datrium (who we talked with last year).

Servers and other news

We both liked Dell’s new 14th generation server. But Howard objected to the lack of technical specs on it. Apparently, Intel won’t let specs be published until they announce their new CPU chipsets, sometime later this year. On the other hand, there were a few server specs discussed. For example, I was impressed the new servers would support many more NVMe cards. Howard liked the new server support for NV-DIMMs, mainly for the potential latency reduction that could provide software defined storage.

That led us on a tangent discussion about whether there is a place for non-software defined storage anymore.  Howard mentioned the downside of HCI/software defined storage on upgrading server (DIMM, PCIe card) hardware.

However, appliance hardware seems to be getting easier to upgrade. The new Unity AFA storage can be upgraded, non-disruptively from the low end to high end appliance by just swapping out controller hardware canisters.

Howard was also interested in Dell EMC’s new CloudFlex purchasing model for HCI solutions. This supplies an almost cloud-like purchasing option for customers. Where for a one year commitment,  you pay as you go (no money down, just monthly payments) rather than an up front capital purchase. After the year’s commitment expires you can send the hardware back to Dell EMC and stop paying.

We talked about Tier 0 storage. EMC DSSD was an early attempt to provide Tier 0 but came with lots of special purpose hardware. When commodity hardware and software emerged last year with NVMe SSD speed, DSSD was no longer viable at the premium pricing needed for all that hardware and was shut down. Howard and I discussed how doing special hardware requires one to be much faster (10-100X) than commodity hardware solutions to succeed and the gap has to be continued.

The other big storage news was the new VMAX 950F AFA and its performance numbers. Dell EMC said the new VMAX could do 6.7M IOPS of RRH (random read hit) and had a 350µsec response time. Howard noted that Dell EMC didn’t say at what IO load they achieved the 350µsec response time. I told him it almost didn’t matter, even if it was a single IO at that response time, it was significant.

The podcast runs about 40 minutes. It’s just Howard and I talking about what we saw/heard at the show and the occasional, tangental topic.  Listen to the podcast to learn more.


Howard Marks, DeepStorage

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

Ray Lucchesi, Silverton Consulting

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage Blog, and can be found on twitter @RayLucchesi.

43: GreyBeards talk Tier 0 again with Yaniv Romem CTO/Founder & Josh Goldenhar VP Products of Excelero

In this episode, we talk with another next gen, Tier 0 storage provider. This time our guests are Yaniv Romem CTO/Founder  & Josh Goldenhar (@eeschwa) VP Products from Excelero, another new storage startup out of Israel.  Both Howard and I talked with Excelero at SFD12 (videos here) earlier last month in San Jose. I was very impressed with their raw performance and wrote a popular RayOnStorage blog post on their system (see my 4M IO/sec@227µsec 4KB Read… post) from our discussions during SFD12.

As we have discussed previously, Tier 0, next generation flash arrays provide very high performing storage at very low latencies with modest to non-existent advanced storage services. They are intended to replace server, direct access SSD storage with a more shared, scaleable storage solution.

In our last podcast (with E8 Storage) they sold a hardware Tier 0 appliance. As a different alternative, Excelero is a software defined, Tier 0 solution intended to be used on any commodity or off the shelf server hardware with high end networking and (low to high end) NVMe SSDs.

Indeed, what impressed me most with their 4M IO/sec, was that target storage system had almost 0 CPU utilization. (Read the post to learn how they did this). Excelero mentioned that they were able to generate high (11M random 4KB) IO/sec on  Intel Core 7, desktop-class CPU. Their one need in a storage server is plenty of PCIe lanes. They don’t even need to have dual socket storage servers, single socket CPU’s work just fine as long as the PCIe lanes are there.

Excelero software

Their intent is to bring Tier 0 capabilities out to all big storage environments. By providing a software only solution it could be easily OEMed by cluster file system vendors or HPC system vendors and generate amazing IO performance needed by their clients.

That’s also one of the reasons that they went with high end Ethernet networking rather than just Infiniband, which would have limited their market to mostly HPC environments. Excelero’s client software uses RoCE/RDMA hardware to perform IO operations with the storage server.

The other thing having little to no target storage server CPU utilization per IO operation gives them is the ability to scale up to 1000 of hosts or storage servers without reaching any storage system bottlenecks.  Another concern eliminated by minimal target server CPU utilization is that you can’t have a noisy neighbor problem, because there’s no target CPU processing to be shared.  Yet another advantage with Excelero is that bandwidth is only  limited by storage server PCIe lanes and networking.  A final advantage of their approach is that they can support any of the current and upcoming storage class memory devices supporting NVMe (e.g., Intel Optane SSDs).

The storage services they offer include RAID 0, 1 and 10 and a client side logical volume manager which supports multi-pathing. Logical volumes can span up to 128 storage servers, but can be accessed by almost any number of hosts. And there doesn’t appear to be a specific limit on the number of logical volumes you can have.

 

They support two different protocols across the 40GbE/100GbE networks. Standard NVMe over Fabric or RDDA (Excelero patented, proprietary Remote Direct Disk Array access). RDDA is what mainly provides the almost non-existent target storage server CPU utilization. But even with standard NVMe over Fabric they support low target CPU utilization. One proviso, with NVMe over Fabric, they do add shared volume functionality to support RAID device locking and additional fault tolerance capabilities.

On Excelero’s roadmap is thin provisioning, snapshots, compression and deduplication. However, they did mention that adding advanced storage functionality like this will impede performance. Currently, their distributed volume locking and configuration metadata is not normally accessed during an IO but when you add thin provisioning, snapshots and data reduction, this metadata needs to become more sophisticated and will necessitate some amount of access during and after an IO operation.

Excelero’s client software runs in Linux kernel mode client and they don’t currently support VMware or Hyper-V. But they do support KVM as a hypervisor and would be willing to support the others, if APIs were published or made available.

They also have an internal OpenStack Cinder driver but it’s not part of their OpenStack’s release yet. They’re waiting for snapshot to be available before they push this into the main code base. Ditto for Docker Engine but this is more of a beta capability today.

Excelero customer experience

One customer (NASA Ames/Moffat Field) deployed a single 2TB NVMe SSD across 128 hosts and had a single 256TB logical volume shared and accessed by all 128 hosts.

Another customer configured Excelero behind a clustered file system and was able to generate 30M randomized IO/sec at 200µsec latencies but more important, 140GB/sec of bandwidth. It turns out high bandwidth is important to many big data applications that have to roll lots of data into their analytics clusters, processing it and output results, and then do it all over again. Bandwidth limitations can impact the success of these types of applications.

By being software only they can be used in a standalone storage server or as a hyper-converged solution where applications and storage can be co-resident on the same server. As noted above, they currently support Linux O/Ss for their storage and client software and support any X86 Intel processor, any RDMA capable NIC, and any NVMe SSD.

Excelero GTM

Excelero is focused on the top 200 customers, which includes the hyper-scale providers like FaceBook, Google, Microsoft and others. But hyper-scale customers have huge software teams and really a single or few, very large/complex applications which they can create/optimize a Tier 0 storage for themselves.

It’s really the customers just below the hyper-scalar class, that have similar needs for high low latency IO/sec or high IO bandwidth (or both) but have 100s to 1000s of applications and they can’t afford to optimize them all for Tier 0 flash. If they solve sharing Tier 0 flash storage in a more general way, say as a block storage device. They can solve it for any application. And if the customer insists, they could put a clustered file system or even an object storage (who would want this) on top of this shared Tier 0 flash storage system.

These customers may currently be using NVMe SSDs within their servers as a DAS device. But with Excelero these resources can be shared across the data center. They think of themselves as a top of rack NVMe storage system.

On their website they have listed a few of their current customers and their pretty large and impressive.

NVMe competition

Aside from E8 Storage, there are few other competitors in Tier 0 storage. One recently announced a move to an NVMe flash storage solution and another killed their shipping solution. We talked about what all this means to them and their market at the end of the podcast. Suffice it to say, they’re not worried.

The podcast runs ~50 minutes. Josh and Yaniv were very knowledgeable about Tier 0, storage market dynamics and were a delight to talk with.   Listen to the podcast to learn more.


Yaniv Romem CTO and Founder, Excelero

Yaniv Romem has been a technology evangelist at disruptive startups for the better part of 20 years. His passions are in the domains of high performance distributed computing, storage, databases and networking.
Yaniv has been a founder at several startups such as Excelero, Xeround and Picatel in these domains. He has served in CTO and VP Engineering roles for the most part.


Josh Goldenhar, Vice President Products, Excelero

Josh has been responsible for product strategy and vision at leading storage companies for over two decades. His experience puts him in a unique position to understand the needs of our customers.
Prior to joining Excelero, Josh was responsible for product strategy and management at EMC (XtremIO) and DataDirect Networks. Previous to that, his experience and passion was in large scale, systems architecture and administration with companies such as Cisco Systems. He’s been a technology leader in Linux, Unix and other OS’s for over 20 years. Josh holds a Bachelor’s degree in Psychology/Cognitive Science from the University of California, San Diego.

GreyBeards talk with Pivot3 and NexGen Storage about their recent acquisition announcement

In our 29th episode, we talk with John Spiers (@lefthandsan), Co-founder & CEO of NexGen Storage and Ron Nash (@hronaldnash), Chairman & CEO of Pivot3, a hyper converged infrastructure provider.  We have talked with John before (see last June’s podcast episode) about NexGen Storage technology. Recently, Pivot3 announced they were going to acquire NexGen Storage and Howard and I wanted to talk about to them what brought the two together.

We have discussed hyper converged solutions before (see ScaleComputing and Gridstore podcasts) dating all the way to the first GreyBeardsOnStorage podcast with Nutanix but this is the first time we have talked with Pivot3 and Ron Nash. As discussed in those podcasts hyper converged infrastructure (HCI) brings together compute, storage and sometimes networking under one overarching infrastructure framework and delivers all this as a single solution that customers can then tailor to their own needs. In a typical HCI solution, storage is software defined, compute is under the control of a hypervisor and can include software defined networking.

Sometime last fall both John and Ron were considering additional funding opportunities with their VC’s, when one of them, Brian Smith of S3 Ventures, suggested they look at combining their two operations into one company.

It seemed that John was looking to expand their sales and marketing team to take NexGen Storage to the next level while Ron was looking for some additional differentiation in storage technology that could take their solution beyond where they were today. It seemed to Mr. Smith that each of them had just what the other one was looking for.

As GreyBeardsOnStorage listeners should recall, NexGen Storage is known for their hybrid storage solution with fine grained QoS capabilities. Although, NexGen Storage is delivered as an appliance, their main IP is in storage software and so implementing a Software Defined Storage solution under HCI was certainly an option.

Pivot3 has been around since 2002 and has sale teams around the world with an extensive marketing team. Pivot3 uses Zen and now mostly VMware for their hypervisor environments and typically run on whitebox servers with storage bridge bay boxes running software defined storage. Pivot3 had already implemented scaleable erasure coding which is something NexGen Storage was also looking at.

Pivot3 and the rest of the HC solutions market space seems split into two. That is there is a good market at the low end, where small companies, remote offices, small workgroups, etc. are looking for an easy to deploy, full IT stack solution. And at the high end, large web properties and other IT behemoths  also need an easy to deploy, readily automated solution, that can scale to whatever size they require.

Both Pivot3 and NexGen Storage work well in VDI deployments but NexGen was mostly deployed in currently running VDI environments, whereas Pivot3 primarily went into brand new deployments, that could take advantage of HCI solutions.

In the podcast we discuss some of these large organizations such as Google, Facebook, Etrades and others and what they are looking for in an IT infrastructure. We also discuss some of the technology trends that are impacting both HCI and storage infrastructure. It turns out NexGen’s extensive QoS capabilities are what can make HCI deployments work even better than they do today.

In the past couple of days, the technology teams of the two companies have been hot and heavy, examining possible synergies and discussing how to reconcile their respective roadmaps. John and Ron were sitting in the back during these discussions throwing out ideas which the technical teams ran with as far as they could.

The podcast runs just over 41 minutes and episode covers a lot of ground about both of their products market spaces, technology, and business dynamics and especially, on how they see the two solutions complementing each other. Apparently the acquisition is on a fast path to close soon. Listen to the podcast to learn more.

Ron Nash 2016[1][1]Ron Nash, Chairman and CEO, Pivot3

Ron brings senior leadership and experience as the chairman and CEO of Pivot3. He has held numerous leadership roles at both start-up and enterprise information technology companies including ExoLink (acquired by Alliance Data Systems), Advanced Telemarketing (now Aegis Global) and Rubicon (acquired by Cerner), Perot Systems (now Dell Services) and EDS (now HP Enterprise Service). More recently, he served as a partner at InterWest Partners, investing in successful breakthrough technology companies like Pivot3 and Lombardi Software (acquired by IBM).

 

John Spiers Headshot[1][1][1][1]John Spiers, Founder and CEO, NexGen Storage

John is a serial entrepreneur based in Boulder, CO. John has been pioneering breakthrough data storage innovations for over 30 years. He co-founded venture-backed LeftHand Networks, a market leader in virtualized, scale-out data storage, and served as LeftHand’s Chief Technology Officer. In 2010 John co-founded NexGen Storage. John supports local entrepreneurs, serving on the boards of local technology startups and as an advisor for the Blackstone Entrepreneurs Network. John is a graduate from Colorado State University with a degree in Engineering.

 

GreyBeards on Storage year end 2015 podcast

In our annual yearend podcast and it’s the Ray and Howard show, talking about storage futures, industry trends and some storage world excitement of- the past year.

We start the discussion deconstructing recent reductions in year over year revenues at major storage vendors. It seems with the advent of all flash arrays (AFA), and all major vendors and most startups now have AFAs, customers no longer feel the need to refresh old storage hardware with similarly (over-)configured new systems. Instead, most can get by with AFA storage, at smaller capacities that provides the same, if not better, performance. Further9, the fact that AFAs are available from so many vendors and startups, customers no longer have to buy performance storage exclusively from major vendors anymore. This is leading to a decline in major vendor storage revenues, which should play itself out over the next 1-2 years as most enterprise storage systems are refreshed.

Recent and future acquisitions also came up for discussion. NetApp’s purchase of SolidFire was a surprise, but SolidFire had carved out a good business with service providers and web-scale customers which should broaden NetApp’s portfolio. In the mean time, the Dell-EMC acquisition takes them out of the competition for new technology acquisitions, at least until it closes. NetApp’s new CEO, George Kurian, appears more willing than his predecessor to go after good storage technology, wherever it comes from.

Software delivered (defined) storage came up as well. With the compute available in todays micro-processors, there’s very little a software delivered storage system can’t do. And with scale-out storage, there’s even more cores to work with. Software delivered storage and scale-out will continue to play a spoiler role, at least in the low to mid-range, in the storage market throughout the next year.

Nonetheless, hardware still has some excitement left. Intel’s recent acquisition of Altera, now makes Xeon/x86 processing available for embedded applications that previously had to rely on ARM and MIPS processing. Now, there’s nothing an FPGA hardware based system can’t do. Look for lot’s more activity here over the long term.

We talked about recent SMR disks coming out and how they could be used in storage systems today.  There was some adjacent discussion on the flash-disk crossover, and conclude it’s unlikely over the next 3-5 years, at least for capacity drives. Although there’s plenty of analyst that say it’s already happened, on a pure $/GB there’s still no comparison.

We then turned to  3D TLC NAND and the  reliability capabilities available from current controlller technologies. Raw planar NAND available today is much less reliable than what we had 1-2 generations back, but the drives, if anything, have gotten more reliable. This is due to the reliability technology inherent in todays SSD controllers.

We had an aside, on SSD overprovisioning and how this should become a customer level option.  Reducing overprovisioning would decrease drive endurance but it’s a tradeoff that the vendors/distributors make for customers today. We feel that at least for some customers, they could make this decision just as well. Especially if drive replacements were a customer maintenance activity with replacement SSDs shipped in a just-in-time manner.

We conclude on 3D XPoint (3DX) non-volatile memory. We both agreed 3DX adoption depends on pricing which will change over time. In the long term, we see the potential for a new storage system with 3DX or other new non-volatile memory as a top performing storage/caching/non-volatile memory tier, 3D TLC NAND as a middle tier and SMR disk as the bottom tier. When is another question.

Our year end discussion always wanders a bit, from high end business trends to in the weeds technologies and everything in-between. This one is no exception and runs over 49 minutes. We tried to do another Year End video this time but neither of our video recording systems worked out, but we had a good audio recording, so we went with the podcast this year. Next year should be back to video.  Listen to the podcast to learn more.

Howard Marks

Howard Marks is the Founder and Chief Scientist of howardmarksDeepStorage, a prominent blogger at Deep Storage Blog and can be found on twitter @DeepStorageNet.

 

Ray Lucchesi

Ray Lucchesi is the President and Founder of Silverton Consulting, a prominent blogger at RayOnStorage.com, and can be found on twitter @RayLucchesi.