Scality’s Open Source S3 Driver

img_6931
The view from Scality’s conference room

We were at Scality last week for Cloud Field Day 1 (CFD1) and one of the items they discussed was their open source S3 driver. (Videos available here).

Scality was on the 25th floor of a downtown San Francisco office tower. And the view outside the conference room was great. Giorgio Regni, CTO, Scality, said on the two days a year it wasn’t foggy out, you could even see Golden Gate Bridge from their conference room.

Scality

img_6912As you may recall, Scality is an object storage solution that came out of the telecom, consumer networking industry to provide Google/Facebook like storage services to other customers.

Scality RING is a software defined object storage that supports a full complement of interface legacy and advanced protocols including, NFS, CIGS/SMB, Linux FUSE, RESTful native, SWIFT, CDMI and Amazon Web Services (AWS) S3. Scality also supports replication and erasure coding based on object size.

RING 6.0 brings AWS IAM style authentication to Scality object storage. Scality pricing is based on usable storage and you bring your own hardware.

Giorgio also gave a session on the RING’s durability (reliability) which showed they support 13-9’s data availability. He flashed up the math on this but it was too fast for me to take down:)

Scality has been on the market since 2010 and has been having a lot of success lately, having grown 150% in revenue this past year. In the media and entertainment space, Scality has won a lot of business with their S3 support. But their other interface protocols are also very popular.

Why S3?

It looks as if AWS S3 is becoming the defacto standard for object storage. AWS S3 is the largest current repository of objects. As such, other vendors and solution providers now offer support for S3 services whenever they need an object/bulk storage tier behind their appliances/applications/solutions.

This has driven every object storage vendor to also offer S3 “compatible” services to entice these users to move to their object storage solution. In essence, the object storage industry, like it or not, is standardizing on S3 because everyone is using it.

But how can you tell if a vendor’s S3 solution is any good. You could always try it out to see if it worked properly with your S3 application, but that involves a lot of heavy lifting.

However, there is another way. Take an S3 Driver and run your application against that. Assuming your vendor supports all the functionality used in the S3 Driver, it should all work with the real object storage solution.

Open source S3 driver

img_6916Scality open sourced their S3 driver just to make this process easier. Now, one could just download their S3server driver (available from Scality’s GitHub) and start it up.

Scality’s S3 driver runs ontop of a Docker Engine so to run it on your desktop you would need to install Docker Toolbox for older Mac or Windows systems or run Docker for Mac or Docker for Windows for newer systems. (We also talked with Docker at CFD1).

img_6933Firing up the S3server on my Mac

I used Docker for Mac but I assume the terminal CLI is the same for both.Downloading and installing Docker for Mac was pretty straightforward.  Starting it up took just a double click on the Docker application, which generates a toolbar Docker icon. You do need to enter your login password to run Docker for Mac but once that was done, you have Docker running on your Mac.

Open up a terminal window and you have the full Docker CLI at your disposal. You can download the latest S3 Server from Scality’s Docker hub by executing  a pull command (docker pull scality/s3server), to fire it up, you need to define a new container (docker run -d –name s3server -p 8000:8000 scality/s3server) and then start it (docker start s3server).

It’s that simple to have a S3server running on your Mac. The toolbox approach for older Mac’s and PC’S is a bit more complicated but seems simple enough.

The data is stored in the container and persists until you stop/delete the container. However, there’s an option to store the data elsewhere as well.

I tried to use CyberDuck to load some objects into my Mac’s S3server but couldn’t get it to connect properly. I wrote up a ticket to the S3server community. It seemed to be talking to the right port, but maybe I needed to do an S3cmd to initialize the bucket first – I think.

[Update 2016Sep19: Turns out the S3 server getting started doc said you should download an S3 profile for Cyberduck. I didn’t do that originally because I had already been using S3 with Cyberduck. But did that just now and it now works just like it’s supposed to. My mistake]

~~~~

Anyways, it all seemed pretty straight forward to run S3server on my Mac. If I was an application developer, it would make a lot of sense to try S3 this way before I did anything on the real AWS S3. And some day, when I grew tired of paying AWS, I could always migrate to Scality RING S3 object storage – or at least that’s the idea.

Comments?

Free & frictionless and sometimes open sourced

IMG_4467I was at EMCWorld2015 (see my posts on the day 1 news and day 2&3 news) and IBMEdge2015 this past month and there was a lot of news on software defined storage. And it turns out I was at an HP Storage Deep Dive the previous month and they also spoke on the topic.

One key aspect of software defined storage is how customers can consume the product. I’m not talking about licensing but rather about product trial-ability. One approach championed by HP, EMC, IBM and others is to offer their software defined storage in a new way.

Free & frictionless?

Howard Marks (@DeepStorageNetDeepStorage.net) and I, had Chad Sakac (@sakacc, VirtualGeek) were on a recent GreyBeards on Storage podcast to discuss the news coming out of EMCWorld2015 and he used the term free & frictionless as a new approach to offering  emerging technology software only storage solutions.

  • Frictionless refers to not having to encounter a sales person and not having to provide a lot of information to gain access to a software download. Frictionless is a matter of degree: at one extreme all you have is a direct link to a software download and it fires up without any registration requirements whatsoever; and at the other end, you have to fill out a couple of pages about your company and your plans for the product.
  • Free refers to the ability to use the product for free in limited situations (e.g., test & development) but requires a full paid for license and support contracts when used outside these limitations.

For example:

  • Microsoft Windows Storage Server 2012 is available for a free 180-day evaluation and can be directly download. I was able to download it without having to supply any information whatsoever. Unfortunately, I don’t have any Windows Server hardware floating around that I could use to see if there was any further registration requirements for it.
  • HP StoreVirtual VSA and StoreOnce VSA are both available for a 60-day, free trial offer, downloadable from the StoreVirtual VSA and StoreOnce VSA websites. StoreVirtual VSA is also available for an free, 1TB/3-year license. You have to register for this last option and all three options require an HP Passport account to download the software. Didn’t have an HP Passport account so don’t know what else was required.
  • VMware Virtual SAN is available for a 60-day, free trial offer (with no capacity or other use restrictions). You will need a 3-server vSphere cluster so you also get vSphere and vCenter server software for free at the download website.  You will need a VMware account in order to download the software, beyond that, it’s unclear to me what’s required.
  • EMC ScaleIO will be available for free when used for test and development, by the end of this month. There is no limit on the time you can use the product, no limit on the amount of storage that can be defined and no limit on the number of servers it’s deployed on. Although the website for EMC’s ScaleIO download was up, there was no download link active on the page yet. So I can’t say what’s required to access the download.
  • IBM Spectrum Accelerate (software-only version of XIV) is going to be available for a 90-day, free trial offer. As far as I know you can do what you like with it for 90-days. I couldn’t find any links on their website for the download but it was just announced last week at IBMEdge2015.

I couldn’t find any information on an Hitachi or a NetApp software defined storage solution free trial offer but could have missed them in my searches.

There are plenty of other software defined storage solutions out there including Maxta, NexentaSpringPath, and probably a dozen others, many of which provide free trial offers. Not to mention software defined object/file systems such as Ceph, Gluster, Lustre, etc.

… And sometimes Open Source

One other item of interest out of EMCWorld2015 this month was that ViPR Controller is being open sourced as Project CoprHD (on GitHub). Its source code is scheduled to be loaded around June.

EMC, IBM, HDS, NetApp, VMware and others have all been very active in open source in the past, in areas such as storage support in Linux, OpenStack and other projects. But outside of Pivotal (an EMC Federation company), most of them have not open sourced a real product.

I believe it was Paul Maritz, CEO Pivotal who said on stage, that one reason to open source a project is to help to create an eco-system around it.

EMC open sourced ViPR Controller primarily to add even more development resources to enhance the solution. The other consideration was that customers adopting ViPR Controller in their data centers were concerned about vendor lock-in. Open sourcing ViPR Controller addresses both of these issues.

My understanding is that Project CoprHD will be under a Mozilla Public License (MPL 2.0) as standalone project. Customers can now add any storage system support they want and anyone that’s afraid of lock-in can download the software and modify it themselves. MPL 2.0 supports a copyleft style of licensing, which essentially means anyone can modify the source code but any derivative work must be licensed under MPL as well.

My understanding is that ViPR Controller will still be available as a commercial product.

~~~~

From my perspective it all seems to make a lot of sense. Customers creating new applications that could use software defined storage want access to the product for free to try it out to see what it can and can’t do.

EMC’s taken a lead in offering their’s for free in test and dev situations, we’ll see if the others go along with them.

Comments?

 

 

 

 

 

 

 

 

Why Open-FCoE is important

FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)
FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)

I don’t know much about O/S drivers but I do know lots about storage interfaces. One thing that’s apparent from yesterday’s announcement from Intel is that Fibre Channel over Ethernet (FCoE) has taken another big leap forward.

Chad Sakac’s chart of FC vs. Ethernet target unit shipments (meaning, storage interface types, I think) clearly indicate a transition to ethernet is taking place in the storage industry today. Of course Ethernet targets can be used for NFS, CIFS, Object storage, iSCSI and FCoE so this doesn’t necessarily mean that FCoE is winning the game, just yet.

WikiBon did a great post on FCoE market dynamics as well.

The advantage of FC, and iSCSI for that matter, is that every server, every OS, and just about every storage vendor in the world supports them. Also there are plethera of economical, fabric switches available from multiple vendors that can support multi-port switching with high bandwidth. And there many support matrixes, identifying server-HBAs, O/S drivers for those HBA’s and compatible storage products to insure compatibility. So there is no real problem (other than wading thru the support matrixes) to implementing either one of these storage protocols.

Enter Open-FCoE, the upstart

What’s missing from 10GBE FCoE is perhaps a really cheap solution, one that was universally available, using commodity parts and could be had for next to nothing. The new Open-FCoE drivers together with the Intels x520 10GBE NIC has the potential to answer that need.

But what is it? Essentially Intel’s Open-FCoE is an O/S driver for Windows and Linux and a 10GBE NIC hardware from Intel. It’s unclear whether Intel’s Open-FCoE driver is a derivative of the Open-FCoe.org’s Linux driver or not but either driver works to perform some of the FCoE specialized functions in software rather than hardware as done by CNA cards available from other vendors. Using server processing MIPS rather than ASIC processing capabilities should make FCoE adoption in the long run, even cheaper.

What about performance?

The proof of this will be in benchmark results but it’s quite possible to be a non-issue. Especially, if there is not a lot of extra processing involved in a FCoE transaction. For example, if Open-FCoE only takes let’s say 2-5% of server MIPS and bandwidth to perform the added FCoE frame processing then this might be in the noise for most standalone servers and would showup only minimally in storage benchmarks (which always use big, standalone servers).

Yes, but what about virtualization?

However real world, virtualized servers is another matter. I believe that virtualized servers generally demand more intensive I/O activity anyway and as one creates 5-10 VMs, ESX server, it’s almost guaranteed to have 5-10X the I/O happening. If each standalone VM requires 2-5% of a standalone processor to perform Open-FCoE processing, then it could easily represent 5-7 X 2-5% on a 10VM ESX server (assumes some optimization for virtualization, if virtualization degrades driver processing, it could be much worse), which would represent a serious burden.

Now these numbers are just guesses on my part but there is some price to pay for using host server MIPs for every FCoE frame and it does multiply for use with virtualized servers, that much I can guarantee you.

But the (storage) world is better now

Nonetheless, I must applaud Intel’s Open-FCoE thrust as it can open up a whole new potential market space that today’s CNAs maybe couldn’t touch. If it does that, it introduces low-end systems to the advantages of FCoE then as they grow moving their environments to real CNAs should be a relatively painless transition. And this is where the real advantage lies, getting smaller data centers on the right path early in life will make any subsequent adoption of hardware accelerated capabilities much easier.

But is it really open?

One problem I am having with the Intel announcement is the lack of other NIC vendors jumping in. In my mind, it can’t really be “open” until any 10GBE NIC can support it.

Which brings us back to Open-FCoE.org. I checked their website and could see no listing for a Windows driver and there was no NIC compatibility list. So, I am guessing their work has nothing to do with Intel’s driver, at least as presently defined – too bad

However, when Open-FCoE is really supported by any 10GB NIC, then the economies of scale can take off and it could really represent a low-end cost point for storage infrastructure.

Unclear to me what Intel has special in their x520 NIC to support Open-FCoE (maybe some TOE H/W with other special sauce) but anything special needs to be defined and standardized to allow broader adoption by other Vendors. Then and only then will Open-FCoE reach it’s full potential.

—-

So great for Intel, but it could be even better if a standardized definition of an “Open-FCoE NIC” were available, so other NIC manufacturers could readily adopt it.

Comments?