A tale of two storage companies – NetApp and Vantara (HDS-Insight Grp-Pentaho)

It was the worst of times. The industry changes had been gathering for a decade almost and by this time were starting to hurt.

The cloud was taking over all new business and some of the old. Flash’s performance was making high performance easy and reducing storage requirements commensurately. Software defined was displacing low and midrange storage, which was fine for margins but injurious to revenues.

Both companies had user events in Vegas the last month, NetApp Insight 2017 last week and Hitachi NEXT2017 conference two weeks ago.

As both companies respond to industry trends, they provide an interesting comparison to watch companies in transition.

Company role

  • NetApp’s underlying theme is to change the world with data and they want to change to help companies do this.
  • Vantara’s philosophy is data and processing is ultimately moving into the Internet of things (IoT) and they want to be wherever the data takes them.

Hitachi Vantara is a brand new company that combines Hitachi Data Systems, Hitachi Insight Group and Pentaho (an analytics acquisition) into one organization to go after the IoT market. Pentaho will continue as a separate brand/subsidiary, but HDS and Insight Group cease to exist as separate companies/subsidiaries and are now inside Vantara.

NetApp sees transitions occurring in the way IT conducts business but ultimately, a continuing and ongoing role for IT. NetApp’s ultimate role is as a data service provider to IT.

Customer problem

  • Vantara believes the main customer issue is the need to digitize the business. Because competition is emerging everywhere, the only way for a company to succeed against this interminable onslaught is to digitize everything. That is digitize your manufacturing/service production, sales, marketing, maintenance, any and all customer touch points, across your whole value chain and do it as rapidly as possible. If you don’t your competition will.
  • NetApp sees customers today have three potential concerns: 1) how to modernize current infrastructure; 2) how to take advantage of (hybrid) cloud; and 3) how to build out the next generation data center. Modernization is needed to free capital and expense from traditional IT for use in Hybrid cloud and next generation data centers. Most organizations have all three going on concurrently.

Vantara sees the threat of startups, regional operators and more advanced digitized competitors as existential for today’s companies. The only way to keep your business alive under these onslaughts is to optimize your value delivery. And to do that, you have to digitize every step in that path.

NetApp views the threat to IT as originating from LoB/shadow IT originating applications born and grown in the cloud or other groups creating next gen applications using capabilities outside of IT.

Product direction

  • NetApp is looking mostly towards the cloud. At their conference they announced a new Azure NFS service powered by NetApp. They already had Cloud ONTAP and NPS, both current cloud offerings, a software defined storage in the cloud and a co-lo hardware offering directly attached to public cloud (Azure & AWS), respectively.
  • Vantara is looking towards IoT. At their conference they announced Lumada 2.0, an Industrial IoT (IIoT) product framework using plenty of Hitachi software functionality and intended to bring data and analytics under one software umbrella.

NetApp is following a path laid down years past when they devised the data fabric. Now, they are integrating and implementing data fabric across their whole product line. With the ultimate goal that wherever your data goes, the data fabric will be there to help you with it.

Vantara is broadening their focus, from IT products and solutions to IoT. It’s not so much an abandoning present day IT, as looking forward to the day where present day IT is just one cog in an ever expanding, completely integrated digital entity which the new organization becomes.

They both had other announcements, NetApp announced ONTAP 9.3, Active IQ (AI applied to predictive service) and FlexPod SF ([H]CI with SolidFire storage) and Vantara announced a new IoT turnkey appliance running Lumada and a smart data center (IoT) solution.

Who’s right?

They both are.

Digitization is the future, the sooner organizations realize and embrace this, the better for their long term health. Digitization will happen with or without organizations and when it does, it will result in a significant re-ordering of today’s competitive landscape. IoT is one component of organizational digitization, specifically outside of IT data centers, but using IT resources.

In the mean time, IT must become more effective and efficient. This means it has to modernize to free up resources to support (hybrid) cloud applications and supply the infrastructure needed for next gen applications.

One could argue that Vantara is positioning themselves for the long term and NetApp is positioning themselves for the short term. But that denies the possibility that IT will have a role in digitization. In the end both are correct and both can succeed if they deliver on their promise.

Comments?

 

Google releases new Cloud TPU & Machine Learning supercomputer in the cloud

Last year about this time Google released their 1st generation TPU chip to the world (see my TPU and HW vs. SW … post for more info).

This year they are releasing a new version of their hardware called the Cloud TPU chip and making it available in a cluster on their Google Cloud.  Cloud TPU is in Alpha testing now. As I understand it, access to the Cloud TPU will eventually be free to researchers who promise to freely publish their research and at a price for everyone else.

What’s different between TPU v1 and Cloud TPU v2

The differences between version 1 and 2 mostly seem to be tied to training Machine Learning Models.

TPU v1 didn’t have any real ability to train machine learning (ML) models. It was a relatively dumb (8 bit ALU) chip but if you had say a ML model already created to do something like understand speech, you could load that model into the TPU v1 board and have it be executed very fast. The TPU v1 chip board was also placed on a separate PCIe board (I think), connected to normal x86 CPUs  as sort of a CPU accelerator. The advantage of TPU v1 over GPUs or normal X86 CPUs was mostly in power consumption and speed of ML model execution.

Cloud TPU v2 looks to be a standalone multi-processor device, that’s connected to others via what looks like Ethernet connections. One thing that Google seems to be highlighting is the Cloud TPU’s floating point performance. A Cloud TPU device (board) is capable of 180 TeraFlops (trillion or 10^12 floating point operations per second). A 64 Cloud TPU device pod can theoretically execute 11.5 PetaFlops (10^15 FLops).

TPU v1 had no floating point capabilities whatsoever. So Cloud TPU is intended to speed up the training part of ML models which requires extensive floating point calculations. Presumably, they have also improved the ML model execution processing in Cloud TPU vs. TPU V1 as well. More information on their Cloud TPU chips is available here.

So how do you code a TPU?

Both TPU v1 and Cloud TPU are programmed by Google’s open source TensorFlow. TensorFlow is a set of software libraries to facilitate numerical computation via data flow graph programming.

Apparently with data flow programming you have many nodes and many more connections between them. When a connection is fired between nodes it transfers a multi-dimensional matrix (tensor) to the node. I guess the node takes this multidimensional array does some (floating point) calculations on this data and then determines which of its outgoing connections to fire and how to alter the tensor to send to across those connections.

Apparently, TensorFlow works with X86 servers, GPU chips, TPU v1 or Cloud TPU. Google TensorFlow 1.2.0 is now available. Google says that TensorFlow is in use in over 6000 open source projects. TensorFlow uses Python and 1.2.0 runs on Linux, Mac, & Windows. More information on TensorFlow can be found here.

So where can I get some Cloud TPUs

Google is releasing their new Cloud TPU in the TensorFlow Research Cloud (TFRC). The TFRC has 1000 Cloud TPU devices connected together which can be used by any organization to train machine learning algorithms and execute machine learning algorithms.

I signed up (here) to be an alpha tester. During the signup process the site asked me: what hardware (GPUs, CPUs) and platforms I was currently using to training my ML models; how long does my ML model take to train; how large a training (data) set do I use (ranging from 10GB to >1PB) as well as other ML model oriented questions. I guess there trying to understand what the market requirements are outside of Google’s own use.

Google’s been using more ML and other AI technologies in many of their products and this will no doubt accelerate with the introduction of the Cloud TPU. Making it available to others is an interesting play but this would be one way to amortize the cost of creating the chip. Another way would be to sell the Cloud TPU directly to businesses, government agencies, non government agencies, etc.

I have no real idea what I am going to do with alpha access to the TFRC but I was thinking maybe I could feed it all my blog posts and train a ML model to start writing blog post for me. If anyone has any other ideas, please let me know.

Comments?

Photo credit(s): From Google’s website on the new Cloud TPU

 

PCM based neuromorphic processors

Read an interesting article from Register the other day about  IBM’s Almadan Research lab using standard Non-volatile memory devices to implement a neural net. They apparently used 2-PCM (Phase Change Memory) devices to implement a 913 neuron/165K synapse pattern recognition system.

This seems to be another (simpler, cheaper) way to create neuromorphic chips. We’ve written about neuromorphic chips before (see my posts on IBM SyNAPSE, IBM TrueNorth and MIT’s analog neuromorphic chip). The latest TrueNorth chip from IBM uses ~5B transistors and provides 1M neurons with 256M synapses.

But none of the other research I have read actually described the neuromorphic “programming” process at the same level nor provided a “success rate” on a standard AI pattern matching benchmark as IBM has with the PCM device.

PCM based AI

The IBM summary report on the research discusses at length how the pattern recognition neural network (NN) was “trained” and how the 913 neuron/165K synapse NN was able to achieve 82% accuracy on NIST’s handwritten digit training database.

The paper has many impressive graphics. The NN was designed as a 3-layer network and used back propagation for its learning process. They show how the back propagation training was used to determine the weights.

The other interesting thing was they analyzed how hardware faults (stuck-ats, dead conductors, number of resets, etc.) and different learning parameters (stochasticity, learning batch size, variable maxima, etc.) impacted NN effectiveness on the test database.

Turns out the NN could tolerate ~30% dead conductors (in the Synapses) or 20% of stuck-at’s in the PCM memory and still generate pretty good accuracy on the training event. Not sure I understand the learning parameters but they varied batch size from 1 to 10 and this didn’t seem to impact NN accuracy whatsoever.

Which PCM was used?

In trying to understand which PCM devices were in use, the only information available said it was a 180nm device. According to a 2012 Flash Memory Summit Report report on alternative NVM technologies, 180nm PCM devices have been around since 2004, a 90nm PCM device was introduced in 2008 with 128Mb and even newer PCM devices at 45nm were introduced in 2010 with 1Gb of memory.  So I would conclude that the 180nm PCM device supported ~16 to 32Mb.

What can we do with todays PCM technology?

With the industry supporting a doubling of transistors/chip every 2 years a PCM device in 2014 should have 4X the transistors of the 45nm, 2010 device above and ~4-8X the memory. So today we should be seeing 4-16Gb PCM chips at ~22nm. Given this, current PCM technology should support 32-64X more neurons than the 180nm devices or ~29K to ~58K neurons or so

Unclear what technology was used for the  ‘synapses’  but based on the time frame for the PCM devices, this should also be able to scale up by a factor of 32-64X or between ~5.3M to ~10.6M synapses.

Still this doesn’t approach TrueNorth’s Neurons/Synapse levels, but it’s close. But then 2 4-16Gb PCMs probably don’t cost nearly as much to purchase as TrueNorth costs to create.

The programing model for the TrueNorth/Synapse chips doesn’t appear to be neural network like. So perhaps another advantage of the PCM model of hardware based AI is that you can use standard, well known NN programming methods to train and simulate it.

So, PCM based neural networks seem an easier way to create hardware based AI. Not sure this will ever match Neuron/Synapse levels that the dedicated, special purpose neuromorphic chips in development can accomplish but in the end, they both are hardware based AI that can support better pattern recognition.

Using commodity PCM devices any organization with suitable technological skills should be able to create a hardware based NN that operates much faster than any NN software simulation. And if PCM technology starts to obtain market acceptance, the funding available to advance PCMs will vastly exceed that which IBM/MIT can devote to TrueNorth and its descendants.

Now, what is HP up to with their memristor technology and The Machine?

Photo Credits: Neurons by Leandro Agrò

RoS video interview with Ron Redmer Sr. VP Cybergroup

Ray interviewed Ronald Redmer, Sr. VP Cybergroup at EMC’s Global Analyst Summit back in October. Ron is in charge of engineering and product management of their new document analytics service offering. Many of their new service offerings depend on EMC Federation solutions such as ViPR (see my post EMC ViPR virtues & vexations but no virtualization), Pivotal HD, and other offerings.

This was recorded on October 28th in Boston.

New Global Learning XPrize opens

Read a post this week in Gizmag about the new Global Learning XPrize. Past XPrize contests have dealt with suborbital spaceflight, super-efficient automobiles,  oil cleanup, and  lunar landers.

Current open XPrize contests include: Google Lunar Lander, Qualcomm Tricorder medical diagnosis, Nokia Health Sensing/monitoring and Wendy Schmidt Ocean Health Sensing. So what’s left?

World literacy

There are probably a host of issues that the next XPrize could go after but one that might just change the world is to improve children literacy.  According to UNESCO (2nd Global Report on Adult Learning and Education  [GRALE 2]) there are over 250M children of primary school age that will not reach grade 4 levels of education in the world, these children cannot read, write or do basic arithmetic. Given current teaching methods we would need an additional 1.6M teachers to teach all these children. As such, to teach all these children when we include teacher salaries, classroom spaces, supplies, etc. would be highly expensive. There has to be a better, more scaleable way to do this.

Enter the Global Learning XPrize. The intent of this XPrize is to create a tablet application which can teach children how to read, write and do rudimentary arithmetic in 18 months without access to a teacher or other supervised learning.

Where are they in the XPrize process?

The Global Learning XPrize already has raised $15M for the actual XPrize but they are using a crowd funding approach to fund the last $500K which will be used to field test the  Global Learning XPrize candidates. The crowd funding is being done on Indiegogo.

Registration starts now and runs through March 2015, Software development runs through September 2016, at which time five finalists will be selected, each will receive the $1M finalist XPrize to fund a further round of coding. In May of 2017, the five apps will be loaded onto tablets and field testing commences in June 2017 through December 2018. At which time the winner will be selected and will recieve the $10M XPrize.

What other projects have been tried?

I once read an article about the  Hole in the wall computer, where NIIT and their technologists placed an outside hardened, internet connected computer inside a brick wall  in an Indian underprivileged area. The intent was to show that children could learn how to use computers on their own, without adult supervision. Within days children were able to “browse, play games, create documents and paint pictures” on the computer. So minimally invasive education (MIE) can be made to work.

Whats the hardware environment going to look like

There’s no reason that an Android tablet would be any worse and potentially could be much better than a internet connected computer.

Although the tablets will be internet connected it is assumed that the connection will be not always on so the intent is that the apps run standalone as much as possible. Also, I believe that a child will be given a tablet which will be for their exclusive use during the 18 months. The Global Learning XPrize team will insure that there are charging stations where the tablets can be charged once/day but we shouldn’t assume that they can be charged while they are being used.

How are the entries to be judged

The finalists will be judged against EGRA (early grade reading assessment), EGWA (early grade writing assessment), and EGMA (early grad math assessment). The chosen language is to be English and the intent is to use children in countries which have an expressed interest in using English. The Grand winner will be judged to have succeeded if its 7 to 12 year old students can score twice as as well on the EGRA, EGWA and EGMA as a control group. [Not sure what a control group would look like for this nor what they would be doing during the 18 months]. For more information checkout the XPrize guidelines v1 pdf.

The assumption is that there will be about 30 children per village and enough villages will be found to provide a statistically valid test of the five learning apps against a control group.

At the end of all this the winning entry and the other four finalists will have their solutions be open sourced, for the good of the world.

Registration is open now…

Entry applications are $500. Finalists win $1M and the winner will take home $10M.

I am willing to put up the $500 application fee for the Global Learning XPrize. Having never started an open source project, never worked on developing an Android tablet application, or done anything other than some limited professional training this will be entirely new to me – so it should be great fun.  I am thinking of creating a sort of educational video game  (yet another thing I have no knowledge about, :).

We have until March of 2015 to see if we can put a team together to tackle this. I think if I can find four other (great) persons to take this on, we will give it a shot. I hope to enter an application by February of 2015, if we can put together a team by then to tackle this.

Anyone interested in tackling the Global Learning XPrize as an open source project from the gitgo, please comment on this post to let me know.

Photo Credit(s): Kid iPad outside by Alice Keeler

More women in tech

Read an interesting article today in the NY Times on how Some Universities Crack Code in Drawing Women to Computer Science. The article discusses how Carnegie Mellon University, Harvey Mudd University and the University of Washington have been successful at attracting women to enter their Computer Science (CompSci) programs.

When I was more active in IEEE there was a an affinity group called Women In Engineering (WIE) that worked towards encouraging female students to go into science, technology, engineering and math (STEM).  I also attended a conference for school age girls interested in science and helped to get the word out about IEEE and its activities.  WIE is still active encouraging girls to go into STEM fields.

However, as I visit startups around the Valley and elsewhere I see lots of coders which are male but very few that are female. On the other hand, the marketing and PR groups have almost a disproportionate representation of females although not nearly as skewed as the male to female ratio in engineering (5:6 in marketing/PR to 7:1 in engineering).

Some in the Valley are starting to report on diversity in their ranks and are saying that only 15 to 17% of their employees in technology are females.

On the other hand, bigger companies seem to do a little better than startups by encouraging more diversity in their technical ranks. But the problem is prevalent throughout the technical industry in the USA, at least.

Universities to the rescue

The article goes on to say that some universities have been more successful in recruiting females to CompSci than others and these have a number of attributes in common:

  • They train female teachers at the high school level in how to teach science better.
  • They host camps and activities where they invite girls to learn more about technology.
  • They provide direct mentors to supply additional help to girls in computer science
  • They directly market to females by changing brochures and other material to show women in science.

Some Universities eliminated programming experience as an entry criteria. They also broadened the appeal of the introductory courses in CompSci to show real world applications of doing technology figuring that this would appeal more to females.  Another university re-framed some of their course work to focus on creative problem solving rather than pure coding.

Other universities are not changing their programs at all and finding with better marketing, more mentorship support and early training they can still attract more females to computer science.

The article did mention one other thing that is attracting more females to CompSci and that is the plentiful, high paying jobs that are currently available in the field.

From my perspective, more females in tech is a good thing and we as an industry should do all we can to encourage this.

~~~~

Comments?

Photo credits: Circuit Bending Orchestra: Lara Grant at Diana Eng’s Fairytale Fashion Show, Eyebeam NYC / 20100224.7D.03621.P1.L1.SQ.BW / SML

Obsolescent no more

Read an article in the Economist’s Quarterly Technology section this week on 3D printers. There was mention of one company who was having problems keeping their MD-80 jets flying because of leaking toilets. It turned out that a plastic part needed to be replaced but as the plane had reached end-of-service the parts were no longer available. Enter the 3D printer and aero-space grade plastic and now the planes are flying again.

The plague of obsolescence

When I worked at a storage vendor we often had problems parts going end-of-life. These were parts were no longer being manufactured and we would have to buy up a bunch of them in order to keep products in the field. Now most of these obsolete parts were electronic, but the problems were still the same.

Over time, manufacturing volumes for some parts were just not worth it anymore. At that point, manufacturers would call it end-of-life and if your system depended on it, you either bought enough to last until your system went end-of-life or you re-designed your system to eliminate the obsolete part.  Most of the time it was a little of both approaches, with a race to see if you would run out of parts before the new design was deployed in the field.

3D printers save the day

Maybe with 3D printers that could print electronics, metal, ceramics and plastics these issues would go away. There are a few preliminary things that need to be done in order for all this to work.

  1. Part manufacturers need to provide a CAD drawing of any and all parts that go end-of-life.
  2. Component manufacturers need to provide detailed CAD drawings of all parts that go into their components that are going end-of-life or end-of-service.
  3. System manufacturers need to provide detailed CAD drawings of all components and parts that go into their systems that are going end-of-life or end-of-service.
  4. Bicycle, automobile, railway, aircraft, tractor, etc., designers need to provide detailed CAD drawings of all the parts, components and assemblies that go into their vehicles that are going end-of-life or end-of-service.

I could go on but you get the picture. With proper CAD drawings and appropriate 3D printers there should never be another problem with end-of-life mechanical or optical parts.

How to get manufacturers to go along?

There would need to be some sort of agreement on the CAD format for such archive information. And there would need to be some teeth behind the proposal to get manufacturers and vendors to provide this information, say from some large 3-letter organization that could start the ball rolling.

They would need to begin demanding as part of all new contracts for equipment purchases that detailed CAD information be put in escrow and made available to them when the systems, components and parts go end-of-life or end-of-service. Of course the system vendors and component manufacturers might want to get ahead of this curve and start demanding this of all their suppliers in anticipation of such.

The escrowed CAD information could easily be licensed to certain customers rather than just given away for free. Possibly this could be provided to independent service organizations as well to service the equipment long after the product is end-of-life and out-of-service.

The problem is that most system vendors and parts manufacturers would rather their customers purchase their more current parts and systems as that’s where they make most of their revenue, not servicing older equipment. But if part, component and system suppliers can provide enough of a benefit with the newer equipment this should just enable other customers that couldn’t afford the new equipment to buy the older stuff.

Something similar has to be done with software code than goes into these systems. The source code needs to to be put in escrow just like the CAD drawings. The nice thing about software is its easier to manufacture as long as you have compatible build tools.  Maybe the source code for the tools needs to be escrowed too.

What about electronics?

There are some limitations given today’s 3D printers. They currently seem much better at printing mechanical parts rather than electronic ones. But the technology is advancing rapidly. The Economist article indicated that there was a company Optomec based in Albuquerque, New Mexico which can print electronics with features as small as 10 microns across. Now todays chip technology is around 20 nm or so, which means they are off by a couple of orders of magnitude.

But we are talking about parts obsolescence. While 20nm electronic 3D printing might be 20 or more years away, it’s not outside the realm of possibility and of course in 20 years from now the electronics (if we keep on Moore’s curve) could be over 1000X smaller than 20nm. So there may be a significant gap for a while yet, at least until Moore’s curve starts slowing down.

But maybe there’s another solution to the electronics parts obsolescence. If manufacuturers were required to supply a detailed gate layout or other electrical design documentation for all electronic parts then perhaps, some ingenious hardware engineer could implement the parts in FPGAs or something similar. Now it seems to me that ASICs today should be able to be converted to FPGAs available 5-10 years down the line.  If this wouldn’t work, maybe some foundry could take the designs and fabricate them. 5 year old electronic technology should be easy to make.  It might be costly at small volumes but it should work.

Analog parts are another matter and I am no hardware engineer so have no idea what proper documentation for these electronics would be. Certainly, there is some standard that could be used, and 5 year old analog parts ought to be easy to make too.

But mechanical and electronics aren’t the only parts today

That leaves a couple of wide areas of materials that are used in every day systems such as magnetic media, fabrics, and other meta-materials that are fabricated with special technologies to perform differently. I suppose some chemical formula and process description might suffice to describe these items and maybe someday 3D printers could take these items on as well.

I am sure missing something.  Seems like cables should be easy just a combination of metal and plastic and the connectors should be more of the same only in a different configurations. I assume that 3D printers should have no problem with optics, but that may be naiveté on my part.

~~~~
Maybe this will take some time to work it’s way through the parts, component, and system suppliers before it can reach reality. Maybe the 3D printers aren’t up to creating all these parts today. But as 3D printing technology matures there will surely come a point in time where we will see the end of obsolescence, end-of-life and end-of-service.

Comments?

Picture Credits: Makerbot Industries – Replicator 2 – 3D-printer 04
Creative Tools

Who’s the next winner in data storage?

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

“The future is already here – just not evenly distributed”, W. Gibson

It starts as it always does outside the enterprise data center. In the line of businesses, in the development teams, in the small business organizations that don’t know any better but still have an unquenchable need for data storage.

It’s essentially an Innovator’s Dillemma situation. The upstarts are coming into the market at the lower end, lower margin side of the business that the major vendors don’t seem to care about, don’t service very well and are ignoring to their peril.

Yes, it doesn’t offer all the data services that the big guns (EMC, Dell, HDS, IBM, and NetApp) have. It doesn’t offer the data availability and reliability that enterprise data centers have come to demand from their storage. require. And it doesn’t have the performance of major enterprise data storage systems.

But what it does offer, is lower CapEx, unlimited scaleability, and much easier to manage and adopt data storage, albeit using a new protocol. It does have some inherent, hard to get around problems not the least of which is speed of data ingest/egress, highly variable latency and eventual consistency. There are other problems which are more easily solvable, with work, but the three listed above are intrinsic to the solution and need to be dealt with systematically.

And the winner is …

It has to be cloud storage providers and the big elephant in the room has to be Amazon. I know there’s a lot of hype surrounding AWS S3 and EC2 but you must admit that they are growing, doubling year over year. Yes it is starting from a much lower capacity point and yes, they are essentially providing “rentable” data storage space with limited or even non-existant storage services. But they are opening up whole new ways to consume storage that never existed before. And therein lies their advantage and threat to the major storage players today, unless they act to counter this upstart.

On AWS’s EC2 website there must be 4 dozen different applications that can be fired up in the matter of a click or two. When I checked out S3 you only need to signup and identify a bucket name to start depositing data (files, objects). After that, you are charged for the storage used, data transfer out (data in is free), and the number of HTTP GETs, PUTs, and other requests that are done on a per month basis. The first 5GB is free and comes with a judicious amount of gets, puts, and out data transfer bandwidth.

… but how can they attack the enterprise?

Aside from the three systemic weaknesses identified above, for enterprise customers they seem to lack enterprise security, advanced data services and high availability storage. Yes, NetApp’s Amazon Direct addresses some of the issues by placing enterprise owned, secured and highly available storage to be accessed by EC2 applications. But to really take over and make a dent in enterprise storage sales, Amazon needs something with enterprise class data services, availability and security with an on premises storage gateway that uses and consumes cloud storage, i.e., a cloud storage gateway. That way they can meet or exceed enterprise latency and services requirements at something that approximates S3 storage costs.

We have talked about cloud storage gateways before but none offer this level of storage service. An enterprise class S3 gateway would need to support all storage protocols, especially block (FC, FCoE, & iSCSI) and file (NFS & CIFS/SMB). It would need enterprise data services, such as read-writeable snapshots, thin provisioning, data deduplication/compression, and data mirroring/replication (synch and asynch). It would need to support standard management configuration capabilities, like VMware vCenter, Microsoft System Center, and SMI-S. It would need to mask the inherent variable latency of cloud storage through memory, SSD and hard disk data caching/tiering.. It would need to conceal the eventual consistency nature of cloud storage (see link above). And it would need to provide iron-clad, data security for cloud storage.

It would also need to be enterprise hardened, highly available and highly reliable. That means dually redundant, highly serviceable hardware FRUs, concurrent code load, multiple controllers with multiple, independent, high speed links to the internet. Todays, highly-available data storage requires multi-path storage networks, multiple-independent power sources and resilient cooling so adding multiple-independent, high-speed internet links to use Amazon S3 in the enterprise is not out of the question. In addition to the highly available and serviceable storage gateway capabilities described above it would need to supply high data integrity and reliability.

Who could build such a gateway?

I would say any of the major and some of the minor data storage players could easily do an S3 gateway if they desired. There are a couple of gateway startups (see link above) that have made a stab at it but none have it quite down pat or to the extent needed by the enterprise.

However, the problem with standalone gateways from other, non-Amazon vendors is that they could easily support other cloud storage platforms and most do. This is great for gateway suppliers but bad for Amazon’s market share.

So, I believe Amazon has to invest in it’s own storage gateway if they want to go after the enterprise. Of course, when they create an enterprise cloud storage gateway they will piss off all the other gateway providers and will signal their intention to target the enterprise storage market.

So who is the next winner in data storage – I have to believe its going to be and already is Amazon. Even if they don’t go after the enterprise which I feel is the major prize, they have already carved out an unbreachable market share in a new way to implement and use storage. But when (not if) they go after the enterprise, they will threaten every major storage player.

Yes but what about others?

Arguably, Microsoft Azure is in a better position than Amazon to go after the enterprise. Since their acquisition of StorSimple last year, they already have a gateway that with help, could be just what they need to provide enterprise class storage services using Azure. And they already have access to the enterprise, already have the services, distribution and goto market capabilities that addresses enterprise needs and requirements. Maybe they have it all but they are not yet at the scale of Amazon. Could they go after this – certainly, but will they?

Google is the other major unknown. They certainly have the capability to go after enterprise cloud storage if they want. They already have Google Cloud Storage, which is priced under Amazon’s S3 and provides similar services as far as I can tell. But they have even farther to go to get to the scale of Amazon. And they have less of the marketing, selling and service capabilities that are required to be an enterprise player. So I think they are the least likely of the big three cloud providers to be successful here.

There are many other players in cloud services that could make a play for enterprise cloud storage and emerge out of the pack, namely Rackspace, Savvis, Terremark and others. I suppose DropBox, Box and the other file sharing/collaboration providers might also be able to take a shot at it, if they wanted. But I am not sure any of them have enterprise storage on their radar just yet.

And I wouldn’t leave out the current major storage, networking and server players as they all could potentially go after enterprise cloud storage if they wanted to. And some are partly there already.

Comments?

 

Enhanced by Zemanta