Blockchains at IBM

img_6985-2I attended IBM Edge 2016 (videos available here, login required) this past week and there was a lot of talk about their new blockchain service available on z Systems (LinuxONE).

IBM’s blockchain software/service  is based on the open source, Open Ledger, HyperLedger project.

Blockchains explained

1003163361_ba156d12f7We have discussed blockchain before (see my post on BlockStack). Blockchains can be used to implement an immutable ledger useful for smart contracts, electronic asset tracking, secured financial transactions, etc.

BlockStack was being used to implement Private Key Infrastructure and to implement a worldwide, distributed file system.

IBM’s Blockchain-as-a-service offering has a plugin based consensus that can use super majority rules (2/3+1 of members of a blockchain must agree to ledger contents) or can use consensus based on parties to a transaction (e.g. supplier and user of a component).

BitCoin (an early form of blockchain) consensus used data miners (performing hard cryptographic calculations) to determine the shared state of a ledger.

There can be any number of blockchains in existence at any one time. Microsoft Azure also offers Blockchain as a service.

The potential for blockchains are enormous and very disruptive to middlemen everywhere. Anywhere ledgers are used to keep track of assets, information, money, etc, that undergo transformations, transitions or transactions as they are further refined, produced and change hands, can be easily tracked in blockchains.  The only question is can these assets, information, currency, etc. be digitally fingerprinted and can that fingerprint be read/verified. If such is the case, then blockchains can be used to track them.

New uses for Blockchain

img_6995IBM showed a demo of their new supply chain management service based on z Systems blockchain in action.  IBM component suppliers record when they shipped component(s), shippers would record when they received the component(s), port authorities would record when components arrived at port, shippers would record when parts cleared customs and when they arrived at IBM facilities. Not sure if each of these transitions were recorded, but there were a number of records for each component shipment from supplier to IBM warehouse. This service is live and being used by IBM and its component suppliers right now.

Leanne Kemp, CEO Everledger, presented another example at IBM Edge (presumably built on z Systems Hyperledger service) used to track diamonds from mining, to cutter, to polishing, to wholesaler, to retailer, to purchaser, and beyond. Apparently the diamonds have a digital bar code/fingerprint/signature that’s imprinted microscopically on the diamond during processing and can be used to track diamonds throughout processing chain, all the way to end-user. This diamond blockchain is used for fraud detection, verification of ownership and digitally certify that the diamond was produced in accordance of the Kimberley Process.

Everledger can also be used to track any other asset that can be digitally fingerprinted as they flow from creation, to factory, to wholesaler, to retailer, to customer and after purchase.

Why z System blockchains

What makes z Systems a great way to implement blockchains is its securely, isolated partitioning and advanced cryptographic capabilities such as z System functionality accelerated hashing, signing & securing and hardware based encryption to speed up blockchain processing.  z Systems also has FIPS-140 level 4 certification which can provide the highest security possible for blockchain and other security based operations.

From IBM’s perspective blockchains speak to the advantages of the mainframe environments. Blockchains are compute intensive, they require sophisticated cryptographic services and represent formal systems of record, all traditional strengths of z Systems.

Aside from the service offering, IBM has made numerous contributions to the Hyperledger project. I assume one could just download the z Systems code and run it on any LinuxONE processing environment you want. Also, since Hyperledger is Linux based, it could just as easily run in any OpenPower server running an appropriate version of Linux.

Blockchains will be used to maintain the system of record of the future just like mainframes maintained the systems of record of today and the past.

Comments?

 

EMCWorld2015 Day 2&3 news

Some additional news from EMCWorld2015 this week:

IMG_4527 IMG_4528 IMG_4531EMC announced directed availability for DSSD, their Rack scale shared Flash storage solution using a PCIe3 (switched) fabric with 36 dual ported, flash modules, which hold 512 NAND chips for 144TB NAND flash storage. On the stage floor they had a demonstration pitting a  40 node Hadoop cluster with DAS against a 15 node Hadoop cluster using the DSSD, both running HIVE and working on the same Query. By the time the 40node/DAS solution got to about 2% of the query completion the 15node/DSSD based cluster had finished the query without breaking a sweat. They then ran an even more complex query and it took no time at all.

They also simulated a copy of a 4TB file (~32K-128K IOs) from memory to memory and it took literally seconds, then copied it to SSD that took considerably longer (didn’t catch how long but much longer than memory), and then they showed the same file copy to DSSD and it only took seconds, almost looked exactly a smidgen slower than the memory to memory copy.

They said the PCIe fabric (no indication what the driver was) provided much more parallelism to the dual ported flash storage that the system was almost able to complete the 4TB copy at memory to memory speeds. It was all pretty impressive, albeit a simulation of the real thing.

EMC indicated that they designed the flash modules themselves and expect to double capacity of the DSSD to 288TB shortly. They showed the controller board that had a mezzanine board over a part of it, but together had 12 major chips on it which I assume had something to do with the PCIe fabric. They said there were two controllers in the system for high availability and the 144TB DSSD was deployed in 5U of space.

I can see how this would play well for real time analytics, high frequency trading and HPC environments but there’s more to shared storage than just speed. Cost wasn’t mentioned neither was the software driver but with the ease with which it worked on the Hive query, I can only assume at some lever it must look something like a DAS device but with memory access times… NVMe anyone?

Project CoprHD was announced which open sourced EMC’s ViPR Controller software. Many ViPR customers were asking for EMC to open source ViPR controller, apparently their listening. Hopefully this will enable some participation from non-EMC storage vendors to allow their storage to be brought under the management of ViPR Controller. I believe the intent is to have an EMC hardened/supported version of Project CoprHD or ViPR Controller to coexist with the open source project version which anyone can download and modify for themselves.

A Non-production, downloadable version of ScaleIO was also announced. The test-dev version is a free download with unlimited capacity, full functionality and available for an unlimited time but only for non-production use.  Another of the demos onstage this morning was Chad configuring storage across a ScaleIO cluster and using its QoS services to limit the impact of a specific workload. There was talk that ScaleIO was available previously as a free download but it took a bunch of effort to find and download. They have removed all these prior hindrances and soon, if not today it’s freely available for anyone. ScaleIO runs on VMware and other hypervisors (maybe bare metal as well). So if you wanted to get your feet wet with software defined storage, this sounds like the perfect opportunity.

ECS is being added to EMC’s Data Lake foundation. Not exactly sure what are all the components in the data lake solution but previously the only Data Lake storage was Isilon based. This week EMC added Elastic Cloud Storage to the picture. Recall that Elastic Cloud Storage comes in either a software only or hardware appliance deployment and provides object storage.

I missed Project Liberty before but it’s a virtual VNX appliance, software only version.  I assume this is intended for ROBO deployments or very low end business environments. Presumably it runs on VMware and has some sort of storage limitations. It seems, more and more of EMC products are coming out in virtual appliance versions.

Project Falcon was also announced which is a virtual Data Domain appliance, software only solution, targeted for ROBO environments and other small enterprises. The intent is to have an onramp for DataDomain backup storage.  I assume runs under VMware.

Project Caspian – rolling out CloudScaling orchestration/automation for OpenStack deployments. On the big stage today, Chad and Jeremy demonstrated Project Caspian on a VCE VxRACK deploying racks of servers under OpenStack control. They were able within a couple of clicks define and deploy openstack on bare metal hardware and deploy applications to the OpenStack servers. They had a monitoring screen which showed the OpenStack server activity (transactions) in real time and showed an over commit of the rack and how easy it was to add a new rack with more servers. All this seemed to take but a few clicks. The intent is not to create another OpenStack distribution but to provide an orchestration/automation/monitoring layer of software on top of OpenStack to “industrialize OpenStack” for enterprise users. Looked pretty impressive to me.

I would have to say the DSSD box was most impressive. It would have been interesting to get an upclose look at the box with some more specifications but they didn’t have one on the Expo floor.

Existential threats

Not sure why but lately I have been hearing a lot about existential events. These are events that threaten the existence of humanity itself.

Massive Solar Storm

A couple of days ago I read about the Carrington Event which was a massive geomagnetic solar storm in 1859. Apparently it wreaked havoc with the communications infrastructure of the time (telegraphs). Researchers have apparently been able to discover other similar events in earth’s history by analyzing ice cores from Greenland which indicate that events of this magnitude occur once every 500 years and smaller events typically occur multiple times/century.

Unclear to me what a solar storm of the magnitude of the Carrington Event would do to the world as we know it today, but we are much more dependent on electronic communications, radio, electronic power, etc. If such an event were to take out, 50% of our electro-magnetic infrastructure, such as frying power transformers, radio transceivers, magnetic storage/motors/turbines, etc. civilization as we know it would be brought back to the mid 1800’s but with a 21st century population.

This would last until we could rebuild all the lost infrastructure, at tremendous cost. During this time we would be dependent on animal-human-water power, paper-optical based communications/storage, and animal-wind transport.

It appears that any optical based communication/computer systems would remain intact but powering them would be problematic without working transformers and generators.

One article (couldn’t locate this) stated that the odds of another Carrington Event happening is 12%  by 2022. But the ice core research seems to indicate that it should be higher than this. By my reckoning, it’s been 155 years since the last event, which means we are ~1/3rd of the way through the next 500 years, so I would expect the probability of a similar event happening to be ~1/3 at this point and rising slightly every year until it happens again.

Superintelligence

I picked up a book called Superintelligence: Paths, Dangers, Strengths by Nick Bostrom last week and started reading it last night. It’s about the dangers of AI gaining the ability to improve itself and after that becoming not just equivalent to Human Level Intelligence (HMLI) but greatly exceeding HMLI at a super-HMLI level (Superintelligent). This means some Superintelligent entity that would have more intelligence than our current population of humans today, by many orders of magnitude.

Bostrom discusses the take off processes that would lead to Superintelligence and some of the ways we could hope to control it. But his belief is that trying to install any of these controls after it has reached HMLI would be fruitless.

I haven’t finished the book but what I have read so far, has certainly scared me.

Bostrom presents three scenarios for a Superintelligence take off: slow take off, fast take off and medium take off. He believes that in a slow take off scenario there may be many opportunities to control the emerging Superintelligence. In a moderate or medium take off, we would know that something is wrong but would have only some limited opportunity to control it. In the fast take off (literally 18months from HMLI to Superintelligence in one scenario Bostrom presents), the likelihood of controlling it after it starts are non-existent.

The later half of Bostrom’s book discusses potential control mechanisms and other ways to moderate the impacts of superintelligence.  So far I don’t see much hope for mankind in the controls he has proposed. But l am only half way through the book and hope to see more substantial mechanisms in the 2nd half.

In the end, any Superintelligence could substantially alter the resources of the world and the impact this would have on humanity is essentially unpredictable. But by looking at recent history, one can see how other species have faired as humanity has altered the resources of the earth. Humanity’s rise has led to massive species die offs, for any species that happened to lie in the way of human progress.

The first part of Bostrom’s book discusses some estimates as to when the world will reach AI with HMLI. Most experts believe that we will see HMLI like this with a 90% probability by the year 2075 and a 50% probability by the year 2050. As for the duration of take off to superintelligence ,the expert opinions are mixed and he believes that they highly underestimate the speed of take off.

Humanity’s risks

The search for extra-terristial intelligence has so far found nothing. One of the parameters for the odds of a successful search was the number of inhabitable planets in the universe. But the another parameter is the ability of a technological civilization to survive long enough to be noticed – the likelihood of a civilization to survive any existential risk that comes up.

Superintelligence and massive solar storms represent just two such risks but there are a multitude of others that can be identified today, and tomorrow’s technological advances will no doubt give rise to more.

Existential risks like these are ever-present and appear to be growing as our technolgical prowess grows. My only problem is that today the study of existential risks seem at best, ad hoc today and at worst, outright disregard.

I believe the best policy is to recognize known existential risks, have some intelligent debate on how probably they are and how we could potentially check them. There really needs to be some systematic study of existential risks around the world bringing academics and technologists together to understand and to mitigate them. The threats to humanity are real, we can continue to ignore them, study a few that gain human interest, or actively seek out and mitigate all of them we can.

Comments?

Photo Credit(s): C3-class Solar Flare Erupts on Sept. 8, 2010 [Detail] by NASA Goddard’s space flight center photo stream

RoS video interview with Ron Redmer Sr. VP Cybergroup

Ray interviewed Ronald Redmer, Sr. VP Cybergroup at EMC’s Global Analyst Summit back in October. Ron is in charge of engineering and product management of their new document analytics service offering. Many of their new service offerings depend on EMC Federation solutions such as ViPR (see my post EMC ViPR virtues & vexations but no virtualization), Pivotal HD, and other offerings.

This was recorded on October 28th in Boston.

New Global Learning XPrize opens

Read a post this week in Gizmag about the new Global Learning XPrize. Past XPrize contests have dealt with suborbital spaceflight, super-efficient automobiles,  oil cleanup, and  lunar landers.

Current open XPrize contests include: Google Lunar Lander, Qualcomm Tricorder medical diagnosis, Nokia Health Sensing/monitoring and Wendy Schmidt Ocean Health Sensing. So what’s left?

World literacy

There are probably a host of issues that the next XPrize could go after but one that might just change the world is to improve children literacy.  According to UNESCO (2nd Global Report on Adult Learning and Education  [GRALE 2]) there are over 250M children of primary school age that will not reach grade 4 levels of education in the world, these children cannot read, write or do basic arithmetic. Given current teaching methods we would need an additional 1.6M teachers to teach all these children. As such, to teach all these children when we include teacher salaries, classroom spaces, supplies, etc. would be highly expensive. There has to be a better, more scaleable way to do this.

Enter the Global Learning XPrize. The intent of this XPrize is to create a tablet application which can teach children how to read, write and do rudimentary arithmetic in 18 months without access to a teacher or other supervised learning.

Where are they in the XPrize process?

The Global Learning XPrize already has raised $15M for the actual XPrize but they are using a crowd funding approach to fund the last $500K which will be used to field test the  Global Learning XPrize candidates. The crowd funding is being done on Indiegogo.

Registration starts now and runs through March 2015, Software development runs through September 2016, at which time five finalists will be selected, each will receive the $1M finalist XPrize to fund a further round of coding. In May of 2017, the five apps will be loaded onto tablets and field testing commences in June 2017 through December 2018. At which time the winner will be selected and will recieve the $10M XPrize.

What other projects have been tried?

I once read an article about the  Hole in the wall computer, where NIIT and their technologists placed an outside hardened, internet connected computer inside a brick wall  in an Indian underprivileged area. The intent was to show that children could learn how to use computers on their own, without adult supervision. Within days children were able to “browse, play games, create documents and paint pictures” on the computer. So minimally invasive education (MIE) can be made to work.

Whats the hardware environment going to look like

There’s no reason that an Android tablet would be any worse and potentially could be much better than a internet connected computer.

Although the tablets will be internet connected it is assumed that the connection will be not always on so the intent is that the apps run standalone as much as possible. Also, I believe that a child will be given a tablet which will be for their exclusive use during the 18 months. The Global Learning XPrize team will insure that there are charging stations where the tablets can be charged once/day but we shouldn’t assume that they can be charged while they are being used.

How are the entries to be judged

The finalists will be judged against EGRA (early grade reading assessment), EGWA (early grade writing assessment), and EGMA (early grad math assessment). The chosen language is to be English and the intent is to use children in countries which have an expressed interest in using English. The Grand winner will be judged to have succeeded if its 7 to 12 year old students can score twice as as well on the EGRA, EGWA and EGMA as a control group. [Not sure what a control group would look like for this nor what they would be doing during the 18 months]. For more information checkout the XPrize guidelines v1 pdf.

The assumption is that there will be about 30 children per village and enough villages will be found to provide a statistically valid test of the five learning apps against a control group.

At the end of all this the winning entry and the other four finalists will have their solutions be open sourced, for the good of the world.

Registration is open now…

Entry applications are $500. Finalists win $1M and the winner will take home $10M.

I am willing to put up the $500 application fee for the Global Learning XPrize. Having never started an open source project, never worked on developing an Android tablet application, or done anything other than some limited professional training this will be entirely new to me – so it should be great fun.  I am thinking of creating a sort of educational video game  (yet another thing I have no knowledge about, :).

We have until March of 2015 to see if we can put a team together to tackle this. I think if I can find four other (great) persons to take this on, we will give it a shot. I hope to enter an application by February of 2015, if we can put together a team by then to tackle this.

Anyone interested in tackling the Global Learning XPrize as an open source project from the gitgo, please comment on this post to let me know.

Photo Credit(s): Kid iPad outside by Alice Keeler

Token ring road traffic control and congestion management

Read an article the other day in Wired, A system to cut traffic that just might work, about two MIT students doing research to help Singapore better manage traffic congestion. They have come up with a sort of token ring network for traffic.

In their approach every car when it enters a “congestion zone” is suppplied an electronic token and when that car leaves the zone it retires it’s token (sound familiar). When the zone is too congested, no new tokens are handed out and cars are re-routed around the zone using GPS provided directions.

It seems a bit hokey but using tokens to control congestion is an old technology and works just fine. The problem with applying tokens to controlling road congestion is that it’s not so easy to re-route someone around a zone if you have to go into it for work or entertainment.

Traffic congestion management today

Most congestion management schemes use congestion toll pricing with transponder and radio transmitters/receivers at entry points into congestion zones. In this fashion metropolitan areas can raise and lower toll pricing on traffic that enters the zone as an incentive to reduce traffic. But this requires special purpose transponders in every car and radio towers at every entry and exit point which fixes the congestion zone boundaries and has a high initial fixed costs.

Singapore’s congestion approach is similar with transponders and radio readers at select entry and exit point locations around the city.

Traffic management via tokens

What the MIT researchers have done is to use a broader WiFi type of radio transmitter in their car transponders with a wider range and use cell tower-like receivers around a metro area to triangulate where a car is and when it’s in a congestion zone and to transfer this information to a central repository.

One advantage to the MIT solution is that the congestion zones are no longer fixed, but can become whatever boundary a city administrator wants to create on a map of the city. This way, different zones could be attempted as experiments whenever it made sense to do so.  Sort of like having a completely configurable congestion zone which can be turned on and off based on the requirements of the moment. And the zones don’t even have to be a polygon at all, any closed form, that could be drawn on a map could represent a new zone.   And of course you could have multiple layers of zones. All this could be almost instantly configurable and trial-able on a whim, like a software defined traffic management (SDTM).

I suppose one problem with using SDTM for toll pricing is that people would need to know ahead of time the cost of traveling through a zone. Maybe that’s why the token approach is better because without a token, you are directed to stop or on another route, outside or around the zone. In one waye tokens could be used as sort of a sophisticated onramp stop signal, that only allows passage when a token frees up.

Maybe token’s should be retired not just when you leave a zone but when you stop moving  or when the engine is turned off as well, that way as cars are parked, their tokens could be freed up for other cars.

How you get people to go along with the token management is another question. As the system is tracking cars automatically, one could automatically fine drivers for violating the token scheme.

~~~~

Thank goodness my commute days are long gone.  I get the feeling it’s going to become a lot more interesting driving to work in the future.

Comments?

Photo Credits: World Class Traffic Jam by JosieShowaa

Protest intensity, world news database and big data – chart of the month

Read an article the other day on the analysis of the Arab Spring (Did the Arab Spring really spark a wave of global protests, in Foreign Policy) using a Google Ideas sponsored project, the GDELT ProjectTime domain run chart showing protest intensity every month for the last 30 years, with running average (Global Database of Events, Language and Tone) file of  events extracted from worldwide media sources.  The GDELT database uses sophisticated language processing to extract “event” data from news media streams and supplies this information in database form.  The database can be analyzed  to identify  trends in world events and possibly to better understand what led up to events that occur on our planet.

GDELT Project

The GDELT database records over 300 categories of events that are geo-referenced to city/mountaintop and time-referenced. The event data dates back to 1979.  The GDELT data captures 60 attributes of any event that occurs, generating a giant spreadsheet of event information with location, time, parties, and myriad other attributes all identified, and cross-referenceable.

Besides the extensive spreadsheet of world event attribute data the GDELT project also supplies a knowledge graph oriented view of its event data. The GDELT knowledge graph “compiles a list of every person, organization, company, location and over 230 themes and emotions from every news report” that can then be used to create network diagrams/graphs to be better able to visualize interactions between events. 

For example see the Global Conversation in Foreign Policy, for a network diagram of every person mentioned in the news during 6 months of 2013.  You can zoom in or out to see how people identified in news reports are connected during the six months. So if you we’re interested, in let’s say the Syrian civil war, one could easily see at a glance any news item that mentioned Syria or was located in Syria since 1979 to now. Wow!

Arab Spring and Worldwide Protest

Getting back to the chart-of-the-month, the graphic above shows the “protest intensity” by month for the last 30 years with a running average charted in black using GDELT data.  (It’s better seen in the FP article/link above or just click on it for an expanded view. ).

One can see from the chart that there was a significant increase in protest activity after January 2011, which corresponds to the beginning of the Arab Spring.  But the amazing inference from the chart above is that this increase has continued ever since. This shows that the Arab Spring has had a lasting contribution that has significantly increased worldwide protest activity.

This is just one example of the types of research available with the GDELT data.

~~~~

I have talked in the past about how (telecom, social media and other) organizations should deposit their corporate/interaction data sets in some public repository for the better good of humanity so that any researcher could use it (see my Data of the world, lay down your chains post for more on this). The GDELT Project is Google Ideas doing this on a larger scale than I ever thought feasible. Way to go.

Comments?

 Image credits: (c) 2014 ForeignPolicy.com, All Rights Reserved

 

 

Cloud storage growth is hurting NAS & SAN storage vendors

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

My friend Alex Teu (@alexteu), from Oxygen Cloud wrote a post today about how Cloud Storage is Eating the World Alive. Alex reports that all major NAS and SAN storage vendors lost revenue this year over the previous year ranging from a ~3% loss to over a 20% loss (Q1-2014 compared to Q1-2013, from IDC).

Although an interesting development, it’s hard to say that this is the end of enterprise storage as we know it.  I believe there are a number of factors that are impacting  enterprise storage revenues and Cloud storage adoption may be only one of them.

Other trends impacting NAS & SAN storage adoption

One thing that has emerged over the last decade or so is the advance of Flash storage. Some of this is used in storage controllers to speed up IO access and some is used in servers to speed up IO access. But any speedup of IO could potentially reduce the need for high-performing disk drives and could allow customers to use higher capacity/slower disk drives instead. This could definitely reduce the cost of storage systems. A little bit of flash goes  long way to speed up IO access.

The other thing is that disk capacity is trending upward, at exponential rates. Yesterday,s 2TB disk drive is todays 4TB disk drive and we are already seeing 6TB from Seagate, HGST and others. And this is also driving down the cost of NAS and SAN storage.

Nowadays you can configure 1PB of storage with just over 170 drives. Somewhere in there you might want a couple 100TB of Flash to speed up IO access to these slow disks but Flash is also coming down in ($/GB) price (see SanDISK’s recent consumer grade TLC drive at $0.44/GB). Also the move to MLC flash has increased the capacity of flash devices, leading to less SSDs/flash cache cards to store/speed up more data.

Finally, the other trend which seems to have emerged recently is the movement away from enterprise class storage to server storage. One can see this in VMware’s VSAN, HyperConverged systems such as Nutanix and Scale Computing, as well as a general trend in Windows Server applications (SQL Server, Exchange Server, etc.) to make better use of DAS storage. So some customers are moving their data to shared DAS storage today, whereas before this was more difficult to accomplish effectively and because of that they previously purchased networked storage.

What about cloud storage?

Yes, as Alex has noted, the price of cloud storage has declined precipitously over the last year or so. Alex’s cloud storage pricing graph is shows how the entry of Microsoft and Google has seemingly forced Amazon to match their price reductions. But the other thing of note is that they have all come down to about the same basic price of $0.024/GB/Month.

It’s interesting that Amazon delayed their first S3 serious price reductions by about 4 months after Azure and Google Cloud Storage dropped there’s and then within another month after that, they all were at price parity.

What’s cloud storage real growth?

I reported last August that Microsoft Azure and Amazon S3 were respectively storing 8 trillion and over 2 trillion objects (see my Is object storage outpacing structured and unstructured data growth). This year (April 2014) Microsoft mentioned at TechEd that Azure was storing 20 Trillion object and servicing 2 million request per second.

I could find no update to Amazon S3 numbers from last year but the 10x  2.5x growth in Azure’s object count in ~8 months and the roughly doubling of request/second (In my post I didn’t mention last year they were processing 900K requests/second) say something interesting is going on in cloud storage.

I suppose Google’s cloud storage service is too new to report serious results and maybe Amazon wants to keep their growth a secret. But considering Amazon’s recent matching of Azure’s and Google’s pricing, it probably means that their growth wasn’t what they expected.

The other interesting item from the Microsoft discussions on Azure, was that they were already hosting 1M SQL databases in Azure and that 57% of Fortune 500 customers are currently using Azure.

In the “olden days”, before cloud storage, all these SQL databases and Fortune 500 data sets would have more than likely resided on NAS or SAN storage of some kind. And possibly due to the traditional storage’s higher cost and greater complexity, some of this data would never have been spun up in the first place if they had to use traditional storage, but with cloud storage so cheap, rapidly configurable and easy to use all this new data was placed in the cloud.

So I must conclude from Microsofts growth numbers and their implication for the rest of the cloud storage industry that maybe Alex was right, more data is moving to the cloud and this is impacting traditional storage revenues.  With IDC’s (2013) data growth at ~43% per year, it would seem that Microsoft’s cloud storage is growing more rapidly than the worldwide data growth, ~14X faster!

On the other hand, if cloud storage was consuming most of the world’s data growth, it would seem to precipitate the collapse of traditional storage revenues, not just a ~3-20% decline. So maybe the most new cloud storage applications would never have been implemented before if they had to use traditional storage, which means that only some of this new data would ever have been stored on traditional storage in the first place, leading to a relatively smaller decline in revenue.

One question remains: is this a short term impact or more of a long running trend that will play out over the next decade or so? From my perspective, new applications spinning up on non-traditional storage is a long running threat to traditional NAS and SAN storage which will ultimately see traditional storage relegated to a niche. How big this niche will ultimately be and how well it can be defended needs to be the subject for another post?

~~~~

Comments?