Why EMC is doing Project Lightening and Thunder

Picture of atmospheric lightening striking ground near a building at night
rayo 3 by El Garza (cc) (from Flickr)

Although technically Project Lightening and Thunder represent some interesting offshoots of EMC software, hardware and system prowess,  I wonder why they would decide to go after this particular market space.

There are plenty of alternative offerings in the PCIe NAND memory card space.  Moreover, the PCIe card caching functionality, while interesting is not that hard to replicate and such software capability is not a serious barrier of entry for HP, IBM, NetApp and many, many others.  And the margins cannot be that great.

So why get into this low margin business?

I can see a couple of reasons why EMC might want to do this.

  • Believing in the commoditization of storage performance.  I have had this debate with a number of analysts over the years but there remain many out there that firmly believe that storage performance will become a commodity sooner, rather than later.  By entering the PCIe NAND card IO buffer space, EMC can create a beachhead in this movement that helps them build market awareness, higher manufacturing volumes, and support expertise.  As such, when the inevitable happens and high margins for enterprise storage start to deteriorate, EMC will be able to capitalize on this hard won, operational effectiveness.
  • Moving up the IO stack.  From an applications IO request to the disk device that actually services it is a long journey with multiple places to make money.  Currently, EMC has a significant share of everything that happens after the fabric switch whether it is FC,  iSCSI, NFS or CIFS.  What they don’t have is a significant share in the switch infrastructure or anywhere on the other (host side) of that interface stack.  Yes they have Avamar, Networker, Documentum, and other software that help manage, secure and protect IO activity together with other significant investments in RSA and VMware.   But these represent adjacent market spaces rather than primary IO stack endeavors.  Lightening represents a hybrid software/hardware solution that moves EMC up the IO stack to inside the server.  As such, it represents yet another opportunity to profit from all the IO going on in the data center.
  • Making big data more effective.  The fact that Hadoop doesn’t really need or use high end storage has not been lost to most storage vendors.  With Lightening, EMC has a storage enhancement offering that can readily improve  Hadoop cluster processing.  Something like Lightening’s caching software could easily be tailored to enhance HDFS file access mode and thus, speed up cluster processing.  If Hadoop and big data are to be the next big consumer of storage, then speeding cluster processing will certainly help and profiting by doing this only makes sense.
  • Believing that SSDs will transform storage. To many of us the age of disks is waning.  SSDs, in some form or another, will be the underlying technology for the next age of storage.  The densities, performance and energy efficiency of current NAND based SSD technology are commendable but they will only get better over time.  The capabilities brought about by such technology will certainly transform the storage industry as we know it, if they haven’t already.  But where SSD technology actually emerges is still being played out in the market place.  Many believe that when industry transitions like this happen it’s best to be engaged everywhere change is likely to happen, hoping that at least some of them will succeed. Perhaps PCIe SSD cards may not take over all server IO activity but if it does, not being there or being late will certainly hurt a company’s chances to profit from it.

There may be more reasons I missed here but these seem to be the main ones.  Of the above, I think the last one, SSD rules the next transition is most important to EMC.

They have been successful in the past during other industry transitions.  If anything they have shown similar indications with their acquisitions by buying into transitions if they don’t own them, witness Data Domain, RSA, and VMware.  So I suspect the view in EMC is that doubling down on SSDs will enable them to ride out the next storm and be in a profitable place for the next change, whatever that might be.

And following lightening, Project Thunder

Similarly, Project Thunder seems to represent EMC doubling their bet yet again on the SSDs.  Just about every month I talk to another storage startup coming out in the market providing another new take on storage using every form of SSD imaginable.

However, Project Thunder as envisioned today is not storage, but rather some form of external shared memory.  I have heard this before, in the IBM mainframe space about 15-20 years ago.  At that time shared external memory was going to handle all mainframe IO processing and the only storage left was going to be bulk archive or migration storage – a big threat to the non-IBM mainframe storage vendors at the time.

One problem then was that the shared DRAM memory of the time was way more expensive than sophisticated disk storage and the price wasn’t coming down fast enough to counteract increased demand.  The other problem was making shared memory work with all the existing mainframe applications was not easy.  IBM at least had control over the OS, HW and most of the larger applications at the time.  Yet they still struggled to make it usable and effective, probably some lesson here for EMC.

Fast forward 20 years and NAND based SSDs are the right hardware technology to make  inexpensive shared memory happen.  In addition, the road map for NAND and other SSD technologies looks poised to continue the capacity increase and price reductions necessary to compete effectively with disk in the long run.

However, the challenges then and now seem as much to do with software that makes shared external memory universally effective as with the hardware technology to implement it.  Providing a new storage tier in Linux, Windows and/or VMware is easier said than done. Most recent successes have usually been offshoots of SCSI (iSCSI, FCoE, etc).  Nevertheless, if it was good for mainframes then, it certainly good for Linux, Windows and VMware today.

And that seems to be where Thunder is heading, I think.




How has IBM research changed?

IBM Neuromorphic Chip (from Wired story)

What does Watson, Neuromorphic chips and race track memory have in common. They have all emerged out of IBM research labs.

I have been wondering for some time now how it is that a company known for it’s cutting edge research but lack of product breakthrough has transformed itself into an innovation machine.

There has been a sea change in the research at IBM that is behind the recent productization of tecnology.

Talking the past couple of days with various IBMers at STGs Smarter Computing Forum, I have formulate a preliminary hypothesis.

At first I heard that there was a change in the way research is reviewed for product potential. Nowadays, it almost takes a business case for research projects to be approved and funded. And the business case needs to contain a plan as to how it will eventually reach profitability for any project.

In the past it was often said that IBM invented a lot of technology but productized only a little of it. Much of their technology would emerge in other peoples products and IBM would not recieve anything for their efforts (other than some belated recognition for their research contribution).

Nowadays, its more likely that research not productized by IBM is at least licensed from them after they have patented the crucial technologies that underpin the advance. But it’s just as likely if it has something to do with IT, the project will end up as a product.

One executive at STG sees three phases to IBM research spanning the last 50 years or so.

Phase I The ivory tower:

IBM research during the Ivory Tower Era looked a lot like research universities but without the tenure of true professorships. Much of the research of this era was in materials and pure mathematics.

I suppose one example of this period was Mandlebrot and fractals. It probably had a lot of applications but little of them ended up in IBM products and mostly it advanced the theory and practice of pure mathematics/systems science.

Such research had little to do with the problems of IT or IBM’s customers. The fact that it created pretty pictures and a way of seeing nature in a different light was an advance to mankind but it didn’t have much if any of an impact to IBM’s bottom line.

Phase II Joint project teams

In IBM research’s phase II, the decision process on which research to move forward on now had people from not just IBM research but also product division people. At least now there could be a discussion across IBM’s various divisions on how the technology could enhance customer outcomes. I am certain profitability wasn’t often discussed but at least it was no longer purposefully ignored.

I suppose over time these discussions became more grounded in fact and business cases rather than just the belief in the value of the research for research sake. Technological roadmaps and projects were now looked at from how well they could impact customer outcomes and how such technology enabled new products and solutions to come to market.

Phase III Researchers and product people intermingle

The final step in IBM transformation of research involved the human element. People started moving around.

Researchers were assigned to the field and to product groups and product people were brought into the research organization. By doing this, ideas could cross fertilize, applications could be envisioned and the last finishing touches needed by new technology could be envisioned, funded and implemented. This probably led to the most productive transition of researchers into product developers.

On the flip side when researchers returned back from their multi-year product/field assignments they brought a new found appreciation of problems encountered in the real world. That combined with their in depth understanding of where technology could go helped show the path that could take research projects into new more fruitful (at least to IBM customers) arenas. This movement of people provided the final piece in grounding research in areas that could solve customer problems.

In the end, many research projects at IBM may fail but if they succeed they have the potential to make change IT as we know it.

I heard today that there were 700 to 800 projects in IBM research today if any of them have the potential we see in the products shown today like Watson in Healthcare and Neuromorphic chips, exciting times are ahead.

Is cloud a leapfrog technology?

Mobile Phone with Money in Kenya by whiteafrican (cc) (from Flickr)
Mobile Phone with Money in Kenya by whiteafrican (cc) (from Flickr)

Read an article today about Safaricom creating a domestic cloud service offering outside Nairobi in Kenya (see Chasing the African Cloud).

But this got me to thinking that cloud services may be just like mobile phones in that developing countries can use it to skip over older technologies like wired phone lines and gain advantages of more recent technology that offers similar services, the mobile phone without the need to bother with the expense and time to build telephone wires across the land.

Leapfrogging IT infrastructure buildout

In the USA, cloud computing, cloud storage, and SAAS services based in the cloud are essentially taking the place of small business IT infrastructure services today.  Many small businesses skip over building their own IT infrastructures, absolutely necessary years ago for email, web services, back office processing, etc., and are moving directly to using cloud service providers for these capabilities.

In some cases, it’s even more than  just the IT infrastructure, as the application, data and processing services all can be supplied from SAAS providers.

Today, it’s entirely possible to run a complete, very large business without owning a stitch of IT infrastructure (other than desktops, laptops, tablets and mobile phones) by doing this

Developing countries can show us the way

Developing countries can do much the same for their economic activity. Rather than have their small businesses spend time building out homegrown IT infrastructure just lease it out from one or more domestic (or international) cloud service providers and skip the time, effort and cost of doing it your self.

Hanging out with Kenya Techies by whiteafrican (cc) (from Flickr)
Hanging out with Kenya Techies by whiteafrican (cc) (from Flickr)

Given this dynamic, cloud service vendors ought to be focusing more time and money on developing countries. They should adopt such services more rapidly because they don’t have the sunk costs in current, private IT infrastructure and applications.

China moves into the cloud

I probably should have caught on earlier.  Earlier this year I was at a vendor analyst meeting, having dinner with a colleague from the China Center for Information Industry Development (CCID) Consulting.  He mentioned that Cloud was one of a select set of technologies that China was focusing considerable state and industry resources on.   At the time, I just thought this was prudent thinking to keep up with industry trends. What I didn’t realize at the time was that the cloud could be a leap frog technology that would help them avoid a massive IT infrastructure build out in millions of small companies in their nation.

One can see that early adopter nations have understood that with the capabilities of mobile phones they can create a fully functioning telecommunications infrastructure almost overnight.  Much the same can be done with cloud computing, storage and services.

Now if they can only get WiMAX up and running to eliminate cabling their cities for internet access.



Server virtualization vs. storage virtualization

Functional Fusion? by Cain Novocaine (cc) (from Flickr)
Functional Fusion? by Cain Novocaine (cc) (from Flickr)

One can only be perplexed by the seemingly overwelming adoption of server virtualization and contrast that with the ho-hum, almost underwelming adoption of storage virtualization.  Why is there this significant a difference?

I think the problem is partly due to the lack of an common understanding of storage performance utilization.

Why server virtualization succeeded

One significant driver of server virtualization is the precipitous drop in server utilization that occurred over the last decade when running single applications on a physical server.  It was nothing to see real processor utilization of less than 10% and consequently it was easy to envision that executing 5-10 applications on the single server. And what’s more each new generation of server kept getting more powerful, handling double the MIPs every 18 months or so driven by Moore’s law.

The other factor was that application workloads weren’t increasing that much. Yes new applications would come online but they seldom consumed an inordinate amount of MIPs and were often similar to what was already present. So application processing growth while not flatlining, was expanding at a relatively slow speed.

Why storage virtualization has failed

Data on the other hand continues its never ending exponential growth. Doubling every 3-5 years or less. And the fact that you have more data, almost always requires more storage hardware to support the IOPs being required to support it.

In the past the storage IOP rates was intrinsically tied to the number of disk heads available to service the load.  Although disk performance grew it wasn’t doubling every 18 months, and real per disk performance was actually going down over time, measured as the amount of IOPS per GB.

This drove proliferation of disk spindles and as such, storage subsystems in the data center. Storage virtualization couldn’t reduce the number of spindles required to support the workload.

Thus, if you look at storage performance from the perspective of % IOPS one could support per disk, most  sophisticated systems were running anywhere from 75% to 150% (based on DRAM caching).

Paradigm shift ahead

But SSDs can change this dynamic considerably.  A typical SSD can sustain 10-100K IOPs and there is some liklihood that this will increase with each generation that comes out but the application requirements will not increase as fast.  Hence, , there is a high liklihood that normal data center utilisation of SSD storage perfomance will start to drop below 50% or more, when that happens. -torage virtualization may start to make a lot more sense.

Maybe when (SSD) data storage starts moving more in line with Moore’s law, storage virtualization will become a more dominant paradigm for data center storage use.

Any bets on who the VMware of storage virtualization will be?


Our long romance with Apple technology


Lisa 2/5 by MattsMacintosh (cc) (from Flickr)
Lisa 2/5 by MattsMacintosh (cc) (from Flickr)

We all heard last night of the passing of Steve Jobs.  But rather than going over his life I would like to here discuss some of the Apple products I have used over my life and how they affected our family.


I don’t know why but I never got an Apple II. In fact the first time I saw one in use was in the early 80’s. But it certainly looked nifty.

But I was struck with love at first sight when I saw the Lisa, a progenitor of the Mac.  I was at a computer conference in the area which had a number of products on display but when I saw the Lisa I couldn’t see anything else.  It had a 3.5″ floppy drive which was encased in hard plastic, hardly ever considered a floppy anymore.  But the real striking aspect was its screen, a white background, bit mapped screen that sported great black and white graphics.

At the time, I was using IBM 3270 terminals which had green lettering on a dark screen and the only graphics were ones made with rows and columns of asterisks.  To see the graphics pop to life on the Lisa, different font options, what you see is what you get was just extraordinary at the time.  The only downside was its $10K price.  Sadly we didn’t buy one of these either.

Mac worship

Then the 1984 commercial came out in the superbowl spot.  The one where Apple was going to free the computing world from the oppression of big brother with the introduction of the first Macintosh computer.

We got our hands on one soon after and my wife used it for her small accounting business and just loved it.   Over time as she took on partners their office migrated to business applications that were more suited for PCs but she stayed on the Mac long after it was sub-optimal, just because it was easy to use.


Apple Fat Mac by Accretion Disc (cc) (From Flickr)
Apple Fat Mac by Accretion Disc (cc) (From Flickr)

Ultimately, she moved to a PC  taking her Fat Mac home to be used there instead.  Over the next decade or so we updated the Mac to a color screen and a desktop configuration but didn’t really do much with it other than home stuff.


Then the iMac’s came out. We latched onto the half basketball one which had a screen protruding out of it.  We used this for some video and photo editing and just loved it.  Video upload and editing took forever but there was nothing else out there that could even come close.


Our 1st iMac
Our 1st iMac

I ended up using this machine the first few years after I left corporate America but also bought a Mac lap top, encased in aluminum for my business trips.    Both these ran PowerPC microprocessor but eventually ran an early generation of Mac OSX.


A couple of years later we moved on to the all-in-one, Intel based, desktop iMac’s and over time updated to bigger screens, faster processing and more storage.  We are still on iMac desktops for home and office use today.

iPhone infatuation

In 2008 we moved from a dumb cell phone to a smart iPhone 3G.  We wanted to wait until the world phone came out which supported GSM.

But this was another paradigm shift for me. When working in the corporate world I had a blackberry and could use it for contacts, email, and calendar but seldom did anything else on it.  And in fact, at the time I used a PalmPilot for a number of business applications, games, and other computing needs.

When the iPhone3G came out, both the PalmPilot and dumb cell phone were retired and we went completely Apple for all our cell phone needs.  Today, I probably scan email, tweet, and do a number of other applications on my iPhone almost as often as I do them on the iMac.  Over time we moved one or the other of us to the 3Gs and 4 and now the children are starting to get hand me down iPhones and love them just as well.

iPad devotion

Then in May of 2010, we bought an iPad.  This was a corporate purchase but everyone used it.  I tried to use it to replace my laptop a number of times (see my posts To iPad or Not to iPad parts 1, 2, 3 & 4) and ultimately concluded it wouldn’t work for me.  We then went out and got a Mac Airbook and now the iPad is mainly used to check email do some light editing as well as gaiming, media and other light computing activities.

The fact is, sitting on our living room couch, checking email, twitter and taking noteshas made using all these tools that much easier. When we saw the iPad2 we liked what we saw but it took so long for it to become available in the stores that we had lost all gadget lust and are now waiting to see what the next generation looks like when it comes out.


All in all almost 30 years with Apple products both in the home and at work have made me a lifelong advocate.

I never worked for Apple but have heard that most of these products were driven single-mindly by Steve Jobs.  If that was the case, I would have to say that Steve Jobs was a singular technical visionary, that understood what was then possible and took the steps needed to make it happen.  In doing that, he changed computing forever and for that I salute him.

Steve Jobs RIP

Big data and eMedicine combine to improve healthcare

fix_me by ~! (cc) (from Flickr)
fix_me by ~! (cc) (from Flickr)

We have talked before ePathology and data growth, but Technology Review recently reported that researchers at Stanford University have used Electronic Medical Records (EMR) from multiple medical institutions to identify a new harmful drug interaction. Apparently, they found that when patients take Paxil (a depressant) and Pravachol (a cholresterol reducer) together, the drugs interact to raise blood sugar similar to what diabetics have.

Data analytics to the rescue

The researchers started out looking for new drug interactions which could result in conditions seen by diabetics. Their initial study showed a strong signal that taking both Paxil and Pravachol could be a problem.

Their study used FDA Adverse Event Reports (AERs) data that hospitals and medical care institutions record.  Originally, the researchers at Stanford’s Biomedical Informatics group used AERs available at Stanford University School of Medicine but found that although they had a clear signal that there could be a problem, they didn’t have sufficient data to statistically prove the combined drug interaction.

They then went out to Harvard Medical School and Vanderbilt University and asked that to access their AERs to add to their data.  With the combined data, the researchers were now able to clearly see and statistically prove the adverse interactions between the two drugs.

But how did they analyze the data?

I could find no information about what tools the biomedical informatics researchers used to analyze the set of AERs they amassed,  but it wouldn’t surprise me to find out that Hadoop played a part in this activity.  It would seem to be a natural fit to use Hadoop and MapReduce to aggregate the AERs together into a semi-structured data set and reduce this data set to extract the AERs which matched their interaction profile.

Then again, it’s entirely possible that they used a standard database analytics tool to do the work.  After all, we were only talking about a 100 to 200K records or so.

Nonetheless, the Technology Review article stated that some large hospitals and medical institutions using EMR are starting to have database analysts (maybe data scientists) on staff to mine their record data and electronic information to help improve healthcare.

Although EMR was originally envisioned as a way to keep better track of individual patients, when a single patient’s data is combined with 1000s more patients one creates something entirely different, something that can be mined to extract information.  Such a data repository can be used to ask questions about healthcare inconceivable before.


Digitized medical imagery (X-Rays, MRIs, & CAT scans), E-pathology and now EMR are together giving rise to a new form of electronic medicine or E-Medicine.  With everything being digitized, securely accessed and amenable to big data analytics medical care as we know is about to undergo a paradigm shift.

Big data and eMedicine combined together are about to change healthcare for the better.

IBM research introduces SyNAPSE chip

IBM with the help of a Columbia, Cornell, University of Wisconsin (Madison) and University of California creates the first generation of neuromorphic chips (press release and video) which mimics the human brain’s computational architecture implemented via silicon.  The chip is a result of Project SyNAPSE (standing for Systems of Neuromorphic Adaptive Plastic Scalable Electronics)

Hardware emulating wetware

Apparently the chip supports two cores one with 65K “learning” synapses and the other with ~256K “programmable” synapses.  Not really sure from reading the press release but it seems each core contains 256 neuronal computational elements.

Wikimedia commons (481px-Chemical_synapse_schema_cropped)
Wikimedia commons (481px-Chemical_synapse_schema_cropped)

In contrast, the human brains contains between 100M and 500M synapses (wikipedia) and has ~85 billion neurons (wikipedia). Typical human neurons have 1000s of synapses.

IBM’s goal is to have a trillion neuron processing engine with 100 trillion synapses occupy a 2-liter volume (about the size of the brain) and consuming less than one kilowat of power (about 500X the brains power consumption).

I want one.

IBM is calling such a system built out of neuromorphic chips a cognitive computing system.

What do with the system

The IBM research team has demonstrated some typical AI applications such as simple navigation, machine vision, pattern recognition, associative memory and classification applications with the chip.

Given my history with von Neuman computing it’s kind of hard for me to envision how synapses represent “programming” in the brain.  Nonetheless, wikipedia defines a synapse as a connection between any two nuerons which can take two forms electrical or chemical. A chemical synapse (wikipedia), can have different levels of strength, plasticity, and receptivity.  Sounds like this might be where the programmability lies.

Just what the “learning” synapses do, how they relate to the programmatical synapses and how they do it is another question entirely.

Stay tuned, a new, non-von Neuman computing architecture was born today.  Two questions to ponder

  1. I wonder if they will still call it artificial intelligence?
  2. Are we any closer to the Singularity now?




M-Disc provides a 1000 year archivable DVD

M-Disc (c) 2011 Millenniata (from their website)
M-Disc (c) 2011 Millenniata (from their website)

I heard about this last week but saw another notice today.  Millenniata has made what they believe to be a DVD which has a 1000 year archive life they call the M-Disc .

I have written before about the lack of long term archives for digital data mostly focused on disappearing formants but this device if it works, has the potential to solve the other problem (discussed here) mainly that no storage media around today can last that long.

The new single layer DVD (4.7GB max) has a chemically stable, inorganic recording layer which is a heat resistant matrix of materials which can retain data while surviving temperatures of up to 500°C (932°F).

Unlike normal DVDs which record data using organic dyes within a DVD, M-Disc data is recorded on this stone-like layer embedded inside  the DVD.  By doing so, M-Disc have created the modern day equivalent of etching information in stone.

According to the vendor, M-Disc archive-ability was independently validated by the US DOD at their Church Lake facilities. While the DOD didn’t say the M-Disc DVD has a 1000 year life they did say that under their testing the M-Disc was the only DVD device which did not lose data. The DOD tested DVDs from Mitsubishi, Verbatum, Delkin, MAM-A and Taiyo Yuden (JVC) in addition to the M-Disc.

The other problems with long term archives involve data formats and program availability that could read such formats from long ago. Although Millenniata have no solution for this, something like a format repository with XML descriptions might provide the way forward to a solution.

Given the nature of their DVD recording surface, special purpose DVD writers, with lasers that are 5X the intensity of normal DVDs, need to be used. But once recorded any DVD reader is able to read the data off the disk.

Pricing for the media was suggested to be about equivalent per disk for archive quality DVDs.  Pricing for the special DVD writers was not disclosed.

They did indicate they were working on a similar product for BluRay disks which would take the single layer capacity up to 26GBs.