Fall SNWUSA wrap-up

Attended SNWUSA this week in San Jose,  It’s hard to see the show gradually change when you attend each one but it does seem that the end-user content and attendance is increasing proportionally.  This should bode well for future SNWs. Although, there was always a good number of end users at the show but the bulk of the attendees in the past were from storage vendors.

Another large storage vendor dropped their sponsorship.  HDS no longer sponsors the show and the last large vendor still standing at the show is HP.  Some of this is cyclical, perhaps the large vendors will come back for the spring show, next year in Orlando, Fl.  But EMC, NetApp and IBM seemed to have pretty much dropped sponsorship for the last couple of shows at least.

SSD startup of the show

Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)
Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)

The best, new SSD startup had to be Skyera. A 48TB raw flash dual controller system supporting iSCSI block protocol and using real commercial grade MLC.  The team at Skyera seem to be all ex-SandForce executives and technical people.

Skyera’s team have designed a 1U box called the Skyhawk, with  a phalanx of NAND chips, there own controller(s) and other logic as well. They support software compression and deduplication as well as a special designed RAID logic that claims to reduce extraneous write’s to something just over 1 for  RAID 6, dual drive failure equivalent protection.

Skyera’s underlying belief is that just as consumer HDAs took over from the big monster 14″ and 11″ disk drives in the 90’s sooner or later commercial NAND will take over from eMLC and SLC.  And if one elects to stay with the eMLC and SLC technology you are destined to be one to two technology nodes behind. That is, commercial MLC (in USB sticks, SD cards etc) is currently manufactured with 19nm technology.  The EMLC and SLC NAND technology is back at 24 or 25nm technology.  But 80-90% of the NAND market is being driven by commercial MLC NAND.  Skyera came out this past August.

Coming in second place was Arkologic an all flash NAS box using SSD drives from multiple vendors. In their case a fully populated rack holds about 192TB (raw?) with an active-passive controller configuration.  The main concern I have with this product is that all their metadata is held in UPS backed DRAM (??) and they have up to 128GB of DRAM in the controller.

Arkologic’s main differentiation is supporting QOS on a file system basis and having some connection with a NIC vendor that can provide end to end QOS.  The other thing they have is a new RAID-AS which is special designed for Flash.

I just hope their USP is pretty hefty and they don’t sell it someplace where power is very flaky, because when that UPS gives out, kiss your data goodbye as your metadata is held nowhere else – at least that’s what they told me.

Cloud storage startup of the show

There was more cloud stuff going on at the show. Talked to at least three or four cloud gateway providers.  But the cloud startup of the show had to be Egnyte.  They supply storage services that span cloud storage and on premises  storage with an in band or out-of-band solution and provide file synchronization services for file sharing across multiple locations.  They have some hooks into NetApp and other major storage vendor products that allows them to be out-of-band for these environments but would need to be inband for other storage systems.  Seems an interesting solution that if succesful may help accelerate the adoption of cloud storage in the enterprise, as it makes transparent whether storage you access is local or in the cloud. How they deal with the response time differences is another question.

Different idea startup of the show

The new technology showplace had a bunch of vendors some I had never heard of before but one that caught my eye was Actifio. They were at VMworld but I never got time to stop by.  They seem to be taking another shot at storage virtualization. Only in this case rather than focusing on non-disruptive file migration they are taking on the task of doing a better job of point in time copies for iSCSI and FC attached storage.

I assume they are in the middle of the data path in order to do this and they seem to be using copy-on-write technology for point-in-time snapshots.  Not sure where this fits, but I suspect SME and maybe up to mid-range.

Most enterprise vendors have solved these problems a long time ago but at the low end, it’s a little more variable.  I wish them luck but although most customers use snapshots if their storage has it, those that don’t, seem unable to understand what they are missing.  And then there’s the matter of being in the data path?!


If there was a hybrid startup at the show I must have missed them. Did talk with Nimble Storage and they seem to be firing on all cylinders.  Maybe someday we can do a deep dive on their technology.  Tintri was there as well in the new technology showcase and we talked with them earlier this year at Storage Tech Field Day.

The big news at the show was Microsoft purchasing StorSimple a cloud storage gateway/cache.  Apparently StorSimple did a majority of their business with Microsoft’s Azure cloud storage and it seemed to make sense to everyone.

The SNIA suite was hopping as usual and the venue seemed to work well.  Although I would say the exhibit floor and lab area was a bit to big. But everything else seemed to work out fine.

On Wednesday, the CIO from Dish talked about what it took to completely transform their IT environment from a management and leadership perspective.  Seemed like an awful big risk but they were able to pull it off.

All in all, SNW is still a great show to learn about storage technology at least from an end-user perspective.  I just wish some more large vendors would return once again, but alas that seems to be a dream for now.

OpenFlow part 2, Cisco’s response


organic growth by jurvetson
organic growth by jurvetson

Cisco’s CTO Padmasree Warrior, was interviewed today by NetworkWorld discussing their response to all the recent press on OpenFlow coming out of the Open Networking Summit (see my OpenFlow the next wave in networking post).  Apparently, Cisco is funding a new spin-in company to implement new networking technology congruent with Cisco’s current and future switches and routers.

Spin-in to the rescue

We have seen this act before, Andiamo was another Cisco spin-in company (brought back in ~2002), only this time focused on FC or SAN switching technology.  Andiamo was successful in that it created FC switch technology which allowed Cisco to go after the storage networking market and probably even helped them design and implement FCoE.

This time’s, a little different however. It’s in Cisco’s backyard, so to speak.  The new spin-in is called Insieme and will be focused on “OpenStack switch hardware and distributed data storage”.

Distributed data storage sounds a lot like cloud storage to me.  OpenStack seems to be an open source approach to define cloud computing systems. What all that has to do with software defined networking I am unable to understand.

Nonetheless, Cisco has invested $100M in the startup and have capped their acquisition cost at $750M if it succeeds.

But is it SDN?

Ms. Warrior does go on to discuss that software programmable switches will be integrated across Cisco’s product line sometime in the near future but says that OpenFlow and OpenStack are only two ways to do that. Other ways exist, such as adding  new features to NX-OS today or modifying their Nexus 1000v (software only, VMware based, virtual switch) they have been shipping since 2009.

As for OpenFlow commoditizing networking technology, Ms. Warrior doesn’t believe that any single technology is going to change the leadership in networking.  Programmability is certainly of interest to one segment of users with massive infrastructure but most data centers have no desire to program their own switches.  And in the end, networking success depends as much channels and goto market programs as it does on great technology.

Cisco’s CTO was reluctant to claim that Insieme was their response to SDN but it seems patently evident to the rest of us that it’s at least one of its objectives.  Something like this is a two edged sword, on the one hand it helps Cisco go after and help define the new technology on the other hand it legitimizes the current players.


Nicira is probably rejoicing today what with all the news coming out of the Summit and the creation of Insieme.  Probably yet another reason not to label it SDN…

New cloud storage and Hadoop managed service offering from Spring SNW

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

Last week I posted my thoughts on Spring SNW in Dallas, but there were two more items that keep coming back to me (aside from the tornados).  The first was a new startup called Symform in cloud storage and the other was an announcement from SunGard about their new Hadoop managed services offering.


Symform offers an interesting alternative on cloud storage that avoids the build up of large multi-site data centers and uses your desktop storage as a sort of crowd-sourced storage cloud, sort of bit-torrent cloud storage.

You may recall I discussed such a Peer-to-Peer cloud storage and computing services in a posting a couple of years ago.  It seems Symform has taken this task on, at least for storage.

A customer downloads (Windows or Mac) software which is installed and executes on your desktop.  The first thing you have to do after providing security credentials is to identify which directories will be moved to the cloud and the second is to tell whether you wish to contribute to Symform’s cloud storage and where this storage is located.  Symform maintains a cloud management data center which records all the metadata about your cloud resident data and everyone’s contributed storage space.

Symform cloud data is split up into 64MB blocks and encrypted (AES-256) using a randomly generated key (known only to Symform). Then this block is broken up into 64 fragments with 32 parity fragments (using erasure coding) added to the stream which is then written to 96 different locations.  With this arrangement, the system could potentially lose 31 fragments out of the 96 and still reconstitute your 64MB of data.  The metadata supporting all this activity sits in Symform’s data center.

Unclear to me what you have to provide as far as ongoing access to your contributed storage.  I would guess you would need to provide 7X24 access to this storage but the 32 parity fragments are there for possible network/power failures outside your control.

Cloud storage performance is an outcome of the many fragments that are disbursed throughout their storage cloud world. It’s similar to a bit torrent stream with all 96 locations participating in reconstituting your 64MB of data.  Of course, not all 96 locations have to be active just some > 64 fragment subset but it’s still cloud storage so data access latency is on the order of internet time (many seconds).  Nonetheless, once data transfer begins, throughput performance can be pretty high, which means your data should arrive shortly thereafter.

Pricing seemed comparable to other cloud storage services with a monthly base access fee and a storage amount fee over that.  But, you can receive significant discounts if you contribute storage and your first 200GB is free as long as you contribute 200GB of storage space to the Symform cloud.

Sungard’s new Apache Hadoop managed service

Hadoop Logo (from http://hadoop.apache.org website)
Hadoop Logo (from http://hadoop.apache.org website)

We are well aware of Sungard’s business continuity/disaster recovery (BC/DR) services, an IT mainstay for decades now. But sometime within the last decade or so Sungard has been expanding outside this space by moving into managed availability services.

Apparently this began when Sungard noticed the number of new web apps being deployed each year exceeded the number of client server apps. Then along came virtualization, which reduced the need for lots of server and storage hardware for BC/DR.

As evident of this trend, last year Sungard announced a new enterprise class computing cloud service.  But in last week’s announcement, Sungard has teamed up with EMC Greenplum to supply an enterprise ready Apache Hadoop managed service offering.

Recall, that EMC Greenplum is offering their own Apache Hadoop supported distribution, Greenplum HD.  Sungard is basing there service on this distribution. But there’s more.

In conjunction with Hadoop, Sungard adds Greenplum appliances.  With this configuration Sungard can load Hadoop processed and structured data into a Greenplum relational database for high performance data analytics.  Once there, any standard SQL analytics and queries can be used against to analyze the data.

With these services Sungard is attempting to provide a unified analytics service that spans all structured, semi-structured and unstructured data.


Probably more to Spring SNW but given my limited time on the exhibition floor and time in vendor discussions these and my previously published post are what I seem of most interest to me.

IT as a service on the Cloud is not the end

Prison Planet by AZRainman (cc) (from Flickr)
Prison Planet by AZRainman (cc) (from Flickr)

[Long post] Read another intriguing post by David Vellente at Wikibon today about the emergence of IT shops becoming service organizations to their industries using the cloud to hosting these services.  I am not in complete agreement with Dave but he certainly describes a convincing picture.

His main points are:

  • Cloud storage and cloud computing are emerging as a favorite platform for IT-as-a-service.
  • Specialization and economics of scale will generate an IT-as-a-service capability for any organization’s information processing needs.

I would have to say another tenet of his overall discussion is that IT matters, a lot and I couldn’t agree more.

Cloud reality

For some reason I have been talking a lot about cloud storage this past couple of weeks, in multiple distinct venues.  On the one hand, I was talking with a VAR the other day and they were extremely excited about the opportunity in cloud storage. It seems getting SMB customers to sign up for a slice of storage is easy and once they have that, getting them to use more becomes a habit they can’t get rid of.

I thought maybe the enterprise level would be immune to such inducements, but no.  Another cloud storage gateway vendor,  StorSimple, I talked with recently was touting the great success they were having displacing tier 2 storage in the enterprise.

Lately, I heard that some small businesses/startups have decided to abandon their own IT infrastructure altogether and depend entirely on cloud offerings from Amazon, RackSpace and others for all they need.  They argue that such infrastructure, for all its current faults, will have less downtime than anything they could create on their own within a limited budget.

So, cloud seems to be taking off, everywhere I look.

Vertical support for IT as a service

Dave mentions plenty in his lengthy post that a number of sophisticated IT organizations are taking their internal services and becoming IT-as-a-service profit centers.  Yes, hard to disagree with this one as well.

But, it’s not the end of IT organizations

However, where I disagree with Dave is that he sees this as a winning solution, taking over all internal IT activities.  In his view, either your IT group becomes an external service profit center or it’s destined to be replaced by someone else’s service offering(s).

I don’t believe this. To say that IT as a service will displace 50+ years of technology development in the enterprise is just overstatement.

Dave talks about WINTEL, displacing mainframes as the two monopolies created in IT.  But the fact remains, WINTEL has not eliminated mainframes.  Mainframes still exist and arguably, today are still expanding through out the world.

Dave states that the introduction of WINTEL reduced the switching cost of mainframes, and that the internet and the cloud that follows, have reduced the costs yet again. I agree.  But, that doesn’t mean that switching cost is 0.

Ask anyone whether SalesForce.com switching cost inhibits them from changing services and more than likely they will say yes.  Switching costs have come down, but they are still a viable barrier to change.

Cloud computing and storage generates similar switching costs not to mention the time it takes to transfer TBs of data over a WAN.  Whether a cloud service uses AWS interface, OpenStack, Azzure or any of the other REST/SOAP cloud storage/cloud computing protocols is a formidable barrier to change.  It would be great if OpenStack were to take over but it hasn’t yet, and most likely won’t in the long run.  Mainly because the entrenched suppliers don’t want to help their competition.

IT matters, a lot to my organization

What I see happening is not that much different from what Dave sees, it’s only a matter of degree.  Some IT shops will become service organizations to their vertical but there will remain a large proportion of IT shops that see

  • That their technology is a differentiator.
  • That their technology is not something they want their competition using.
  • That their technology is too important to their corporate advantage to sell to others.

How much of this is reality vs. fiction is another matter.

Nonetheless, I firmly believe that a majority of IT shops that exist today will not convert to using IT as a service.   Some of this is due to sunk costs but a lot will be due to the belief that they are truly better than the service.

That’s not to say that new organizations, just starting out might be more interested in utilizing IT as a service.  For these entities, service offerings are going to be an appealing alternative.

However, a small portion of these startups may just as likely conclude that they can do better or believe it’s more important for them to develop their own IT services to help them get ahead.  Similarly, how much of this is make believe is TBD.

In the end, I believe IT as a service will take it’s place alongside IT developed services and IT outsourced development as yet another capability that any company can deploy to provide information processing for their organization.

The real problem

In my view, the real problem with IT developed services today is development disease.  Most organizations, would like increased functionality, and want it ASAP but they just can’t develop working functionality fast enough.  I call slow functionality development, missing critical features with lots of bugs development disease.  And it’s everywhere today and has never really gone away.

Some of this is due to poor IT infrastructure, some is due to the inability to use new development frameworks, and some of it is due to a lack of skills.  If IT had some pill they could take to help them develop business processing faster, consuming less resources with much fewer bugs and fuller functionality, they would never consider IT as a service.

That’s where the new frameworks of Ruby on Rails, SpringForce and the like are exciting. Their promise is providing faster functionality with fewer failures. When that happens, organizations will move away from IT as a service in droves, and back to internally developed capabilities.

But, we’re not there yet.



Is cloud a leapfrog technology?

Mobile Phone with Money in Kenya by whiteafrican (cc) (from Flickr)
Mobile Phone with Money in Kenya by whiteafrican (cc) (from Flickr)

Read an article today about Safaricom creating a domestic cloud service offering outside Nairobi in Kenya (see Chasing the African Cloud).

But this got me to thinking that cloud services may be just like mobile phones in that developing countries can use it to skip over older technologies like wired phone lines and gain advantages of more recent technology that offers similar services, the mobile phone without the need to bother with the expense and time to build telephone wires across the land.

Leapfrogging IT infrastructure buildout

In the USA, cloud computing, cloud storage, and SAAS services based in the cloud are essentially taking the place of small business IT infrastructure services today.  Many small businesses skip over building their own IT infrastructures, absolutely necessary years ago for email, web services, back office processing, etc., and are moving directly to using cloud service providers for these capabilities.

In some cases, it’s even more than  just the IT infrastructure, as the application, data and processing services all can be supplied from SAAS providers.

Today, it’s entirely possible to run a complete, very large business without owning a stitch of IT infrastructure (other than desktops, laptops, tablets and mobile phones) by doing this

Developing countries can show us the way

Developing countries can do much the same for their economic activity. Rather than have their small businesses spend time building out homegrown IT infrastructure just lease it out from one or more domestic (or international) cloud service providers and skip the time, effort and cost of doing it your self.

Hanging out with Kenya Techies by whiteafrican (cc) (from Flickr)
Hanging out with Kenya Techies by whiteafrican (cc) (from Flickr)

Given this dynamic, cloud service vendors ought to be focusing more time and money on developing countries. They should adopt such services more rapidly because they don’t have the sunk costs in current, private IT infrastructure and applications.

China moves into the cloud

I probably should have caught on earlier.  Earlier this year I was at a vendor analyst meeting, having dinner with a colleague from the China Center for Information Industry Development (CCID) Consulting.  He mentioned that Cloud was one of a select set of technologies that China was focusing considerable state and industry resources on.   At the time, I just thought this was prudent thinking to keep up with industry trends. What I didn’t realize at the time was that the cloud could be a leap frog technology that would help them avoid a massive IT infrastructure build out in millions of small companies in their nation.

One can see that early adopter nations have understood that with the capabilities of mobile phones they can create a fully functioning telecommunications infrastructure almost overnight.  Much the same can be done with cloud computing, storage and services.

Now if they can only get WiMAX up and running to eliminate cabling their cities for internet access.



The sensor cloud comes home

We thought the advent of smart power meters would be the killer app for building the sensor cloud in the home.  But, this week Honeywell announced a new smart thermostat that attaches to the Internet and uses Opower’s cloud service to record and analyze home heating and cooling demand.  Looks to be an even better bet.

9/11 Memorial renderings, aerial view (c) 9/11 Memorial.org (from their website)
9/11 Memorial renderings, aerial view (c) 9/11 Memorial.org (from their website)

Just this past week, on a NPR NOVA telecast: Engineering Ground Zero on building the 9/11 memorial in NYC, it was mentioned that all the trees planted in the memorial had individual sensors to measure soil chemistry, dampness, and other tree health indicators. Yes, even trees are getting on the sensor cloud.

And of course the buildings going up at Ground Zero are all smart buildings as well, containing sensors embedded in the structure, the infrastructure, and anywhere else that matters.

But what does this mean in terms of data

Data requirements will explode as the smart home and other sensor clouds build out.  For example, even if a smart thermostat only issues a message every 15 minutes and the message is only 256 bytes, the data from the 130 million households in the US alone would be an additional ~3.2TB/day.  And that’s just one sensor per household.

If you add the smart power meter, lawn sensor, intrusion/fire/chemical sensor, and god forbid, the refrigerator and freezer product sensors to the mix that’s another another 16TB/day of incoming data.

And that’s just assuming a 256 byte payload per sensor every 15 minutes.  The intrusion sensors could easily be a combination of multiple, real time exterior video feeds as well as multi-point intrusion/motion/fire/chemical sensors which would generate much, much more data.

But we have smart roads/bridges, smart cars/trucks, smart skyscrapers, smart port facilities, smart railroads, smart boats/ferries, etc. to come.   I could go on but the list seems long enouch already.  Each of these could generate another ~19TB/day data stream, if not more.  Some of these infrastructure entities/devices are much more complex than a house and there are a lot more cars on the road than houses in the US.

It’s great to be in the (cloud) storage business

All that data has to be stored somewhere and that place is going to be the cloud.  The Honeywell smart thermostat uses Opower’s cloud storage and computing infrastructure specifically designed to support better power management for heating and cooling the home.  Following this approach, it’s certainly feasible that more cloud services would come online to support each of the smart entities discussed above.

Naturally, using this data to provide real time understanding of the infrastructure they monitor will require big data analytics. Hadoop, and it’s counterparts are the only platforms around today that are up to this task.


So cloud computing, cloud storage, and big data analytics have yet another part to play. This time in the upcoming sensor cloud that will envelope the world and all of it’s infrastructure.

Welcome to the future, it’s almost here already.




Shared DAS

Code Name "Thumper" by richardmasoner (cc) (from Flickr)
Code Name "Thumper" by richardmasoner (cc) (from Flickr)

An announcement this week by VMware on their vSphere  5 Virtual Storage Appliance has brought back the concept of shared DAS (see vSphere 5 storage announcements).

Over the years, there have been a few products, such as Seanodes and Condor Storage (may not exist now) that have tried to make a market out of sharing DAS across a cluster of servers.

Arguably, Hadoop HDFS (see Hadoop – part 1), Amazon S3/cloud storage services and most scale out NAS systems all support similar capabilities. Such systems consist of a number of servers with direct attached storage, accessible by other servers or the Internet as one large, contiguous storage/file system address space.

Why share DAS? The simple fact is that DAS is cheap, its capacity is increasing, and it’s ubiquitous.

Shared DAS system capabilities

VMware has limited their DAS virtual storage appliance to a 3 ESX node environment, possibly lot’s of reasons for this.  But there is no such restriction for Seanode Exanode clusters.

On the other hand, VMware has specifically targeted SMB data centers for this facility.  In contrast, Seanodes has focused on both HPC and SMB markets for their shared internal storage which provides support for a virtual SAN on Linux, VMware ESX, and Windows Server operating systems.

Although VMware Virtual Storage Appliance and Seanodes do provide rudimentary SAN storage services, they do not supply advanced capabilities of enterprise storage such as point-in-time copies, replication, data reduction, etc.

But, some of these facilities are available outside their systems. For example, VMware with vSphere 5 will supports a host based replication service and has had for some time now software based snapshots. Also, similar services exist or can be purchased for Windows and presumably Linux.  Also, cloud storage providers have provided a smattering of these capabilities from the start in their offerings.


Although distributed DAS storage has the potential for high performance, it seems to me that these systems should perform poorer than an equivalent amount of processing power and storage in a dedicated storage array.  But my biases might be showing.

On the other hand, Hadoop and scale out NAS systems are capable of screaming performance when put together properly.  Recent SPECsfs2008 results for EMC Isilon scale out NAS system have demonstrated very high performance and Hadoops claim to fame is high performance analytics. But you have to throw a lot of nodes at the problem.


In the end, all it takes is software. Virtualizing servers, sharing DAS, and implementing advanced storage features, any of these can be done within software alone.

However, service levels, high availability and fault tolerance requirements have historically necessitated a physical separation between storage and compute services. Nonetheless, if you really need screaming application performance and software based fault tolerance/high availability will suffice, then distributed DAS systems with co-located applications like Hadoop or some scale out NAS systems are the only game in town.


Personal medical record archive

MRI of my brain after surgery for Oligodendroglioma tumor by L_Family (cc) (From Flickr)
MRI of my brain after surgery for Oligodendroglioma tumor by L_Family (cc) (From Flickr)

I was reading a book the other day and it suggested that sometime in the near future we will all have a personal medical record archive. Such an archive would be a formal record of every visit to a healthcare provider, with every x-ray, MRI, CatScan, doctor’s note, blood analysis, etc. that’s ever done to a person.

Such data would be our personal record of our life’s medical history usable by any future medical provider and accessible by us.

Who owns medical records?

Healthcare is unusual.  For any other discipline like accounting, you provide information to the discipline expert and you get all the information you could possibly want back, to store, send to the IRS or or whatever, to do with it as you want.  If you decide to pitch it, you can pretty much request a copy (at your cost) of anything for a certain number of years after the information was created.

But, in medicine, X-rays are owned and kept by the medical provider, same with MRIs, CT scans, etc. and you hardly ever get a copy.  Occasionally, if the physician deems it useful for explicative reasons, you might get a grainy copy of an X-ray that shows a break or something but other than that and possible therapeutic instructions, typically nothing.

Getting Doctor’s notes is another question entirely.  It’s mostly text records in some sort of database somewhere online to the medical unit.  But, mainly what we get as patients, is a verbal diagnosis to take in and mull over.

Personal experience with medical records

I worked for an enlightened company a while back that had their own onsite medical practice providing all sorts of healthcare to their employees.  Over time, new management decided this service was not profitable and terminated it.  As they were winding down the operation, they offered to send patient medical information to any new healthcare provider or to us.  Not having a new provider, I asked they send them to me.

A couple of weeks later, a big brown manilla envelope was delivered.  Inside was a rather large, multy-page printout of notes taken by every medical provider I had visited throughout my tenure with this facility.  What was missing from this assemblage was lab reports, x-rays and other ancillary data that was taken in conjunction with those office visits. I must say the notes were comprehensive and somewhat laden with medical terminology but they were all there to see.

Printouts were not very useful to me and probably wouldn’t be to any follow-on medical group caring for me. However the lack of x-rays, blood work, etc. might be a serious deficiency for any follow-on treatment.  But, as far as I was concerned it was the first time any medical entity even offered me any information like this.

Making personal medical records useable, complete, and retrievable

To take this to the next level, and provide something useful for patients and follow-on healthcare, we need some sort of standardization of medical records across the healthcare industry.  This doesn’t seem that hard, given where we are today and need not be that difficult.  Standards for most medical data already exist, specifically,

  • DICOM or Digital Imaging and Communications in Medicine – is a standard file format used to digitally record X-Rays, MRIs, CT scans and more.  Most digital medical imaging technology (except for ultrasound) out there today optionally records information in DICOM format.  There just so happens to be an open source DICOM viewer that anyone can use to view these sorts of files if one is interested.
  • Ultrasound imaging –  is typically rendered and viewed as a sort of movie and is often used for soft tissue imaging and prenatal care.  I don’t know for sure but cannot find any standard like DICOM for ultrasound images.  However, if they are truly movies, perhaps HD movie files would suffice for a standard ultrasound imaging file.
  • Audiograms, blood chemistry analysis, etc. – is provided by many technicians or labs and could all be easily represented as PDFs, scanned images, JPEG/MPEG recordings, etc.  Doctors or healthcare providers often discuss salient items off these reports that are of specific interest to the patients condition.  Such affiliated notes could all be in an associated text file or even a recording made of the doctor discussing the results of the analysis that somehow references the other artifact  (“Blood chemistry analysis done on 2/14/2007 indicates …”).
  • Other doctor/healthcare provider notes – I find that everytime I visit a healthcare provider these days, they either take copious notes using WIFI connected laptops, record verbal notes to some voice recorder later transcribed into notes, or some combination of these. Any of such information could be provided in standard RTF  (text files) or MPEG recordings and viewed as is.

How patients can access medical data

Most voice recordings or text notes could easily be emailed to the patient.  As for DICOM images, ultrasound movies, etc., they could all be readily provided on DVDs or other removable media sent to the patient.

Another and possibly better alternative, is to have all this data uploaded to a healthcare provider’s designated URL, stored in a medical record cloud someplace, allowing patient access for viewing, downloading and/or copying.   I envision something akin to a photo sharing site, upload-able by any healthcare provider but accessible for downloads by any authorized user/patient.

Medical information security

Any patient data stored in such a medical record cloud would need to be secured and possibly encrypted by a healthcare provider supplied pass code which could be used for downloading/decrypting by the patient.  There are plenty of open source cryptographic algorithms which would suffice to encrypt this data (see GNU Privacy Guard for instance).

As for access passwords, possible some form of public key cryptography would suffice but it need not be that sophisticated.  I prefer to use open source tools for these security mechanisms as then it would be readily available to the patient or any follow-on medical provider to access and decrypt the data.

Medical information retention period

The patient would have a certain amount of time to download these files.  I lean towards months just to insure it’s done in a timely fashion but maybe it should be longer, something on the order of 7-years after a patients last visit might work.  This would allow the patient sufficient time to retrieve the data and to supply it to any follow-on medical provider or stored it in their own, personal medical record archive. There are plenty of cloud storage providers I know, that would be willing to store such data at a fair, but high price, for any period of time desired.

Medical information access credentials

All the patient would need is an email and/or possible a letter that provides the accessing URL, access password and encryption passcode information for the files.  Possibly such information could be provided in plaintext, appended to any bill that is cut for the visit which is sure to find its way to the patient or some financially responsible guardian/parent.

How do we get there

Bootstrapping this personal medical record archive shouldn’t be that hard.  As I understand it, Electronic Medical Record (EMR) legislation in the US and elsewhere has provisions stating that any patient has a legal right to copies of any medical record that a healthcare provider has for them.  If this is true, all we need do then is to institute some additional legislation that requires the healthcare provider to make those records available in a standard format, in a publicly accessible place, access controlled/encrypted via a password/passcode, downloadable by the patient and to provide the access credentials to the patient in a standard form.  Once that is done, we have all the pieces needed to create the personal medical record archive I envision here.


While such legislation may take some time, one thing we could all do now, at least in the US, is to request access to all medical records/information that is legally ours already.  Once all the healthcare providers, start getting inundated with requests for this data, they might figure having some easy, standardized way to provide it would make sense.  Then the healthcare organizations could get together and work to finalize a better solution/legislation needed to provide this in some standard way.  I would think university hospitals could lead this endeavor and show us how it could be done.

Am I missing anything here?