Big data and eMedicine combine to improve healthcare

fix_me by ~! (cc) (from Flickr)
fix_me by ~! (cc) (from Flickr)

We have talked before ePathology and data growth, but Technology Review recently reported that researchers at Stanford University have used Electronic Medical Records (EMR) from multiple medical institutions to identify a new harmful drug interaction. Apparently, they found that when patients take Paxil (a depressant) and Pravachol (a cholresterol reducer) together, the drugs interact to raise blood sugar similar to what diabetics have.

Data analytics to the rescue

The researchers started out looking for new drug interactions which could result in conditions seen by diabetics. Their initial study showed a strong signal that taking both Paxil and Pravachol could be a problem.

Their study used FDA Adverse Event Reports (AERs) data that hospitals and medical care institutions record.  Originally, the researchers at Stanford’s Biomedical Informatics group used AERs available at Stanford University School of Medicine but found that although they had a clear signal that there could be a problem, they didn’t have sufficient data to statistically prove the combined drug interaction.

They then went out to Harvard Medical School and Vanderbilt University and asked that to access their AERs to add to their data.  With the combined data, the researchers were now able to clearly see and statistically prove the adverse interactions between the two drugs.

But how did they analyze the data?

I could find no information about what tools the biomedical informatics researchers used to analyze the set of AERs they amassed,  but it wouldn’t surprise me to find out that Hadoop played a part in this activity.  It would seem to be a natural fit to use Hadoop and MapReduce to aggregate the AERs together into a semi-structured data set and reduce this data set to extract the AERs which matched their interaction profile.

Then again, it’s entirely possible that they used a standard database analytics tool to do the work.  After all, we were only talking about a 100 to 200K records or so.

Nonetheless, the Technology Review article stated that some large hospitals and medical institutions using EMR are starting to have database analysts (maybe data scientists) on staff to mine their record data and electronic information to help improve healthcare.

Although EMR was originally envisioned as a way to keep better track of individual patients, when a single patient’s data is combined with 1000s more patients one creates something entirely different, something that can be mined to extract information.  Such a data repository can be used to ask questions about healthcare inconceivable before.

—-

Digitized medical imagery (X-Rays, MRIs, & CAT scans), E-pathology and now EMR are together giving rise to a new form of electronic medicine or E-Medicine.  With everything being digitized, securely accessed and amenable to big data analytics medical care as we know is about to undergo a paradigm shift.

Big data and eMedicine combined together are about to change healthcare for the better.

e-pathology and data growth

Blue nevus (4 of 4) by euthman (cc) (From Flickr)
Blue nevus (4 of 4) by euthman (cc) (From Flickr)

I was talking with another analyst the other day by the name of John Koller of Kai Consulting who specializes in the medical space and he was talking about the rise of electronic pathology (e-pathology).  I hadn’t heard about this one.

He said that just like radiology had done in the recent past, pathology investigations are moving to make use of digital formats.

What does that mean?

The biopsies taken today for cancer and disease diagnosis which involve one more specimens of tissue examined under a microscope will now be digitized and the digital files will be inspected instead of the original slide.

Apparently microscopic examinations typically use a 1×3 inch slide that can have the whole slide devoted to some tissue matter.  To be able to do a pathological examination, one has to digitize the whole slide, under magnification at various depths within the tissue.  According to Koller, any tissue is essentially a 3D structure and pathological exams, must inspect different depths (slices) within this sample to form their diagnosis.

I was struck by the need for different slices of the same specimen. I hadn’t anticipated that but whenever I look in a microscope, I am always adjusting the focal length, showing different depths within the slide.   So it makes sense, if you want to understand the pathology of a tissue sample, multiple views (or slices) at different depths are a necessity.

So what does a slide take in storage capacity?

Koller said, an uncompressed, full slide will take about 300GB of space. However, with compression and the fact that most often the slide is not completely used, a more typical space consumption would be on the order of 3 to 5GB per specimen.

As for volume, Koller indicated that a medium hospital facility (~300 beds) typically does around 30K radiological studies a year but do about 10X that in pathological studies.  So at 300K pathological examinations done a year, we are talking about 90 to 150TB of digitized specimen images a year for a mid-sized hospital.

Why move to  e-pathology?

It can open up a whole myriad of telemedicine offerings similar to the radiological study services currently available around the globe.  Today, non-electronic pathology involves sending specimens off to a local lab and examination by medical technicians under microscope.  But with e-pathology, the specimen gets digitized (where, the hospital, the lab, ?) and then the digital files can be sent anywhere around the world, wherever someone is qualified and available to scrutinize them.

—–

At a recent analyst event we were discussing big data and aside from the analytics component and other markets, the vendor made mention of content archives are starting to explode.  Given where e-pathology is heading, I can understand why.

It’s great to be in the storage business