Graphene Flash Memory

Model of graphene structure by CORE-Materials (cc) (from Flickr)
Model of graphene structure by CORE-Materials (cc) (from Flickr)

I have been thinking about writing a post on “Is Flash Dead?” for a while now.  Well at least since talking with IBM research a couple of weeks ago on their new memory technologies that they have been working on.

But then this new Technology Review article came out  discussing recent research on Graphene Flash Memory.

Problems with NAND Flash

As we have discussed before, NAND flash memory has some serious limitations as it’s shrunk below 11nm or so. For instance, write endurance plummets, memory retention times are reduced and cell-to-cell interactions increase significantly.

These issues are not that much of a problem with today’s flash at 20nm or so. But to continue to follow Moore’s law and drop the price of NAND flash on a $/Gb basis, it will need to shrink below 16nm.  At that point or soon thereafter, current NAND flash technology will no longer be viable.

Other non-NAND based non-volatile memories

That’s why IBM and others are working on different types of non-volatile storage such as PCM (phase change memory), MRAM (magnetic RAM) , FeRAM (Ferroelectric RAM) and others.  All these have the potential to improve general reliability characteristics beyond where NAND Flash is today and where it will be tomorrow as chip geometries shrink even more.

IBM seems to be betting on MRAM or racetrack memory technology because it has near DRAM performance, extremely low power and can store far more data in the same amount of space. It sort of reminds me of delay line memory where bits were stored on a wire line and read out as they passed across a read/write circuit. Only in the case of racetrack memory, the delay line is etched in a silicon circuit indentation with the read/write head implemented at the bottom of the cleft.

Graphene as the solution

Then along comes Graphene based Flash Memory.  Graphene can apparently be used as a substitute for the storage layer in a flash memory cell.  According to the report, the graphene stores data using less power and with better stability over time.  Both crucial problems with NAND flash memory as it’s shrunk below today’s geometries.  The research is being done at UCLA and is supported by Samsung, a significant manufacturer of NAND flash memory today.

Current demonstration chips are much larger than would be useful.  However, given graphene’s material characteristics, the researchers believe there should be no problem scaling it down below where NAND Flash would start exhibiting problems.  The next iteration of research will be to see if their scaling assumptions can hold when device geometry is shrunk.

The other problem is getting graphene, a new material, into current chip production.  Current materials used in chip manufacturing lines are very tightly controlled and  building hybrid graphene devices to the same level of manufacturing tolerances and control will take some effort.

So don’t look for Graphene Flash Memory to show up anytime soon. But given that 16nm chip geometries are only a couple of years out and 11nm, a couple of years beyond that, it wouldn’t surprise me to see Graphene based Flash Memory introduced in about 4 years or so.  Then again, I am no materials expert, so don’t hold me to this timeline.

 

—-

Comments?

Pure Storage surfaces

1 controller X 1 storage shelf (c) 2011 Pure Storage (from their website)
1 controller X 1 storage shelf (c) 2011 Pure Storage (from their website)

We were talking with Pure Storage last week, another SSD startup which just emerged out of stealth mode today.  Somewhat like SolidFire which we discussed a month or so ago, Pure Storage uses only SSDs to provide primary storage.  In this case, they are supporting a FC front end, with an all SSDs backend, and implementing internal data deduplication and compression, to try to address the needs of enterprise tier 1 storage.

Pure Storage is in final beta testing with their product and plan to GA sometime around the end of the year.

Pure Storage hardware

Their system is built around MLC SSDs which are available from many vendors but with a strategic investment from Samsung, currently use that vendor’s storage.  As we know, MLC has write endurance limitations but Pure Storage was built from the ground up knowing they were going to use this technology and have built their IP to counteract these issues.

The system is available in one or two controller configurations, with an Infiniband interconnect between the controllers, 6Gbps SAS backend, 48GB of DRAM per controller for caching purposes, and NV-RAM for power outages.  Each controller has 12-cores supplied by 2-Intel Xeon processor chips.

With the first release they are limiting the controllers to one or two (HA option) but their storage system is capable of clustering together many more, maybe even up to 8-controllers using the Infiniband back end.

Each storage shelf provides 5.5TB of raw storage using 2.5″ 256GB MLC SSDs.  It looks like each controller can handle up to 2-storage shelfs with the HA (dual controller option) supporting 4 drive shelfs for up to 22TB of raw storage.

Pure Storage Performance

Although these numbers are not independently verified, the company says a single controller (with 1-storage shelf) they can do 200K sustained 4K random read IOPS, 2GB/sec bandwidth, 140K sustained write IOPS, or 500MB/s of write bandwidth.  A dual controller system (with 2-storage shelfs) can achieve 300K random read IOPS, 3GB/sec bandwidth, 180K write IOPS or 1GB/sec of write bandwidth.  They also claim that they can do all this IO with an under 1 msec. latency.

One of the things they pride themselves on is consistent performance.  They have built their storage such that they can deliver this consistent performance even under load conditions.

Given the amount of SSDs in their system this isn’t screaming performance but is certainly up there with many enterprise class systems sporting over 1000 disks.  The random write performance is not bad considering this is MLC.  On the other hand the sequential write bandwidth is probably their weakest spec and reflects their use of MLC flash.

Purity software

One key to Pure Storage (and SolidFire for that matter) is their use of inline data compression and deduplication. By using these techniques and basing their system storage on MLC, Pure Storage believes they can close the price gap between disk and SSD storage systems.

The problems with data reduction technologies is that not all environments can benefit from them and they both require lots of CPU power to perform well.  Pure Storage believes they have the horsepower (with 12 cores per controller) to support these services and are focusing their sales activities on those (VMware, Oracle, and SQL server) environments which have historically proven to be good candidates for data reduction.

In addition, they perform a lot of optimizations in their backend data layout to prolong the life of MLC storage. Specifically, they use a write chunk size that matches the underlying MLC SSDs page width so as not to waste endurance with partial data writes.  Also they migrate old data to new locations occasionally to maintain “data freshness” which can be a problem with MLC storage if the data is not touched often enough.  Probably other stuff as well, but essentially they are tuning their backend use to optimize endurance and performance of their SSD storage.

Furthermore, they have created a new RAID 3D scheme which provides an adaptive parity scheme based on the number of available drives that protects against any dual SSD failure.  They provide triple parity, dual parity for drive failures and another parity for unrecoverable bit errors within a data payload.  In most cases, a failed drive will not induce an immediate rebuild but rather a reconfiguration of data and parity to accommodate the failing drive and rebuild it onto new drives over time.

At the moment, they don’t have snapshots or data replication but they said these capabilities are on their roadmap for future delivery.

—-

In the mean time, all SSD storage systems seem to be coming out of the wood work. We mentioned SolidFire, but WhipTail is another one and I am sure there are plenty more in stealth waiting for the right moment to emerge.

I was at a conference about two months ago where I predicted that all SSD systems would be coming out with little of the engineering development of storage systems of yore. Based on the performance available from a single SSD, one wouldn’t need 100s of SSDs to generate 100K IOPS or more.  Pure Storage is doing this level of IO with only 22 MLC SSDs and a high-end, but essentially off-the-shelf controller.

Just imagine what one could do if you threw some custom hardware at it…

Comments?

IBM research introduces SyNAPSE chip

IBM with the help of a Columbia, Cornell, University of Wisconsin (Madison) and University of California creates the first generation of neuromorphic chips (press release and video) which mimics the human brain’s computational architecture implemented via silicon.  The chip is a result of Project SyNAPSE (standing for Systems of Neuromorphic Adaptive Plastic Scalable Electronics)

Hardware emulating wetware

Apparently the chip supports two cores one with 65K “learning” synapses and the other with ~256K “programmable” synapses.  Not really sure from reading the press release but it seems each core contains 256 neuronal computational elements.

Wikimedia commons (481px-Chemical_synapse_schema_cropped)
Wikimedia commons (481px-Chemical_synapse_schema_cropped)

In contrast, the human brains contains between 100M and 500M synapses (wikipedia) and has ~85 billion neurons (wikipedia). Typical human neurons have 1000s of synapses.

IBM’s goal is to have a trillion neuron processing engine with 100 trillion synapses occupy a 2-liter volume (about the size of the brain) and consuming less than one kilowat of power (about 500X the brains power consumption).

I want one.

IBM is calling such a system built out of neuromorphic chips a cognitive computing system.

What do with the system

The IBM research team has demonstrated some typical AI applications such as simple navigation, machine vision, pattern recognition, associative memory and classification applications with the chip.

Given my history with von Neuman computing it’s kind of hard for me to envision how synapses represent “programming” in the brain.  Nonetheless, wikipedia defines a synapse as a connection between any two nuerons which can take two forms electrical or chemical. A chemical synapse (wikipedia), can have different levels of strength, plasticity, and receptivity.  Sounds like this might be where the programmability lies.

Just what the “learning” synapses do, how they relate to the programmatical synapses and how they do it is another question entirely.

Stay tuned, a new, non-von Neuman computing architecture was born today.  Two questions to ponder

  1. I wonder if they will still call it artificial intelligence?
  2. Are we any closer to the Singularity now?

—-

Comments

 

Is FC dead?!

SNIA Tech Center Computer Lab 2 switching hw (c) 2011 Silverton Consulting, Inc.
SNIA Tech Center Computer Lab 2 switching hw (c) 2011 Silverton Consulting, Inc.

Was at the Pacific Crest/Mosaic annual conference cocktail hour last night surrounded by a bunch of iSCSI/NAS storage vendors and they made the statement that FC is dead.

Apparently, 40GbE is just around the corner and 10GbE cards have started a steep drop in price and are beginning to proliferate through the enterprise.  The vendors present felt that an affordable 40GbE that does iSCSI and/or FCoE would be the death knell for FC as we know it.

As evidence they point to Brocade’s recent quarterly results that shows their storage business is in decline, down 5-6% YoY for the quarter. In contrast, Brocade’s Ethernet business is up this quarter 12-13% YoY (albeit, from a low starting point).  Further confusing the picture, Brocade is starting to roll out 16Gbps FC  (16GFC) while the storage market is still trying to ingest the changeover to 8Gbps FC.

But do we need the bandwidth?

One question is do we need 16GFC or even 40GbE for the enterprise today.  Most vendors speak of the high bandwidth requirements for server virtualization as a significant consumer of enterprise bandwidth.  But it’s unclear to me whether this is reality or just the next wave of technology needing to find a home.

Let’s consider for the moment what 16GFC and 40GbE can do for data transfer. If we assume ~10 bits pre byte then:

  • 16GFC can provide 1.6GB/s of data transfer,
  • 40GbE can provide 4GB/s of data transfer.

Using Storage Performance Council’s SPC-2 results the top data transfer subsystem (IBM DS8K) is rated at 9.7GB/s so with 40GbE it could use about 3 links and for the 16GFC it would be able to sustain this bandwidth using about 7 links.

So there’s at least one storage systems out there that can utilize the extreme bandwidth that such interfaces supply.

Now as for the server side nailing down the true need is a bit harder to do.  Using Amdahl’s IO law, which states there is 1 IO for every 50K instructions, and with Intel’s Core I7 Extreme edition rated at 159KMips, it should be generating about 3.2M IO/s and at 4KB per IO this would be about 12GB/sec.  So the current crop of high processors seem able to consume this level of bandwidth, if present.

FC or Ethernet?

Now the critical question, which interface does the data center use to provide that bandwidth.  The advantages of FC are becoming less so over time as FCoE becomes more widely adopted and any speed advantage that FC had should go away with the introduction of data center 40GbE.

The other benefit that Ethernet offers is a “single data center backbone” which can handle all network/storage traffic.  Many large customers are almost salivating at the possibility of getting by with a single infrastructure for everything vs. having to purchase and support separate cabling, switches and server cards to use FC.

On the other hand, having separate networks, segregated switching, isolation between network and storage traffic can provide better security, availability, and reliability that are hard to duplicate with a single network.

To summarize, one would have to say is that there are some substantive soft benefits to having both Ethernet and FC infrastructure but there are hard cost and operational advantages to having a single infrastructure based on 10GbE or hopefully, someday 40GbE.

—-

So I would have to conclude that FC’s days are numbered especially when 40GbE becomes affordable and thereby, widely adopted in the data center.

Comments?

Shared DAS

Code Name "Thumper" by richardmasoner (cc) (from Flickr)
Code Name "Thumper" by richardmasoner (cc) (from Flickr)

An announcement this week by VMware on their vSphere  5 Virtual Storage Appliance has brought back the concept of shared DAS (see vSphere 5 storage announcements).

Over the years, there have been a few products, such as Seanodes and Condor Storage (may not exist now) that have tried to make a market out of sharing DAS across a cluster of servers.

Arguably, Hadoop HDFS (see Hadoop – part 1), Amazon S3/cloud storage services and most scale out NAS systems all support similar capabilities. Such systems consist of a number of servers with direct attached storage, accessible by other servers or the Internet as one large, contiguous storage/file system address space.

Why share DAS? The simple fact is that DAS is cheap, its capacity is increasing, and it’s ubiquitous.

Shared DAS system capabilities

VMware has limited their DAS virtual storage appliance to a 3 ESX node environment, possibly lot’s of reasons for this.  But there is no such restriction for Seanode Exanode clusters.

On the other hand, VMware has specifically targeted SMB data centers for this facility.  In contrast, Seanodes has focused on both HPC and SMB markets for their shared internal storage which provides support for a virtual SAN on Linux, VMware ESX, and Windows Server operating systems.

Although VMware Virtual Storage Appliance and Seanodes do provide rudimentary SAN storage services, they do not supply advanced capabilities of enterprise storage such as point-in-time copies, replication, data reduction, etc.

But, some of these facilities are available outside their systems. For example, VMware with vSphere 5 will supports a host based replication service and has had for some time now software based snapshots. Also, similar services exist or can be purchased for Windows and presumably Linux.  Also, cloud storage providers have provided a smattering of these capabilities from the start in their offerings.

Performance?

Although distributed DAS storage has the potential for high performance, it seems to me that these systems should perform poorer than an equivalent amount of processing power and storage in a dedicated storage array.  But my biases might be showing.

On the other hand, Hadoop and scale out NAS systems are capable of screaming performance when put together properly.  Recent SPECsfs2008 results for EMC Isilon scale out NAS system have demonstrated very high performance and Hadoops claim to fame is high performance analytics. But you have to throw a lot of nodes at the problem.

—–

In the end, all it takes is software. Virtualizing servers, sharing DAS, and implementing advanced storage features, any of these can be done within software alone.

However, service levels, high availability and fault tolerance requirements have historically necessitated a physical separation between storage and compute services. Nonetheless, if you really need screaming application performance and software based fault tolerance/high availability will suffice, then distributed DAS systems with co-located applications like Hadoop or some scale out NAS systems are the only game in town.

Comments?

Big data – part 3

Linkedin maps data visualization by luc legay (cc) (from Flickr)
Linkedin maps data visualization by luc legay (cc) (from Flickr)

I have renamed this series to “Big data” because it’s no longer just about Hadoop (see Hadoop – part 1 & Hadoop – part 2 posts).

To try to partition this space just a bit, there is unstructured data analysis and structured data analysis. Hadoop is used to analyze un-structured data (although Hadoop is used to parse and structure the data).

On the other hand, for structured data there are a number of other options currently available. Namely:

  • EMC Greenplum – a relational database that is available in a software only as well as now as a hardware appliance. Greenplum supports both row or column oriented data structuring and has support for policy based data placement across multiple storage tiers. There is a packaged solution that consists of Greenplum software and a Hadoop distribution running on a GreenPlum appliance.
  • HP Vertica – a column oriented, relational database that is available currently in a software only distribution. Vertica supports aggressive data compression and provides high throughput query performance. They were early supporters of Hadoop integration providing Hadoop MapReduce and Pig API connectors to provide Hadoop access to data in Vertica databases and job scheduling integration.
  • IBM Netezza – a relational database system that is based on proprietary hardware analysis engine configured in a blade system. Netezza is the second oldest solution on this list (see Teradata for the oldest). Since the acquisition by IBM, Netezza now provides their highest performing solution on IBM blade hardware but all of their systems depend on purpose built, FPGA chips designed to perform high speed queries across relational data. Netezza has a number of partners and/or homegrown solutions that provide specialized analysis for specific verticals such as retail, telcom, finserv, and others. Also, Netezza provides tight integration with various Oracle functionality but there doesn’t appear to be much direct integration with Hadoop on thier website.
  • ParAccel – a column based, relational database that is available in a software only solution. ParAccel offers a number of storage deployment options including an all in-memory database, DAS database or SSD database. In addition, ParAccel offers a Blended Scan approach providing a two tier database structure with DAS and SAN storage. There appears to be some integration with Hadoop indicating that data stored in HDFS and structured by MapReduce can be loaded and analyzed by ParAccel.
  • Teradata – a relational databases that is based on a proprietary purpose built appliance hardware. Teradata recently came out with an all SSD, solution which provides very high performance for database queries. The company was started in 1979 and has been very successful in retail, telcom and finserv verticals and offer a number of special purpose applications supporting data analysis for these and other verticals. There appears to be some integration with Hadoop but it’s not prominent on their website.

Probably missing a few other solutions but these appear to be the main ones at the moment.

In any case both Hadoop and most of it’s software-only, structured competition are based on a massively parrallelized/share nothing set of linux servers. The two hardware based solutions listed above (Teradata and Netezza) also operate in a massive parallel processing mode to load and analyze data. Such solutions provide scale-out performance at a reasonable cost to support very large databases (PB of data).

Now that EMC owns Greenplum and HP owns Vertica, we are likely to see more appliance based packaging options for both of these offerings. EMC has taken the lead here and have already announced Greenplum specific appliance packages.

—-

One lingering question about these solutions is why don’t customers use current traditional database systems (Oracle, DB2, Postgres, MySQL) to do this analysis. The answer seems to lie in the fact that these traditional solutions are not massively parallelized. Thus, doing this analysis on TB or PB of data would take a too long. Moreover, the cost to support data analysis with traditional database solutions over PB of data would be prohibitive. For these reasons and the fact that compute power has become so cheap nowadays, structured data analytics for large databases has migrated to these special purpose, massively parallelized solutions.

Comments?

SolidFire supplies scale-out SSD storage for cloud service providers

SolidFire SF3010 node (c) 2011 SolidFire (from their website)
SolidFire SF3010 node (c) 2011 SolidFire (from their website)

I was talking with a local start up called SolidFire the other day with an interesting twist on SSD storage.  They were targeting cloud service providers with a scale-out, cluster based SSD iSCSI storage system.  Apparently a portion of their team had come from Lefthand (now owned by HP) another local storage company and the rest came from Rackspace, a national cloud service provider.

The hardware

Their storage system is a scale-out cluster of storage nodes that can range from 3 to a theoretical maximum of 100 nodes (validated node count ?). Each node comes equipped with 2-2.4GHz, 6-core Intel processors and 10-300GB SSDs for a total of 3TB raw storage per node.  Also they have 8GB of non-volatile DRAM for write buffering and 72GB read cache resident on each node.

The system also uses 2-10GbE links for host to storage IO and inter-cluster communications and support iSCSI LUNs.  There are another 2-1GigE links used for management communications.

SolidFire states that they can sustain 50K IO/sec per node. (This looks conservative from my viewpoint but didn’t state any specific R:W ratio or block size for this performance number.)

The software

They are targeting cloud service providers and as such the management interface was designed from the start as a RESTful API but they also have a web GUI built out of their API.  Cloud service providers will automate whatever they can and having a RESTful API seems like the right choice.

QoS and data reliability

The cluster supports 100K iSCSI LUNs and each LUN can have a QoS SLA associated with it.  According to SolidFire one can specify a minimum/maximum/burst level for IOPS and a maximum or burst level for throughput at a LUN granularity.

With LUN based QoS, one can divide cluster performance into many levels of support for multiple customers of a cloud provider.  Given these unique QoS capabilities it should be relatively easy for cloud providers to support multiple customers on the same storage providing very fine grained multi-tennancy capabilities.

This could potentially lead to system over commitment, but presumably they have some way to ascertain over commitment is near and not allowing this to occur.

Data reliability is supplied through replication across nodes which they call Helix(tm) data protection.  In this way if an SSD or node fails, it’s relatively easy to reconstruct the lost data onto another node’s SSD storage.  Which is probably why the minimum number of nodes per cluster is set at 3.

Storage efficiency

Aside from the QoS capabilities, the other interesting twist from a customer perspective is that they are trying to price an all-SSD storage solution at the $/GB of normal enterprise disk storage. They believe their node with 3TB raw SSD storage supports 12TB of “effective” data storage.

They are able to do this by offering storage efficiency features of enterprise storage using an all SSD configuration. Specifically they provide,

  • Thin provisioned storage – which allows physical storage to be over subscribed and used to support multiple LUNs when space hasn’t completely been written over.
  • Data compression – which searches for underlying redundancy in a chunk of data and compresses it out of the storage.
  • Data deduplication – which searches multiple blocks and multiple LUNs to see what data is duplicated and eliminates duplicative data across blocks and LUNs.
  • Space efficient snapshot and cloning – which allows users to take point-in-time copies which consume little space useful for backups and test-dev requirements.

Tape data compression gets anywhere from 2:1 to 3:1 reduction in storage space for typical data loads. Whether SolidFire’s system can reach these numbers is another question.  However, tape uses hardware compression and the traditional problem with software data compression is that it takes lots of processing power and/or time to perform it well.  As such, SolidFire has configured their node hardware to dedicate a CPU core to each physical disk drive (2-6 core processors for 10 SSDs in a node).

Deduplication savings are somewhat trickier to predict but ultimately depends on the data being stored in a system and the algorithm used to deduplicate it.  For user home directories, typical deduplication levels of 25-40% are readily attainable.  SolidFire stated that their deduplication algorithm is their own patented design and uses a small fixed block approach.

The savings from thin provisioning ultimately depends on how much physical data is actually consumed on a storage LUN but in typical environments can save 10-30% of physical storage by pooling non-written or free storage across all the LUNs configured on a storage system.

Space savings from point-in-time copies like snapshots and clones depends on data change rates and how long it’s been since a copy was made.  But, with space efficient copies and a short period of existence, (used for backups or temporary copies in test-development environments) such copies should take little physical storage.

Whether all of this can create a 4:1 multiplier for raw to effective data storage is another question but they also have a eScanner tool which can estimate savings one can achieve in their data center. Apparently the eScanner can be used by anyone to scan real customer LUNs and it will compute how much SolidFire storage will be required to support the scanned volumes.

—–

There are a few items left on their current road map to be delivered later, namely remote replication or mirroring. But for now this looks to be a pretty complete package of iSCSI storage functionality.

SolidFire is currently signing up customers for Early Access but plan to go GA sometime around the end of the year. No pricing was disclosed at this time.

I was at SNIA’s BoD meeting the other week and stated my belief that SSDs will ultimately lead to the commoditization of storage.  By that I meant that it would be relatively easy to configure enough SSD hardware to create a 100K IO/sec  or 1GB/sec system without having to manage 1000 disk drives.  Lo and behold, SolidFire comes out the next week.  Of course, I said this would happen over the next decade – so I am off by a 9.99 years…

Comments?

e-pathology and data growth

Blue nevus (4 of 4) by euthman (cc) (From Flickr)
Blue nevus (4 of 4) by euthman (cc) (From Flickr)

I was talking with another analyst the other day by the name of John Koller of Kai Consulting who specializes in the medical space and he was talking about the rise of electronic pathology (e-pathology).  I hadn’t heard about this one.

He said that just like radiology had done in the recent past, pathology investigations are moving to make use of digital formats.

What does that mean?

The biopsies taken today for cancer and disease diagnosis which involve one more specimens of tissue examined under a microscope will now be digitized and the digital files will be inspected instead of the original slide.

Apparently microscopic examinations typically use a 1×3 inch slide that can have the whole slide devoted to some tissue matter.  To be able to do a pathological examination, one has to digitize the whole slide, under magnification at various depths within the tissue.  According to Koller, any tissue is essentially a 3D structure and pathological exams, must inspect different depths (slices) within this sample to form their diagnosis.

I was struck by the need for different slices of the same specimen. I hadn’t anticipated that but whenever I look in a microscope, I am always adjusting the focal length, showing different depths within the slide.   So it makes sense, if you want to understand the pathology of a tissue sample, multiple views (or slices) at different depths are a necessity.

So what does a slide take in storage capacity?

Koller said, an uncompressed, full slide will take about 300GB of space. However, with compression and the fact that most often the slide is not completely used, a more typical space consumption would be on the order of 3 to 5GB per specimen.

As for volume, Koller indicated that a medium hospital facility (~300 beds) typically does around 30K radiological studies a year but do about 10X that in pathological studies.  So at 300K pathological examinations done a year, we are talking about 90 to 150TB of digitized specimen images a year for a mid-sized hospital.

Why move to  e-pathology?

It can open up a whole myriad of telemedicine offerings similar to the radiological study services currently available around the globe.  Today, non-electronic pathology involves sending specimens off to a local lab and examination by medical technicians under microscope.  But with e-pathology, the specimen gets digitized (where, the hospital, the lab, ?) and then the digital files can be sent anywhere around the world, wherever someone is qualified and available to scrutinize them.

—–

At a recent analyst event we were discussing big data and aside from the analytics component and other markets, the vendor made mention of content archives are starting to explode.  Given where e-pathology is heading, I can understand why.

It’s great to be in the storage business