An Exabyte-a-day

snp microarray data by mararie (cc) (from flickr)
snp microarray data by mararie (cc) (from flickr)

At HPTechDay this week Jim Pownell, office of CTO, HP StorageWorks Division, reported on an IDC study that said this year the world is creating about an Exabyte of data each day.  An Exabyte (XB) is 10**18 bytes or 1000 PB of data.  Seems a bit high from my perspective.

Data creation by individuals

Population Growth and Income Level Chart by mattlemmon (cc) (from flickr)
Population Growth and Income Level Chart by mattlemmon (cc) (from flickr)

The US Census bureau estimates todays worldwide population at around 6.8 Billion people. Given that estimate, the XB/day number says that the average person is creating about 150MB/day.

Now I don’t know about you but we probably create that much data during our best week. That being said our family average over the last 3.5 years is more like 30.1MB/day. This average, over the last year, has been closer to 75.1MB/day (darn new digital camera).

If I take our 75.1 MB/day as a reasonable approximate average for our family and with 2 adults in our family, this would say each adult creates ~37.6MB of data per day.

Probably about 50% of todays world wide population probably has no access to create any data whatsoever. Of the remaining 50%, maybe 33% is at an age where data creation is insignificant. All this leaves about 2.3B people actively creating data at around 37.6MB/day. This would account for about 86.5PB of data creation a day.

Naturally, I would consider myself a power data creator but

  • We are not doing much with video production which takes creates gobs of data.
  • Also, my wife retains camera rights and I only take the occasional photo with my cell phone. So I wouldn’t say we are heavy into photography.

Nonetheless, 37.6MB/day on average seems exceptionally high, even for us.

Data creation by companies

However, that XB a day also accounts for corporate data generation as well as individuals. Hoovers, a US corporate database lists about 33M companies worldwide. These are probably the biggest 33M and no doubt creating lot’s of data each day.

Given the above that individuals probably account for 86.5PB/day, that leaves about ~913.5PB/day for the Hoover’s DB of 33M companies to create. By my calculations this would say each of these companies is generating about ~27.6GB/day. No doubt there are plenty of companies out there doing this each day but the average company generates 27.6GB a day?? I don’t think so.

Ok, my count of companies could be wildly off. Perhaps the 33M companies in Hoover’s DB represent only the top 20% of companies worldwide, which means that maybe there are another 132M smaller companies out there totaling 165M companies. Now the 913.5PB/day says the average company generates ~5.5GB/day. This still seems high to me, especially considering this is an average of all 165M companies world wide.

Most analysts predict data creation is growing by over 100% per year, so that XB/day number for this year will be 2XB/day next year.

Of course I have been looking at a new HD video camera for my birthday…

Sony_HDR-TG5V_Vanity350
Sony_HDR-TG5V_Vanity350