DC is back

Power sub distribution board by Tom Raftery (cc) (from Flickr)
Power sub distribution board by Tom Raftery (cc) (from Flickr)

Read an article in this month’s IEEE Spectrum on providing direct current (DC) power distribution to consumers.  Apparently various groups around the world are preparing standards to provide 24-V DC and 380-V DC power distribution to home and office.

Why DC is good

It turns out that most things electronic today (with the possible exception of electro-magnetic motors) run off DC power.  In fact, the LED desklamp I just purchased has a converter card in the plug adapter that converts 120-V alternating current (AC ) to 24-V DC to power the lamp.

If you look at any PC, server, data storage, etc., you will find power supplies that convert AC to DC for internal electronic use.  Most data centers take in 480-V AC which is converted to DC to charge up uninterruptible power supply batteries that discharge DC power that is  converted back to AC which is then converted internally back to DC for server electronics.  I count 3 conversions there: AC to DC, DC to AC and AC to DC.

But the problem with all this AC-DC conversion going on, is that it takes energy.

The War of Currents or why we have AC power today

Edison was a major proponent of DC power distribution early in the history of electronic power distribution. But the issues with DC or even AC for that matter is that voltage is lost over any serious line distances which required that DC generation stations of the time had to be located within a mile of consumers.

In contrast, Tesla and Westinghouse first proposed distributing AC power because of the ability to convert high voltage AC to low voltage using transformers.  To see why this made a difference read on, …

It turns out the major problem with the amount of line loss depends on the current being transmitted.  But current is only one factor in the equation that determines electrical power, the other factor being voltage.  You see any electrical power level can be represented by high current-low voltage or low current-high voltage.

Because AC at the time could easily be converted from high to low voltage or vice versa, high voltage-low current AC power lines could easily be converted (locally) to low voltage-high current power lines.   A high voltage-low current line lost less power and as AC voltage could be converted more easily, AC won the War of Currents.

Move ahead a century or so, and electronics have advanced to the point that converting DC voltage is almost as easy as AC today.  But more to the point, with today’s AC distribution, changing from lot’s of small, individual AC to DC and DC to AC converters in each appliance, server, UPS, etc., can be better served by a few, larger AC to DC converters at the building or household level, improving energy efficiency.

Where DC does better today

Batteries, solar panels, solid state electronics (pretty much any electronic chip, anywhere today), and LED lighting all operate on DC power alone and in most cases convert AC to DC to use today’s AC power distribution.  But by having 24-V or 380-V DC power in the home or office, it would allow these devices to operate without converters and be more efficient.  The Spectrum article states that LED lighting infrastructure can save up to 15% of the energy required if it was just powered by DC rather than having to convert AC to DC.

However, with the industry standards coming out of Emerge and European Telecommunications Standards Institute we may have one other significant benefit.  We could have one worldwide plug/receptacle standard for DC power.  Whether this happens is anyone’s guess, and given today’s nationalism may not be feasible.  But we can always hope for sanity to prevail…

Comments?

Why Open-FCoE is important

FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)
FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)

I don’t know much about O/S drivers but I do know lots about storage interfaces. One thing that’s apparent from yesterday’s announcement from Intel is that Fibre Channel over Ethernet (FCoE) has taken another big leap forward.

Chad Sakac’s chart of FC vs. Ethernet target unit shipments (meaning, storage interface types, I think) clearly indicate a transition to ethernet is taking place in the storage industry today. Of course Ethernet targets can be used for NFS, CIFS, Object storage, iSCSI and FCoE so this doesn’t necessarily mean that FCoE is winning the game, just yet.

WikiBon did a great post on FCoE market dynamics as well.

The advantage of FC, and iSCSI for that matter, is that every server, every OS, and just about every storage vendor in the world supports them. Also there are plethera of economical, fabric switches available from multiple vendors that can support multi-port switching with high bandwidth. And there many support matrixes, identifying server-HBAs, O/S drivers for those HBA’s and compatible storage products to insure compatibility. So there is no real problem (other than wading thru the support matrixes) to implementing either one of these storage protocols.

Enter Open-FCoE, the upstart

What’s missing from 10GBE FCoE is perhaps a really cheap solution, one that was universally available, using commodity parts and could be had for next to nothing. The new Open-FCoE drivers together with the Intels x520 10GBE NIC has the potential to answer that need.

But what is it? Essentially Intel’s Open-FCoE is an O/S driver for Windows and Linux and a 10GBE NIC hardware from Intel. It’s unclear whether Intel’s Open-FCoE driver is a derivative of the Open-FCoe.org’s Linux driver or not but either driver works to perform some of the FCoE specialized functions in software rather than hardware as done by CNA cards available from other vendors. Using server processing MIPS rather than ASIC processing capabilities should make FCoE adoption in the long run, even cheaper.

What about performance?

The proof of this will be in benchmark results but it’s quite possible to be a non-issue. Especially, if there is not a lot of extra processing involved in a FCoE transaction. For example, if Open-FCoE only takes let’s say 2-5% of server MIPS and bandwidth to perform the added FCoE frame processing then this might be in the noise for most standalone servers and would showup only minimally in storage benchmarks (which always use big, standalone servers).

Yes, but what about virtualization?

However real world, virtualized servers is another matter. I believe that virtualized servers generally demand more intensive I/O activity anyway and as one creates 5-10 VMs, ESX server, it’s almost guaranteed to have 5-10X the I/O happening. If each standalone VM requires 2-5% of a standalone processor to perform Open-FCoE processing, then it could easily represent 5-7 X 2-5% on a 10VM ESX server (assumes some optimization for virtualization, if virtualization degrades driver processing, it could be much worse), which would represent a serious burden.

Now these numbers are just guesses on my part but there is some price to pay for using host server MIPs for every FCoE frame and it does multiply for use with virtualized servers, that much I can guarantee you.

But the (storage) world is better now

Nonetheless, I must applaud Intel’s Open-FCoE thrust as it can open up a whole new potential market space that today’s CNAs maybe couldn’t touch. If it does that, it introduces low-end systems to the advantages of FCoE then as they grow moving their environments to real CNAs should be a relatively painless transition. And this is where the real advantage lies, getting smaller data centers on the right path early in life will make any subsequent adoption of hardware accelerated capabilities much easier.

But is it really open?

One problem I am having with the Intel announcement is the lack of other NIC vendors jumping in. In my mind, it can’t really be “open” until any 10GBE NIC can support it.

Which brings us back to Open-FCoE.org. I checked their website and could see no listing for a Windows driver and there was no NIC compatibility list. So, I am guessing their work has nothing to do with Intel’s driver, at least as presently defined – too bad

However, when Open-FCoE is really supported by any 10GB NIC, then the economies of scale can take off and it could really represent a low-end cost point for storage infrastructure.

Unclear to me what Intel has special in their x520 NIC to support Open-FCoE (maybe some TOE H/W with other special sauce) but anything special needs to be defined and standardized to allow broader adoption by other Vendors. Then and only then will Open-FCoE reach it’s full potential.

—-

So great for Intel, but it could be even better if a standardized definition of an “Open-FCoE NIC” were available, so other NIC manufacturers could readily adopt it.

Comments?

Is cloud computing/storage decentralizing IT, again?

IBM Card Sorter by Pargon (cc) (From Flickr)
IBM Card Sorter by Pargon (cc) (From Flickr)

Since IT began, over the course of years, computing services have run through massive phases of decentralization out to departments followed by consolidation back to the data center.  In the early years of computing, from the 50s to the 60s, the only real distributed solution to mainframe or big iron data processing was sophisticated card sorters.

Consolidation-decentralization Wars

But back in the 70s the consolidation-decentralization wars were driven by the availability of mini-computers competing with mainframes for applications and users.  During the 80s, the PC emerged to become the dominant decentralizer taking applications away from mainframes and big servers and in the 90s it was small, off-the-shelf linux servers and continuing use of high-powered PCs that took applications out from data center control.

In those days it seemed that most computing decentralization was driven by the ease of developing applications for these upstarts and the relative low-cost of the new platforms.

Server virtualization, the final solution

Since 2000, another force has come to solve the consolidation quandry – server virtualization.  With server virtualization such as from VMware, Citrix and others, IT has once again driven massive consolidation outlying departmental computing services to bring them all, once again, under one roof, centralizing IT control.  Virtualization provided an optimum answer to the one issue that decentralization could never seem to address – utilization efficiency.  With most departmental servers being used at 5-10% utilization, virtualzation offered demonstrable cost savings when consolidated onto data center hardware.

Cloud computing/storage mutiny

But with the insurrection that is cloud computing and cloud storage once again, departments can easily acquire storage and computing resources on demand and utilization is no longer an issue because it’s a “pay only for what you use” solution. And they don’t even need to develop their own applications because SaaS providers can supply most of their application needs using cloud computing and cloud storage resources alone.

Virtualization was a great solution to the poor utilization of systems and storage resources. But with the pooling available with cloud computing and storage, utilization effectiveness occurs outside the bounds of the todays data center.  As such, with cloud services utilization effectiveness in $/MIP or $/GB can be approximately equivalent to any highly virtualized data center infrastructure (perhaps even better).  Thus, cloud services can provide these very same utilization enhancements at reduced costs out to any departmental user without the need for centralized data center services.

Other decentralization issues that cloud solves

Traditionally, the other problems with departmental computing services were lack of security and the unmanageability distributed service both of which held back some decentralization efforts but these are partially being addressed with cloud infrastructure today.  Insecurity continues to plague cloud computing but some cloud storage gateways (see Cirtas Surfaces and other cloud storage gateway posts) are beginning to use encryption and other cryptographic techniques to address these issues.  How this is solved for cloud computing is another question (see Securing the cloud – Part B).

Cloud computing and storage can be just as diffuse and difficult to manage as a proliferation of PCs or small departmental linux servers.  However, such unmanage-ability is a very different issue, one intrinsic to decentralization and much harder to address.  Although it’s fairly easy to get a bill for any cloud services, it’s unclear whether IT will be able to see all of them to manage it.  Also, nothing seems able to stop some department from signing up for SalesForce.com or even to use Amazon EC2 to support an application they need.  The only remedy, as far as I can see to this problem, is adherence to strict corporate policy and practice.  So unmanageability remains an ongoing issue for decentralized computing for some time to come.

—-

Nonetheless, it seems as if decentralization via the cloud is back, at least until the next wave of consolidation hits.  My guess for the next driver of consolidation is to make application development much easier and quicker to accomplish for centralized data center infrastructure – application frameworks anyone?

Comments?

Top 10 storage technologies over the last decade

Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)
Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)

Some of these technologies were in development prior to 2000, some were available in other domains but not in storage, and some were in a few subsystems but had yet to become popular as they are today.  In no particular order here are my top 10 storage technologies for the decade:

  1. NAND based SSDs – DRAM and other technology solid state drives (SSDs) were available last century but over the last decade NAND Flash based devices have dominated SSD technology and have altered the storage industry forever more.  Today, it’s nigh impossible to find enterprise class storage that doesn’t support NAND SSDs.
  2. GMR head– Giant Magneto Resistance disk heads have become common place over the last decade and have allowed disk drive manufacturers to double data density every 18-24 months.  Now GMR heads are starting to transition over to tape storage and will enable that technology to increase data density dramatically
  3. Data DeduplicationDeduplication technologies emerged over the last decade as a complement to higher density disk drives as a means to more efficiently backup data.  Deduplication technology can be found in many different forms today, ranging from file and block storage systems, backup storage systems, to backup software only solutions.
  4. Thin provisioning – No one would argue that thin provisioning emerged last century but it took the last decade to really find its place in the storage pantheon.  One almost cannot find a data center class storage device that does not support thin provisioning today.
  5. Scale-out storage – Last century if you wanted to get higher IOPS from a storage subsystem you could add cache or disk drives but at some point you hit a subsystem performance wall.  With scale-out storage, one can now add more processing elements to a storage system cluster without having to replace the controller to obtain more IO processing power.  The link reference talks about the use of commodity hardware to provide added performance but scale-out storage can also be done with non-commodity hardware (see Hitachi’s VSP vs. VMAX).
  6. Storage virtualizationserver virtualization has taken off as the dominant data center paradigm over the last decade but a counterpart to this in storage has also become more viable as well.  Storage virtualization was originally used to migrate data from old subsystems to new storage but today can be used to manage and migrate data over PBs of physical storage dynamically optimizing data placement for cost and/or performance.
  7. LTO tape When IBM dominated IT in the mid to late last century, the tape format dejour always matched IBM’s tape technology.  As the decade dawned, IBM was no longer the dominant player and tape technology was starting to diverge into a babble of differing formats.  As a result, IBM, Quantum, and HP put their technology together and created a standard tape format, called LTO, which has become the new dominant tape format for the data center.
  8. Cloud storage Unclear just when over the last decade cloud storage emerged but it seemed to be a supplement to cloud computing that also appeared this past decade.  Storage service providers had existed earlier but due to bandwidth limitations and storage costs didn’t survive the dotcom bubble. But over this past decade both bandwidth and storage costs have come down considerably and cloud storage has now become a viable technological solution to many data center issues.
  9. iSCSI SCSI has taken on many forms over the last couple of decades but iSCSI has the altered the dominant block storage paradigm from a single, pure FC based SAN to a plurality of technologies.  Nowadays, SMB shops can have block storage without the cost and complexity of FC SANs over the LAN networking technology they already use.
  10. FCoEOne could argue that this technology is still maturing today but once again SCSI has taken opened up another way to access storage. FCoE has the potential to offer all the robustness and performance of FC SANs over data center Ethernet hardware simplifying and unifying data center networking onto one technology.

No doubt others would differ on their top 10 storage technologies over the last decade but I strived to find technologies that significantly changed data storage that existed in 2000 vs. today.  These 10 seemed to me to fit the bill better than most.

Comments?

Punctuated equilibrium for business success

Finch evolution (from http://rst.gsfc.nasa.gov/Sect20/A12d.html)
Finch evolution (from http://rst.gsfc.nasa.gov/Sect20/A12d.html)

Coming out of the deep recession of 2007-2009 I am struck by how closely business success during recession looks like what ecologists call punctuated equilibrium.  As Wikipedia defines it, punctuated equilibria, “… is a model for discontinuous tempos of change (in) the process of speciation and the deployment of species in geological time.”

This seems to me to look just like strategic inflection points.  That is punctuated equilibrium is a dramatic, discontinuous change in a market or an environment which brings about great opportunity for gain or loss.  Such opportunities can significantly increase business market share, if addressed properly. But if handled wrong, species and/or market share can vanish with surprising speed.

Galapagos Finches

I first heard of punctuated equilibrium from the Pultizer prize-winning book The Beak of the Finch by Jonathan Weiner which documented a study done by two ecologists on Galapagos island finches over the course of a decade or so.  Year after year they went back and mapped out the lives and times of various species of finches on the island.  After a while they came to the conclusion that they were not going to see any real change in the finches during their study, the successful species were holding there own and the unsuccessful species were barely hanging on.  But then something unusual occurred.

As I recall, there was a great drought on the islands which left the more usual soft-skinned nut finch food unavailable.  During this disaster, a segment of finches that hadn’t been doing all that well on the islands but had a more powerful beak was able to rapidly gain population and there was evidence that finch speciation was actually taking place.  It turns out this powerful beak which was a liability in normal times was better able to break open these harder nuts that were relatively more plentiful during drought but normally unavailable.

Recessionary Business  Success

Similar to finches, certain business characteristics that in better times might be consider disadvantageous, can reap significant gains during recession.  Specifically,

  • Diverse product & service portfolio –  multiple products and services that appeal to different customer segments/verticals/size can help by selling to differing business some of which may be suffering  and some who may do ok during a recession.
  • Diverse regional revenue sources – multiple revenue streams coming from first, developing and third world localities around the world can help by selling to regions which can be less impacted by any economic catastrophe.
  • Cash savings – sizable savings accounts can help a company continue to spend on the right activities that will help them emerge from recession much stronger  than competitors forced to cut spending to conserve cash.
  • Marketing discipline – understanding how marketing directly influences revenue can help companies can better identify and invest in those activities that maximize revenue per marketing spend.
  • Development discipline – understanding how to develop products that deliver customer value can help companies better identify and invests in those activities that generate more revenue per R&D $.

Probably other characteristics that were missed, but these will suffice. For example consider cash savings, a large cash horde is probably a poor investment when times are good.  Also, diverse product and regional revenue streams may be considered unfocused and distracting when money is flooding in from main product lines sold in first world regions.  But when times are tough in most areas around the globe or most business verticals, having diverse revenue sources that span the whole globe and/or all business segments can be the difference between life and death.

The two obvious exceptions here are marketing and development discipline.  It’s hard for me to see a potential downside to doing these well.  Both obviously require time, effort and resources to excel in, but the payoffs are present good times and bad.

I am often amazed by the differences in how companies react to adversity.  Recession is just another, more pressing example of this.   Recessions, like industry transformation are facts of life today, failing to plan for them is a critical leadership defect that can threaten business long-term survival.

One iPad per Child (OipC)

OLPC XO Beta1 (from wikipedia.org)
OLPC XO Beta1 (from wikipedia.org)

Starting thinking today that the iPad with some modifications  could be used to provide universal computing and information services to the world’s poor as a One iPad per Child (OipC).  Such a solution could easily replace the One Laptop Per Child (OLPC) that exists today with a more commercially viable product.

From my perspective only a few additions would make the current iPad ideal for universal OipC use.  Specifically, I would suggest we add

  • Solar battery charger – perhaps the back could be replaced with a solar panel to charge the battery.  Or maybe the front could be reconfigured to incorporate a solar charger underneath or within its touch panel screen.
  • Mesh WiFi – rather than being a standard WiFi target, it would be more useful for the OipC to support a mesh based WiFi system.  Such a mesh WiFi could route internet request packets/data from one OipC to another, until a base station were encountered providing a broadband portal for the mesh.
  • Open source free applications – it would be nice if more open office applications were ported to the new OipC so that free office tools could be used to create content.
  • External storage  – software support for NFS or CIFS over WiFi would allow for a more sophisticated computing environment and together with the mesh WiFi would allow a central storage repository for all activities.
  • Camera – for photos and video interaction/feedback.

    iPad (from wikipedia.org)
    iPad (from wikipedia.org)

Probably other changes needed but these will suffice for discussion purposes. With such a device and reasonable access to broadband, the world’s poor could easily have most of the information and computing capabilities of the richest nations.  They would have access to the Internet and as such could participate in remote k-12 education as well as obtain free courseware from university internet sites.  They would have access to online news, internet calling/instant messaging and free email services which could connect them to the rest of the world.

I believe most of the OipC hardware changes could be viable additions to the current iPad with the possible exception of the mesh WiFi.  But there might be a way to make a mesh WiFi that is software configurable with only modest hardware changes (using software radio transcievers).

Using the current iPad

Of course, the present iPad without change could be used to support all this, if one were to add some external hardware/software:

  • An external solar panel charging system – multiple solar charging stations for car batteries exist today which are used in remote areas.  If one were to wire up a cigarette lighter and purchase a car charger for the iPad this would suffice as a charging station. Perhaps such a system could be centralized in remote areas and people could pay a small fee to charge their iPads.
  • A remote WiFi hot spot – many ways to supply WiFi hot spots for rural areas.  I heard at one time Ireland was providing broadband to rural areas by using local pubs as hot spots.  Perhaps a local market could be wired/radio-connected to support village WiFi.
  • A camera – buy a cheap digital camera and the iPad camera connection kit.  This lacks real time video streaming but it could provide just about everything else.
  • Apps and storag – software apps could be produced by anyone.  Converting open office to work on an iPad doesn’t appear that daunting except for the desire to do it.  Providing external iPad storage can be provided today via cloud storage applications.  Supplying pure NFS or CIFS support as native iPad facilities that other apps could use would be more difficult but could be easily provided if there were a market.

The nice thing about the iPad is that it’s a monolithic, complete unit. Other than power there are minimal buttons/moving parts or external components present.  Such simplified componentry should make it more easily usable in all sorts of environments.  Not sure how rugged the current iPad is and how well it would work out in rural areas without shelter, but this could easily be gauged and changes made to improve it’s surviveability.

OipC costs

Having the mesh, solar charger, and camera all onboard the OipC would make this all easier to deploy but certainly not cheaper.  The current 16GB iPad parts and labor come in around US$260 (from livescience).  The additional parts to support the onboard camera, WiFi mesh and solar charger would drive costs up but perhaps not significantly.  For example, adding the iPhone 3m pixel camera to the iPad might cost about US$10 and a 3gS transciever (WiFi mesh substitute) would cost an additional US$3 (both from theappleblog).

As for the solar panel battery charger, I have no idea, but a 10W standalone solar panel can be had from Amazon for $80.  Granted it doesn’t include all the parts needed to convert power to something that the iPad can use and it’s big, 10″ by 17″.  This is not optimal and would need to be cut in half (both physically and costwise) to better fit the OicP back or front panel.

Such a device might be a worthy successor to OLPC at the cost of roughly double that devices price of US$150 per laptop.  Packaging all these capabilities in the OicP might bring some economies of scale that could potentially bring its price down some more.

Can the OipC replace the OLPC?

One obvious advantage that the OipC would have over the OLPC is that it was based on a commercial device.  If one were to use the iPad as it exists today with the external hardware discussed above it would be a purely commercial device.  As such, future applications should be more forthcoming, hardware advances should be automatically incorporated in the latest products, and a commercial market would exist to supply and support the products.  All this should result in better, more current software and hardware technology being deployed to 3rd world users.

Some disadvantages for the OipC vs. the OLPC include lack of a physical keyboard, open source operating system and access to all open source software, and usb ports.  Of course all the software and courseware specifically designed for the OLPC would also not work on the OipC.  The open sourced O/S and the USB are probably the most serious omissions. iPad has a number of external keyboard options which can be purchased if needed.

Now as to how to supply broadband to rural hot spots around the 3rd world, we must leave this for a future post…

VPLEX surfaces at EMCWorld

Pat Gelsinger introducting VPLEXes on stage at EMCWorld
Pat Gelsinger introducting VPLEXes on stage at EMCWorld

At EMCWorld today Pat Gelsinger  had a pair of VPLEXes flanking him on stage and actively moving VMs from “Boston” to “Hopkinton” data centers.  They showed a demo of moving a bunch of VMs from one to the other with all of them actively performing transaction processing.  I have written about EMC’s vision in a prior blog post called Caching DaaD for Federated Data Centers.

I talked to an vSpecialist at the Blogging lounge afterwards and asked him where the data actually resided for the VMs that were moved.  He said the data was synchronously replicated and actively being updated  at both locations. They proceeded to long-distance teleport (Vmotion) 500 VMs from Boston to Hopkinton.  After that completed, Chad Sakac powered down the ‘Boston’ VPLEX and everything in ‘Hopkinton’ continued to operate.  All this was done on stage so Boston and Hopkinton data centers were possibly both located in the  convention center but interesting nonetheless.

I asked the vSpecialist how they moved the IP address between the sites and he said they shared the same IP domain.  I am no networking expert but I felt that moving the network addresses seemed the last problem to solve for long distance Vmotion.  But, he said Cisco had solved this with their OTV (Open Transport Virtualization) for  Nexus 7000 which could move IP addresses from one data center to another.

1 Engine VPLEX back view
1 Engine VPLEX back view

Later at the Expo, I talked with a Cisco rep who said they do this by encapsulating Layer 2 protocol messages into a Layer 3 packet. Once encapsulated it can be routed over anyone’s gear to the other site and as long as there was another Nexus 7K switch at the other site within the proper IP domain shared with the server targets for Vmotion then all works fine.  Didn’t ask what happens if the primary Nexus 7K switch/site goes down but my guess is that the IP address movement would cease to work. But for active VM migration between two operational data centers it all seems to hang together.  I asked Cisco if OTV was a formal standard TCP/IP protocol extension and he said he didn’t know.  Which probably means that other switch vendors won’t support OTV.

4 Engine VPLEX back view
4 Engine VPLEX back view

There was a lot of other stuff at EMCWorld today and at the Expo.

  • EMC’s Content Management & Archiving group was renamed Information Intelligence.
  • EMC’s Backup Recovery Systems group was in force on the Expo floor with a big pavilion with Avamar, Networker and Data Domain present.
  • EMC keynotes were mostly about the journey to the private cloud.  VPLEX seemed to be crucial to this journey as EMC sees it.
  • EMCWorld’s show floor was impressive. Lots of  major partners were there RSA, VMware, IOmega, Atmos, VCE, Cisco, Microsoft, Brocade, Dell, CSC, STEC, Forsythe, Qlogic, Emulex and many others.  Talked at length with Microsoft about SharePoint 2010. Still trying to figure that one out.
One table at bloggers lounge StorageNerve & BasRaayman hard at work
One table at bloggers lounge StorageNerve & BasRaayman in the foreground hard at work

I would say the bloggers lounge was pretty busy for most of the day.  Met a lot of bloggers there including StorageNerve (Devang Panchigar), BasRaaymon (Bas Raaymon), Kiwi_Si (Simon Seagrave), DeepStorage (Howard Marks), Wikibon (Dave Valente), and a whole bunch of others.

Well not sure what EMC has in store for day 2, but from my perspective it will be hard to beat day 1.

Full disclosure, I have written a white paper discussing VPLEX for EMC and work with EMC on a number of other projects as well.

Smart metering’s data storage appetite

European smart meter in use (from en.wikipedia.org) (cc)
European smart meter in use (from en.wikipedia.org/wiki/Smart_meter) (cc)

A couple of years back I was talking with a storage person from PG&E and he was concerned about the storage performance aspects of installing smart meters in California.  I saw a website devoted to another electric company in California installing 1.4M smart meters that send information every 15min to the electric company.  Given that this must be only some small portion of California this represents  ~134M electricity recording transactions per day and seems entirely doable. But even at only 128 bytes per transaction, ~17GB a day of electric metering data is ingested for this company’s service area. Naturally, this power company wants to extend smart metering to gas usage as well which should not quite double the data load.

According to US census data there were ~129M households in 2008.  At that same 15 minute interval, smart metering for the whole US would generate 12B transactions a day and at 128 bytes per transaction, would represent ~ 1.5TB/day.  Of course thats only households and only electricity usage.

That same census website indicates there were 7.7M businesses in the US in 2007.  To smart meter these businesses at the same interval would take an additional ~740M transactions a day or ~95GB of data. But fifteen minute intervals may be too long for some companies (and their power suppliers), so maybe it should  be dropped to every minute for businesses.  At one minute intervals, businesses would add 1.4TB of electricity metering data to the household 1.5TB data or a total of ~3TB of data/day.

Storage multiplication tables:

  • That 3TB of day must be backed up so that’s at least another 3TB of day of backup load (deduplication notwithstanding).
  • That 3TB of data must be processed offline as well as online, so that’s another 3TB a day of data copies.
  • That 3TB of data is probably considered part of the power company’s critical infrastructure and as such, must be mirrored to some other data center which is another 3TB a day of mirrored data.

So with this relatively “small” base data load of 3TB a day we are creating an additional 9TB/day of copies.  Over the course of a year this 12TB/day generates ~4.4PB of data.  A study done by StorageTek in the late ’90s showed that on average data was copied 6 times, so the 3 copies above may be conservative.  If the study results held true today for metering data, it would generate ~7.7PB/year.

To paraphrase Senator E. Dirksena petabyte here, a petabyte there and pretty soon your talking real storage.

In prior posts we discussed the 1.5PB of data generated by CERN each year, the expectations for the world to generate an exabyte (XB) a day of data in 2009 and  NSA’s need to capture and analyze a yottabyte (YB) a year of voice data by 2015.  Here we show how another 4-8PB of storage could be created each year just by rolling out smart electricity metering to US businesses and homes.

As more and more aspects of home and business become digitized more data is created each day and it all must be stored someplace – data storage.  Other technology arenas may also benefit from this digitization of life, leisure, and economy but today we would contend that  storage benefits most from this trend. We must defer for now discussions as to why storage benefits more than other technological domains to some future post.