Data density – Page 3 – Silverton Consulting

Optical discs for Facebook cold storage

Posted on February 8, 2014February 8, 2014 by Ray in Cloud services, Data density, Data retention, Optical storage, Storage longevity, System effectiveness

I heard last week that Facebook is implementing Blu Ray libraries for cold storage. Each BluRay disk holds ~100GB and they figure they can store 10,000 discs or ~1PB in a rack.

They bundle 12 discs in a cartridge and 36 cartridges in a magazine, placing 24 magazines in a cabinet, with BluRay drives and a robotic arm. The robot arm sits in the middle of the cabinet with the magazines/cartridges located on each side.

It’s unclear what Amazon Glacier uses for its storage but a retrieval time of 3-5 hours indicates removable media of some type. I haven’t seen anything on Windows Azure offering a similar service but Google has released Durable Reduced Availability (DRA) storage which could potentially be hosted on removable media as well. I was unable to find any access times specifications for Google DRA.

Why the interest in cold storage?

The article mentioned that Facebook is testing the new technology first on its compliance data. After that Facebook will start using it for cold photo storage. Facebook also said that it will be using different storage technologies for it’s cold storage repository mentioning “bad flash” as another alternative.

BluRay supports both a re-writeable as well as WORM (write once, read many times) technology. As such, WORM discs cannot be modified, only destroyed. WORM technology would be very useful for anyone’s compliance data. The rewritable Blu Ray discs might be more effective for cold photo storage, however the fact that people on Facebook rarely delete photos, says WORM would work well here too.

100GB is a pretty small storage bucket these days but for compliance documents, such as email, invoices, contracts, etc. it’s plenty large.

Can Blu Ray optical provide data center cold storage?

Facebook didn’t discuss the specs on the robot arm that they were planning to use but with 10K cartridges it has a lot of work to do. Tape library robots move a single cartridge in about 11 seconds or so. If the optical robot could do as well (no information to the contrary) one robot arm could support ~4K disc moves per day. But that would be enterprise class robotics and 100% duty cycle, more likely 1/2 to 1/4 of this would be considered good for an off the shelf system like this. So maybe a 1000 to 2000 disc picks per day.

If we use 22 seconds per disc swap (two disc moves), a single robot/rack could support a maximum of 100 to 200TB of data writes per day (assuming robot speed was the only bottleneck). In the video (see about 30 minutes in) the robot didn’t look all that fast as compared to a tape library robot, but maybe I am biased.

Near as I can tell a 12x BluRay drive can write at ~35MB/sec (SATA drive, writing single layer, 25GB disc, we assume this can be sustained for a 4-layer or dual-sided 2-layer 100GB disc). So to be able to write a full 100GB disk would take ~48 minutes and if you add to that the 22 seconds of disc swap time, one SATA drive running 100% flat out could maybe write 30 discs per day or ~3TB/day.

In the video, the BluRay drives appear to be located in an area above the disc magazines along each side. There appears to be two drives per column with 6 columns per side, so a maximum of 24 drives. With 24 drives, one rack could write about 72TB/day or 720 discs per day which would fit into our 22 seconds per swap. At 72TB/day it’s going to take ~14 days to fill up a cabinet. I could be off on the drive count, they didn’t show the whole cabinet in the video, so it’s possible they have 12 columns per side, 48 drives per cabinet and 144TB/day.

All this assumes 100% duty cycle on the drives which is unreasonable for an enterprise class tape drive let alone a consumer class BluRay drive. This is also write speed, I assume that read speed is the same or better. Also, I didn’t see any servers in the cabinet and I assume that something has to be reading, writing and controlling the optical library. So these other servers need to be somewhere close by, but they could easily be located in a separate rack somewhere near to the library.

So it all makes some amount of sense from a system throughput perspective. Given what we know about the drive speed, cartridge capacity and robot capabilities, it’s certainly possible that the system could sustain the disc swaps and data transfer necessary to provide data center cold storage archive.

And the software

But there’s plenty of software that has to surround an optical library to make it useful. Somehow we would want to be able to identify a file as a candidate for cold storage then have it moved to some cold storage disc(s), cataloged, and then deleted from the non-cold storage repository. Of course, we probably want 2 or more copies to be written, maybe these redundant copies should be written to different facilities or at least different cabinets. The catalog to the cold storage repository is all important and needs to be available 24X7 so this needs to be redundant/protected, updated with extreme care, and from my perspective on some sort of high-speed storage to handle archives of 3EB.

What about OpenStack? Although there have been some rumblings by Oracle and others to provide tape support in OpenStack, nothing seems to be out yet. However, it’s not much of a stretch to see removable media support in OpenStack, if some large company were to put some effort into it.

Other cold storage alternatives

In the video, Facebook says they currently have 30PB of cold storage at one facility and are already in the process of building another. They said that they should have 150PB of cold storage online shortly and that each cold storage facility is capable of holding 3EB or 3,000PB of cold storage.

A couple of years back at Hitachi in Japan, we were shown a Blu Ray optical disc library using 50GB discs. This was just a prototype but they were getting pretty serious about it then. We also saw an update of this at an analyst meeting at HDS, a year or so later. So there’s at least one storage company working on this technology.

Facebook, seems to have decided they were better off developing their own approach. It’s probably more dense/space efficient and maybe even more power efficient but to tell that would take some spec comparisons which aren’t available from Facebook or HDS just yet.

Why not magnetic tape?

I see these large storage repository sizes and wonder if Facebook might not be better off using magnetic tape. It has a much larger capacity and I believe magnetic tape (LTO or enterprise) would supply better volumetric (bytes/in**3) density than the Blu Ray cabinet they showed in the video.

Facebook said that BluRay discs had a 50 year lifetime. I believe enterprise and LTO tape vendors say their cartridges have a 30 year lifetime. And that might be one consideration driving them to optical.

The reality is that new LTO technology is coming out every 2-3 years or so, and new drives read only 2 generations back and write only the current technology. With that quick a turnover, a data center would probably have to migrate data from old to new tape technology every decade or so before old tape drives go out of warranty.

I have not seen any Blu Ray technology roadmaps so it’s hard to make a comparison, but to date, PC based Blu Ray drives typically can read and write CDs, DVDs, and current Blu Ray disks (which is probably 4 to 5 generations back). So they have a better reputation for backward compatibility over time.

Tape technology roadmaps are so quick because tape competes with disk, which doubles capacity every 18 months or so. I am sure tape drive and media vendors would be happy not to upgrade their technology so fast but then disk storage would take over more and more tape storage applications.

If Blu Ray were to become a data center storage standard, as Facebook seems to want, I believe that Blu Ray technology would fall under similar competitive pressures from both disk and tape to upgrade optical technology at a faster rate. When that happens, it would be interesting to see how quickly optical drives stop supporting the backward compatibility that they currently support.

Comments?

Photo Credit: [73/366] Grooves by Dwayne Bent [Ed. note, picture of DVD, not Blu Ray disc]

Oracle (finally) releases StorageTek VSM6

Posted on October 25, 2012 by Ray in Clustered storage, Data, Data density, Disaster Recovery, ESCON, Ethernet, FICON, Networking, R&D measures, Storage reliability, Storage virtualization, System effectiveness, Systems, Tape storage, Uncategorized

[Full disclosure: I helped develop the underlying hardware for VSM 1-3 and also way back, worked on HSC for StorageTek libraries.]

Virtual Storage Manager System 6 (VSM6) is here. Not exactly sure when VSM5 or VSM5E were released but it seems like an awful long time in Internet years. The new VSM6 migrates the platform to Solaris software and hardware while expanding capacity and improving performance.

What’s VSM?

Oracle StorageTek VSM is a virtual tape system for mainframe, System z environments. It provides a multi-tiered storage system which includes both physical disk and (optional) tape storage for long term big data requirements for z OS applications.

VSM6 emulates up to 256 virtual IBM tape transports but actually moves data to and from VSM Virtual Tape Storage Subsystem (VTSS) disk storage and backend real tape transports housed in automated tape libraries. As VSM data ages, it can be migrated out to physical tape such as a StorageTek SL8500 Modular [Tape] Library system that is attached behind the VSM6 VTSS or system controller.

VSM6 offers a number of replication solutions for DR to keep data in multiple sites in synch and to copy data to offsite locations. In addition, real tape channel extension can be used to extend the VSM storage to span onsite and offsite repositories.

One can cluster together up to 256 VSM VTSSs into a tapeplex which is then managed under one pane of glass as a single large data repository using HSC software.

What’s new with VSM6?

The new VSM6 hardware increases volatile cache to 128GB from 32GB (in VSM5). Non-volatile cache goes up as well, now supporting up to ~440MB, up from 256MB in the previous version. Power, cooling and weight all seem to have also gone up (the wrong direction??) vis a vis VSM5.

The new VSM6 removes the ESCON option of previous generations and moves to 8 FICON and 8 GbE Virtual Library Extension (VLE) links. FICON channels are used for both host access (frontend) and real tape drive access (backend). VLE was introduced in VSM5 and offers a ZFS based commodity disk tier behind the VSM VTSS for storing data that requires longer residency on disk. Also, VSM supports a tapeless or disk-only solution for high performance requirements.

System capacity moves from 90TB (gosh that was a while ago) to now support up to 1.2PB of data. I believe much of this comes from supporting the new T10,000C tape cartridge and drive (5TB uncompressed). With the ability of VSM to cluster more VSM systems to the tapeplex, system capacity can now reach over 300PB.

Somewhere along the way VSM started supporting triple redundancy for the VTSS disk storage which provides better availability than RAID6. Not sure why they thought this was important but it does deal with increasing disk failures.

Oracle stated that VSM6 supports up to 1.5GB/Sec of throughput. Presumably this is landing data on disk or transferring the data to backend tape but not both. There doesn’t appear to be any standard benchmarking for these sorts of systems so, will take their word for it.

Why would anyone want one?

Well it turns out plenty of mainframe systems use tape for a number of things such as data backup, HSM, and big data batch applications. Once you get past the sunk costs for tape transports, automation, cartridges and VSMs, VSM storage can be a pretty competitive data storage solution for the mainframe environment.

The fact that most mainframe environments grew up with tape and have long ago invested in transports, automation and new cartridges probably makes VSM6 an even better buy. But tape is also making a comeback in open systems with LTO-5 and now LTO-6 coming out and with Oracle’s 5TB T10000C cartridge and IBM’s 4TB 3592 JC cartridge.

Not to mention Linear Tape File System (LTFS) as a new tape format that provides a file system for tape data which has brought renewed interest in all sorts of tape storage applications.

Competition not standing still

EMC introduced their Disk Library for Mainframe 6000 (DLm6000) product that supports two different backends to deal with the diversity of tape use in the mainframe environment. Moreover, IBM has continuously enhanced their Virtual Tape Server the TS7700 but I would have to say it doesn’t come close to these capacities.

Lately, when I talked with long time StorageTek tape mainframe customers they have all said the same thing. When is VSM6 coming out and when will Oracle get their act in gear and start supporting us again. Hopefully this signals a new emphasis on this market. Although who is losing and who is winning in the mainframe tape market is the subject of much debate, there is no doubt that the lack of any update to VSM has hurt Oracle StorageTek tape business.

Something tells me that Oracle may have fixed this problem. We hope that we start to see some more timely VSM enhancements in the future, for their sake and especially for their customers.

~~~~

Comments?

~~~~

Image credit: Interior of StorageTek tape library at NERSC (2) by Derrick Coetzee

Shingled magnetic recording disks

Posted on October 4, 2012 by Ray in Block Storage, Data, data access, Data density, Disk storage, File Storage, Storage, Storage drive, Strategic Inflection Points, System effectiveness, System quality, Systems

A couple of weeks ago I attended a day of the SNIA Storage Developers Conference (SDC) where Garth Gibson of Carnegie Mellon University Parallel Data Lab (CMU PDL) and Panasas was giving a talk of what they are up to at CMU’s storage lab. His talk at the conference was on shingled magnetic recording (SMR) disks. We have discussed this topic before in our posts on Sequential only disks?! and in Disk trends revisited. SMR may require a re-thinking of how we currently access disk storage.

Recall that shingled magnetic recording uses a write head that overwrites multiple tracks at a time (see graphic above), with one track being properly written and the adjacent (inward) tracks being overwritten. As the head moves to the next track, that track can be properly written but more adjacent (inward) tracks are overwritten, etc. In this fashion data can be written sequentially, on overlapping write passes. In contrast, read heads can be much narrower and are able to read a single track.

In my post, I assumed that this would mean that the new shingled magnetic recording disks would need to be accessed sequentially not unlike tape. Such a change would need a massive rewrite to only write data sequentially. I had suggested this could potentially work if one were to add some SSD or other NVRAM to the device to help manage the mapping of the data to the disk. Possibly that plus a very sophisticated drive controller, not unlike SSD wear leveling today, could handle mapping a physically sequentially accessed disk to a virtually randomly accessed storage protocol.

Garth’s approach to the SMR dilemma

Garth and his team of researchers are taking another tack at the problem. In his view there are multiple groups of tracks on an SMR disk (zones or bands). Each band can be either written sequentially or randomly but all bands can be read randomly. One can break up the disk to include sections of multiple shingled bands, that are sequentially written and less, non-shingled bands that can be randomly written. Of course there would be a gap between the shingled bands in order not to overwrite adjacent bands. And there would also be gaps between the randomly written tracks in a non-shingled partition to allow for the wider track writing that occurs with the SMR write head.

His pitch at the conference dealt with some characteristics of such a multi-band disk device. Such as

How to determine the density for a device that has multiple bands of both shingled write data and randomly written data.
How big or small a shingled band should be in order to support “normal” small block and randomly accessed file IO.
How many randomly written tracks or what the capacity of the non-shingled bands would need to be to support “normal” file IO activity.

For maximum areal density one would want large shingled bands. There are other interesting considerations that were not as obvious but I won’t go into here.

SCSI protocol changes for SMR disks

The other, more interesting section of Garth’s talk was on recent proposed T10 and T13 changes to support SMR disks that supported shingled and non-shingled partitions and what needed to be done to support SMR devices.

The SCSI protocol changes being considered to support SMR devices include:

A new write cursor for shingled write bands that indicates the next LBA to be written. The write cursor starts out at a relative band address of 0 and as each LBA is written consecutively in the band it’s incremented by one.
A write cursor can be reset (to zero) indicating that the band has been erased
Each drive maintains the band map and current cursor position within each band and this can be requested by SCSI drivers to understand the configuration of the drive.

Probably other changes are required as well but these seem sufficient to flesh out the problem.

SMR device software support

Garth and his team implemented an SMR device, emulated in software using real random accessed devices. They then implemented an SMR device driver that used the proposed standards changes and finally, implemented a ShingledFS file system to use this emulated SMR disk to see how it would work. (See their report on Shingled Magnetic Recording for Big Data Applications for more information.)

The CMU team implemented a log structured file system for the ShingledFS that only wrote data to the emulated SMR disk shingled partition sequentially, except for mapping and meta-data information which was written and updated randomly in a non-shingled partition.

You may recall that a log structured file system is essentially written as a sequential stream of data (not unlike a log). But there is additional mapping required that indicates where file data is located in the log which allows for randomly accessing the file data.

In their report and at the conference, Garth presented some benchmark results for a big data application called Terasort (essentially Teragen, Terasort and Teravalidate) which seems to use Hadoop to sort a large body of data. Not sure I can replicate this information here but suffice it to say at the moment the emulated SMR device with ShingledFS did not beat a base EXT3 or FUSE using the same hardware for these applications.

Now the CMU project wAs done by a bunch of smart researchers but it’s still relatively new and not necessarily that optimized. Thus, there’s probably some room for improvement in the ShingledFS and maybe even the emulated SMR device and/or the SMR device driver.

At the moment Garth and his team seem to believe that SMR devices are certainly feasible and would take only modest changes to the SCSI protocols to support such devices. As for file system support there is plenty of history surrounding log structured file systems so these are certainly doable but would require probably extensive development to implemented in various OS to support an SMR device. The device driver changes don’t seem to be as significant.

~~~~

It certainly looks like there’s going to be SMR devices in our future. It’s just a question whether they will be ever as widely supported as the randomly accessed disk device we know and love today. Possibly, this could all be behind a storage subsystem that makes the technology available as networked storage capacity and over time maybe SMR devices could be implemented in more standard OS device drivers and file systems. Nevertheless, to keep capacity and areal density on their current growth trajectory, SMR disks are coming, it’s just a matter of time.

Comments?

ReRAM to the rescue

Posted on May 15, 2012May 15, 2012 by Ray in Data density, Energy efficiency, SSD storage, Strategic Inflection Points

I was at the Solid State Storage Symposium a couple of weeks ago where Robin Harris (StorageMojo) gave the keynote presentation. In his talk, Robin mentioned a new technology on the horizon which holds the promise of replacing DRAM, SRAM and NAND called resistive random access memory (ReRAM or RRAM).

If so, ReRAM will enter the technological race pitting MRAM, Graphene Flash, PCM and racetrack memory as followons for NAND technology. But none of these have any intention of replacing DRAM.

Problems with NAND

There are a few problems with NAND today but the main problem that affects future NAND technologies is as devices shrink they lose endurance. For instance, today’s SLC NAND technology has an endurance of ~100K P/E (program/erase) cycles, MLC NAND can endure around 5000 P/E cycles and eMLC somewhere in between. Newly emerging TLC (three bits/cell) has less even endurance than MLC.

But that’s all at 30nm or larger. The belief is that as NAND feature size shrinks below 20nm its endurance will get much worse, perhaps orders of magnitude worse.

While MLC may be ok for enterprise storage today, much less than 5000 P/E cycles could become a problem and would require ever more sophistication in order to work around these limitation. Which is why most enterprise class, MLC NAND based storage uses specialized algorithms and NAND controller functionality to support storage reliability and durability.

ReRAM solves NAND, DRAM and NvRAM problems.

Enter ReRAM, it has the potential to be faster than PCM-RAM, has smaller features than MRAM which means more bits per square inch and uses lower voltage than racetrack memory and NAND. The other nice thing about ReRAM is that it seems readily scaleable to below 30nm feature geometries. Also as it’s a static memory it doesn’t have to be refreshed like DRAM and thus uses less power.

In addition, it appears that ReRAM is much more flexible than NAND or DRAM which can be designed and/or tailored to support different memory requirements. Thus, one ReRAM design can be focused on standard DRAM applications while another ReRAM design can be targeted at mass storage or solid state drives (SSD).

On the negative side there are still some problems with ReRAM, namely the large “sneak parasitic current” [whatever that is] that impacts adjacent bit cells and drains power. There are a few solutions to this problem but none yet completely satisfactory.

But it’s a ways out, isn’t it?

No it’s not. BBC and Tech-On reported that Panasonic will start sampling devices soon and plan to reach volume manufacturing next year. Elpida-Sharp and HP-Hynix are also at work on ReRAM (or memristor) devices and expect to ship sometime in 2013. But for the moment it appears that Panasonic is ahead of the pack.

At first, these devices will likely emerge in low power applications but as vendors ramp up development and mass production it’s unclear where it will ultimately end up.

The allure of ReRAM technology is significant in that it holds out the promise of replacing both RAM and NAND used in consumer devices as well as IT equipment with the same single technology. If you consider that the combined current market for DRAM and NAND is over $50B, people start to notice.

~~~~

Whether ReRAM will meet all of its objectives is yet TBD. But we seldom see any one technology which has this high a potential. The one remaining question is why everybody else isn’t going after ReRAM as well, like Samsung, Toshiba and Intel-Micron.

I have to thank StorageMojo and the Solid State Storage Symposium team for bringing ReRAM to my attention.

[Update] @storagezilla (Mark Twomey) said that “… Micron’s aquisition of Elpida gives them a play there.”

Wasn’t aware of that but yes they are definitely in the hunt now.

Comments?

Image: Memristor by Luke Kilpatrick

Gamma ray optics promise nuclear waste mitigation

Posted on May 10, 2012 by Ray in Data density, System effectiveness

Scientists report (see AAAS report, Wired article or actual research) that they are now able to refract or focus gamma rays. Contrary to theory, they have discovered that gamma rays can be deflected by the nucleus of a silicon atom.

Down a bit in the article they said that the mystery deflecting gamma rays seems to be the creation of “virtual electron” electron&anti-electron pairs in the nucleus. The deflection is something ~1.000000001 not much yet, but the belief is that even heavier elements such as gold will refract gamma rays even better.

Gamma-ray and gamma ray bursts are typically evidence of extremely energetic explosions witnessed in distant galaxies. They are the most luminous electromagnetic events in the universe. Most gamma ray bursts are released during supernova explosions when a star violently collapses.

But what can you do with Gamma ray optics?

The possibility of gamma ray optical systems introduces a whole new way of looking at the universe. For example, the introduction of x-rays in the early 1900s created an entirely new way to see inside the human body, never before possible. It’s unclear what gamma ray optics or a G-ray machine will do for medicine or human health but it’s certain that such devices will be better able to “see” processes and objects impossible to detect today.

One item of interest was the promise that someday, gamma ray optics will be able to render harmless, radioactive isotopes such as nuclear waste. Somehow a focused gamma ray beam at the proper (neutron binding energy) wavelength could be used to “evaporate” or remove neutrons from an atomic nucleus and by doing so render it less lethal. How this works on Kg of material versus a single atom is another question.

Also, gamma ray optics could be used in the future to potentially create designer radioactive isotopes for medical diagnostics and therapy. Even higher resolution nuclear spectroscopy is envisioned by using gamma ray optics.

~~~~

I don’t know about nuclear waste, but if gamma ray optics could transmute lead into gold, we might have something. This probably means that someday, gamma ray optics will be able to store information in an atomic nucleus and that would certainly take data density out of the magnetic domain altogether.

Image: Tycho’s Star Shines in Gamma Rays

A “few exabytes-a-day” from SKA

Posted on April 5, 2012April 10, 2012 by Ray in Data density, data logistics, Networking, Optical networking, Storage, Tape storage

A number of radio telescopes, positioned close together pointed at a cloudy sky — VLA by C. G. P. Grey (cc) (from Flickr)

ArsTechnica reported today on the proposed Square Kilometer Array (SKA) radio telescope and it’s data requirements. IBM is in collaboration with the Netherlands Institute for Radio Astronomy (ASTRON) to help develop the SKA called the DOME project.

When completed in ~2024, the SKA will generate over an exabyte a day (10**18) of raw data. I reported in a previous post how the world was generating an exabyte-a-day, but that was way back in 2009.

What is the SKA?

The new SKA telescope will be a configuration of “millions of radio telescopes” which when combined together will create a telescope with an aperture of one square kilometer, which is no small feet. They hope that the telescope will be able to shed some light on galaxy evolution, cosmology and dark energy. But it will go beyond that to investigating “strong-field tests of gravity“, “origins and evolution of cosmic magnetism” and search for life on other planets.

But the interesting part from a storage perspective is that the SKA will be generating a “few exabytes a day” of radio telescopic data for every full day of operation. Apparently the new radio telescopes will make use of a new, more sensitive detector able to generate data of up to 10GB/second.

How much data, really?

The team projects final storage needs at between 300 to 1500 PB per year. This compares to the LHC at CERN which consumes ~15PB of storage per year.

It would seem that the immediate data download would be the few exabytes and then it would be post- or inline-processed into something more mangeable and store-able. Unless they have some hellaciously fast processing, I am hard pressed to believe this could all happen inline. But then they would need at least another “few exabytes” of storage to buffer the data feed before processing.

I guess that’s why it’s still a research project. Presumably, this also says that the telescope won’t be in full operation every day of the year, at least at first.

The IBM-ASTRON DOME collaboration project

The joint research project was named for the structure that covers a major telescope and for a famous Swiss mountain. Focus areas for the IBM-ASTRON DOME project include:

Advanced high performance computing utilizing 3D chip stacks for better energy efficiency
Optical interconnects with nanophotonics for high-speed data transfer
Storage for both high access performance access and for dense/energy efficient data storage.

In this last focus area, IBM is considering the use of phase change memories (PCM) for high access performance and new generation tape for dense/efficient storage. We have discussed PCM before in a previous post as an alternative to NAND based storage today (see Graphene Flash Memory). But IBM has also been investigating MRAM based race track memory as a potential future storage technology. I would guess the advantage of PCM over MRAM might be access speed.

As for tape, IBM has already demonstrated in their labs technologies for a 35TB tape. However storing 1500 PB would take over 40K tapes per year so they may need another even higher capacities to support SKA tape data needs.

Of course new optical interconnects will be needed to move this much data around from telescope to data center and beyond. It’s likely that the nanophotonics will play some part as an all optical network for transceivers, amplifiers, and other networking switching gear.

The 3D chip stacks have the advantage of decreasing chip IO and more dense packing of components will make efficient use of board space. But how these help with energy efficiency is another question. The team projects very high energy and cooling requirements for their exascale high performance computing complex.

If this is anything like CERN, datasets gathered onsite are initially processed then replicated for finer processing elsewhere (see 15PB a year created by CERN post. But moving PBs around like SKA will require is way beyond today’s Internet infrastructure.

~~~~

Big science like this gives a whole new meaning to BIGData. Glad I am in the storage business. Now just what exactly is nanophotonics, mems based phote-electronics?

What to do with 36TB on my Mac?

Posted on March 16, 2012 by Ray in Data density, Disk storage, File Storage, Strategic Inflection Points

(Back of) Western Digital's Thunderbolt Duo (from their website)

Western Digital (WD) just released their new Digital MyBook Thunderbold Duo the other day and it features 2-2TB or -3TB disks and of course you can daisy chain up to 6 of these together just in case, for up to 36TB on a Mac.

I have been happy with my desktop storage which has been running about 80% full. Plus I have a 1TB time machine external drive for online backups which I use more than I care to admit. But what the heck am I going to do with 36TB.

Enter Apple TV

Well, now that the new Apple TV is out and it supports 1080p video that problem might be solved. I am starting to think of transfering my entire DVD/BlueRay collection to digital format and loading it all on iTunes. That way I could use Airplay and Apple TV to play it to a TV.

This is where the 6 to 36TB of storage could come in handy. Especially if I wasn’t interested in streaming everything off of iCloud and having a local iTunes repository onsite for all my videos.

Digital video for the iPad

Today, I don’t have a lot of videos on my desktop, mostly ones I wanted to view on my iPad so, they are highly compressed and only take up about 1GB per video (Handbrake encoded from DVDs).

I am thinking the new 1080p iTunes encoded videos would take up more space at least 4-5GB per video but would still be considerably better than 9GB for DVD and ~36GB for BluRay, high definition videos.

Given current storage I could probably handle converting my current iPad videos over to the 1080p version (if I actually owned them in hi-def) but if I wanted to put the rest of my video library on my desktop I don’t have enough space.

Bulk storage meet the Mac

Then WD came out with their new Thunderbolt Duo drives. It seems to have it all, Thunderbolt I/O at 10Gbps, with all the storage I could possibly need. Presumably the 2 or 3TB drives are 5400 or 7200 SATA 3.0 drives. But they are user swappable, so could concievably be changed out to whatever comes out next but probably in pairs.

Of course with SATA 3.0 they can only go 6Gbps to the disks, but it’s not a bad match to have 2 drives per single bi-directional Thunderbolt channel. Although whether 6 of these daisy chained on a single Thunderbolt cable would generate decent performance is another question. Then again, how much performance can one Mac use?

I suppose my next steps are to upgrade my Mac to hardware that supports Thunderbolt, get Apple TV, buy a Duo drive or two and then start encoding my DVD/BluRay library.

But that’s too logical, instead maybe I’ll just get Apple TV and give iCloud a try, at least for awhile and save the WD Duo for the next evolution. Maybe by then WD have come out with their 4TB drives, providing 8TB per Duo.

Comments?

12 atoms per bit vs 35 bits per electron

Posted on January 17, 2012 by Ray in Data density, R&D measures, Storage, Storage density

Shows 6 atom pairs in a row, with coloration of blue for interstitial space and yellow for external facets of the atom — from Technology Review Article

Read a story today in Technology Review on Magnetic Memory Miniaturized to Just 12 Atoms by a team at IBM Research that created a (spin) magnetic “storage device” that used 12 iron atoms to record a single bit (near absolute zero and just for a few hours). The article said it was about 100X denser than the previous magnetic storage record.

Holographic storage beats that

Wikipedia’s (soon to go dark for 24hrs) article on Memory Storage Density mentioned research at Stanford that in 2009 created an electronic quantum holographic device that stored 35 bits/electron using a sheet of copper atoms to record the letters S and U.

The Wikipedia article went on to equate 35bits/electron to ~3 Exabytes[10**18 bytes]/In**2. (Although, how Wikipedia was able to convert from bits/electron to EB/in**2 I don’t know but I’ll accept it as a given)

Now an iron atom has 26 electrons and copper has 29 electrons. If 35 bits/electron is 3 EB/in**2 (or ~30Eb/in**2), then 1 bit per 12 iron atoms (or 12*26=312 electrons) should be 0.0032bits/electron or ~275TB/in**2 (or ~2.8Pb/in**2). Not quite to the scale of the holographic device but interesting nonetheless.

What can that do for my desktop?

Given that today’s recording head/media has demonstrated ~3.3Tb/in**2 (see our Disk drive density multiplying by 6X post), the 12 atoms per bit is a significant advance for (spin) magnetic storage.

With today’s disk industry shipping 1TB/disk platters using ~0.6Tb/in**2 (see our Disk capacity growing out of sight post), these technologies, if implemented in a disk form factor, could store from 4.6PB to 50EB in a 3.5″ form factor storage device.

So there is a limit to (spin) magnetic storage and it’s about 11000X larger than holographic storage. Once again holographic storage proves it can significantly store more data than magnetic storage if only it could be commercialized. (Probably a subject to cover in a future post.)

~~~~

I don’t know about you but 4.6PB drive is probably more than enough storage for my lifetime and then some. But then again those new 4K High Definition videos, may take up a lot more space than my (low definition) DVD collection.

Comments?