Yottabyte – Silverton Consulting

NSA’s huge (YBs) new data center to turn on in 2013

Posted on March 20, 2012March 20, 2012 by Ray in Data index, data protection, Data security, Distributed computing, Information economy, Storage density, Strategic Inflection Points, System effectiveness

Ran across a story in Wired about the new NSA Utah data center today which is scheduled to be operational in September of 2013.

This new data center is intended to house copies of all communications intercepted the NSA. We have talked about this data center before and how it’s going to store YB of data (See my Yottabytes by 2015?! post).

One major problem with having a YB of communications intercepts is that you need to have multiple copies of it for protection in case of human or technical error.

Apparently, NSA has a secondary data center to backup its Utah facility in San Antonio. That’s one copy. We also wrote another post on protecting and indexing all this data (see my Protecting the Yottabyte Archive post)

NSA data centers

The Utah facility has enough fuel onsite to power and cool the data center for 3 days. They have a special power station to supply the 65MW of power needed. They have two side by side raised floor halls for servers, storage and switches, each with 25K square feet of floor space. That doesn’t include another 900K square feet of technical support and office space to secure and manage the data center.

In order to help collect and temporarily storage all this information, apparently the agency has been undergoing a data center building boom, renovating and expanding their data centers throughout the states. The article discusses some of other NSA information collection points/data centers, in Texas, Colorado, Georgia, Hawaii, Tennessee, and of course, Maryland.

New NSA super computers

In addition to the communication intercept storage, the article also talks about a special purpose, decrypting super computer that NSA has invented over the past decade which will also be housed in the Utah data center. The NSA seems to have created a super powerful computer that dwarfs the current best Cray XT5 super computer clusters that operate at 1.75 petaflops available today.

I suppose what with all the encrypted traffic now being generated, NSA would need some way to decrypt this information in order to understand it. I was under the impression that they were interested in the non-encrypted communications, but I guess NSA is even more interested in any encrypted traffic.

Decrypting old data

With all this data being stored, the thought is that the data now encrypted with unbreakable AES-128, -192 or -256 encryption will eventually become decypherable. At that time, foriegn government and other secret communications will all be readable.

By storing this secret communications now, they can scan this treasure trove for patterns that eventually occur and once found, such patterns will ultimately lead to decrypting the data. Now we know why they need YB of storage.

So NSA will at least know what was going on in the past. However, how soon they can move that up to do real time decryption of communications today is another question. But knowing the past, may help in understanding what’s going on today.

~~~~

So be careful what you say today even if it’s encrypted. Someone (NSA and its peers around the world) will probably be listening in and someday soon, will understand every word that’s been said.

Comments?

Protecting the Yottabyte archive

Posted on November 11, 2009January 27, 2010 by Ray in Data index, Networking, Storage, Storage Backup, Storage density, Storage performance, Systems

blinkenlights by habi (cc) (from flickr)

In a previous post I discussed what it would take to store 1YB of data in 2015 for the National Security Agency (NSA). Due to length, that post did not discuss many other aspects of the 1YB archive such as ingest, index, data protection, etc. Thus, I will attempt to cover each of these in turn and as such, this post will cover some of the data protection aspects of the 1YB archive and its catalog/index.

RAID protecting 1YB of data

Protecting the 1YB archive will require some sort of parity protection. RAID data protection could certainly be used and may need to be extended to removable media (RAID for tape), but that would require somewhere in the neighborhood of 10-20% additional storage (RAID5 across 10 to 5 tape drives). It’s possible with Reed-Solomon encoding and using RAID6 that we could take this down to 5-10% of additional storage (RAID 6 for a 40 to a 20 wide tape drive stripe). Possibly other forms of ECC (such as turbo codes) might be usable in a RAID like configuration which would give even better reliability with less additional storage.

But RAID like protection also applies to the data catalog and indexes required to access the 1YB archive of data. Ditto for the online data itself while it’s being ingested, indexed, or readback. For the remainder of this post I ignore the RAID overhead but suffice it to say with today’s an additional 10% storage for parity will not change this discussion much.

Also in the original post I envisioned a multi-tier storage hierarchy but the lowest tier always held a copy of any files residing in the upper tiers. This would provide some RAID1 like redundancy for any online data. This might be pretty usefull, i.e., if a file is of high interest, it could have been accessed recently and therefore resides in upper storage tiers. As such, multiple copies of interesting files could exist.

Catalog and indexes backups for 1YB archive

IMHO, RAID or other parity protection is different than data backup. Data backup is generally used as a last line of defense for hardware failure, software failure or user error (deleting the wrong data). It’s certainly possible that the lowest tier data is stored on some sort of WORM (write once read many times) media meaning it cannot be overwritten, eliminating one class of user error.

But this presumes the catalog is available and the media is locatable. Which means the catalog has to be preserved/protected from user error, HW and SW failures. I wrote about whether cloud storage needs backup in a prior post and feel strongly that the 1YB archive would also require backups as well.

In general, backup today is done by copying the data to some other storage and keeping that storage offsite from the original data center. At this amount of data, most likely the 2.1×10**21 of catalog (see original post) and index data would be copied to some form of removable media. The catalog is most important as the other two indexes could potentially be rebuilt from the catalog and original data. Assuming we are unwilling to reindex the data, with LTO-6 tape cartridges, the catalog and index backups would take 1.3×10**9 LTO-6 cartridges (at 1.6×10**12 bytes/cartridge).

To back up this amount of data once per month would take a gaggle of tape drives. There are ~2.6×10**6 seconds/month and each LTO-6 drive can transfer 5.4×10**8 bytes/sec or 1.4X10**15 bytes/drive-month but we need to backup 2.1×10**21 bytes of data so we need ~1.5×10**6 tape transports. Now tapes do not operate 100% of the time because when a cartridge becomes full it has to be changed out with an empty one, but this amounts to a rounding error at these numbers.

To figure out the tape robotics needed to service 1.5×10**6 transports we could use the latest T-finity tape library just announced by Spectra Logic . The T-Finity supports 500 tape drives and 122,000 tape cartridges, so we would need 3.0×10**3 libraries to handle the drive workload and about 1.1×10**4 libraries to store the cartridge set required, so 11,000 T-finity libraries would suffice. Presumably, using LTO-7 these numbers could be cut in half ~5,500 libraries, ~7.5×10**5 transports, and 6.6×10**8 cartridges.

Other removable media exist, most notably the Prostor RDX. However RDX roadmap info out to the next generation are not readily available and high-end robotics are do not currently support RDX. So for the moment tape seems the only viable removable backup for the catalog and index for the 1YB archive.

Mirroring the data

Another approach to protecting the data is to mirror the catalog and index data. This involves taking the data and copying it to another online storage repository. This doubles the storage required (to 4.2×10**21 bytes of storage). Replication doesn’t easily protect from user error but is an option worthy of consideration.

Networking infrastructure needed

Whether mirroring or backing up to tape, moving this amount of data will require substantial networking infrastructure. If we assume that in 2105 we have 32GFC (32 gb/sec fibre channel interfaces). Each interface could potentially transfer 3.2GB/s or 3.2×10**9 bytes/sec. Mirroring or backing up 2.1×10**21 bytes over one month will take ~2.5×10**6 32GFC interfaces. Probably should have twice this amount of networking just to not have any one be a bottleneck so 5×10**6 32GFC interfaces should work.

As for switches, the current Brocade DCX supports 768 8GFC ports and presumably similar port counts will be available in 2015 to support 32GFC. In addition if we assume at least 2 ports per link, we will need ~6,500 fully populated DCX switches. This doesn’t account for multi-layer switches and other sophisticated switch topologies but could be accommodated with another factor of 2 or ~13,000 switches.

Hot backups require journals

This all assumes we can do catalog and index backups once per month and take the whole month to do them. Now storage today normally has to be taken offline (via snapshot or some other mechanism) to be backed up in a consistent state. While it’s not impossible to backup data that is concurrently being updated it is more difficult. In this case, one needs to maintain a journal file of the updates going on while the data is being backed up and be able to apply the journaled changes to the data backed up.

For the moment I am not going to determine the storage requirements for the journal file required to cover the catalog transactions for a month, but this is dependent on the change rate of the catalog data. So it will necessarily be a function of the index or ingest rate of the 1YB archive to be covered in a future post.

Stay tuned, I am just having too much fun to stop.

Yottabytes by 2015!?

Posted on November 3, 2009January 27, 2010 by Ray in Data index, File Storage, Storage, Storage density, Storage performance, Storage reliability, Systems

Well, maybe an Exabyte a day was way too small for 2009. NSA is now reporting that they may be storing yottabytes (YB, 10**24) of data by 2015 somewhere in Utah. Later reports have NSA reducing this down to something closer to 1000 PB or so but YB of storage got me thinking.

This points out a couple of issues:

How is the NSA going to store all this data.
How will the NSA be able to retrieve anything in this amount of data.
The storage industry must come up with a new term that applies to 10**27 bytes of storage.

As a first stab at this I would suggest NONABYTE (nona- is latin for nine, (y)otta- is italian for eight). In a similar way, perhaps we could use DECEMABYTE for 10**30 and UNDECEMABYTE for 10**33. That should last us for a couple of years.

Storing a yottabyte of data is no small matter. 10 to 100 Petabytes (PB, 10**15 bytes) of data can be dealt with today with a number of storage systems both cloud and non-cloud. Many cloud providers claim PB of storage under their environment so this is entirely feasible today.

Exabytes (XB, 10**18 bytes) would seem to require an offline archive of data. Of course, somebody could conceivably build such an online storage complex (see below for how). Testing such a system might only be possible during implementation but that would not be unusual for such leading edge projects.

Zetabytes (ZB, 10**21 bytes) seems outside the realm of possibility today being a million PB of storage. But offline archives could conceivably be built even for this amount of storage. It’s conceivable that online storage of an XB of data could be used to support offline storage of a ZB of data.

1 YB of data in perspective

Yottabytes of data seem extremely large. If a minute of standard definition digital video takes ~GB of storage, a yottabyte would be about 10**15 minutes of video.

A minute of MP3 audio (as in a phone conversation) takes roughly a MB of storage, so 1 YB would be about 10**18 minutes of conversation. Realize there are only ~6×10**9 people on the planet. So this is enough storage for a ~100 million (10**8) minutes of conversation from everyone on the planet. Seems like a lot, but who am I to judge.

Also realize there are only 5×10**5 minutes/year, so 10**24 would be enough storage to record everything everybody said over ~333 years (mb/minute 10**6 X 6×10**9 people on earth X 5×10**5 minutes per year=3×10**21 bytes required to store one year of everyone talking for the whole year). Also people sleep, don’t often talk 100% of wake time and most conversations are between two people, so this is very conservative.

1 YB of data at rest

How to construct such a 1 YB archive poses many challenges. One would have to consider a multi-tier/level storage hierarchy made up of both removable and online storage.

Tape or other removable media would be an obvious choice for at least the lowest tier of storage but keeping track of 1.5×10**14 tape volumes (LTO-7 will maybe support 6.4TB (6.4×10**9 bytes per cartridge) seems outside today’s capabilities.

Similar quantities of disk drives would be required to store 1 YB of data but nobody would consider storing all this online. Consider that only 5.4×10**8 disk drives were shipped in 2008 and it becomes obvious that large portions of the 1YB archive must be offline. Deduplication would help but audio and video doesn’t dedupe well.

But that’s nothing, try keeping track of the 10**18 to 10**20 files (assuming 10**6 for audio down to 10**4 for text files of bytes per file).

I think this calls for an object store of some type. 10**6 objects are feasible today scaling up to 10**18 through 10**20 would be a significant leap but not outside technology available 5 (or maybe 10) years hence.

Next one must consider the catalog for such a storage complex. Let’s assume these are conversations and use the 10**18 number, and just keeping 100 bytes of metadata per file, the catalog would take 10**20 bytes of storage. Of course, 100 bytes seems pretty limiting to record all the important data about a conversation or even a text file, so 1000 bytes seems more realistic. Thus, we would need 10**21 bytes of storage just for the catalog. It seems even portions of the catalog would need to be offline to be realistically stored. This would not be optimal but would accommodate a rudimentary listing of the 10**18 element catalog as a last resort.

Searching 1 YB of data

NSA would probably want at least to search the catalog for items of interest, like a person’s name, a phone number, or maybe even time of call. Indexes take anywhere from 20 to 100% of the data being searched. Let’s say with great people working on the project they can get the catalog index down to 10% of the storage being searched. So there is yet another 10**20 bytes of data for the catalog to be searchable. Now we would want the majority of this to be online and directly accessible but even this is 100,000 PB of data. Way beyond today’s capabilities for online accessible storage.

Of course, it’s possible that the agency might want to search the contents of the conversation for items of interest such as words used. Any content index would take vastly more storage than a simple catalog index but maybe this could be shrunk down to only 100% of the catalog size or 10**21 bytes of storage. Again a 1,000,000 PB of data is unlikely to be kept online in total.

I am beginning to see how NSA and Mitre may dave come up with the YB figure. 10**20 for an index 10**21 for a catalog, and another 10**21 for a vocabulary index to 10**18 conversations. Now YB of storage is starting to make sense. If you took the 10**18 conversations down say to 10**15, with a catalog of 10**18 bytes, indexes of 10**19 bytes this might be even more realistic. But, even 10**15 conversations seems a bit much for 2015.

Ingesting, indexing, and protecting 1 YB of storage all pose interesting challenges of their own which I will leave for later posts.