Archive for the ‘Systems’ Category

Describing Dedupe

Deduplication is a mechanism to reduce the amount of data stored on disk for backup, archive or even primary storage.  For any storage, data is often duplicated and any system that eliminates storing duplicate data will be more utilize storage more efficiently.
Essentially, deduplication systems identify duplicate data and only store one copy of such data. [...]

Read the rest of this entry »

Google vs. National Information Exchange Model

Wouldn’t the National information exchange be better served by deferring the National Information Exchange Model (NIEM) and instead implementing some sort of Google-like search of federal, state, and municipal text data records.  Most federal, state and local data resides in sophisticated databases using their information management tools but such tools all seem to support ways [...]

Read the rest of this entry »

Free P2P-Cloud Storage and Computing Services?

What would happen if somebody came up with a peer-to-peer cloud (P2P-Cloud) storage or computing service.  I see this as

Operating a little like Napster/Gnutella where many people come together and share out their storage/computing resources.
It could operate in a centralized or decentralized fashion
It  would allow access to data/computing resources anywhere from the internet

Everyone joining the [...]

Read the rest of this entry »

Backup is for (E)discovery too

There has been lot’s of talk in twitterverse and elsewhere on how “backup is used for restore and archive is for e-discovery”, but I beg to differ.
If one were to take the time to review the EDRM (Electronic Discovery Reference Model) and analyze what happens during actual e-discovery processes, one would see that [...]

Read the rest of this entry »

Problems solved, introduced and left unsolved by cloud storage

When I first heard about cloud storage I wondered just what exactly it was trying to solve. There are many storage problems within the IT shop nowadays days, cloud storage can solve a few of them but introduces more and leaves a few unsolved.
Storage problems solved by cloud storage

Dynamic capacity – storage capacity is [...]

Read the rest of this entry »

Protecting the Yottabyte archive

In a previous post I discussed what it would take to store 1YB of data in 2015 for the National Security Agency (NSA). Due to length, that post did not discuss many other aspects of the 1YB archive such as ingest, index, data protection, etc. Thus, I will attempt to cover each of these [...]

Read the rest of this entry »

Yottabytes by 2015!?

Well, maybe an Exabyte a day was way too small for 2009. NSA is now reporting that they may be storing yottabytes (YB, 10**24) of data by 2015 somewhere in Utah. Later reports have NSA reducing this down to something closer to 1000 PB or so but YB of storage got me thinking.
This points [...]

Read the rest of this entry »

Repositioning of tape

In my past life, I worked for a dominant tape vendor. Over the years, we had heard a number of times that tape was dead. But it never happened. BTW, it’s also not happening today.
Just a couple of weeks ago, I was at SNW and vendor friend of mine asked if I knew [...]

Read the rest of this entry »

Today's data and the 1000 year archive

Somewhere in my basement I have card boxes dating back to the 1970s and paper tape canisters dating back to the 1960s with basic, 360-assembly, COBOL, PL/1 programs on them. These could be reconstructed if needed, by reading the Hollerith encoding and typing them out into text files. Finding a compiler/assembler/interpreter to interpret [...]

Read the rest of this entry »

Cache appliances rise from the dead

Sometime back in the late 80’s a company I once worked with had a product called the tape accelerator which was nothing more than a ram cache in front of a tape device to smooth out physical tape access. The tape accelerator was a popular product for it’s time, until most tape subsystems started [...]

Read the rest of this entry »