Oracle RMAN and data deduplication – part 2

Insight01C 0011 by watz (cc) (from Flickr)
Insight01C 0011 by watz (cc) (from Flickr)

I have blogged about the poor deduplication ratios seen when using Oracle 10G RMAN compression before (see my prior post) but not everyone uses compressed backupsets.  As such, the question naturally arises as how well RMAN non-compressed backupsets deduplicate.

RMAN backup types

Oracle 10G RMAN supports both full and incremental backups.  The main potential for deduplication would come when using full backups.  However, 10G also supports something called RMAN cumulative incremental backups in addition to the more normal differential backups.  Cumulative incrementals backs up all changes since the last full and as such, could readily duplicate many changes which occur between full backups also leading to higher deduplication rates.

RMAN multi-threading

In any event, the other issue with RMAN backups is Oracle’s ability to multi-thread or multiplex backup data. This capability was originally designed to keep tape drives busy and streaming when backing up data.  But the problem with file multiplexing is that file data is intermixed with blocks from other files within a single data backup stream, thus losing all context and potentially reducing deduplication ability.  Luckily, 10G RMAN file multiplexing can be disabled by setting FILESPERSET=1, telling Oracle to provide only a single file per data stream.

Oracle’s use of meta-data in RMAN backups also makes them more difficult to deduplicate but some vendors provide workarounds to increase RMAN deduplication (see Quantum DXIEMC Data Domain and others).

—-

So deduplication of RMAN backups will vary depending on vendor capabilities as well as admin RMAN backup specifications.  As such, to obtain the best data deduplication of RMAN backups follow deduplication vendor best practices, use periodic full and/or cumulative incremental backups, don’t use compressed backupsets, and set FILESPERSET=1.

Comments?