Posts Tagged ‘snapvault’

SnapMirror and Deduplication

February 16, 2012 Leave a comment

In a recent blog, I talked about the interaction between deduplication and SnapVault.  In this post I’ll discuss SnapVault’s cousin – SnapMirror.

SnapVault was designed to make efficient D2D backup copies, but SnapMirror has a different purpose – making replication copies.  Using good old Snapshot technology, SnapMirror transfers snapshots from one storage system to another, usually from the data center to an offsite disaster recovery location.

SnapVault and SnapMirror have many similarities, but there is one important item that distinguishes these two cousins – Unlike SnapVault, SnapMirror relationships are peer-based and can be reversed.  In fact, when we talk about SnapMirror pairs, we don’t use the terms primary and secondary as we do with SnapVault, instead we refer to source and destination systems.  Either of the SnapMirror systems can be a source or a destination, it just depends on the direction the snapshots are moving.  Take a look at the diagram below to get a better understanding of what I mean:

I’ve used this diagram in dozens customer briefings, and I use it to point out the subtle differences between SnapVault and SnapMirror.  First of all, notice the arrows.  SnapVault’s go from left to right only, but SnapMirror’s arrows travel in both directions.  Normally, the SnapMirror source system (the one on the left) controls the flow of application data to servers and clients.  However if the source system goes down for some reason, the SnapMirror destination system (on the right) takes control, and we call this a “Failover” event.  When we bring up and revert control back to the source system we call this a “Failback”.  In either case, Snapshot copies are passed back and forth between the systems to insure that both the source and destination systems are synchronized to the current point and time, using the most current SnapMirror copy.

Now, lets talk about using deduplication with SnapMirror.  There are two types of SnapMirror replication, and deduplication behaves differently with each type.

The first type is called Qtree SnapMirror, or QSM.  As the name implies, QSM performs replication at the Qtree level.  What is a Qtree?  Its a logical subset of a NetApp volume.  Storage admins like to use Qtrees in NAS systems when they need to administer quotas or set access permissions.  Much more info on the why’s and how’s of Qtrees can be found in the Data ONTAP System Administration Guide on the NOW Support site.

In the context of deduplication, QSM presents a bit of a problem.  Since replication is done at the logical level, any deduplication done at the source will be re-inflated at the destination, and will need to be re-deduplicated.  This kind of defeats the purpose of space reduction.  But there is one valuable use case – if you don’t want to dedupe the source, and only want to deduplicate the destination, QSM makes perfect sense.  Refer to the following diagram:

As the diagram shows, with QSM, only the Qtree portion of the volume is replicated and it is only deduplicated at the DR site.  To configure QSM for deduplication, just enable it on the destination volume and set the deduplication schedule to “auto”.  The source volume will remain untouched and the destination volume will deduplicate automatically.  Failovers and Failbacks will work just fine, since any replication from the destination back to the source will be un-deduplicated.

The second type of replication is Volume SnapMirror, or VSM, which takes a different approach.  VSM replicates entire volumes (including Qtrees) at the physical level.  What this means to deduplication is that blocks are replicated once, and any deduplication pointers are sent along with the blocks.  By replicating at the +physical +level, this means that the destination volume “inherits” deduplication automatically.  Here’s a diagram that shows how VSM works with deduplication:

To configure VSM for deduplication, enable it on both the source and destination volumes but only set the deduplication schedule at the source.  The source volume will do all the work and the destination volume will get deduplication for free.  After you have a Failover/Failback event, you might want to run a deduplication scan on the source volume (sis start -s) to pick up any duplicate blocks that might have been written to the destination during the Failover, but then again its probably a very small amount that won’t be worth the effort.  Your choice.

In a nutshell, that’s how deduplication and SnapMirror work together.  If you’d like to read a much more complete description, here is an excellent Technical Report that includes best practices.

Categories: netapp Tags: , ,