Global Deduplication and Data Commits
Categories: XenServer, Hyper-V, Alike Features, Backup, Job Logs, Alike v4
Introduction
This KB will explain what Alike doing when it says the following line in a backup job:
Globally deduplicating and committing new backup data to ADS (X GB). Please wait… Complete.
Backup Job Details
The short answer is that the software is globally deduplicating your backup data, and permanently recording it to your Alike Data Store. This is one of the most I/O intensive operations that Alike performs, so depending on the amount of data, the efficiency of your hardware, and possibly network, this process can take some time—especially the first time a system has been backed up. After your initial backups are complete, this process should take only a fraction of the time.
Update: With the release of Alike v4.1, you can speed up initial backups with the ‘Loose dedup caching’ option. This option allows Alike to use similar VMs as a basis for deduplication caching when backing up a VM for the first time. Please note this is for initial backups only. To learn more about this option and when to use it, see our KB article about this option.
To dig a bit deeper into what’s going on during this phase, and why it is sometimes slow and other times fast, you need to understand how Alike’s data acquisition and deduplication works. Essentially, Alike performs two phases of deduplication to most efficiently store your backups. The first phase is local data dedupe, which occurs local to the system being protected (target side).
This first phase of dedupe leverages previously protected data of that system, as well as any internally redundant data within the VM. So, if Alike has never protected a system, it has no previous backup data to leverage for this phase, and it will see significantly more data as in need of protection. This data is transferred from the source (Q-hybrid agent/ABD/Hyper-v host) directly to your ADS share.
The second phase is Alike’s global data deduplication, which is performed by the Alike server, against the newly acquired backup data from the first phase. In the global data deduplication phase, Alike will check the new data against its global store, and only record new, unique backup data that does not yet exist in the ADS.
Final Notes
This process can dramatically reduce the amount of data Alike needs to store in order to maintain your backup versions.