When you think about disaster recovery and business continuity, you typically think about making sure your server or workstation that has data on them are protected from failure. There are many solutions that give you the ability to backup your data. These applications are countless and work pretty reliably.

However, when you do implement some type of data protection technology, you have to store those backups somewhere. In most cases, they are local on your premise, on a local hard drive or network attached storage device. Soon though as you start to protect more and more systems, the data for these starts to grow exponentially. Why is that?

Well, when you backup your data, the systems that are being protected have to generate at least one full copy to start. Some backup software store just the incremental changes afterward, but in reality, you will want to have at least 1 weekly, possibly 1 monthly and maybe even 1 yearly if you’re a regulated company. This can add up very quickly to a lot of disk space, especially if you have multiple systems to protect. This can become very costly over time in added disk space and it’s very inefficient.

Data Deduplication to the Rescue!

For the most part, in all IT organizations or just in the small business office, most people use the same operating systems over and over again. For example, I own five desktops of Windows 10 and I have to back them all up. In essence, I’m storing five copies of Windows 10 and all the business related data on them. However, if it were possible to keep just one copy of Windows 10 and just unique changes between those five desktops, would that not be more efficient? That’s where data deduplication comes in.

Say your systems are like this cute kid puzzels123.com jungle puzzle. In this picture, you see a lion, giraffe, elephant, and zebra. Now you buy another puzzle that’s the same but this one has a bear in it. Then you buy another one that adds a buffalo. It’s all the same picture but just slightly different content. So with data deduplication, it compares the first puzzle to the next and to the next and only stores the changes. Thus saving huge amounts of disk space.

We then measure the efficiency by what is called the deduplication ratio. According to tech target, the data deduplication ratio is the measurement of data’s original size versus the data’s size after removing redundancy.  In short, if I store five puzzles that are almost identical, the deduplication ratio will be on the order of 4 or maybe 5:1.

Why Is It Necessary For Business Continuity and Disaster Recovery?

Again backups have to be stored somewhere, why not do it efficiently. However, there are two compelling reasons to implement deduplication with your backup strategy.

SCALE: Deduplication allows you to scale up your backups thus utilizing the hard drive you store them on more efficiently. Say for example you have a 5 terabyte of data, at 5.1 you actually are storing only 1 terabyte on your hard drive. Now if you didn’t have deduplication and you only have a 5 terabyte hard drive and you’re maxed; to continue to do more backups will require you to purchase another hard disk. However, with deduplication, you are only storing 1 terabyte of data, and yet all five of your systems are there, backed up safe and sound. So you will be able to drastically increase the number of backups.

COST:  Deduplication can actually save you money!  Let’s do the math. If you paid for example 250 U.S dollars for a 5 terabyte hard drive and you fill it, your cost per gig is actually .05 cents. Well, you might say that’s not bad, however now let’s apply deduplication to it. Assuming a 5:1 deduplication ratio and you max out your storage, now the amount of data stored in that same 5 terabytes is around 25 terabytes. Your cost per gigabyte is now .01 cents. This makes much better financial sense.

Windows Server 2012 and 2016 comes with deduplication technology built-in but there are other technologies like Dell-EMC Data Domain, ExaGrid, Neverfail Hybristor, and Quantum DXi available for higher scale and use cases.

In a future article, we will consider how dedication works in practice, the different approaches to it and how to identify the use cases that work for you.

Leave a Reply

Your email address will not be published. Required fields are marked *

*