It is interesting to me how marketing folks and technical purists banter IT terms around, in hopes of sounding fresh and compelling to their customers. While “backup” is often thought of as passé or the bane of IT operations, “data protection” is perceived as more strategic, with other lofty terms such as “business continuity” and “information management” being thrown around as adjuncts. And that doesn’t include some of the classic debates, such as Snapshots versus Backups … or Disk vs. Tape.
A few years ago, I wrote the book “Data Protection for Virtual Data Centers,” where the first chapter described the Landscape of Data Protection, as a range of methods that could be categorized by RTO and implementation layer, with a key focus being the differentiation of solutions focused on Availability of data versus Protection of data:
Chapter 1, which discussed the Evolving Landscape of Data Protection, is available as a free download.
As IT organizations struggle to deliver a broader range of recovery options, often with agility that can be measured in seconds but with retention measured in years, it can be intimidating to sift through the buzz words that all seem synonymous with “data protection.”
Instead, think of Data Protection is the umbrella term for the Spectrum of activities and initiatives that enable organizations to be agile in their protection, preservation and delivery of production data.
In most cases, the terms are arranged with close-cousins being adjacent to each other; but the key idea is not about their placement, as much to understand that just like a rainbow wouldn’t be complete without ‘green’ or ‘red’, comprehensive data protection wouldn’t be complete without all of the colors shown here (including an underpinning of deduplication).
Folks need to stop arguing the merits of one color versus another; and start thinking about what types of recoverability that your organization needs:
- If you need to quickly recover to near current points in time, utilize snapshot technology(s)
- If you need to recover data selectively (on en masse) to a range of previous timeframes, use traditional backups
- If you need to preserve data for content-specific purposes, implement archiving
- If you need data to be in more than one location, replicate it
- If that remote data needs to be accessible, leverage the replicated copies for BC/DR
- And if the data must be accessible all of the time, ensure that usability through high availability mechanisms
And of course, all of those mechanisms are best served when the data is deduplicated within each color and throughout the spectrum.
The real keys to a contemporary data protection strategy, with all of the colors represented, are to:
- Manage the spectrum of data protection methods through an integrated or unified policy/monitoring interface, so that each color isn’t creating its own copies, on its own schedule, with its own storage.
- Reduce the number of copies (or copy fragments) across your environment through deduplication and consolidation of the data stores, so that you have less storage footprint providing a broader range of recovery points across the various colors. And remember that ‘storage’ may appear as deduplicated disk, primary storage, long-term tape or cloud-repositories — with assuredly most, if not all, of them in your rainbow.
And perhaps most importantly, don’t start your data protection strategy planning with an assessment of what your current technologies offer. Instead, start with the end goals (colors) in mind, with a business assessment of why each core workload needs whichever types of protection. And then assess your current and alternative technologies against their ability to deliver the colors in a cohesive way – and their ability to be managed and store data in a unified perspective.
After all, have you ever seen a rainbow where the colors were mostly parallel, but somewhat overlapping, with gaps in-between some colors and other colors shooting off in different directions?!
As always, thanks for reading.