Private cloud backup needs to get better

You can back up your private cloud data using a one-size-fits-all-method or by doing it manually, but neither approach is ideal. But automation might be the key.

For years, Microsoft’s model was “If we build it, someone else will back it up.” It resulted in backup vendors creating their own database agents for apps like Exchange and SharePoint and lots of third-party support disclaimers. So Microsoft created Volume Shadow Copy Service (VSS) as a framework that backup and data storage vendors could use, and things started getting better. But VSS adoption was slow at first and allowed some latitude in its implementation, so Microsoft shipped its own backup product — System Center Data Protection Manager (SCDPM) — that gave users another choice. It also taught Microsoft quite a bit about backup in the real world, and VSS improved because of it. Today, almost every backup app for Windows starts with the VSS framework and builds on it.

Why the history lesson? Because some companies are regressing to an “if we launch new virtualized services, hopefully someone else may back it up” attitude as they adopt private cloud architectures. The challenge isn’t what a private or hybrid cloud architecture should look like, but the disconnect between most private cloud implementations and backup applications and their inability to integrate.

A private cloud takes the resource-maximizing capabilities of a highly virtualized infrastructure and adds elasticity (based on load) while enabling new models for provisioning. In its most advanced models, virtualized services and applications are assigned a service-level agreement (SLA) or quality of service, e.g., gold, silver or bronze. Depending on the service rating, the underlying infrastructure might use faster disk or provision more processors/memory and so on. That’s fine until it comes to backup.

There are a couple of common models for private cloud backup services today:

  1. Brute force: One size fits all and everything gets backed up. Data protection apps optimized for virtualization workloads often tout an auto-discovery feature that essentially watches the hypervisor hosts and adds any newly created virtual machines to a default backup job. Where’s the service level in that approach? In effect, auto-discovery jobs say, “No matter how you define the importance of those virtualized resources, it’s one size fits all for backing them up.” A little oversimplified maybe, but not by much.
  2. Manual backups: Every backup is custom tailored, and everything gets the backups they need. In this case, the storage admin has traded up from the older methodologies of server and storage provisioning to a private cloud portal. Based on business needs, a self-service tool that abstracts most details can be used to find out within seconds what new virtualized services are being brought online. The admin then uses the backup tool’s interface, navigates to the pool of host servers managed by the private cloud, explicitly identifies the newly created resources, and configures the backup and recovery policies appropriate to that data or resource. It’s a wholly disconnected experience that’s essentially three steps forward and two steps back.

These alternatives are relative extremes, but there aren’t a lot of solutions in the middle ground. Instead, data protection and recovery SLAs should be attributes of the provisioning methodology.

  • With gold, silver and bronze service levels, bronze might be backed up weekly to disk and backups would expire after one year.
  • Silver might be backed up every four hours to disk and weekly to tape, and have a three-year retention.
  • Gold could be near-continuous protection with automatic replication to a secondary site for disaster recovery.

The challenge with this approach is that it requires extra levels of interaction that aren’t currently offered by most vendors or, in some cases, enabled by the private cloud management stack. Since I picked on Microsoft initially, I’ll give them a nod for the System Center portfolio that includes private cloud provisioning (VMM) and monitoring tools (SCOM), but also includes DPM for backup and (most importantly) the System Center Orchestrator for automation.

Automation is the key. It requires private cloud portals to provide some extensibility so backup and other infrastructure support services have visibility into the defined tiers of service offered by the portal. In an ideal state, when service levels are defined, data protection SLAs should also be defined, similar to how both Microsoft and VMware allow additional wizards or tabs in their user interfaces for vendors to integrate with.

It also requires the runbook automation logic to provide interaction with the backup offering. For example, when the runbook for provisioning a new application is executed, backup configuration should simply be some of the steps. Within a runbook world, this is ideally done through an “integration pack” created by the backup vendor, but it can also be achieved through generic insertions into a runbook automated workflow (which might simply run a command line) to execute the backup app’s functionality. The latter approach takes additional work on the part of the backup vendor and solid partnering by private cloud vendors in publishing best practice guides.

In a perfect world, visibility into the portal enables a “best” possible solution. While not perfect, augmenting one’s private cloud automation with backup vendors’ processes yields a “better” approach by aligning with the provisioning process. Either of those approaches is better than the “good” approach used by smart IT pros who add integration on a case-by-case basis because it hasn’t been published by the vendors. And all of these processes are better than the one-size-fits-all or custom-tailored approaches that many private cloud infrastructures are dealing with today.

 [Originally posted on TechTarget as a recurring columnist]

Leave a Reply