How do you back up Big Data? Or SaaS? Who will be the next Veeam?

It seems that every time that a new major IT platform is delivered, backing it up is an afterthought – often exacerbated by the fact that the platform vendor didn’t create the APIs or plumbing to enable a backup ecosystem. Each time, there is a gap where the legacy folks aren’t able to adapt quickly enough and a new vendor (or small subset) start from scratch to figure it out. And for a while, perhaps a long while, they are the defacto solution until the need becomes so great that the platform vendor creates the APIs, and then everyone feverishly tries to catch up. Sometimes they do, other times, not so much:

Veeam, while not the only virtualization-specific backup solution, is a classic example of this scenario and are typically the vendor that the legacy solutions measure themselves against for mindshare or feature innovation, in their efforts to win back those who are using a VM-specific product in combination with traditional physical backup solutions.

Before them, Seagate Software’s Backup Exec was synonymous with Windows Server backups, helped by the built-in "Backup Exec lite" version that shipped within early Windows.

Before them, Cheyenne Software’s ARCserve was synonymous with Novell NetWare backups, who was among the first to protect a server’s data from within the server, instead of from the administrator’s desktop (really).

History continues to repeat itself

The challenge for platform vendors is that after the early adopters have embraced a platform (any platform), the mainstream folks will push back under the premise of “If I am going to put my eggs (data) in this basket, it better be a solid basket” (meaning that it is back-up-able) – without which will ultimately hinder the second/broader waves of adoption. Other examples include:

Microsoft SharePoint has a slightly similar architecture to “Big Data”, with its disparate SQL databases being far more protectable than the metadata and “Farm” constructs that link everything together. Many legacy backup solutions that had robust SQL protection capabilities struggled to back up SharePoint (restore was even worse), in large part because Microsoft hadn’t developed the VSS enablers to traverse the Farm. Today, after four major releases of SharePoint, almost anyone can back it up like “just another Microsoft application.”

As mentioned earlier, VMware hypervisors were notoriously difficult to back up in their early days, with legacy backup providers being very late to solve the challenges. Instead, a completely new group of virtualization-specific backup vendors created new approaches to address the market demand for better virtualization protection. Today, with the mature VADP mechanisms provided by VMware, VM backups are now reliable, but agility of VM recovery continues to vary widely among solutions. Some virtualization protection solutions are continuing to thrive, while the legacy backup vendors are rapidly catching up in VM backup features and trying to recapture their lost market share not only from the VM-specific vendors but VMware’s own VDP solution. is perhaps the poster-child of software-as-a-service, enabling CRM solutions for organizations of all sizes. And as a cloud-based solution, most early adopters were focused primarily on assuring availability of the data/service across the Internet. Unfortunately, SFDC has not yet published APIs that allow traditional backup solutions to protect SFDC data. The result is that customers are moving from legacy, albeit protected, self-managed CRM systems to an arguably far superior CRM system in SFDC, but are losing the ability to protect their data once it is there. SFDC is rumored to have a rudimentary recovery capability that is purportedly $10,000US per event and doesn’t have an SLA for recovery. Like the disruptive platforms before it, ESG expects that mainstream demand and platform maturity will eventually make protecting and restoring SFDC data easier, but in the meantime, like virtualization, a new group of cloud-backup solutions are among the first to try to solve what legacy vendors are slow to adapt to.

And that brings us back to … How do you back up Big Data?

To begin to answer that question, I partnered with my ESG Big Data colleague, Nik Rouda, in authoring a brief on what you should be thinking about and looking for in the future.

CLICK HERE to check out “ESG’s Data Protection for Big Data” brief.

As always, thanks for reading.

[originally posted on ESG’s Technical]

Leave a Reply