Blog categories

My blog posts and tweets are my own, and do not necessarily represent the views of my current employer (ESG), my previous employers or any other party.

I do not do paid endorsements, so if I am appear to be a fan of something, it is based on my personal experience with it.

If I am not talking about your stuff, it is either because I haven't worked with it enough or because my mom taught me "if you can't say something nice ... "

What I am looking for at VMworld 2014 … a Data Protection Perspective

For the past few years, the big data protection trend in virtual environments was simply to ensure reliable backups (and restores) of VMs. That alone hasn’t always been easy, but with the newer Data Protection APIs from VMware (VADP), that is becoming table-stakes – and the real differentiation coming from the agility to restore (speed and granularity), as well as manageability and integration.

And while there is certainly still a lot of room for many vendors to improve in those areas, the industry overall needs to move past the original question of “Can I back up your VM?” and even past “How quick can I restore your VM?

The new questions to be answered are:

Does your data protection solution understand which VMs should be protected and how?

Have protection/recovery enabled is your Virtualization Administrator?

The answer to the latter question may in fact inform the former one, in that a Backup Administrator isn’t always the best person to determine how the VMs should be backed up – because they don’t know what is running in those VMs. The only folks that really know that are the folks that provisioned those VMs in the first place, which are typically not the backup admins … it’s the virtualization administrators.

I covered that in some detail in a TechTarget article – discussing that the provisioning process is the right place to quantify how the VM should be protected, including retention length and RPO/RTO, which would then affect how the data protection process(es) are enacted. Maybe the provisioning process links directly to the backup engine, or the snapshot engine, or replication engine, or ???

Remember, “data protection” is not synonymous with “backup” – especially as it relates to server virtualization. In fact, when ESG asked IT professionals how they protect VMs, less than 10% stated that they only used VM-centric backup mechanisms. The other 90%+ used a combination of snapshots, replication or both to protect VMs in combination with VM-centric backups, as reported in ESG’s Trends in Protecting Highly Virtualized Environments in 2013.

      VM-protection methods -- from ESG Research Report Trends in Protecting Highly Virtualized Environments 2013

Short of augmenting the VM provisioning process to include data protection, the next best answer is to enable data protection management from within the virtualization administrators’ purview – because those folks understand the business requirements of the VMs. That doesn’t always mean ensuring that your data protection (snapshot/backup/replication) tool has a vCenter plug-in, though that helps. It does mean:

Have you truly built your data protection product or service to understand highly virtualized environments?

Is the solution VM-aware (per VM or VM-group), or simply hypervisor host-centric?

Are the management UI’s (standalone or plug-in) developed with the virtualization administrator in mind? Or are they backup UIs that you hope the virtualization administrator will learn?

And of course, how agile is the restore? How fast? How granular? How flexible to alternate locations (other hosts, other sites, other hypervisors, cloud services)?

Yes, it’s a long list of questions – and I expect to be very busy at VMworld 2014, trying to find the answers from the exhibiting vendors, as well as from VMware who enables them.

[Originally posted on ESG’s Technical]

What if Cloud-Backup-Storage was Free?

Not Backup-as-a-Service, but just cloud-storage that could be used to supplement a backup. Sure, there are a lot of STaaS (storage-as-a-service) folks that will give you a small amount of capacity to try their platform, knowing full well that you are going to want more and be willing to pay for it.

But there used to be a company that would give you as much storage in the cloud as you wanted – Symform. Symform was my 2011 “Coolest Disruptive Technology that most folks hadn’t heard of yet” award winner.

Essentially, Symform would make a copy of some subset of your data, encrypt it, shatter it into 64 chunks, add 32 parity chunks, and then scatter those pieces into the Internet – specifically, onto 96 remote/nameless locations. As long as any 64 of the 96 locations were accessible, you data was accessible in the cloud. The catch: you needed to offer some of your local storage to store 1/96th of other folks’ data. So, you could quite literally go purchase a 3TB USB hard drive from a local retailer, throw it in/on your server, and then get 3TB of cloud-based storage – I know, because I did it with my own environment. Sure, if you needed more capacity than you wanted to offer, you could pay for it – but the idea that I could buy cheap, slow local storage and gain durable, tertiary cloud-based storage was really interesting.

The reason that I bring up Symform is because they were just acquired by Quantum, who also offers deduplicated disk (DXi appliances, both virtual and physical), tape solutions and another cloud offering (Q-Cloud). We’ll see if the mechanics and service model changes, now that they are part of a broader portfolio of data protection offerings from a commercial vendor, but the idea that such a different and potentially exciting technology will now be available to a much broader set of customers and service providers will be interesting to watch. Congrats to Quantum.

As always, thanks for reading.

[Originally posted on ESG’s Technical]

A Replication Feature is not a Disaster Recovery Plan

A few years ago, I blogged that “Your Replication is not my Disaster Recovery” where I lamented that real BC/DR is much more about people/process than it is about technology.

To be clear, I am not bashing replication technologies or the marketing folks at those vendors … because without your data, you don’t have BC/DR, you have people looking for jobs

But that does not mean that if you have your data remotely, you have a BC/DR plan.  Having “survivable data” means that you have the IT elements necessary to either roll up your sleeves and attempt to persevere, or (preferably) the means by which to invoke a pre-prepared BC/DR set of mitigation and/or resumption activities.

BC/DR is not a “feature” or a button or a checkbox in a product, unless those elements are part of invoking the orchestrated IT resumption processes that are part of a broader organizational set of cultural and expertise-based approaches to resuming business, not just restarting/rehosting IT

Replication needs to be part of every Data Protection plan, to ensure agility for recovery – and often to increase the overall usability/ROI of one’s data protection infrastructure by enabling additional ways to leverage the secondary/tertiary data copies.  Replication, whether object-, application- or VM-based and whether hardware- or software-powered, is also the underpinnings of ensuring “survivable data.”  Only after you have “survivable data” can you begin real BC/DR planning.

As always, thanks for watching.

[Originally posted on ESG’s Technical]

EMC announces another step towards backupless-backups

Last week, in London, EMC made several announcements – many of which hinged on the VMAX3 platform – but the one of most interest to me was ProtectPoint, where those new VMAX machines will be able to send their backup data directly from production storage to protection storage (EMC Data Domain) without an intermediary backup server. 

I mentioned this in my blog last week as an example of while “backup” is evolving, those kinds of evolutions require that the role of both the Backup Administrator (which should not be thought of as a Data Protection Manager/DPM) and the Storage Administrator (or any other workload manager that is becoming able to protect their own data) need to evolve, as well.  Enjoy the video:

As always, thanks for watching.

[Originally posted on ESG’s Technical]

Workload-enabled Data Protection is the Future … and that is a good thing

When asked about “what is the future for datacenter data protection,” my most frequent answer is that DP becomes less about dedicated backup admins with dedicated backup infrastructure … and more about DP savvy being part of the production workload, co-managed by the DP and workload administrators.

  • In the last few years, we’ve seen a few examples of that with DBA’s using Oracle RMAN to do backups that aren’t rogue outside of corporate data protection mandates, but in concert with them – and being stored in the same deduplicated solution as the rest of the backups (e.g. DDboost for Oracle RMAN).
  • More recently, we are seeing more examples of VMware administrators getting similar functionality, including not only VMware’s own VDPA/VDPA+, but also traditional backup engines that are being controlled through vCenter plug-ins to give the virtualization admin their own solution.

EMC’s announcement of ProtectPoint is another step in that evolutionary journey, enabling VMAX production storage to go directly to Data Domain protection storage, thereby giving yet another group of IT Pro’s more direct control of their own protection/recovery destiny, while at the same time extending the agility and sphere of influence of data protection professionals.

To be clear, as workload owner enablement continues to evolve, the role of the “Data Protection Manager” (formerly known as the “backup administrator”) also evolves – but it does not and cannot go away. DPM’s should be thrilled to be out of some of the mundane aspects of tactical data protection and even more elated that the technology innovations like snap-to-dedupe integration, application-integration, etc. create real partnerships between the workload owners and the data protection professionals. And it does need to be a partnership, because while the technical crossovers are nice, they must be coupled with shared responsibility.

If the legacy backup admin simply abdicates their role of protecting data to the workload owner, because they now have a direct UI, many backups will simply stop being done – because the tactical ability to back up and the strategic mindset of understanding business and regulatory retention requirements are very different. The “Data Protection Manager” should be just that, the role that manages or ensures that data protection occurs – regardless of whether they enact it themselves (using traditional backup tools) or enable it through integrated data protection infrastructure that is shared with the workload owners.

Some naysayers will be concerned that as the workload owners gain tools that enable their own backups, the DP admin role diminishes – but there is a wide range of behaviors that are enabled by this evolution:

Some workload owners will wholly take on the DP mantle, but the DP manager will still need to “inspect what they expect” so that corporate retention and BC/DR mandates still happen.

Some workload owners will be grateful to drive their own restore experiences, but happily rely on the DP managers to manage the backups beforehand.

Some workload owners will recognize that they are so busy managing the workloads, the DP admins will continue to do the backups and restores – but now with better backups/snaps that are continue to be even more workload-savvy.

And there are likely other variations of those workload owner/DP Manager partnerships beyond these. But any way that you look at it, the evolution and collaboration of workload-enhanced data protection that can be shared between the workload owner(s) and the data protection managers is a good thing that should continue.

[Originally posted on ESG’s Technical]

Data Protection Impressions from the Dell Analyst Conference 2014

I recently had the opportunity to attend the Dell Annual Analyst Conference (DAAC), where Michael Dell and the senior leadership team gave updates on their businesses and cast a very clear strategy around four core pillars:

  • Transform — to the Cloud
  • Connect — with mobility
  • Inform — through Big Data
  • Protect

Protect?! YAY!!  As a 25-year backup dude who has been waiting to see how the vRanger and NetVault products would be aligned with AppAssure and the Ocarina-accelerated deduplication appliances, I was really jazzed to see “Protect” as a core pillar of the Dell story.

But then the dialog took an interesting turn:

As always, thanks for watching.

[Originally posted on ESG’s Technical]

vBlog: Wrap-Up on Backup from EMC World 2014 – part 2, strategy

Last week, I published a video summary of the data protection product news from EMC World 2014, with the help of some of my EMC Data Protection friends. To follow that up, I asked EMC’s Rob Emsley to knit the pieces together around the Data Protection strategy from EMC:

Essentially, what I call the Data Protection Spectrum, EMC calls the Data Protection Continuum — as an overarching perspective that combines backups, archives, snapshots, and replication (for all of which EMC has product offerings).

My thanks to Rob for his insights and to EMC for a great week.

Thanks to you for watching

[Originally posted on ESG’s Technical]

vBlog: Wrap-Up on Backup from EMC World 2014 – part 1, products

During EMC World 2014 in Las Vegas last month, I had the chance to visit with several EMC product managers on what was announced from a product perspective, as well as overall data protection strategy.  Enjoy the video:

For such a broad range of products within the EMC DP portfolio, it is impressive that while each product continues to innovate on its own, it is obvious that they are doing so in alignment with each other — and with a clear and unifying vision of meeting customers’ overall data protection needs.

As always, thanks for watching.

[Originally posted on ESG’s Technical]

vBlog: How to Ensure the Availability of the Modern Data Center

When you really boil down the core of IT — its to deliver the services and access to data that the business requires. That includes understanding the needs of the business, its dependencies on things like its data, and then ensuring the availability of that data.

"Availability" can be achieved in two ways = Resilience and Recoverability.

  • Resilience are the clustering/mirroring technologies that many of us think of as "traditional high availability”
  • Recoverability are the myriad methods of rapidly restoring functionality through the use of technologies like near-instantaneous VM restoration or leveraging snapshots.

The modern data center by definition should be highly virtualized must also be both resilient and recoverable, in order to be dependable enough to then deliver the other promises of modern IT around agility, flexibility, etc. With that in mind, here is a short video of what folks should be looking for to fulfill the recoverability requirements of highly virtualized environments, thus helping to achieve the availability of the modern data center:

As always, thanks for watching.

[Originally posted on ESG’s Technical]

How do you back up Big Data? Or SaaS? Who will be the next Veeam?

It seems that every time that a new major IT platform is delivered, backing it up is an afterthought – often exacerbated by the fact that the platform vendor didn’t create the APIs or plumbing to enable a backup ecosystem. Each time, there is a gap where the legacy folks aren’t able to adapt quickly enough and a new vendor (or small subset) start from scratch to figure it out. And for a while, perhaps a long while, they are the defacto solution until the need becomes so great that the platform vendor creates the APIs, and then everyone feverishly tries to catch up. Sometimes they do, other times, not so much:

Veeam, while not the only virtualization-specific backup solution, is a classic example of this scenario and are typically the vendor that the legacy solutions measure themselves against for mindshare or feature innovation, in their efforts to win back those who are using a VM-specific product in combination with traditional physical backup solutions.

Before them, Seagate Software’s Backup Exec was synonymous with Windows Server backups, helped by the built-in "Backup Exec lite" version that shipped within early Windows.

Before them, Cheyenne Software’s ARCserve was synonymous with Novell NetWare backups, who was among the first to protect a server’s data from within the server, instead of from the administrator’s desktop (really).

History continues to repeat itself

The challenge for platform vendors is that after the early adopters have embraced a platform (any platform), the mainstream folks will push back under the premise of “If I am going to put my eggs (data) in this basket, it better be a solid basket” (meaning that it is back-up-able) – without which will ultimately hinder the second/broader waves of adoption. Other examples include:

Microsoft SharePoint has a slightly similar architecture to “Big Data”, with its disparate SQL databases being far more protectable than the metadata and “Farm” constructs that link everything together. Many legacy backup solutions that had robust SQL protection capabilities struggled to back up SharePoint (restore was even worse), in large part because Microsoft hadn’t developed the VSS enablers to traverse the Farm. Today, after four major releases of SharePoint, almost anyone can back it up like “just another Microsoft application.”

As mentioned earlier, VMware hypervisors were notoriously difficult to back up in their early days, with legacy backup providers being very late to solve the challenges. Instead, a completely new group of virtualization-specific backup vendors created new approaches to address the market demand for better virtualization protection. Today, with the mature VADP mechanisms provided by VMware, VM backups are now reliable, but agility of VM recovery continues to vary widely among solutions. Some virtualization protection solutions are continuing to thrive, while the legacy backup vendors are rapidly catching up in VM backup features and trying to recapture their lost market share not only from the VM-specific vendors but VMware’s own VDP solution. is perhaps the poster-child of software-as-a-service, enabling CRM solutions for organizations of all sizes. And as a cloud-based solution, most early adopters were focused primarily on assuring availability of the data/service across the Internet. Unfortunately, SFDC has not yet published APIs that allow traditional backup solutions to protect SFDC data. The result is that customers are moving from legacy, albeit protected, self-managed CRM systems to an arguably far superior CRM system in SFDC, but are losing the ability to protect their data once it is there. SFDC is rumored to have a rudimentary recovery capability that is purportedly $10,000US per event and doesn’t have an SLA for recovery. Like the disruptive platforms before it, ESG expects that mainstream demand and platform maturity will eventually make protecting and restoring SFDC data easier, but in the meantime, like virtualization, a new group of cloud-backup solutions are among the first to try to solve what legacy vendors are slow to adapt to.

And that brings us back to … How do you back up Big Data?

To begin to answer that question, I partnered with my ESG Big Data colleague, Nik Rouda, in authoring a brief on what you should be thinking about and looking for in the future.

CLICK HERE to check out “ESG’s Data Protection for Big Data” brief.

As always, thanks for reading.

[originally posted on ESG’s Technical]