Blog categories

My blog posts and tweets are my own, and do not necessarily represent the views of my current employer, my previous employers or any other party.

Many of these blogs were originally posted elsewhere on public sites, but have been re-posted here with attributions to the original location -- strictly as a means of a centralized archive of my perspectives.

I do not do paid endorsements, so if I am appear to be a fan of something, it is based on my personal experience with it. If I am not talking about your stuff, it is either because I haven't worked with it enough or because my mom taught me "if you can't say something nice ... "

2017 Data Protection Predictions (Video)

predictions.jpgWe’re already into 2017, so here are three topics that really ought to be reconsidered and/or focused on in order to ensure that as you modernize production, your protection strategies are up to the tasks at hand:

Cloud – While many organizations continue to investigate where cloud will fit within their data protection strategy, it is NOT inevitable that all things go cloudy. In addition, there isn’t one kind of cloud service that applies to data protection, nor is there a defacto scenario that universally screams “use the cloud, dangit!” (other than endpoints).

Continue reading 2017 Data Protection Predictions (Video)

New Blog on Fiduciary Class Data Recovery

Check out this new blog post that I co-wrote with Mark Peters in which we lay out our ideas about a new data protection requirement: fiduciary class data recovery.


Gold standard of data protection methods goes through sea of change

Not too long ago, the gold standard for protecting organizational data involved using a disk-to-disk-to-tape process. First, a copy of production data went to secondary disk to expedite rapid recovery if needed, and then the data went to tape for long-term retention. Previously, some organizations used only tape, and a few moved to using only disk when that became cost-effective. Even then, most IT groups knew it made more sense to use a combination of the two media types to leverage what they did best: disk for recovery, tape for retention.

Fast forward a few years and the gold standard for data protection methods boiled down to employing backups, snapshotting and replication, specifically the following:

  • Backups to provide multiple previous versions over an extended timespan;
  • Snapshotting to deliver the fastest recovery from a near-current version; and
  • Replication for data survivability at an alternate location.

Some have argued that one of these data protection methods should usurp the others depending on your point of view. But the fact is the best approach has always been to use each process for what it does best, in a complementary manner with the others.

Here we are today, and it appears the gold standard for protecting data, or at least the de facto standard at enterprise-scale organizations, has changed once again. It now centers on using multiple data protection methods per workload, namely employing the following:

Importantly, since most of those “everything else” backup products do not protect data generated from software-as-a-service applications such as Office 365, Google Docs or Salesforce, most enterprises end up using four different types of data protection products.

Ramifications of fragmentation

Do we really need four backup products? No, but because the approach to protection is already so fragmented, many IT operations admins — and even senior IT decision makers — are often no longer able to persuade their vAdmin and database admin colleagues that a unified product would really work better.

This situation is really the fault of vendors — specifically, the unified-product vendors who haven’t invested enough in marketing awareness of the economic benefits and technical proofs regarding how their products support varied workloads as well as niche offerings. Arguably, this lack of effective promotion is more of a high-impact factor in the lack of unified data protection uptake than any lack of engineering to actually deliver equitable protection capabilities. Until vendors fix this messaging problem, today’s data protection-related fragmentation will continue.In the meantime, having each administrator perform their own backups for the technological areas under their domain is a dangerous practice. Think about it: Most workload or platform admins only really care about being able to achieve 30-, 60- or 90-day rollbacks, for example. They are not worried about 5-year, 7-year or 10-year retention rules.Corporate data must be protected to a corporate standard, however, which can include adhering to long-term retention and deletion requirements. That’s a consideration regardless of how fragmented the actual execution of protection. So, right now, some organizational data is being under-protected and some overprotected. This situation of varying data protection methods is making organizations vulnerable.

A multiplicity of gold standards

Another thing to note is that gold standards do not necessarily replace one another. Many organizations may be using two, three or four backup products per workload. But they are still supplementing those backups with snapshots and replicas. That’s actually a wise move, as no backup offering can replace the agility that comes with snapshotting or replication.

They’re also still using disk for rapid restoration and tape for long-term retention. And many are now adding cloud-based protection (disaster recovery as a service, for example) to achieve added agility.

At the end of the day, what these admins and the organizations they work for should care most about is the agility and reliability of the protection effort — regardless of the various mechanisms and media used to facilitate that protection.

We are going to have heterogeneous protection media, and we are going to have multiple data protection methods. With those realities in mind, to avoid unnecessary risk, the answer might be to have as close to a common catalog, control layer (for policy management) and console as possible. That way everyone will understand what is really going on across an environment via a single pane of glass, regardless of fragmentation behind the scenes.

 [Originally posted on TechTarget as a recurring columnist]

Wrap-Up on Backup from Microsoft Ignite (with Video)

As ESG often tries to do, here is a short summary video of ESG’s impressions from a major industry event – Microsoft Ignite, held in Atlanta over September 26-29, 2016 – from a backup perspective.

In the video, I suggested that Microsoft is a leader in Windows data protection. Certainly, this is not to disparage the many Microsoft partners that have built whole companies and product lines around data protection. And from a revenue perspective, their backup offerings wouldn’t register at all.  But …

  • Almost every version of Windows has shipped with a built-in backup utility to address the immediate, per-machine need for ad-hoc backups or file roll-back, with today’s “Previous Versions” functionality more closely resembling software-based snapshots than “backup” per se. That said, it has always been recognized that a more full-featured, multi-server solution is almost always warranted.
  • Of course, Microsoft has been producing Volume Shadowcopy Services (VSS) for over a decade, which is the underpinnings of how any backup vendor protects data within Windows systems.
  • Microsoft has been shipping its “for sale” backup offering, Data Protection Manager (within System Center), for over a decade. And though a good number of Hyper-V centric environments use DPM, the greater impact is how DPM gave/gives Microsoft insights into how to improve VSS, thereby improving all data protection offerings in market.

The point is – Microsoft is not new to “backup.” It hasn’t previously been a monetary focus, but they have consistently recognized backup as intrinsic to a management story and assured satisfaction with Windows Server and its application server offerings.  All of that may be changing with Azure as the crown jewel of the Microsoft ecosystem and OMS as a cloud-based management stack.

Azure Backup is a key pillar of OMS, just like “backup” is a key pillar to many management strategies, alongside “provisioning,” “monitoring,” and “patching.”  IT Operations folks that are responsible for the latter three activities are continually wanting to do backups (as preparation for recoveries) as well … which makes Azure Backup something to watch for its own sake, and the sake of the data protection market in 2017 as many organizations are reassessing their partners during their embrace of cloud services.

[Originally blogged via ESG’s Technical]

New Veritas = New Vision

This week, the newly unencumbered Veritas (from Symantec) relaunched its premier user event – Veritas Vision. There was a palpable energy that resonated around “we’re back and ready to resume our leadership mantle,” starting with an impressive day one from main stage:

  • Bill Coleman (CEO) started the event by making “information is everything” personal by tying medical data to a young girl with health struggles, which gave context that would resonate with everyone — which then got “real” by revealing her as Bill’s niece.
  • Ana Pinczuk (CPO) introduced us to the journey that Veritas wants to partner with its customers and ecosystem – from what you already rely on Veritas for to ambitious data management and enablement, with an impressive array of announcements, almost all of which coupled Veritas’ established flagships with emerging offerings that unlock some very cool data-centric capabilities.
  • Mike Palmer (SVP, Data Insights), who is arguably the best voice on the Veritas stage in a very long time, delivered a brilliant session describing data and metadata through the movie InsideOut, tying the movie’s memory globes to data, the colors to meta data, combined with formative insights, pools of repositories and outcomes, etc.  I’ll be hard pressed to use any other analogy to describe metadata for a long time to come.  Vendors, if you want to see how to completely nail a keynote, watch the recordng of Mike’s session.

And that was just the first two hours.  Along the way, the new Veritas also wanted everyone to know that they are combining two decades of storage/data protection innovation with a youthful, feisty aggressiveness against perceived legacy technologies, with EMC + Dell being the punchline of many puns and direct takedowns. That could have come across as mean or disrespectful, but was delivered with enough wit that it served to bring the room together. The competitive digs may have had arguable merit, but they clearly cast Veritas as a software-centric, data-minded contrast to hardware vendors – with a level of spunk that ought to energize its field, partners, and customer base.

As further testament to their approach for combining flagships with emerging offerings, many of the breakouts leveraged multiple, integrated Veritas products for solution-centric outcomes — which candidly is their best route to the ambitious journey that Veritas is embarking on. Glueing together their new journey through integrated solutions that are then underpinned by products (instead of jumping straight to what’s new in product X this year) will be a key to watch for as the new Veritas continues to redefine itself.  As a reminder that “Veritas” is much more than “NetBackup,” check out their current portfolio.

For further impressions on the event, check out ESG’s video coverage from the event:


Congratulations Veritas on a fresh vision (and Vision) that ought to propel you into some exciting opportunities.

[Originally blogged via ESG’s Technical]

Is Your Data Protection Strategy Suffering a Civil War?

I am a huge fan of the Marvel movies. Each of the individual hero movies has done an awesome job contributing to the greater albeit fictional universe. Each of the heroes has their unique role to play within the Avengers team. And yet, in the latest movie that released on Blu-Ray today, it appears as if this colorful array of heroes is divided.  They have similar goals, but what seems to be opposing methods that put them at odds with each other. Data protection can have similar contradictions.


The Spectrum of Data Protection activities can seem similar.  We often talk about the spectrum as a holistic perspective on the myriad of data protection outcomes—and the potentially diverse tools that enable those outcomes.  And yet, sometimes, the spectrum can appear opposed to itself:

  • Some in your organization are focused on “data management” (governance, retention, and compliance) which focuses on how long you can or should retain data in a cost-effective way that unlocks the value of the data.
  • Others in your organization are focused on “data availability” (assured availability and BC/DR), as part of ensuring the users’ and the business’ productivity.

Do these goals actually contradict?  No.

But … you have to start with the core of what is common: data protection, powered by a complementary approach of backup, snapshots, and replication. But as backup evolves to data protection, many come to a crossroads where that evolution only goes down one path or the other—data management or data availability.

We’ll have to wait until next year to see how the Avengers reconciles to a single team again—but you can’t afford to wait that long. Start with your core focus areas and then evolve toward the edges, as opposed to coming from the edges in.

[Originally blogged via ESG’s Technical]

Why you still need backup … and beyond

The foundation of any data protection, preservation, and availability strategy is grounded in “backup,” period. Yes, a majority of organizations supplement backups with snapshots, or replicas, or archives, as shown in what ESG refers to as the Data Protection Spectrum:


And as much as some of those other colors (approaches to data protection) can add agility or flexibility to a broader data protection strategy, make no mistake that for most organizations of any and all sizes, backup still matters!

In fact, ESG wrote a brief on the relevance of backup today, within the context how other methods supplement backups and vice versa. ESG is now making this brief publicly available, courtesy of Commvault.

Click here to download the ESG brief

Why You Still Need Backup

In fact, Commvault believes so much in this backup-centric and yet comprehensive approach to data management, protection, and recovery, that they’ve invited me to speak at their Commvault GO conference in October at a session aptly named, Why you still need Backup….and Beyond (session description below)

ESG research shows that for the past five years, improving data backup and recovery has consistently been one of the IT priorities most reported by organizations. However, to evolve from traditional backup to true data management is to get smarter on all of the iterations of data throughout an infrastructure, including not only the backups/snapshots/replicas/archives from data protection, but also the primary/production data and the copies of data created for non-protection purposes, like test/dev, analytics/reporting, etc. Further, the cloud offers a new way to approach data protection, disaster recovery and some of those non-protection use cases. In this session, leading industry analyst, Jason Buffington discusses the trends in data protection today and market shifts that customers MUST understand in order to keep pace with the changing IT landscape.

You can click here to find out more about the sessions at the conference.

My thanks to Commvault for syndicating the brief and the chance to share ESG’s perspectives on how the realm of data protection and data management is changing, and what to look for as it does. See you in Orlando!

[Originally blogged via ESG’s Technical]

Enterprise data protection strategy requires evolution

Most organizations supplement traditional backup with some combination of snapshots, replication and archiving to achieve more comprehensive data protection. They are using what we at ESG refer to as the Spectrum of Data Protection.

Innovative data protection vendors, meanwhile, are constantly reacting to the changing IT landscape in their attempts to give their customers and prospects what they are looking for. With backup no longer enough for many organizations, data protection startups and industry dominators are stretching out and evolving in one of two tracks: data availability or data management.

Let’s take a look at these complementary yet decisively distinct branches of the data protection family tree.

Enterprise Strategy Group data protection family tree
Data protection startups and industry dominators alike are evolving along complementary yet distinct tracks: data availability or data management.

Data availability

The primary focus of data availability is to ensure user productivity though an infrastructure that is reactive in its recoverability across a diverse range of scenarios, delivering a wider set of recovery point objective and recovery time objective capabilities than what backup alone can do (essentially, the right three-quarters of the family tree diagram).

A significant challenge in embracing a comprehensive data protection strategy, instead of simply a backup strategy, is the myriad methods employed. They can cause drastic over-protection or under-protection. While most data still requires backups (routine, multiversion retention over an extended period of time), the agile recovery needed for heightened availability often comes from snapshots and replicas before even attempting a restore. And those activities are complemented by application- or platform-specific availability/clustering/failover mechanisms.

The key to a successful data availability strategy then becomes a heterogeneous control plane across the multiple data protection methods, and a common catalog to mitigate over-/under-protection while unlocking all of the copies available for recovery.

Data management

Data management can be seen as both the reactive and proactive result of a truly mature IT infrastructure that has evolved beyond data protection. From a reactive perspective, all copies of data created through both data protection and data availability initiatives are economically unsustainable without data management.

That’s because primary production storage and secondary/tertiary protection storage are each growing faster than IT budgets. As such, organizations need to look at how they can unlock additional business value from their sprawling data protection infrastructure by leveraging otherwise dormant copies of information for reporting, test/dev enablement and analytics. Most in the industry call this copy data management, which was pioneered by startups and is now starting to be championed by industry leaders as an evolution of their broader data protection portfolios.

The proactive side of data management encompasses data protection areas such as e-discovery and compliance, archiving and, of course, backups (essentially, the left one-third of the family tree diagram). Here, organizations embrace real archival technologies instead of just long-term backups. They combine those technologies with processes and corporate culture changes to enable information governance and regulatory compliance.

For any of this to happen (data management, data availability or even just comprehensive data protection), organizations need a framework that we at ESG refer to as “The 5 Cs of Data Protection”:

Containers: Organizations should have multiple containers (repositories) for production storage and protection storage, including tape, disk and cloud.

Conduits: Enterprises will likely have multiple conduits (data movers). They frequently include not only snapshot and replication mechanisms that are often hardware based, but also multiple backup applications for general-purpose platforms and perhaps tools specifically for databases, VMs or SaaS.

Control: Because of the presumed heterogeneity of containers and conduits, organizations should seek out a single control plane (policy engine) that can ensure adequate protection across the underlying widgets without over- or under-protection.

Catalog: This needs to be a real catalog of what you have stored across those containers, regardless of which conduits created them. While some vendors might claim a rich catalog, they merely have an index of backup jobs, perhaps with enumerated file sets, instead of something that recognizes the contextual or embedded business value of the information within the data that has been stored.

Consoles: Lastly, to make sense of the whole data protection environment, most organizations need multiple consoles, whereby different roles can provide contextual insight — though vCenter plug-ins, System Center packs, workload (e.g., Oracle/SQL) connectors — as well as a ubiquitous lens across the vastly heterogeneous arrays and backup/replication technologies for a view that enables IT operations specialists to gain insights from their catalog and drive their control plane.

You can read more about The 5 Cs of Data Protection on ESG’s blog.

 [Originally posted on TechTarget as a recurring columnist]

“Tape” is not a four-letter word

In the 25+ years that I have been in data protection, much of it has been spent hearing about “better” alternatives to tape as a medium. Certainly, in the earlier days, tape earned its reputation of slowness or unreliability. But nothing else in IT is the same as it was twenty years ago, so why do people presume that tape hasn’t changed?

Do I believe that most recoveries should come from disk? Absolutely. But candidly, my preferred go-to would be a disk-based snapshot/replica first, and then my backup storage disks, which would presumably be deduplicated and durable.

Do I believe in cloud as a data protection medium? Definitely. But not because it is the ultimate cost-saver for cold storage. Instead, cloud-based data protection services (cloud-storage tier or BaaS) are best when you are either dealing with non-data center data (including endpoints, ROBO servers or IaaS/SaaS native data) or when you want to do more with your data than just store it (BC/DR preparedness, test/dev/analytics). Of course, ‘cloud’ isn’t actually a medium, but a consumption model for service-delivered disk or tape, but we’ll ignore that for now.

Do I believe that tape is as relevant as it’s ever been?Yes, I really do. As data storage requirements in both production and protection continue to skyrocket, retention mandates continue to lengthen, and while IT teams are struggling to ‘do more with less,’ there are many organizations that need to re-discover what modern tape (not legacy stuff) really can do for their data protection and data management strategies.

Check out this video that we did in partnership a few vendors within the LTO community:

Your organization’s broader data protection and data management strategy should almost certainly use all three mediums for what each of them are best at. Disk is a no-brainer and cloud is on everyone’s minds, but don’t forget abouttape.

As always, thanks for watching.

[Originally blogged via ESG’s Technical]

The gold standard for data protection keeps evolving

Yes, of course, data protection has to evolve to keep up with how production platforms are evolving, but I would offer that the presumptive ‘gold standard’ for what is the norm for those on the front lines of proactive data protection is evolving in at least three different directions at the same time.

Here is a 3-minute video on what we are seeing and what you should be thinking about as the evolutions continue.

As always, thanks for watching.

Video transcript:

Announcer: The following is an ESG video blog.

Jason: Hi. I’m Jason Buffington. I’m a Principal Analyst at ESG covering all things data protection. The gold standard for data protection has evolved over the years. It used to be, way back in the day, it was disc-to-disc-to-tape, D2D2T. Meaning, we’d first try to recover from disc for rapid restoration or we go from tape as a longer term medium. There were a few folks out there that only used tape. There were even fewer folks out there that said you only needed disc. But most of us figured out we should use both as a better-together scenario.

Fast forward a couple of years and the gold standard changed. Now we’re talking about supplementing those longer term retention tiers of backups with snapshots for even faster recovery. And replication for data survivability and agility. Again, there’s a few folks out there that will try to convince you that one might usurp the others. But most of us have figured out that it makes sense, and sense, to have all of them in complement or in supplement to each other.

Fast forward a few more years and the gold standard continues to evolve. Now what we’re seeing is that backup is actually defragmenting. Where we’re seeing different virtualization solutions than we are for database solutions than we are for a unified solution. But the problem is the unified solutions don’t typically cover SaaS. So now we’re up to four different backup products.

Do you really need four different backup products? Probably not. But based on what we’re seeing in the industry, evidently, there’s a lot of IT operations and a lot of IT decision makers out there that haven’t been able to convince their V-Admin and their DBA colleagues that unified solution might be superior. And that brings up two challenges. Workload owners tend to think about 30-day, 60-day, 90-day rollback in order to keep their platforms productive. Whereas a backup admin tends to think in 5-year, 10-year, 15-year retention windows. Corporate data has to be protected to a corporate standard regardless of how many different people are clicking the mouse.

The other thing to note is that one gold standard doesn’t replace the ones before it. We’re still using disc plus tape and we’re supplementing that with cloud. We’re still supplementing backups with snapshots and replication. And we’re continuing to fragment the virtualization and DBA and all the other ones we’ve talked about. And that’s gonna lead us to what ESG calls the Five Cs of data protection.

We’ve already covered the containers, meaning all those different media types you’re gonna store. We’ve already talked about the conduits, those data movers, the backups, and the snaps, and the replicas, etc. And if you’re gonna have that much heterogeneity, you better have a single control plane to make sure that everything is operating in sync with each other. Mitigating that over and under protection. You better have a catalogue that’s rich enough to tell you what you have, where it is, and how long you need to keep it. And you better have at least one console that can tell you what all is going on within the environment. And make sure they’re actually making this all actionable and insightful.

Those are the Five Cs. I’m Jason Buffington for ESG. Thanks for watching.

[Originally blogged via ESG’s Technical]

@JBuff on Twitter