Blog categories

My blog posts and tweets are my own, and do not necessarily represent the views of my current employer (ESG), my previous employers or any other party.

I do not do paid endorsements, so if I am appear to be a fan of something, it is based on my personal experience with it.

If I am not talking about your stuff, it is either because I haven't worked with it enough or because my mom taught me "if you can't say something nice ... "

“Tape” is not a four-letter word

In the 25+ years that I have been in data protection, much of it has been spent hearing about “better” alternatives to tape as a medium. Certainly, in the earlier days, tape earned its reputation of slowness or unreliability. But nothing else in IT is the same as it was twenty years ago, so why do people presume that tape hasn’t changed?

Do I believe that most recoveries should come from disk? Absolutely. But candidly, my preferred go-to would be a disk-based snapshot/replica first, and then my backup storage disks, which would presumably be deduplicated and durable.

Do I believe in cloud as a data protection medium? Definitely. But not because it is the ultimate cost-saver for cold storage. Instead, cloud-based data protection services (cloud-storage tier or BaaS) are best when you are either dealing with non-data center data (including endpoints, ROBO servers or IaaS/SaaS native data) or when you want to do more with your data than just store it (BC/DR preparedness, test/dev/analytics). Of course, ‘cloud’ isn’t actually a medium, but a consumption model for service-delivered disk or tape, but we’ll ignore that for now.

Do I believe that tape is as relevant as it’s ever been?Yes, I really do. As data storage requirements in both production and protection continue to skyrocket, retention mandates continue to lengthen, and while IT teams are struggling to ‘do more with less,’ there are many organizations that need to re-discover what modern tape (not legacy stuff) really can do for their data protection and data management strategies.

Check out this video that we did in partnership a few vendors within the LTO community:

Your organization’s broader data protection and data management strategy should almost certainly use all three mediums for what each of them are best at. Disk is a no-brainer and cloud is on everyone’s minds, but don’t forget abouttape.

As always, thanks for watching.

[Originally blogged via ESG’s Technical Optimist.com]

Enterprise data protection strategy needs archive, backup

Anyone who truly requires long-term data retention probably already understands the realities of what modern tape offers — and not the marketing fear, uncertainty and doubt from 20 years ago. Which is why, with its speed, error-protection, durability, functionality (e.g., Linear Tape File System) and economics, tape continues to earn a place in the enterprise data protection strategy of a majority of organizations today.

Meanwhile, just because an organization doesn’t buy into the fear, uncertainty and doubt (FUD) of times past regarding tape doesn’t mean that it fully understands how best to use tape or other backup and archive media, such as disk or cloud. So please stop presuming backups go to disk and archives go to tape. That may be the combination you’ve chosen, but that’s just you. Which storage medium you choose depends on the specific goals you are trying to accomplish in your enterprise data protection strategy and the IT infrastructure you’ve got in place.

Backup, archive and the data protection spectrum

Backups and archives are similar, but are two very different IT processes that are done for very different reasons.

A backup is a copy of a data container (volume, directory, database, whatever). It is typically optimized to retain multiple previous versions of data in preparation for restoration to an earlier version or point in time.

An archive, by contrast, is the intentional preservation of data, usually a minority subset of your overall data set. Archives are based on the business value of the data, and are often motivated by some kind of regulatory or operational mandate to retain that data for an extended period of time.

At ESG, we think of each of these — backup and archive — as parts of a broader enterprise data protection strategy that also includes snapshots,replication, availability and so on. (See: “ESG Spectrum of Data Protection”).

ESG Spectrum of Data Protection

ESG Spectrum of Data Protection

What the numbers say

For the same reasons that different folks will send their backups to disk, cloud or tape as part of their enterprise data protection strategy — an archive can also go to disk, cloud or tape. In ESG’s recent “Long Term Retention” research, we looked at several performance characteristics of modern archival products:

  • Over half (51%) of organizations move data into their archival product on at least a daily basis.
  • Two out of five (41%) organizations retrieve data on at least a daily basis, with another fourth (23%) retrieving data multiple times throughout the week.
  • Four out of five (80%) pieces of data retrieved are two years old or less.
  • Over one-third (37%) of data retrieved is less than a gigabyte, with another third (34%) less than 100 GB.
  • And two-thirds of folks expect to retrieve that data in minutes. In fact, one in five (19%) expect to retrieve the data within seconds!

Everything described here fits modern tape usage, but also fits moderndeduplicated disk and some cloud products — and that is the point. Well, one of two points, actually:

  • Stop interchanging backup and archive behaviors, unless your data protection product actually does both — creating copies for restoration preparation and archives for long-term data preservation.
  • Choose the right media combinations for your backups and archives based on the agility you need (which will often point to disk) and the characteristics of what you require in long-term retention (which will likely point you to tape and/or cloud).

In fact, by the time you look at the broader data protection spectrum again and at what your business units require in cost-effective and flexible data management, you’ll likely need disk, cloud and tape to accomplish the goals of your enterprise data protection strategy — and that is okay, really.

[Originally posted on TechTarget as a recurring columnist]

Why BaaS when you can DRaaS?

The question isn’t as simple as it might seem:

  • There are lots of great reasons to embrace modern Backup as a Service (BaaS) solutions, including governance, extended data preservation, IT oversight of endpoint and ROBO backups, etc.
  • There is also one overwhelmingly great reason to utilize Disaster Recovery as a Service (DRaaS) — enhanced availability of servers.

So, the question really is, can you gain a DRaaS outcome from a BaaS solution? And honestly, it isn’t just a cloud-consideration. You could just as easily ask, can you get BC/DR agility from a backup tool?

The answer is, it’s really, really hard to get BC/DR from a Backup/Restore tool, in or out of a cloud — particularly due to data flow and the need for better orchestration. Here is a short video I’ve recorded on the differences between backup and replication, and the importance of orchestration and workflow:

You may possibly have a data protection product (or service) that offers both. If you do, it is likely to be doing both backups (transformed) and replicas (readily-usable), which builds good cases for:

  1. Deduplicated/optimized protection storage
  2. Orchestration/automation workflows as part of any modern data protection management framework.

As always, thanks for watching.

Video transcript:

Woman: The following is an ESG video blog.

Jason: Hi. I’m Jason Buffington. I’m the senior analyst at ESG covering data protection. For the past two years, we’ve seen significant interest in leveraging cloud services as part of one’s data protection strategy. In fact, in ESG’s last two annual IT Spending Intentions reports, when asked about the use cases for cloud based infrastructure services, improving data backup was number one. Disaster recovery was number three.

There are lots of different ways to gain backup services, including augmenting on-prem backups with cloud storage all the way to a full-fledged backup as a service, BaaS. Similarly, for business continuity disaster recovery goals, you might utilize colo space, you might use infrastructure as a service, hybrid architecture or a full-fledged DR as a service, DRaaS.

With so many choices, it can be really confusing. I’d like to offer what I believe are the single biggest differences between them, which is data flow and orchestration, which will ultimately affect your agility and your business outcome.

From a data flow perspective, most backup technologies transform the data as part of transmitting it to the secondary repository, on-prem or cloud, which is what necessitates doing some kind of a “restore” to get it back. This transformation usually optimizes for storage but can limit the immediate usability or the recoverability of the data unless you restore it or basically un-transform it back to its original state.

In contrast to that, most BC/DR and availability technologies replicate the data in closer to their original state, which makes the data more immediately usable when needed. One method is not better than the other. Backups optimize for multiple versions, while replicas are designed for usability of typically only the most current version.

There are exception to the rule, but in general, the more immediately usable the data, the less transformed it is within secondary storage and that’s a trade-off between storage efficiency and IT resiliency.

The other main differentiator is workflow orchestration or automation. It’s one thing to have copies of VMs sitting in some secondary repository some place, but availability in BC/DR are more than just powering them up. For example, if you have a multi-VM application with a web front ends and connecting to middleware, being serviced by a pair of database servers, all of which have to be authenticated by active directory. You can’t just highlight those eight VMs, right-click and say, “power on.” You have to have a workflow. You have to have automation that’s defined in advance and runs when you need it.

Those same orchestration, automation mechanisms can also give you sandbox testing. You can test the ability to bring VMs online without impacting production, or you can test the recoverability or the restorability of VMs with even granular data within that VM on a regular basis.

There are other differences, but I hope this starts to get you thinking. Both kinds of technologies whether implemented within public or dedicated cloud services, in a hybrid architecture or even just between on-prem locations, all provide huge value and modernizing one’s data protection capabilities. Just understand what you’re getting and be clear on what you need.

I hope this was helpful. I’m Jason Buffington for ESG. Thanks for watching.

[Originally blogged via ESG’s Technical Optimist.com]

Wrap-up on backup from EMC World 2016 — day two keynote

There are a lot of things to like about EMC World this year, especially if data protection is important to you. Kudos on the day two general session with Jeremy Burton and Guy Churchward.

Some notes from the event:

  • It is hard to imagine doing an inside-an-appliance component-level tour from main stage, but the Fantastic Voyage miniaturization schtick worked to keep us entertained and let EMC tell a story of what makesUnity unique, both as a platform and a usability experience. And both are impressive (even to a backup guy like me).
  • A special nod to Beth Phalen on the data protection topics, which EMC briefly covered its recently announced Data Domain Virtual Edition (DD VE) and then a very solid demo on eCDM (enterprise Copy Data Management), which discovers and helps manage all of the myriad copies across EMC production and protection storage platforms. CDM can be a daunting concept to really understand, but the pastry example worked and the solution is one that a lot of folks ought to be excited to explore.
  • Chad Sakac added more data protection goodness, by talking about the built-in data protection within new VCE converged infrastructure and hyperconverged appliances via EMC RecoverPoint for VMs, with replication and failover at a VM-level, already included within VXRail systems, as well as wizard‑based workflows for context-aware backups that are part of the VM provisioning experience. I’m a huge fan of built‑in DP within converged infrastructure, so was pleased to see these from main stage and done well.

As the general session continued, more execs told their parts of the story with new announcements around a faux coffee-shop. At one point; they sat on couches and hovered around a table looking at a new platform – painting a subtle picture that these execs and their myriad technologies really are ‘Friends’ (TV show style) as a genuine family.

I don’t gush on keynotes often, but this one was spot-on in execution (and good content) which is worth studying by others in our industry. Check out the EMC World Day Two’s General Session at http://EMCworld.com.  As for the data protection stuff:

You can also see my guest-vBlog on EMC’s site around my first impressions (hands-on) with DD VE.

In addition, check out my recent vBlog on why Copy Data Management matters to everyone.

[Originally blogged via ESG’s Technical Optimist.com]

How to evolve from “data protection” to “data management”

To evolve from backup to data protection is to embrace complementary mechanisms that enable a much broader range of preservation of data, protection of data, and the assurance of productivity through enhanced availability initiatives, as seen in ESG’s Spectrum of Data Protection.

Spectrum-DP-2016.png

To evolve from data protection to data management is to get smarter on all of the iterations of data throughout an infrastructure, including not only the backups/snapshots/replicas/archives from data protection, but also the primary/production data and the copies of data created for non-protection purposes, like test/dev, analytics/reporting, etc.

Here’s a short video on Copy Data Management that hopefully explains the importance of CDM:

Quintessential ‘backup admins’ are in a unique position to potentially lead this evolution because:

  1. Admins have led many of the evolutions from backup to data protection.
  2. Some DP technologies already have the catalog and the controller (policy) that lend themselves to evolve from data protection to data management.
  3. They, perhaps more than most, understand the diversity of repositories of data that exist throughout an infrastructure.

To be clear, backup admins cannot achieve data management on their own — in fact, the stakeholders should include several groups across IT, non-IT business stakeholders, and other beneficiaries (such as test/dev, compliance, etc.). But if you put those folks in the room, and seek out authentic CDM technologies that can enable what has been envisioned, then you are on your way to data management.

As always, thanks for watching.

[Originally blogged via ESG’s Technical Optimist.com]

Now’s the time to virtualize backup deduplication

With so much great work being done in converged infrastructure and hyper-converged appliances, many users are getting to the point that the only boxes with blinky lights that they want to see on their floor are those converged infrastructure (CI)/hyper-converged appliance (HC) platforms, since everything else is running inside those bundles.

But one of the last “blinky” holdouts is the data protection setup being used to protect the CI/HC platforms. Even when the data protection software engine might run within a virtualized server, the protection storage (typically a storage array performing backup deduplication) is often still physical and adjacent to the rest of the infrastructure.

My recent hands-on experience with a new virtualized backup deduplication array has validated my beliefs that those platforms should also be virtualized. In the past, many argued that key systems, such as database platforms, needed dedicated hardware due to imbued latency going through the virtualization layer or because of the demand for so many CPU/memory/I/O resources, the machine would be essentially dedicated if it hosted just one VM.

But the innovations that have thinned the hypervisor layer and optimized I/O also negate the first argument, while the second argument of being dedicated actually favors the single-VM-on-host model. VMs are easier to protect and recover than physical servers, and they are much easier to upgrade:

* If you have a dedicated physical server that requires more resources, you must rebuild or migrate that OS, application and dataset.

* If your virtual machine requires more resources, you can add them from the host or move the VM to a new host that has more of what’s needed.

If those are the scenarios for a dedicated database server, why can’t they be the same for a dedicated backup deduplication platform? In fact, that argument should be even more compelling to those who use dedicatedbackup deduplication appliances, since the task of replacing the controller head of a dedupe platform has traditionally been arduous.

According to ESG research on data protection appliances, most organizations believe the tipping point between a virtual appliance and physical appliance to be around 32 TB, but if VMs continue to scale larger, that becomes less of a barrier. Moreover, some of the most exciting innovations in backup deduplication are coming from products with controllers that interconnect for both scale-out and scale-up. At that point, why not put a few virtual dedupe appliances in the same infrastructure, split across I/O boundaries, such as one dedupe-VM in each of the four nodes within a hyper-converged appliance?

The reasons for virtualizing backup deduplication go far beyond “because you can without penalty”:

* Distributed environments can run virtualized dedupe within each remote office, and then take advantage of highly efficient, compressed/deduplicated replication from the branch to a larger deduplication appliance at headquarters.

* Small and midsized organizations can finally get the economics of deduplication, without potentially complicating their otherwise consolidated environments. And they can efficiently replicate from their own SMB environment to another of their own sites or to a service provider offering cloud storage or disaster recovery as a service.

* Service providers can choose to spin up a virtual appliance per subscriber, instead of relying on multi-tenancy or making significant Capex investmentsup front. Instead, they can create virtual dedupe targets on demand, with complete isolation of data and management, and then add capacity licensing (and do migration-based maintenance) transparently.

* Dedupe vendors benefit from virtual dedupe, since it’s much easier for pre-sales team members and reseller partners to spin up a VM than it is to requisition a physical demo unit for a proof of concept.

* Backup software vendors can also benefit, if they choose to ship a virtual backup engine and perhaps partner with a virtual dedupe vendor. The customer/partner installs two VMs that are known to work together, and everyone benefits.

For the vast majority of environments, disk is the right choice to recover from — and from a cost-efficiency perspective, you really have to have backup deduplication. That said, the argument that you need a physical deduplication platform is being challenged as innovations in virtual products continue to exceed expectations.

 [Originally posted on TechTarget as a recurring columnist]

Do we still need VM-specific backup tools? [video]

This is one of the big questions in 2016 (and each of the past few years as well).

Have the “traditional” unified data protection solutions caught up in reliability and agility to the degree that the need for “point products” that only protect VMs are no longer necessary? To help answer the question, I’ve recorded a short video:

The “unified vs. specific” question is interesting, but the real answer is that we need data protection that is contextually-aware and integration-ready with multiple hypervisors, combined with utilizing the latest APIs to ensure reliable protection, and innovation that delivers rapid and agile recovery without first restoring.

Combine those traits with a growing influence by vAdmins and IT operations folks over data protection, instead of a single all-powerful backup administrator, and it’s easy to see how the toolsets of many organizations continue to fragment around workload‑specific data protection initiatives.

ESG watches and comments on VM protection and recovery considerations on a very regular basis:

We recently started our most expansive coverage of the topic yet — 2016 Trends in Protecting Highly Virtualized Environments — in support of VMworld 2016 and Microsoft Ignite 2016, including which hypervisor(s), what DP methods (backup/replication/snapshots), and who (vAdmin/backupadmin) are driving virtualization protection and recovery solutions. It looks like it’ll be a very interesting year.

As always, thanks for watching.

Video transcript:

Woman: The following is an ESG video blog.

Jason: Hi. I’m Jason Buffington. I’m the Senior Analyst at ESG covering data protection. I’ve been in data protection for more than 25 years and certainly one of the most impactful transformations over that time has been server virtualization. As we’ve seen in every other major platform shifts, data protection is rarely an initial part of the new platform. And so it takes early innovator startups to figure out how to protect that new platform while industry-leading unified solutions tend to lag behind.

This was certainly true for VM backups where it seemed like forever before VMware delivered vSphere APIs for data protection, VADP. Until those stabilized, you really couldn’t adequately protect a highly virtualized environment with legacy approaches. But today, those APIs are current and almost everybody really can adequately protect VMs as long as you stay current with whatever your backup software is. Do we still need VM-specific data protection mechanisms, or has that need passed and we can all go back to unified solutions until the next disruptive platform shift?

In 2015, ESG asked data protection professionals whether they currently used a VM-specific data protection tool or a unified solution that protects both physical and virtual machines. Sixty-four percent of organizations use a unified solution for protecting physical and virtual machines, while 36% of us use a VM-specific solution. But that isn’t the whole story. When we dig into what they’re using and what do they anticipate using in the future, we see some really interesting shifts. Twenty-five percent, one in four organizations use a unified solution and they like it. Thirty one percent, nearly a third, use a unified solution today, but they anticipate moving to a VM-specific solution in the future. Nine percent, not quite 1 in 10, use a VM-specific solution and they plan on keeping it while another 17% use a VM-specific solution, but they’re planning on switching to a unified approach. Again, more folks changing than staying.

Another way to look at this is to say that only a third, 34%, are planning on staying with whichever method they’re currently using today, meanwhile nearly half of folks plan on switching from one approach to the other. And then there’s the 18% of organizations, they’re open to either. So let’s map this trend over time. When ESG asked this question in 2013, about three fourths of you used a unified solution, while 27% used a VM-specific, about a three to one ratio. In 2015, it’s about a two to one ratio with 64% using unified approach, 36% VM-specific. Fast forward two years, we asked folks what they anticipated doing in 2017, it’s nearly even. Forty-two percent plan on a unified solution, 40% using VM-specific and then the 18% that’s undecided. But even if only 18% went with a unified solution, they’d be at 60, it’s still a lot of share. But it’s much more likely that around summer 2016, those lines will cross and more folks will be using VM-specific technologies instead of unified solutions for protecting and recovering their VMs.

2016 promises to be interesting. As mentioned earlier, this could be the year that VM-specific becomes the majority method, but the unified vendors could really shift this momentum. Here is how.

Functionality wise. With hypervisor APIs today, everyone can get a good backup. That isn’t the goal line anymore. The real question is, how fast is your restore? Followed by, how granular is that restore and where can I restore to? Like the cloud. If you don’t have good answers for those questions in 2016, then you really are the definition of legacy.

Upgrades. One of the big reasons that larger enterprises are moving towards VM-specific solutions is because their backup methods are intentionally laggards. They’re staying a version or more behind in order to have less bugs in the backup product. And as such, the older backup solution really isn’t as adequate for VM backups. So unified vendors, you got to drive those upgrades. You need better marketing, more in-place upgrade technology, better compelling features to ensure that your customers stay current.

Marketing. You don’t have to fragment your whole content strategy but if your unified product can stand toe to toe with a VM-specific product, then some percentage of your marketing should articulate and demonstrate and prove that. This is such an interesting topic.

ESG will be kicking off a new research project on the trends in protecting highly virtualized environments in early 2016, including VMware, Hyper-V and OpenStack strategies as well as how the methods backup, snapshots, replication might be affected during private, hybrid, and public cloud implementations. You can take a look at our 2013 protection of private clouds and our 2015 virtualization protection reports to get an idea what we’re looking at next. I’m Jason Buffington for ESG. Thanks for watching.

[Originally blogged via ESG’s Technical Optimist.com]

Mix and match data protection methods

There’s a debate going on about data protection methods right now, and it centers on “disk, tape or cloud.” Of course, cloud is a delivery mechanism, not a medium, and cloud providers are using disk and near-line tape, too.

But what kinds of media and architectures are organizations really using for backup? According to ESG research, the most common approach continues to be disk-to-disk-to-tape (D2D2T). In other words, backing up from primary disk; then leveraging secondary disk for efficient deduplication and fast, granular recovery; and finally employing tertiary tape for long-term retention. Twenty-six percent of the organizations surveyed by ESG said they use this data protection method.

Another 16% go a different route: disk-to-disk-to-cloud (D2D2C). They make sure they’ll have rapid/granular recovery if needed via the secondary disk, and they send data to the cloud for tertiary retention as well.

The tradeoff between the two data protection methods is typically related to regulatory compliance — the amount of data an organization must retain for long periods will likely influence its cloud vs. tape long-term retention decisions.

Still another 17% opt for disk-to-tape-to-tape — tape for on-site recovery, with some tapes going off-site for retention and BC/DR preparedness. Interestingly, it appears the second most common method of data protection in use today (at least among the respondents surveyed) is an all-tape system.

However, the remainder of the surveyed organizations went with “all-disk” data protection methods:

  • Disk to disk for fast on-premises protection but no tertiary or off-premises retention (14%)
  • Disk to disk across a WAN to another disk (11%)
  • Disk across a WAN to another disk, but making centralized backups of branch/remote offices back to the main data center (6%)
  • Disk to tape — i.e., traditional tape backups (6%)
  • Disk to cloud (4%)

A case could be made for any of these data protection methods. And that’s the point. An organization’s recovery goals, retention requirements and economic priorities all should factor into the decision.

I love cloud backups of remote office data, though, depending on the organization’s recovery needs, I can see sending the data back to headquarters instead. I also love cloud-based data protection for endpoint devices, as long as IT oversees the process. And I think the cloud is compelling to support BC/DR, where the organization not only uses the cloud for storage, but also leverages cloud-based compute to fail over if needed.

And I’m a fan of tape, especially for organizations in retention-regulated industries and those storing data for more than two years.

Still, disk ought to be the primary means of protection and recovery. That’s because it is difficult (if not impossible) to meet today’s SLAs without having a fast on-site copy to recover from.

ESG research tells a similar story:

  • Seventy-three percent of organizations use disk as their first tier of recovery. That number should rise somewhat in the future.
  • However, 69% of organizations use something other than disk in their overall strategy. Nearly half (49%) use tape and 20% are using the cloud in some fashion for data protection.
  • Twenty-three percent of the surveyed organizations exclusively use tape, and 4% exclusively use cloud. That tape number may recede somewhat, and the cloud number may rise — but not by much.

I expect some organizations that use tape exclusively will evolve to a disk-plus-tape model for efficiency and to help them meet tight SLAs. But those same goals will likely keep pure-cloud platforms from taking over. Disk, tape and cloud data protection methods all have their place in supporting rapid recovery, reliable retention and data agility/survivability (respectively).

If those are your goals, then perhaps all three approaches/media types should be part of your data protection strategy.

[Originally posted on TechTarget as a recurring columnist]

Cloud-Powered Data Protection — Definitions and Clarifications

We continue to see a great amount of interest in combining “data protection” and “the cloud” – but also a great deal of confusion, in that there isn’t such thing as “the cloud.”

It is a misnomer to discuss data protection media as having three choices (tape, disk or cloud) because cloud services are not actually a media type; they are a consumption vehicle, whereby you trade CapEx for OpEx to a service provider that stores your data on their disk and tape, instead of yours.

It is also a misnomer that vendors offering solutions that utilize cloud-services are “cloud vendors”. They are smart companies whose products are compatible with and/or integration-ready with cloud-based services (with widely varying degrees of finesse), which is great! But they aren’t cloud vendors themselves, they enable customers to leverage cloud-services as part of a broader solution.

But if we look at cloud-based services that intersect with various data protection and availability initiatives, there really are at least six “data protection plus service model” scenarios that are interesting and worth investigating. I recently defined these for ESG’s upcoming Data Protection Cloud Strategiesresearch project, so they are offered here for your consideration:

  • Managed backup services — third-party monitoring and management of your backup solution to provide expertise and oversight, regardless of whether the backup solution is on-premises or cloud-based.

    I firmly believe that much of the dissatisfaction with your current backup solution would be alleviated if you contracted out the management and monitoring of your solution to experts in data protection. This is especially true when you have more than one data protection solution within an environment.

There are three distinct ways to protect data TO the cloud (BaaS, DRaaS, and STaaS/dp):

  • Backup-as-a-service (BaaS) — a third-party service that includes software to back up data into a cloud-based repository, typically paid for using a capacity protected model. Along with the software/service, it may or may not also utilize an on-premises caching appliance or other onsite storage device for faster recovery, but the primary solution design is to ensure the data is stored via an Internet facility.
  • Disaster recovery-as-a-service (DRaaS) — a cloud-based service which may or may not utilize on‑premises technologies (e.g. failover or network-extending) appliance provides orchestrated and cloud‑based compute, storage and networking to enable virtualized servers and services to resume functionality within a hosted cloud-service, instead of within a self-managed data center.

    One key delineation between BaaS and DRaaS is that a backup service typically operates much like a backup application, using scheduled jobs before transmitting and transforming data into repository for long-term retention, from which the data must be “restored.”  Whereas a DR service will typically replicate (instead of backup) data on a recurring basis, with relatively limited ‘transformation,’ thus enabling the replicated servers to boot or otherwise resume operation from the alternate location.

  • Storage-as-a-service (STaaS/dp) — leveraging cloud-based storage as a tertiary repository and supplement to an on‑premises traditional data protection solution (STaaS/dp), so that traditional backups and recoveries occur onsite before the data is replicated to the cloud for longer-term retention and offsite protection. This is what folks who debate “tape, disk, cloud” are talking about as media choices.

The main distinction between STaaS/dp and either BaaS or DRaaS is that an outside backup/archival application interfaces with the production systems. The cloud storage is simply a supplemental repository, over self-managed tape or disk systems.

In addition, there are at least two ways to protect data that is IN the cloud:

  • Software-as-a-service (SaaS) — including Office 365, Google Apps, and Salesforce, whereby cloud‑based production services are used in lieu of traditional on-premises, data center‑centric servers; presumably with native resiliency between intra-cloud points of presence, but without data retention or previous version capabilities. Key point: even if SaaS platforms are assumed to be resilient to outage, your SaaS data still has to be backed up (by you).
  • Infrastructure-as-a-service (IaaS) — utilizing compute (VMs) and storage that is running within a third-party cloud platform or within a hybrid architecture, in lieu of or in complement with self‑managed servers within a more traditional datacenter. The key data protection consideration for Hybrid or IaaS scenarios is where/how does the organization back up the cloud-based data within a hosted environment — and should that data be stored within the same cloud, a different cloud, or back to the organization’s self-managed facilities?

Let me know what you think about the definitions. And before someone asks, they are not mutually exclusive – so yes, your solution may fit more than category; but hopefully in function, not just in marketing.

[Originally blogged via ESG’s Technical Optimist.com]

The first thing to agree on in data protection modernization

There are many fundamental debates in data protection:

  • Disk vs. tape vs. cloud
  • Backups vs. snapshots vs. replication
  • Centralized backup of ROBO’s vs. autonomous backups vs. cloud-BaaS solutions
  • Unified data protection vs. workload-specific (e.g. VM/database) methods

On any given day, I could argue on either side of any one of them (for fun) in which I adamantly insist that these choices are not mutually exclusive nor definitively decidable with a unilateral best choice. Candidly, every one of those choices is best resolved somewhere with, “it depends”, and usually the right answer is, “and, not or.”

There truly is only one argument that really does require definitive alignment and consensus on when discussing data protection modernization: “what are we solving for?

But as it turns out, even that simple question is often in stark disagreement, depending on who you talk to in the organization between IT professionals responsible for data protection, IT professionals responsible for production workloads, and IT and corporate leadership executives. Here’s a video to try to clear the confusion:

Fundamentally, if those that are implementing and those that are paying for (and depending on) any aspect of IT do not agree, neither side is likely to be satisfied. And frankly, the broader organization suffers in that scenario.

In the case of data protection, if executive leaders are looking for increased reliability or speed, it is a safe bet that they don’t believe that the current solution is reliable or fast enough. But if IT implementers are instead focused on operational or functional scenarios, this can be lost. More importantly, if IT implementers bring up new functionality through purchase requests, but the new solution doesn’t also assure increased reliability, agility or speed (what the executives want to pay for), then the entire project likely fails — if it is even purchased at all.

At the end of the day, the argument, “what implementers need vs. what executives want” turns out to be just like the other great data protection debates listed above. The right answer is and, not or. But until all of the stakeholders align on that, the technology decisions are irrelevant.


 

Video transcript:

Woman: The following is an ESG video blog.

Jason: Hi, I’m Jason Buffington. I’m the Senior Analyst at ESG covering data protection. One aspect of IT that everybody seems to agree on is we need better data protection. What they don’t always agree on is how, let’s unpack that.

In ESG’s latest annual IT spending intentions report, improving data backup has actually been on the top three for the past four years running. This is not a new struggle, but something folks continue to deal with. It’s also interesting to see not only the tactical side of data protection within the top 10, but also the more strategic side within business continuity and SaaS recovery. Of those same four years, we’ve often seen improving data backup and increasing use of server utilization, either adjacent to each other or within a percentage point. And that last one is a big lesson. When you modernize production, you have to modernize protection. Virtualization’s a great example. SaaS is also a good one. But even just scaling up what you already have will likely show inadequacies in the legacy approaches for data protection use.

Maybe you’ll modernize protection reactively, because your legacy approach is hindering those new production systems. Maybe you’ll modernize production proactively, because you’re smart like that and you understand that evolving production systems will assuredly need better protection than your legacy solution’s provided. But let’s just agree, when you modernize production, you will modernize protection. But how? That is actually the challenge because it turns out that folks are way out of alignment on this.

When ESG looked at what were the top protection challenges that folks were struggling with, we saw the costs, complexity, and workloads as top of mind. But when ESG asked what were the data protection mandates from IT leadership, we saw something much different. Increasing reliability of backups and recoveries, increase the speed and agility of recoveries, increasing the speed and frequency of backups. I’ve never known execs to say that they want something more reliable, unless they thought what they had was unreliable. If they’re asking for more speed and agility for recoveries, it’s a safe bet they believe recoveries are not fast enough or agile enough today. Same thing with speed and frequency of backups.

But the funny thing is those IT mandates for better data protection, don’t seem to be addressing what we saw with what the implementers were struggling with. And there’s more. When IT professionals were asked what were the top considerations for choosing a new backup vendor, we see encryption and cloud and TCO and solution scenarios again. Those are all good things, but they don’t align with what IT leadership wants – reliability and agility. All of those costs savings and security enhancements don’t count if you can’t restore your data.

So here are three things that I hope you do after this video. Plan for a hybrid approach to data protection. Data protection is more than a synonym for backup because backup alone isn’t enough. To accommodate the kind of agility the business units need, you really ought to be complementing your backups with snapshots, and replication, and probably also archiving for long-term retention and grooming-off that stagnant data.

Number two, spend more time in a conference room than you do staring in a backup UI dashboard. You got to talk to these other groups and get aligned. The gaps I shared earlier are just between data protection professionals and IT leadership. Add in some DBAs, some V-Admins, legal and compliance, IT operations, and a few business unit owners who rely on the data, you hopefully get my point. Spend more time in a conference room talking and listening about data protection.

And number three, balance operational and economic goals against the business unit’s IT dependency. That’s a fancy way of saying the cost of the solution should not be more than the business impact of the problem. And when you take a real look at the business units, you’ll find that not everyone has the same, or needs the same SOAs. Not everyone’s data should only be on disk. In fact, if you take a fresh look you’ll see where tape fits, where cloud fits, which clouds or not, etc.

In retrospect, I probably gave those to you in the wrong order. Start with understanding the needs of the business units and the platform owners. See how you already are going to be in the conference room more? Then, go to a different conference room with the other tech folks to figure out how you’re going to deliver the levels of IT resiliency the business users are asking for. And then, and only then, are you really ready to start looking at the technologies that you’ll need.

[Originally blogged via ESG’s Technical Optimist.com]

@JBuff on Twitter