UPDATED 12:51 EDT / AUGUST 14 2021

SECURITY

Rethinking data protection in the 2020s

Techniques to protect sensitive data have evolved over thousands of years, literally, but the pace of modern data protection is rapidly accelerating and presents both opportunities and threats for organizations.

In particular, the amount of data stored in the cloud, combined with hybrid work models, the clear and present threat of cybercrime, regulatory edicts and ever-expanding edge use cases should put executives on notice that the time is now to rethink your data protection strategies.

In this Breaking Analysis, we’re going to explore the evolving world of data protection and share some data on how we see the market evolving and the competitive landscape for some of the top players.

The evolving world of data protection

Steve Kenniston, aka the Storage Alchemist, shared a story with us and it was pretty clever. Way back in 4000 B.C., the Sumerians invented the first system of writing. They used clay tokens to represent transactions. To prevent tampering with these tokens, they sealed them in clay jars to ensure that the tokens – that is, the data – would remain secure with an accurate record that was quasi-immutable and lived in a clay vault.

Since that time we’ve seen quite an evolution in data protection. Tape, of course, was the main means of data protection during most of the mainframe era and that carried into client/server computing, which really accentuated the issues around backup windows and challenges with Recovery Time Objective, Recovery Point Objective and recovery nightmares.

Then in the 2000s, data reduction made disk-based backup more popular and pushed tape into an archive, last-resort media. Data Domain Corp., then EMC Corp. and now Dell Technologies Inc. still sell many purpose-built backup appliances as a primary backup target.

The rise of virtualization brought more changes in backup and recovery strategies as a reduction in physical resources squeezed the one application that wasn’t under utilizing compute: backup. And we saw the rise of Veeam Inc., the cleverly named company that became synonymous with data protection for virtual machines.

The cloud has created new challenges related to data sovereignty, governance, latency, copy creep, expense and more.

But more recently, cybersecurity threats have elevated data protection to become a critical adjacency to information security. Cyber resilience to protect specifically against ransomware attacks is the new trend being pushed by the vendor community as organizations are urgently looking for help with this insidious threat.

Cloud and cyber as disruptors

The two major disruptors we’re going to discuss today are the rapid adoption of cloud and the escalating threats in cybercrime, especially as it relates to ransoming your data.

Every customer is using cloud, and 76% are using multiple clouds, according to a recent study by HashiCorp.

We’ve extensively covered the digital skills gap and the challenges this brings to organizations. It’s especially acute in the complicated world of cybersecurity and that is bleeding into backup, recovery and related data protection strategies.

Customers are building (or buying) abstraction layers to hide the underlying cloud complexity and essentially build out their own clouds. This is good in that it creates standards and simplifies provisioning and management. However, there is a downside in that by creating that layer, it makes things less transparent and creates other problems.

We see these challenges as fundamentally data problems that are accentuated by distributed cloud architectures. For example: Ensuring fast, accurate and complete backups and recoveries. Adhering to compliance and data sovereignty edicts. How to facilitate safe data sharing. Managing copy creep. Ensuring cyber resilience and protecting privacy. These are just some of the issues these disruptors bring to organizations.

As it relates to cybersecurity, we’re all learning how remote workers are especially vulnerable and as clouds expand rapidly, data protection technologies are struggling to keep pace.

Public cloud is becoming the standard architectural construct

The chart below quantifies the worldwide revenue and growth of the big four hyperscale cloud vendors and underscores the rapid adoption of these platforms.

The so-called Big 4 will surpass $115 billion in revenue this year. That’s around 35% growth relative to 2019, when they generated a combined $86 billion. Notably, last year these four spent more than $100 billion in capital spending building out their clouds.

We see this as a gift to the rest of the industry.

To date, the legacy vendor community has been defensive, but that narrative is starting to change as large tech companies such as Dell, IBM Corp., Cisco Systems Inc., Hewlett Packard Enterprise Co. and others see opportunities to build on top of this infrastructure.

Listen to how Michael Dell is thinking about this opportunity when questioned on theCUBE by John Furrier about the cloud.

Clouds are infrastructure, right? So you can have a public cloud, you can have an edge cloud, a private cloud, a telco cloud, a hybrid cloud, or multicloud, here cloud, there cloud, everywhere cloud cloud. Yes, they’ll all be there, but it’s basically infrastructure. And how do you make that as easy to consume and create the flexibility that enables… everything.

Michael nailed it in our view and it’s exactly the right message. The cloud is everywhere. You have to make it easy. And you have to admire the scope of his comments. We know this is an individual who thinks big, right? “Enables everything.” He’s basically saying that technology is at the point where it has the potential to touch virtually every industry, every person, every problem… everything.

The rise of the data protection cloud

Let’s discuss how this informs the changing world of data protection.

The digital mandate has dragged a data protection imperative along with it. A digital business is a data business and no longer can backup, recovery and data management as it relates to data protection be bolted on as an afterthought. Rather, it must be architected as a fundamental component of an overall cloud strategy.

For this segment, we’ve purposely borrowed the title of a recent book written by Snowflake Inc. Chief Executive Frank Slootman called the “Rise of the Data Cloud.” In this book, Slootman lays out his vision for building value on top of the hyperscaler gift and leveraging network effects to create new value for customers at massive scale. Snowflake is executing on this vision for database and data management and though much of this vision has yet to be delivered, we believe it’s one of the most powerful “North Stars” in the industry today.

Snowflake’s vision is an elegant and easy to understand application of Michael Dell’s cloud everywhere comments: On-prem/hybrid–>cross cloud(s)–>edge strategy and we believe serves as an excellent example that can be applied to the data protection market.

This vision hides the underlying complexity of the clouds and creates the same experience across all estates with automation and orchestration build into the data protection cloud. The data protection cloud provides a variety of services across any cloud (as well as on-premises), including backup and recovery for virtualized and bare-metal computing, any operating system, container data protection and a variety of other services.

It includes analytics that not only report but use machine intelligence to anticipate problems or anomalous behavior.  And possibly it includes protections for personally identifiable information or PII.

The attributes of the data protection cloud are that it manages the underlying cloud primitives, exploits cloud-native technologies for performance, machine intelligence and lowest cost. It has a distributed metadata capability to track files, volumes and any organizational data, irrespective of location. And fundamentally enables sets of services to intelligently govern data, in a federated manner, while ensuring integrity.

And it’s automated to help with the skills gap.

Connection to cyber recovery

As it relates to cyber recovery, air-gapped solutions must be part of the portfolio but managed outside the data protection cloud. In other words, the orchestration and management of the air-gapped data must also be gapped and disconnected.

This strategy is a cohort to and a complementary piece of cybersecurity regimes. But that is a complicated world and one in which technologies and processes can become messy.

The fragmented world of cybersecurity

In other words, you don’t want your data protection strategy to get lost in this mess.

This is a chart we often use to describe the complexity and sea of point products that has permeated the industry. So try to think about data protection strategy as a cohort or an overlay to your cybersecurity approach. Yes, this may create some overheads and integration challenges, which is why you’ll likely need a partner.

We see the rise of managed services providers and specialist service providers, not the public cloud providers, not your technology arms dealer, rather managed service providers that have intimate relationships with customers, understand their business and specialize in architecting solutions to these difficult challenges.

Quantifying the data threat

A closer look at the risk factors that organizations face is a useful exercise. The chart below was shared with us by the Storage Alchemist. It’s based on a study that IBM funds with the Ponemon Institute, which is a firm that researches things such as the cost of breaches and has for years.

The chart shows the total cost of a typical breach within each dot on the Y axis and the frequency on the X axis in percentage terms. The two most frequent types of breaches are are compromised credentials and phishing, which once again proves bad user behavior trumps good security every time.

The point here is that the adversary’s attack vectors are many. And specific technology companies specialize in solving these problems, often with point products which is why the slide we showed earlier from Optiv looks so cluttered.

But this problem is top of mind today and that’s why we’ve seen the emergence of cyber recovery solutions from virtually all the major players.

Zero trust is a megatrend

Ransomware and the SolarWinds hack have made trust the No. 1 issue for chief information officers and chief information security officers.

We see major shifts in CISO spending patterns toward endpoint, identity & cloud. We see this in the Enterprise Technology Research data and in the stock price momentum of disruptors such as Okta Inc., CrowdStrike Holdings Inc. and Zscaler Inc.

Cyber resilience is top of mind and robust solutions are required. Several companies including Dell, IBM, Veeam and virtually every major player, are building cyber recovery solutions. It’s common, of course, for backup and recovery vendors to focus their solutions on the backup corpus because that is often a prime hacker target.

We believe there is an opportunity to expand the scope from just the backup corpus to all data in a more comprehensive data management strategy.

Many companies use a 3, 2, 1 or 3, 2, 1, 1 strategy: Three copies, two backups, one in the cloud and one air-gapped. This strategy can be extended to primary storage, copies, snaps, containers, data in motion and more.

As we said earlier, many customers are increasingly looking to MSPs and specialists to help solve this problem because of skills gaps.

And the best practice is to physically and logically separate the orchestration and automation of the air-gapped solution.

Sizing up some of the major players

Let’s look at some of the ETR data on the competition. The chart below is a two-dimensional view with Net Score or spending velocity on the vertical axis and Market Share or pervasiveness in the ETR data set on the horizontal axis. Market Share is an indicator of response presence in the data, not revenue share.

This chart is a cut of the storage sector in the ETR taxonomy and isolates on pure plays backup and recovery/data protection vendors. The 40% red line is our subjective view of excellence – in other words anything over that line is considered elevated.

Note that only Rubrik Inc. above the 40% line. Also note the red highlight is the position of Rubrik and Cohesity Inc. from the January 2020 survey.

Veeam, although it’s below the 40% mark, has been impressive and steady over the last several quarters and years.

Commvault Systems Inc. is moving steadily up: Sanjay Merchandani is making moves and the Metallic offering appears to be driving cloud affinity within Commvault’s large customer base. The company is a good example of a legacy player evolving its strategy and staying relevant.

Veritas Technologies LLC continues to underperform relative to the other players in the ETR data set, as does Barracuda Networks Inc.

Large portfolio companies also play

The ETR taxonomy includes a total storage sector, not a backup and recovery view. So let’s add IBM and Dell to the chart, noting this comprises their entire respective storage portfolios, not just backup and recovery/data protection.

In this view we’ve also inserted the data table that shows the actual Net Score and Shared N data that inform the plot position. While Rubrik and Cohesity, for example, have smaller Ns, we feel there is enough data to track trends over time.

Veeam is impressive. Its Net Score has always been respectable in the mid-to-high 30% range over the last several quarters and years. It has solid spending momentum and a consistent presence in the data.

HPE SimpliVity has a small N but has improved its position relative to previous surveys.

The compute renaissance

We now want to emphasize something we’ve been hitting on for quite some time now, and that is the renaissance that’s coming in compute.

We’re all familiar with Moore’s Law, the doubling of transistor density every 18 to 24 months, which leads to a two-times performance boost in that timeframe. The blue line represents the x86 curve. The math averages out to around 40% per annum performance improvement as measured in trillions of operations per second. That figure is moderating for x86 and is now down to about 30% or so.

The orange line represents the Arm Ltd. ecosystem improvements calculated from Apple’s custom-designed A series chips, culminating most recently in the A15, which is the basis for the M1 chip that replaced Intel Corp.’s in Apple’s laptops.

That orange line is accelerating at a pace of more than 100% per annum when you include the processing power of the central processing unit, graphics processing unit, network processing unit and other alternative processors included in the chip.

The point is that there’s a new performance improvement curve and it’s being led by the Arm ecosystem.

Data protection must exploit future architectures

So what’s the tie to data protection? We’ll leave you with this chart below which shows Arm’s Confidential Compute Architecture.

We believe this architecture is ushering in a new era of security and data protection using the concept of realms.

Zero trust is the new mandate and what realms do is create separation of vulnerable components by creating a physical bucket to deposit code and data, away from the OS. Remember, the OS is one of the most valuable entry points for hackers because it contains privileged access. It’s also a weak link because of things such as memory leakages and other vulnerabilities. Malicious code can be placed by bad guys within data inside the OS and appear benign – even though it’s anything but.

So in this architecture, all the OS does is create application programming interface calls to the realm controller. That’s the only interaction with the data, which makes it much harder for bad actors to get access to the code and data. And it’s an end-to-end architecture, so there’s protection throughout.

The link to data protection is that backup needs to be the most trusted of applications because it’s one of the most targeted areas in a cyberattack. Realms provide an end-to-end separation of data and code from the OS and is a better architectural construct to support zero trust and confidential computing in critical use cases such as data protection/backup and other digital business applications.

Our call to action is: Backup software vendors, you can lead the charge. Arm is several years ahead at the moment in our view, so pay attention to that and use your relationships with Intel to accelerate its version of this architecture.

Or, ideally, agree on common standards for the industry and solve this problem together. Intel CEO Pat Gelsinger told us on theCUBE that if it’s the last thing he’s going to do in his life, he’s going to solve this security problem. Well, Pat, you don’t have to solve it yourself. You can’t and you know that. So while you’re going about your business saving Intel, look to partner with arm to use these published APIs and push to collaborate and open source an architecture that address the cybersecurity problem.

If anyone can do it, you can.

Keep in touch

Remember we publish each week on Wikibon and SiliconANGLE. These episodes are all available as podcasts wherever you listen.

Email david.vellante@siliconangle.com, DM @dvellante on Twitter and comment on our LinkedIn posts.

Also, check out this ETR Tutorial we created, which explains the spending methodology in more detail. Note: ETR is a separate company from Wikibon and SiliconANGLE. If you would like to cite or republish any of the company’s data, or inquire about its services, please contact ETR at legal@etr.ai.

Here’s the full video analysis:

Image: TarikVision

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One-click below supports our mission to provide free, deep and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU