Bookmark and Share

CommVault Blog - Modern Data Protection
 
 
 

Subscribe

Enter your email address:

Delivered by FeedBurner


About the Author

Greg White is a Senior Manager of Product Marketing, focusing on Data Management and Protection. He has worked in in the technology industry in brand and product marketing for the last decade helping organizations of all sizes, verticals and geographies solve their IT challenges. Over the last six years, Greg has gained specific knowledge and experience in the enterprise application and storage areas where he has helped businesses find solutions for their data growth, data management, and data protection problems.

Jeff Dorr is a Senior Manager currently focused on data protection product marketing for CommVault. Over the past 10 years, Jeff has led efforts in several companies to develop, introduce, and manage market leading hardware, software, and services that really help users improve their data protection environment.


The Modern Data Protection Experts' Archives

May 2013

April 2013

March 2013

February 2013

December 2012

November 2012

October 2012

September 2012

July 2012

June 2012

More »


 

Dedupe Smarter, Not Harder with CommVault Simpana 4th Gen Deduplication

Tuesday, May 07, 2013

Guest post by Phil Curran

It's time to rethink your data deduplication strategy. Most environments using dedupe today either suffer from resource bottlenecks, scale limitations, or a combination of both. This seems to be particularly true with the rapid spread of dedupe appliances. Don't get me wrong — a dedupe appliance is a useful tool that solves some important challenges customers may face when rapidly rolling out the technology to see a quick benefit. But this strategy retains the fundamental reliance on hardware that can create other problems. In particular, as datasets continue to accelerate in growth, these approaches fundamentally ask you to throw more and more hardware at the problem. As a result, if you're deduping this way today, you may be overpaying and/or underperforming. Read on to determine how to remedy these problems.

Let's set the stage: on average, datasets today are growing at about a 40% annual rate. In other words, they are doubling every two years — with no end in sight. Other industry pundits state even more aggressive data growth forecasts1. The point is, no matter which stat you choose to believe, rapid data growth is here to stay2.

With this in mind, CommVault® Simpana® 10 delivers 4th Generation data deduplication, engineered to meet the challenges of the continued explosion in data growth. The major productivity improvement in 4th Gen Dedupe is "Parallel Deduplication" technology, which gets your infrastructure working smarter rather than harder. The fundamental premise behind parallel deduplication is to deliver massively scalable and highly resilient deduplication - via a software-centric approach - designed for the largest datasets and most demanding business applications. It does this by leveraging a grid-based architecture to the dedupe database (DDB) and the media agent.

With grid architecture, Simpana 10 parallel deduplication will federate multiple DDBs together to present a single, very large deduplication pool for use by data protection jobs (clients and subclients). Figure 1 below is intended to be an example of what a 2-node parallel deduplication pool would look like. Using this type of architecture, we can scale deduplication capacity and throughput in a near linear fashion to support very large dedupe workloads.

Example of 2-Node Parallel Deduplication Pool Configuration
Figure 1: Example of 2-Node Parallel Deduplication Pool Configuration

In this example (Figure 1), we have federated two deduplication nodes together, each individually could protect up to 120 TB of front-end storage3 and approximately 4.5 TB/hr throughput4. By federating the 2 nodes together into a single dedupe pool, we can now manage deduplication of up to 240 TB of data and 9 TB/hr throughput.

Beyond the large scale and throughput this approach delivers, we can also combine the parallel deduplication approach with CommVault's unique GridStor® capability to deliver full load balancing and job failover options. If one node in the dedupe pool goes down, other nodes in the pool will immediately pick up the load to prevent any down time.

There are a couple of caveats I want to address on parallel deduplication up front that customers should understand:

  1. As of this post, Simpana 10 currently supports two nodes in a parallel dedupe policy, although there is no hard limit to the number of dedupe nodes that can be federated. And customers can expect CommVault to continue to push the limit upwards in terms of the number of dedupe nodes that will be supported in a single parallel dedupe policy.
  2. Parallel dedupe nodes need to be preconfigured up front in the Storage Policy — a single node cannot be converted to 2 nodes; 2 nodes will not be convertible to 4 nodes; so there is still a need to plan ahead and architect the solution for growth.

Creating a 2-Node Parallel Deduplication Storage Policy in CommVault Simpana 10

Finally, I want to provide a quick walkthrough of the highlights in configuring a parallel dedupe storage policy. This isn't the full step-by-step guide. For that, you can reference our Books Online: Create a new Storage Policy for Parallel Deduplication in Simpana 10:

Creating a new Storage Policy for Parallel Deduplication

Designate this Storage Policy for Parallel Deduplication (select Multiple Deduplication Databases option):

Designate this Storage Policy for Parallel Deduplication

Select Number of Deduplication Database Nodes (a.k.a. Partitions):

Select Number of Deduplication Database Nodes

And finally, finish off the Storage Policy Wizard to review your configuration:

Storage Policy Configuration Review

As you can see the configuration process itself is fairly straightforward and wizard driven. The trick is to be sure to have sized your DDB and media agent appropriately to handle the workloads. To help address this sizing, we typically recommend you work with your CommVault technical and services teams to get the configuration just right. But to give you an idea and direction for this sizing, check out the Simpana 10 Dedupe Requirements and Sizing on Books Online.

Parallel dedupe is just one of several capabilities available in Simpana 10 that help you dedupe smarter rather than harder. Here are several more I'll address in future posts:

  1. Consolidate remote and central office deduplication in a single software-based architecture. You can leverage single node dedupe policies at the remote site. Then run DASH Copy operations to the central office using a parallel dedupe policy at the central site. Combining single and multiple node dedupe gives you the flexibility to right size the capabilities at each location based on the business need.
  2. Run incremental forever backups leveraging DASH Full. This drives a much smarter backup strategy with minimal impact on production servers and network and helps drive better infrastructure utilization. For example, with the traditional weekly full daily incrementals, VM backups can only drive 20-25 TB per node, with incremental forever and DASH copy the same node can drive 40-50 TB of VM data.
  3. Holistically manage multiple dedupe pools, based on data type, from a single console. This ensures you are creating dedupe pools with the maximum dedupe benefits to optimize resource consumption.

Phil Curran (@PhilJCurran) is Director of Product Marketing for Infrastructure Solutions at CommVault.

1 IDC Big Data Technology and Services 2012-2016 Forecast, January 2013
2 2013 Spending Intentions Survey, ESG January 2013
3 120 TB requires use of SSD — on the DDB Store
4 Throughput is prelim v10 metric, expect this number to go up during the life of v10

Permalink | Submit a Comment

 

Introducing "Ask the Educator" Series to Address Questions, Shares Successes

Monday, May 06, 2013

Guest post by Robert Brower

The landscape around how Education Services and products are accessed and used is under a radical evolution. Functional expertise is now as much a by-product of "Googling" as it is professional experience and knowledge. A 20-year study, conducted from 1986 to 2006, assessed the percentage of knowledge required by workers to complete their work; researchers determined that there was a 7x DECREASE in the value of retained knowledge for employees over that time period. The change engine in this dynamic – the Internet. Fast forward seven years into the era of BYOD and it's clear that creating highly accessible "byte-sized" access to information assets is critical for employee success.

Permalink | Submit a Comment

 

Adventures in Data – A Real Story from the Road

Wednesday, April 24, 2013

Guest post by Greg White

If you have any doubts that protecting data at the edge and providing employees with self-service access to their information, on any device, makes them more productive, then this true story is for you. I know it's true because I was there. Thankfully, CommVault has fully embraced the growing trend of empowering the mobile workforce and, as the industry slogan says, we "eat our own dog food" by deploying CommVault Edge to protect the data on our employees' laptops and desktops.

Permalink | Submit a Comment

 

People-Centric IT: Empowering the Benefits of Mobility and BYOD

Wednesday, March 27, 2013

Guest post by Greg White

Hardly a day goes by where I don't see an article or email with the letters "BYOD" or the word "mobility" in it. What I've found talking to customers is that there is some merit to the volume of articles on these typically intertwined topics. Many tell me they are thinking about these issues, some are getting their feet wet, and a few are already all in, so I thought I'd throw in my 2 cents from the data management perspective.

Permalink | Submit a Comment

 

DR 101: Prepare for Disaster Recovery Before the Storm Clouds Appear

Tuesday, March 19, 2013

Guest post by James Brissenden

The recent run of cold, snowy weather in my home state of Massachusetts reminded me that disasters come in many shapes and sizes and they can strike at any time of the year. While blizzards do not typically fall into the natural disaster category, they do have the potential to dramatically affect your business's access to critical data. Here in North America, it is more likely that hurricane, flood, tornado, fire or even an earthquake will test your disaster recovery plan.

Permalink | Submit a Comment

 

The content of this blog reflects the thoughts and opinions of the author, and does not represent the thoughts, opinions, plans or strategies of CommVault Systems, Inc. ("CommVault") and CommVault undertakes no obligation to update, correct or modify any statements made by the author of this blog. Any and all third party links provided by this blog are not affiliated with, nor endorsed by, CommVault.

Search Newsroom:

Search by Topic:

 
 

Connect With Us

 

Press Contacts

GLOBAL
Liem Nguyen
732.728.5370 (direct)
512.970.9711 (cell)
lnguyen@commvault.com


NORTH AMERICA
Kevin Komiega
978.834.6898 (direct)
978.270.4287 (cell)
kkomiega@commvault.com


APJ
Ian Mackie
+61 2 8197 7704 (direct)
+61 405 489 182 (cell)
imackie@commvault.com