
Subscribe
Enter your email address:
Delivered by FeedBurner
About the Author
Greg White is a Senior Manager of Product Marketing, focusing on Data Management and Protection. He has worked in in the technology industry in brand and product marketing for the last decade helping organizations of all sizes, verticals and geographies solve their IT challenges. Over the last six years, Greg has gained specific knowledge and experience in the enterprise application and storage areas where he has helped businesses find solutions for their data growth, data management, and data protection problems.
Jeff Dorr is a Senior Manager currently focused on data protection product marketing for CommVault. Over the past 10 years, Jeff has led efforts in several companies to develop, introduce, and manage market leading hardware, software, and services that really help users improve their data protection environment.
The Modern Data Protection Experts' Archives
Tuesday, May 07, 2013
Guest post by Phil Curran
It's time to rethink your data deduplication strategy. Most environments using dedupe today either suffer from resource bottlenecks, scale limitations, or a combination of both. This seems to be particularly true with the rapid spread of dedupe appliances. Don't get me wrong — a dedupe appliance is a useful tool that solves some important challenges customers may face when rapidly rolling out the technology to see a quick benefit. But this strategy retains the fundamental reliance on hardware that can create other problems. In particular, as datasets continue to accelerate in growth, these approaches fundamentally ask you to throw more and more hardware at the problem. As a result, if you're deduping this way today, you may be overpaying and/or underperforming. Read on to determine how to remedy these problems.
Let's set the stage: on average, datasets today are growing at about a 40% annual rate. In other words, they are doubling every two years — with no end in sight. Other industry pundits state even more aggressive data growth forecasts1. The point is, no matter which stat you choose to believe, rapid data growth is here to stay2.
With this in mind, CommVault® Simpana® 10 delivers 4th Generation data deduplication, engineered to meet the challenges of the continued explosion in data growth. The major productivity improvement in 4th Gen Dedupe is "Parallel Deduplication" technology, which gets your infrastructure working smarter rather than harder. The fundamental premise behind parallel deduplication is to deliver massively scalable and highly resilient deduplication - via a software-centric approach - designed for the largest datasets and most demanding business applications. It does this by leveraging a grid-based architecture to the dedupe database (DDB) and the media agent.
With grid architecture, Simpana 10 parallel deduplication will federate multiple DDBs together to present a single, very large deduplication pool for use by data protection jobs (clients and subclients). Figure 1 below is intended to be an example of what a 2-node parallel deduplication pool would look like. Using this type of architecture, we can scale deduplication capacity and throughput in a near linear fashion to support very large dedupe workloads.

In this example (Figure 1), we have federated two deduplication nodes together, each individually could protect up to 120 TB of front-end storage3 and approximately 4.5 TB/hr throughput4. By federating the 2 nodes together into a single dedupe pool, we can now manage deduplication of up to 240 TB of data and 9 TB/hr throughput.
Beyond the large scale and throughput this approach delivers, we can also combine the parallel deduplication approach with CommVault's unique GridStor® capability to deliver full load balancing and job failover options. If one node in the dedupe pool goes down, other nodes in the pool will immediately pick up the load to prevent any down time.
There are a couple of caveats I want to address on parallel deduplication up front that customers should understand:
Creating a 2-Node Parallel Deduplication Storage Policy in CommVault Simpana 10
Finally, I want to provide a quick walkthrough of the highlights in configuring a parallel dedupe storage policy. This isn't the full step-by-step guide. For that, you can reference our Books Online: Create a new Storage Policy for Parallel Deduplication in Simpana 10:

Designate this Storage Policy for Parallel Deduplication (select Multiple Deduplication Databases option):

Select Number of Deduplication Database Nodes (a.k.a. Partitions):

And finally, finish off the Storage Policy Wizard to review your configuration:

As you can see the configuration process itself is fairly straightforward and wizard driven. The trick is to be sure to have sized your DDB and media agent appropriately to handle the workloads. To help address this sizing, we typically recommend you work with your CommVault technical and services teams to get the configuration just right. But to give you an idea and direction for this sizing, check out the Simpana 10 Dedupe Requirements and Sizing on Books Online.
Parallel dedupe is just one of several capabilities available in Simpana 10 that help you dedupe smarter rather than harder. Here are several more I'll address in future posts:
Phil Curran (@PhilJCurran) is Director of Product Marketing for Infrastructure Solutions at CommVault.
1 IDC Big Data Technology and Services 2012-2016 Forecast, January 2013
2 2013 Spending Intentions Survey, ESG January 2013
3 120 TB requires use of SSD — on the DDB Store
4 Throughput is prelim v10 metric, expect this number to go up during the life of v10
Monday, May 06, 2013
Guest post by Robert Brower
The landscape around how Education Services and products are accessed and used is under a radical evolution. Functional expertise is now as much a by-product of "Googling" as it is professional experience and knowledge. A 20-year study, conducted from 1986 to 2006, assessed the percentage of knowledge required by workers to complete their work; researchers determined that there was a 7x DECREASE in the value of retained knowledge for employees over that time period. The change engine in this dynamic – the Internet. Fast forward seven years into the era of BYOD and it's clear that creating highly accessible "byte-sized" access to information assets is critical for employee success.
Wednesday, April 24, 2013
Guest post by Greg White
If you have any doubts that protecting data at the edge and providing employees with self-service access to their information, on any device, makes them more productive, then this true story is for you. I know it's true because I was there. Thankfully, CommVault has fully embraced the growing trend of empowering the mobile workforce and, as the industry slogan says, we "eat our own dog food" by deploying CommVault Edge™ to protect the data on our employees' laptops and desktops.
Wednesday, March 27, 2013
Guest post by Greg White
Hardly a day goes by where I don't see an article or email with the letters "BYOD" or the word "mobility" in it. What I've found talking to customers is that there is some merit to the volume of articles on these typically intertwined topics. Many tell me they are thinking about these issues, some are getting their feet wet, and a few are already all in, so I thought I'd throw in my 2 cents from the data management perspective.
Tuesday, March 19, 2013
Guest post by James Brissenden
The recent run of cold, snowy weather in my home state of Massachusetts reminded me that disasters come in many shapes and sizes and they can strike at any time of the year. While blizzards do not typically fall into the natural disaster category, they do have the potential to dramatically affect your business's access to critical data. Here in North America, it is more likely that hurricane, flood, tornado, fire or even an earthquake will test your disaster recovery plan.
The content of this blog reflects the thoughts and opinions of the author, and does not represent the thoughts, opinions, plans or strategies of CommVault Systems, Inc. ("CommVault") and CommVault undertakes no obligation to update, correct or modify any statements made by the author of this blog. Any and all third party links provided by this blog are not affiliated with, nor endorsed by, CommVault.