Bookmark and Share

CommVault Blog - CommVault on Cloud Computing
 
 
 

Subscribe

Enter your email address:

Delivered by FeedBurner


  About the Author

As the Senior Director of Product Marketing and Business Development Jeff is focused on building CommVault's cloud solutions and partner ecosystem to extend customer value using Simpana software.

Prior to joining CommVault, Jeff spent six years at Dell where he worked in a variety of roles including outbound marketing, PowerVault product management and analyst relations.

Jeff has his bachelor's degree in mechanical engineering and an MBA from the University of Texas at Austin.


Previous Blog Posts

Setting the Record Straight on the DCIG Virtual Server Backup Guide

Reflections From a Vegas Cloud: Recap of VMware Partner Exchange 2013

DCIG Virtual Server Backup Software Buyer's Guide: 'Solving Forward' Really IS the Right Strategy

CommVault Simpana Software Achieves Quality Assurance and Certification for Rackspace Private Cloud

Modern Data Protection for VMware View Branch Office Desktops


Related News

Healthcare Survey Reveals Top Data and Information Management Needs

CommVault Simpana 10 Delivers an Exponential Leap in Data and Information Management with Massively Scalable, Open Software Platform

CommVault Selected by Northgate Managed Services to Provide Data and Information Management for Europe's First Education Cloud

More »


 
Email this Page Print this Page Share on FacebookShare on TwitterShare on Google Buzz

Moving Large Volumes of Data to the Cloud – Part 1

Wednesday, May 30, 2012

Guest post by Jeanna James

More and more I find myself on customer calls where a client wants to move 50TB or so of data to the cloud. In theory, moving data to the cloud should be straightforward, but in reality there are important issues to consider first. Reality, of course, consists of lots of ones, zeros, and some basic math. I'm no math whiz, so I decided to write this blog to help others like myself calculate the cloud equation. When talking with these clients, I ask the following questions: How much bandwidth do you have? How much data do you have? Do you ever want to restore your data? How long is acceptable for the restore process to take place? Does your cloud provider allow you to seed the cloud (i.e. an option whereby a disk drive or other appliance is sent to customers, who back up their data locally and then ship the drive/appliance back to the provider)? Does your cloud provider have servers in the cloud so you could potentially restore data directly from the cloud storage?

Let's walk through two examples of customers planning to move 50TB of data to the cloud. One customer has a T1 line and the other is blessed with a fiber optic OC3 network. When I tell someone writing data into the cloud can take months, I'm not joking. I'm also not good at doing spontaneous math over the phone or on the white board. I should also confess right now that this is a three part blog and in this part we're only going to cover one solution to the problem I'm about to expose.

Example 1: Customer with T1

  • 50TB of data over 1.544 Mbps pipe (otherwise known as a T1 line) = 79,124 hours or approximately 3,297 days
  • First backup with typical 57% dedupe 50TB of data over a T1 = 33,549 hours or 1,398 days
  • Sequential backup with typical ongoing 90% dedupe ratio, 50TB of data over a T1 = 8,387 hours or 349 days. This represents an estimated 10% daily change rate with 90% dedupe ratio. Obviously in this example, the daily changes wouldn't make it to the cloud provider in a reasonable time.

Example 2: Customer with OC3

  • 50TB of data over OC3 = 788 hours or approximately 32 days
  • First backup with 57% dedupe 50TB of data over OC3 = 334 hours or 14 days
  • Sequential backup with 90% dedupe ratio, 50TB of data over OC3 = 84 hours or 3.5 days. Again, this represents an estimated 10% daily change rate with 90% dedupe ratio. While closer, the cloud provider still wouldn't receive daily data within a 24 hour window. If the window for delivering back up or archive data is less than 24 hours, the challenge becomes even bigger.

Now for more fun facts about the above calculations – they assume a pristine environment with minimal WAN overhead, but no other network load. What, you use those lines for Internet, email, and IP phones? Well, you can see where that is going to be a problem.

Below is an eye chart that breaks down 10TB of data based on these same assumptions.

Type Effective Bandwidth in Megabits Sec Effective MB Sec including Protocol (MB/s) Time to Transfer 10 TBTBTBTBs
T-1 1.54 0.17 1.82 Years
T-3 45 5.05 23 Days
OC-1 52 5.85 20 Days
OC-3 156 18 6 Days
OC-12 622 72 2 Days
OC-24 1,244 143 1 Day
OC-48 2,488 287 9.69 Hours
OC-192 9,952 1,146 2.42 Hours
OC-255 13,210 1,522 1.83 Hours

So why am I, a senior manager of cloud business development, being a Debbie Downer and pointing out these cloud-based challenges? Writing data into the cloud is just one step. Getting the data back, especially if it is backup or archive data is even more critical if the cloud is going to be the storage target for your company in the event of a disaster. This drives home the importance of other questions I ask about the cloud providers our customers are considering: Does your cloud provider allow you to seed the cloud? Does your cloud provider have servers in the cloud so you could potentially restore data directly from the cloud storage? If you have a disaster, will your cloud provider ship hard drives with your data back to your site for recovery? The answers to these questions are critical.

With the advent of massive, 50TB cloud solutions, how are customers overcoming the above math challenge? Today we'll cover one potential solution to this problem. In this example, the customer is working with a cloud provider who: 1) allows customers to seed data in the cloud and 2) bases their compute cloud on VMware in order to take full advantage of physical-to-virtual (P2V) technologies. There are several advantages to this type of cloud solution.

First, seeding the cloud gives customers the ability to ship media to the cloud provider and then merely send changes over the wire. This significantly reduces the bandwidth required by leveraging integrated data reduction technologies such as source-side deduplication and compression.

Second, in the event of a disaster, the customer has the ability to perform recoveries into the compute side of the service providers cloud. Not only can they perform these recoveries, but because the service provider is running VMware, they can leverage P2V technologies – in CommVault's case our Virtualize Me functionality, to automatically recover a physical server into a VM. In addition, organizations should ask their cloud provider if they can access tools to multi-stream data to the cloud and maximize bandwidth, especially for those customers who have a dedicated 1Mbps pipe. CommVault provides this feature in Simpana 9. This is one scenario that will substantially accelerate an organization seeking to move large volumes of back up or archive data into the cloud. I've heard from customers who use modern data management features such as dedupe, compression and multi-streaming to cut the time it takes them to move large volumes to the cloud from months to days or even hours.

Planning such an implementation is critical to the success of the project both for getting data into the cloud and for testing restore objectives. In my next post we'll look at shipping hardware and when physically shipping a device might be the best option for a customer.

Learn more by watching our Virtualize Me demo on YouTube.

Jeanna James is senior manager of cloud business development at CommVault.


Submit a Comment

Please note that your comments will be sent directly to the author of this blog, and will be published upon approval per CommVault's comment policy.

*Name:

*Email Address: (Not displayed with your comment)

*Comments:

*Enter the code as it is shown in the image below:

[This resource requires a Javascript enabled browser.]

 

The content of this blog reflects the thoughts and opinions of the author, and does not represent the thoughts, opinions, plans or strategies of CommVault Systems, Inc. ("CommVault") and CommVault undertakes no obligation to update, correct or modify any statements made by the author of this blog. Any and all third party links provided by this blog are not affiliated with, nor endorsed by, CommVault.

 

Search Newsroom:

Search by Topic:

 
 

Connect With Us

 

Press Contacts

GLOBAL
Liem Nguyen
732.728.5370 (direct)
512.970.9711 (cell)
lnguyen@commvault.com


NORTH AMERICA
Kevin Komiega
978.834.6898 (direct)
978.270.4287 (cell)
kkomiega@commvault.com


APJ
Ian Mackie
+61 2 8197 7704 (direct)
+61 405 489 182 (cell)
imackie@commvault.com