Vous êtes sur la page 1sur 25

Doc.

code

Oceanspace VTL3500 White Paper

Issue 1.0
Date 2010-05-18

Huawei Symantec Technologies CO., LTD.


Copyright © Huawei Symantec Technologies Co., Ltd. 2009. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Symantec Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei Symantec trademarks are trademarks of Huawei Symantec Technologies
Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their
respective holders.

Notice
The purchased products, services and features are stipulated by the commercial contract made
between Huawei Symantec and the customer. All or partial products, services and features
described in this document may not be within the purchased scope or the usage scope. Unless
otherwise agreed by the contract, all statements, information, and recommendations in this
document are provided “AS IS” without warranties, guarantees or representations of any kind, either
express or implied.

The information in this document is subject to change without notice. Every effort has been made in
the preparation of this document to ensure accuracy of the contents, but all statements, information,
and recommendations in this document do not constitute the warranty of any kind, express or
implied.

Huawei Symantec Technologies Co., Ltd.


Address: Building 1
The West Zone Science Park of UESTC, No.88, Tianchen Road
Chengdu, 611731
P.R.China

Website: http:// www.huaweisymantec.com

Email: support@huaweisymantec.com
VTL3500 Product
Description

Contents

1 Executive Summary ............................................................................................................4


2 Introduction .........................................................................................................................5
2.1 Background of the VTL Technology....................................................................................................5
2.2 De-duplication......................................................................................................................................9

3 Solution...............................................................................................................................10
3.1 Advantages of the VTL3500 ..............................................................................................................10
3.2 Powerful Virtualization Capability..................................................................................................... 11
3.3 On-Demand Capacity Expansion ....................................................................................................... 11
3.4 IP Replication.....................................................................................................................................12
3.5 Tape Caching......................................................................................................................................14
3.6 Tape Encryption .................................................................................................................................17
3.7 De-duplication....................................................................................................................................18

4 Experience...........................................................................................................................21
4.1 Powerful Virtualization Capability.....................................................................................................21
4.2 On-Demand Capacity Expansion .......................................................................................................21
4.3 Application Scenario ..........................................................................................................................22

5 Conclusion..........................................................................................................................24
6 Acronyms and Abbreviations.........................................................................................25

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 3 of 25
1 Executive Summary

As the data amount increases rapidly and the market competition heats up, customers
have higher requirements on the reliability and performance of data backup and
recovery. The traditional physical tape library technology is already unable to meet
customer requirements. Under the background that the virtual storage technology
develops and SATA hard disks emerge, the virtual tape library (VTL) becomes a
mature and cost-effective kind of data backup device. The VTL uses disk arrays as the
storage device and virtualizes the existing hard disks as the mainstream tape library
through the built-in virtualization software. The VTL combines multiple advantages,
such as the high reliability, high performance, ease-of-management of disk devices,
and mature media management of tape devices. Therefore, the VTL has attracted more
and more attention.
Since the SATA disk is advantageous in the cost and performance, more and more
users adopt disk to disk (D2D) backup to construct a fast and reliable backup system.
The capacity of the disk backup device, however, tends insufficient as the data amount
soars. Large amounts of duplicate data consumes much of the capacity. Under this
circumstance, the de-duplication technology comes into being and has become hot in
recently years. De-duplication can greatly reduce the amount of the data that needs to
be stored. In addition, de-duplication can dramatically decrease the amount of the data
replicated between remote nodes, thus reducing the occupation of bandwidth.
This document is going to introduce some key technologies of VTL and analyze their
values for customers. These technologies include virtualization, on-demand capacity
expansion, multi-stream backup, FC/IP SAN backup, remote IP replication, tape
caching, and de-duplication.
VTL3500 Product
Description

2 Introduction

2.1 Background of the VTL Technology


2.1.1 Deficiencies of the Physical Tape Library
As informatization develops and data grows explosively in recent years, more and
more users recognize the importance of data protection and purchase tape libraries and
data backup software to construct their own data backup systems. By using tape
libraries, users can mange the media comprehensively and thoroughly, and can use the
backup software to realize automatization. Tapes are easy to be preserved offline, and
can be taken out of the physical tape library and transported to another site to
implement remote disaster recovery. Now, users, however, find that at the same time
the automated data backup system brings convenience, it also poses new problems that
threaten the practicability of the existing data backup solutions.

Reliability
Figure 2-1 shows the analysis of backup failures by IDC.

Figure 2-1 Analysis of backup failures

"What are the most common causes of a backup failure?" --


Percent of All Users
(multiple responses accepted), N = 222

Media Failure 59%

Hardware Failure 53%

Human Error 47%

Software Failure 40%

Network Failure 32%

Other 3%

Don't Know 3%

0% 10% 20% 30% 40% 50% 60% 70%

A tape library consists of mechanical parts. The tape drive boasts hundreds of
thousands of hours of operating life, but it often becomes faulty within one or two
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 5 of 25
VTL3500 Product
Description

years in the practical use. The robot of the tape library has a high fault probability. A
large proportion of the users of low-end and mid-range tape libraries suffer from at
least one backup failure due to the fault of the tape library. The tape library is
vulnerable to failures resulting from the external environment, such as dust and
moisture. The combination of components degrades the overall system availability.
The tape library is fault intolerant. The whole tape library runs abnormally and even
the whole backup system breaks down when a single failure of the tape drive, tape slot,
robot, controller, barcode scanning system, or tape incoming and ejecting device. The
low availability heightens the maintenance cost. According to the statistics, in 2002,
the average yearly maintenance cost of tape libraries accounted for 10% to 15% of the
procurement cost. What bores users more is that the repair of tape libraries must be
performed by professionals. The long repair period messes the daily operation up. That
compels users to purchase multiple tape drives, which are the major expensive parts of
a tape library. As a result, users' total cost of ownership (TCO) increases.
To improve the reliability of the tape-based storage, many users adopt the tape
replication method to implement dual backups of data. This time and labor consuming
method brings extra operation costs. In essence, backup itself is not the objective.
Backup only counts when it can ensure data recovery. The reliability of the backup
media determines the reliability of backup data. Tapes are exposed to the air and
vulnerable to electromagnetism, dust, moisture, magnetic particles, conglutination, and
moldiness. Users sometimes find the tapes damaged before starting data recovery.

Performance
As the service requirements grow, each system requires shorter backup windows. The
performance bottleneck of tape devices exists in data reading and writing, and also
tape loading, which sometimes spends more time than data reading and writing. If the
data on multiple tapes needs to be recovered, a complete system recovery takes a long
time and has a very low recovery performance. If users want to back up more data in a
shorter time, users need to install more tape drives in their tape libraries. That means
higher expenses, higher fault probabilities, and higher investment as well when the
tape technology is updated. In fact, due to the limitations of the design of the tape
library, the number of the tape drives that can be added is limited.

Scalability
On the one hand, the data amount increases ceaselessly; on the other hand, the
expansion space for the tape library is limited. If the user purchases a large tape library
(with over 200 slots for example), the procurement cost is very high even if a
relatively low configuration is chosen.

Return on Investment
As the data amount increases, each system requires shorter backup windows. Under
the current backup systems, data backup and recovery take more and more time.
Consequently, uses are required to increase the performance and capacity of the
existing tape libraries. The results, however, are higher hardware costs, more difficult
media management, higher software costs, higher fault probabilities, and higher
maintenance costs. Moreover, the return on investment is reduced because of the low
utilization of tapes and tape libraries, high maintenance costs of tape libraries, and
short lifecycle of tape drive technologies.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 6 of 25
VTL3500 Product
Description

Eventually, users will find that the investment on data protection is beyond expected
and the return is far from expected, and that the backup system itself increases the
workload of maintaining the whole storage system. That has become a common
problem for many organizations.

2.1.2 Disk-based Backup


Faced with the preceding problems, some users and consultants start to put their eyes
on the disk-based backup. As SATA disks become popular, the disks of large capacity
have a low price and high performance.
Under this background, the backup solution based on disk arrays comes into being,
which is realized in the following methods:
 Employing the disk arrays of standard FC, SAS, or iSCSI interfaces and
connecting SATA disks of a high capacity and low cost directly to the backup
server
 Using the space of NAS for backup
 Adopting the mainstream backup software that supports disk-based backup
This type of backup solution uses disks as the storage device, which is formatted into
file systems. This type of backup solution solves many problems found in the
tape-based solution:
 Eliminating the reliability limitations of the tape library and media
 Avoiding the effect of tape loading and unloading on the performance (the
sequential read/write performance equals or exceeds that of mid-range tape
libraries)
 Increasing the utilization of storage space greatly
 Facilitating the maintenance and reducing the maintenance cost (disk arrays are
common and can be easily managed by the administrators that do not have
professional knowledge)
Theoretically, the investment is low, for the user only needs to purchase one storage
array. In practical, however, the user finds that this backup solution based on disk
arrays is not perfect. This solution is disadvantageous in the following aspects.
 Sharing
If the user implements the LAN-free backup in the multi-server environment, the
complexity and cost of configuration increase.
Generally speaking, only when a file system is set up on a disk array can this disk
array be identified and used by the backup software. Moreover, most file systems
cannot be shared by multiple servers, whereas these file systems can be shared by
multiple tape libraries.
That is to say, if the user wants to make the same storage array shared by multiple
servers over a SAN, just like for tape libraries, the user must set up multiple
logical devices in this storage array and assign each logical device to each backup
server.
A series of management problems face the user consequently:
− How to determine the number of disks assigned to each server?
− How to expand the capacity online when the allocated capacity is insufficient?

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 7 of 25
VTL3500 Product
Description

− How to reduce the capacity online when the allocated capacity is


overabundant?
Must this function be realized through the expensive volume management
software? Some types of backup software support the backup storage pool, but
these types of backup software can only support data sharing in the same
platform and cannot support data sharing across platforms. Moreover, this
function requires additional data sharing options and supports limited number of
platforms. The function still needs to be improved.
 Security
This type of storage device is simply based on disk arrays and works as a file
system in the server. This file system can be operated by any tool and accessed by
any users. One unintentional rm –r or del * command can spoil all backups. All
in all, the backups are vulnerable like the files in the file system. That means
many risks:
− Will data be lost due to misoperations of the administrator or malicious
deletion by others?
− Will data be copied by others and recovered on another computer, thus
causing the leak of confidential information?
− Can the backup data not be used for data recovery due to viruses?
 Performance
First, the file system itself may be a performance bottleneck. Especially when
processing multiple tasks and processes, the file system probably becomes a
performance bottleneck of the backup system.
Second, the file system cannot solve the problem of disk fragments. Disk
fragments degrade the performance of the file system. When a large amount of
data is processed, the problem of disk fragments can hardly be solved.
 Function
The backup management software is specially designed for tape libraries.
Currently, most types of backup software support the use of disk arrays as the
backup device, but the functions are different from under the tape-based
circumstance. These differences can cause some serious problems:
− The existing backup environment must have the current backup policy
changed. The seamless integration is unrealizable.
− The data hardware compression function cannot be realized under the
disk-based backup. The backup performance or storage space cannot be
optimized effectively.
− The data backups saved on disk arrays cannot be copied via the media for
remote data storage. Therefore, the advantages of tape in the flexibility, such
as offline storage, data migration, and remote disaster recovery, are lost.
According to the preceding analysis, the use of disk arrays as the backup device
solves some problems found in tape libraries, but it also brings new problems,
which are more difficult to conquer.
In fact, the applications that use disk arrays as the backup device are restricted to
use disks as the cache for the tape-based backup. This function is supported by
the mainstream backup software, such as the Disk Staging of VERITAS
NetBackup and the Disk Backup Option of Legato NetWorker. That is to say, the
backup operation is implemented on disks within the time window, and then the
data is migrated from disks to the tapes in the background. This solution has also
posed the preceding problems. Uses must rely on tape libraries to implement data
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 8 of 25
VTL3500 Product
Description

storage. This solution is only a supplement to the disk-based backup and used for
accelerating backup and recovery.
 Management
The disk-based backup is one of the functions of the backup software. The types
of backup software from different vendors implement disk-based backup in
different methods and no universal standard exists. As a result, under the
environment of multiple backup systems, the user cannot realize centralized
backup management or protect their investment.

2.1.3 VTL Function Provided by the Backup Software


At present, some types of backup software have the VTL function, such as the Virtual
Disk Library of BakBone NetVault. The backup server is installed with a VTL
software module, through which part of the storage space of the backup server is
virtualized into the tape library.
The solution is easy to implement and also cheap. It provides the basic VTL function
and partly solves the performance problem of the tape-based backup. This solution
starts to be adopted by some users.
This solution, however, has some obvious disadvantages, for example, sharing,
management of LAN-free backup, security, and high consumption of system resources
by the backup server. In a word, the solution can only be considered as a supplement
to the disk-based backup method, and is mainly used as the cache of the tape-based
backup. This solution cannot work independent of tape libraries.
According to the previous analysis, when the VTL function, which is achieved through
physical tape libraries, disk arrays, and backup software, is used to back data up,
various problems rise. The VTL technology can solve these problems effectively.

2.2 De-duplication
As the Internet develops, large organizations, governments, and finance institutions
have increasingly growing data centers. The increasing requirement for storage space
boosts the storage cost. The IT personnel must deal with the top three issues: saving
energy, reducing power consumption, and lowering the system cost. As a hot
technology in the storage field, de-duplication solves these problems.
De-duplication is developed for reducing space occupation by duplicate data and thus
lowering costs and energy consumption. When adopting the de-duplication technology,
the user must consider the following factors:
 Effect of de-duplication on the backup performance
 De-duplication ratio
 Efficiency of remote replication
 Total benefits
 Scalability

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 9 of 25
VTL3500 Product
Description

3 Solution

The Oceanspace VTL3500 virtual tape library (hereinafter referred to as the VTL3500)
is a backup solution developed by Huawei Symantec Technologies Co., Ltd.
(hereinafter referred to as Huawei Symantec) for the low-end market. The VTL3500
virtualizes SATA disk arrays into a physical tape library through the software. The
VTL3500 provides a high performance and supports seamless deployment. Moreover,
the VTL3500 supports de-duplication and integrated backup software to reduce users'
investment on the IT infrastructure.

3.1 Advantages of the VTL3500


3.1.1 VTL3500 vs. Physical Tape Library
By using the virtualization technology, the VTL3500 emulates the parts of a physical
tape library. The robot, drive, tapes, and slots of the physical tape library exist in the
logical manner and do not need to be maintained manually. This manner avoids the
inherent mechanical deficiencies such as tape location and tape errors, and the short
service life problem resulting from being exposed to the air and being vulnerable to
electromagnetism, dust, moisture, magnetic particles, conglutination, and moldiness.
The costs of managing the media and maintaining the device decrease greatly and the
reliability of backup data increases. The VTL3500 stores backup data based on
high-speed disks and high-reliability RAID technologies. The VTL3500 improves the
performance of backup and recovery, shortens greatly the time of backup and recovery,
provides a high scalability, and increases the return on investment.

3.1.2 VTL3500 vs. Disk-based Backup


After the VTL3500 creates VTLs and assigns them to the backup servers, the backup
servers recognize them as physical devices and share them between each other. On the
use and allocation of storage space, even when the physical libraries are shared
between multiple servers, the user can create new tapes to be invoked by multiple
backup servers according to the share mechanism specified by the backup software.
Therefore, the user does not need to worry about how to allocate proper space to
different backup servers.
Under the disk-based backup, the backup data is saved in the file system and can be
accessed by any user and virus. The disk-based backup cannot prevent human
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 10 of 25
VTL3500 Product
Description

misoperations, vicious destroy, and virus attacks. The VTL3500 simulates the data
read/write method of physical tape and stores the backup data in the raw device. Thus,
users cannot operate the backup data directly and viruses cannot destroy the data. The
VTL3500 solves the security problems found in the disk-based backup.
In the disk-based backup, a file system needs to be created in the storage unit. Data is
read and written first through the I/O interface of the file system and then through the
invoked I/O interface of the raw device. During data transfer, the overhead of invoking
the two interfaces degrades the system performance. In addition, the file system itself
may be a performance bottleneck. The VTL3500 transfers data through directly
reading and writing the raw device. This method fully utilizes the high speed of the
raw device and increases the transfer efficiency.

3.1.3 VTL3500 vs. VTL Module of the Backup Software


As an extension of the disk-based backup, the VTL module of the backup software
cannot solve the management and security problems found in the LAN-free backup.
This module consumes a large amount of server resources and even may degrade the
backup performance. In addition, this module cannot work independent of physical
tape libraries to meet the backup requirement in the complicated environment. The
VTL3500 has the independent hardware and functional components. It fully emulates
physical tape libraries and works as a backup device independent of physical tape
libraries. At the same time the VTL3500 helps to realize effective data backup and
recovery, it hardly occupies any server resources.

3.2 Powerful Virtualization Capability


The VTL3500 can virtualize 16/64/128 tape libraries/tape drives and more than 60
types of tape libraries and tape drives from the mainstream vendors such as HP, IBM,
and Quantum. The backup servers consider the VTL3500 the same as physical tape
libraries. Therefore, the VTL3500 can be seamlessly deployed into the existing backup
system that is based on physical tape libraries.

3.3 On-Demand Capacity Expansion


Microcosmically, the VTL3500 uses the Capacity-on-Demand technology. The user
can set a small initial capacity for the virtual tapes. As more data is written to the
virtual tapes, the VTL3500 automatically allocates more space to the virtual tapes. As
for physical tape libraries, the media management causes space waste (50% or more of
the total space) because a large number of tapes cannot be fully written. Compared
with physical tape libraries and disks, the VTL3500 increases the utilization of storage
space dramatically.
Macroscopically, the VTL3500 manages disks in the common way. New disks can be
easily added to expand the capacity. Therefore, the user does not need to purchase a
high configuration of disks like tape libraries. The user can add new disks
incrementally as the data amount grows. Thus, the initial procurement cost is much
lower than that of tape libraries. For the routine maintenance, the cost is much lower,
for disks are free of the various mechanical faults found in tape libraries.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 11 of 25
VTL3500 Product
Description

3.4 IP Replication
Replication is a common technology used for disaster recovery. Data replication refers
to copying data from one medium onto another medium and generating a data copy by
using the data replication software.
The traditional disaster recovery generally uses the transportation method. The backup
software copies data onto a physical tape library, and the physical tape library is
transported to a remote place for preservation. During the transportation, tapes may get
lost or damaged; thus, the effect of disaster recovery cannot be ensured.
Over an IP network, the local VTL3500 copies data on virtual tapes to the remote
VTL3500. Through this method, the VTL3500 utilizes the convenience and high speed
of the network to save the transportation cost. The local VTL3500 encrypts the tape
data by using the encryption algorithm before data transfer. Then the remote VTL3500
decrypts the data after receiving it. As a result, the data security during transfer is
ensured.
The VTL3500 provides four options for the IP replication:
 Remote Copy
 Automatic Replication
 IP Replication
 Replication upon De-duplication.
Among the four options, three support automatic replication and one supports manual
replication. Table 3-1 lists the four options of IP replication.

Table 3-1 Four options of IP replication


Option Type Description
Auto Automatic When a virtual tape is exported from the VTL, the
Replication system automatically copies the data on the virtual
tape to another VTL3500.
Remote Manual The data on a virtual tape is copied to another VTL as
Copy required.
IP Automatic Within the specified interval and according to the
Replication user-defined policy, the changed data on the primary
virtual tape is copied to the same or another VTL.
Replication Automatic When the de-duplication function is enabled, the
upon deletion policy is integrated with the replication
De-duplicati policy. The changed data is copied to another
on VTL3500 according to the replication policy.

These four options differ mainly in the replication triggering mechanism.


 Auto Replication is triggered by the backup software. If the VTL is set Auto
Replication, the replication of the virtual tape is triggered when the VTL receives
the eject command from the backup software (For a physical tape library, the
eject command for the backup software means to eject the tape out of the physical
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 12 of 25
VTL3500 Product
Description

tape library; for a virtual tape library, this command means to put the virtual tape
into the virtual vault).
 Remote Copy is triggered manually. The user can copy the data on the selected
disk to the VTL3500 in the disaster recovery center. Then, the VTL3500 in the
disaster recovery center allocates the space equal to that of the source tape to the
target disk, and sets the same barcode. When the copy is complete, the system
automatically promotes the disk to the virtual vault of the remote VTL3500 for
future use. Through the Remote Copy function, the whole virtual tape can be
copied to the remote VTL3500, without the need of creating a new virtual tape in
the remote VTL3500. Before the copy, any virtual tape in the remote VTL3500
must not have the same name as any virtual tape in the local VTL3500.
 IP Replication is triggered based on the policy.
The policy can be:
− Data increment-based replication policy.
The VTL3500 can identify the amount of the data backed up to the tape each
time. If the data increment exceeds the pre-set threshold, the replication is
automatically triggered after the copy.
− Time point-based replication.
The user can specify the time point for the first replication and the replication
interval for each virtual tape. Then, the data on the virtual tape will be copied
according to the specified time point. The remote virtual tape that adopts IP
Replication must be promoted manually before use.
 Replication upon De-duplication is manually triggered based on the policy.
The triggering condition can be the specific date or time point, or upon the
completion of the backup operation. The local VTL3500 transfers the data after
de-duplication to the remote VTL3500 over an IP network. After de-duplication,
data blocks instead of data are transferred during the IP replication. The
bandwidth occupation decreases and the transfer efficiency increases. As a result,
the remote data-level disaster recovery can be implemented with low costs, easy
deployment, and high efficiencies.
The remote IP replication has the following scenarios:
 One VTL3500 copies data to the remote VTL3500.

Figure 3-1 Networking of one-to-one remote disaster recovery

 Multiple VTL3500s copy data to the remote VTL3500.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 13 of 25
VTL3500 Product
Description

Figure 3-2 Networking of many-to-one remote disaster recovery

3.5 Tape Caching


Tape Caching is an advanced function of the VTL3500. This function uses the
high-speed VTL3500 as the high-speed cache of the physical tape library. The backup
data is written to the VTL3500 first. After the backup operation is complete, the
VTL3500 migrates the backup data to the physical tape library according to the preset
policy. In this way, the hierarchical storage architecture forms.
The VTL3500 can shorten the backup window and quickly recover data. Physical tape
libraries are suitable for large-capacity offline data. Therefore, the VTL3500 can be
combined with physical tape libraries to implement the hierarchical storage.
The principles of the hierarchical storage include:
 The data that needs to be archived for a long time is stored on the physical tape
libraries.
 The frequently-used data is stored in the VTL.
 The VTL takes over the physical tape libraries.
Physical tape libraries have the slow backup speed and disks are unsuited for
seldom-accessed data for a long time. The hierarchical storage eliminates the
shortcomings of physical tape libraries and disks.
Figure 3-3 shows the networking of the hierarchical storage.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 14 of 25
VTL3500 Product
Description

Figure 3-3 Networking of the hierarchical storage

Data can be recovered directly from the VTL or physical tape library. To fully utilize
the high-speed cache, the VTL3500 provides various migration triggering policies and
space reclaiming policies.

3.5.1 Data Migration Policies


Tape Caching provides two policies for triggering data migration between the
VTL3500 and the physical tape library: 1) time-based migration; 2) intelligent
migration. Table 3-2 and Table 3-3 list the two policies.

Table 3-2 Time-based migration policy

Policy Name Description


Certain time Migration is performed in a one-day cycle. The VTL3500 starts
point each day data migration at the specified time point each day.
Certain time Migration is performed in a one-week cycle. The VTL3500
point each week starts data migration at the specified time point each day from
Monday to Saturday.

Table 3-3 Intelligent migration policy

Policy Name Description


And/Or Conjunction/disjunction of the intelligent policy. The option
And means migration is triggered only when all conditions are
met; or means that migration is triggered when any condition is
met.
Data storage Migration is triggered when the backup data is stored on the
period VTL3500 for a specified period.
Watermark Migration is triggered when the usage of the disk space of the
VTL3500 reaches 90%.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 15 of 25
VTL3500 Product
Description

Policy Name Description


After backup Migration is triggered after each backup. "Tape space used out"
(tape space used is the additional policy for "after backup". If the two options are
out) chosen, the VTL3500 checks the usage of the virtual tape when
a virtual tape is ejected out of the tape drive. If the space of this
tape is used out, migration is triggered.
Postponed to a Migration is postponed to a specific time point after the
certain time condition is met this time. This policy must be used together
point with the preceding three policies. When the condition of any
preceding policy is met, migration can be postponed to a
specific time point.

The time-based migration policy and intelligent migration policy cannot be used
simultaneously. For the time-based migration policy, "Certain time point each day"
and "Certain time point each week" cannot be used at the same time. The user can only
select either for the condition of triggering migration. Multiple options of the
intelligent policy can be chosen simultaneously. The options can be combined to meet
different requirements of migration.

3.5.2 Space Reclamation Policy


To fully utilize the cache, the VTL3500 provides two space reclamation policies to
ensure the space utilization: 1) intelligent reclamation; 2) reclamation upon
de-duplication. Table 3-4 lists the reclamation methods.

Table 3-4 Reclamation methods

Policy Name Description


Intelligent The space occupied by the virtual tapes of the VTL3500 used as
reclamation the cache is reclaimed. That is, the data on these virtual tapes is
deleted and only the indexes to the physical tapes are reserved.
Reclamation Through the de-duplication algorithm, the duplicate data is
upon deleted to release the storage space of the VTL3500.
de-duplication

Table 3-5 lists the methods of triggering space reclamation.

Table 3-5 Methods of triggering space reclamation

Policy Name Description


Immediate After the migration is complete, the space originally occupied
reclamation by the migrated data is reclaimed.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 16 of 25
VTL3500 Product
Description

Policy Name Description


Watermark When the remaining disk space accounts for less than 10% of
the total space, the space originally occupied by the migrated
data is reclaimed. This trigger method is available only under
intelligent reclamation.
Storage period When the backup data is stored on the VTL3500 for a specified
period, the space occupied by the backup data is reclaimed.

Users do not need to worry about data loss. The VTL3500 only reclaims the space
originally occupied by the migrated data. The space occupied by the other data will not
be reclaimed. Thus, the data security and consistency are ensured.

3.6 Tape Encryption


To ensure the security of the data stored on tapes, the VTL3500 encrypts tapes when
data is transferred to physical tape libraries.

Figure 3-4 Tape encryption

The tape encryption function of the VTL3500 uses the 128-bit Advanced Encryption
Standard (AES) encryption algorithm. The user can create one or more tape keys to
encrypt the data exported to physical tapes and decrypt the data imported to virtual
tapes. The data on the tape library is inaccessible unless the correct key has been used
to decrypt the data. Moreover, the user can set passwords for each key. Only when the
correct password is provided can the key name, password, and password hint be
changed and can the key be deleted and exported.
When data is being exported to a physical tape library or during the IP replication, the
user can employ a created key to encrypt the data, thus ensuring the security of the
tape data. Even if tapes are lost or stolen or data packets are intercepted during the
transportation, the user does not need to worry the data security. If the correct key is
not used, the data on tapes are totally inaccessible.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 17 of 25
VTL3500 Product
Description

3.7 De-duplication
The VTL3500 only saves one copy of the backup data in the Single Instance
Repository (SIR). The redundant part of the original data is replaced by the index to
the single instance. This index can be used to read and recover data.

3.7.1 Types of De-duplication


According to where it happens, de-duplication can be divided into front-end deletion
and back-end deletion.
The front-end deletion means that the duplicate data is deleted on backup servers. The
type of deletion can reduce the amount of transferred data and the occupation of
bandwidth. This type of deletion, however, has the following disadvantages:
 The front-end deletion occupies the CPU resources of backup servers and
degrades the backup performance. This type of deletion is considered
unacceptable by many users because of the long backup window.
 The front-end deletion has a low deletion ratio of duplicate data. De-duplication
upon a backup server can only delete the duplication data on the sole server. That
is to say, the front-end deletion cannot work upon the duplicate data in the
across-server system.
 To implement the front-end deletion, the user needs to replace the existing
backup software and reconfigure the client. As a result, the investment on the
existing backup software is wasted and the current applications are affected.
Compared with the front-end deletion, the back-end deletion happens on the storage
client. This type of deletion cannot reduce the amount of the data transferred between
backup servers and storage devices, but it can solve the preceding three problems
found in the front-end deletion. The back-end deletion brings no impact on backup
servers, provides a higher deletion ratio of duplicate data, and requires no change of
the existing backup network to protect users' investment. According to when it
happens, the back-end deletion can be divided into in-line deletion and post-processing
deletion.
 The in-line deletion means that de-duplication works the instant data reaches the
storage device. Then the data after de-duplication is backed up on the storage
media.
 The post-processing deletion means that de-duplication happens on the storage
device after the backup operation is complete. Obviously, the former deletion
degrades the backup performance, whereas the latter deletion prolongs the time
when the storage device processes the data.
The VTL3500 adopts the advanced back-end post-processing de-duplication that does
not affect the backup performance. The VTL3500 provides a 20:1 deletion ratio of
duplicate data and increases the utilization of the storage space. During remote
replication, the VTL3500 transfers the data after de-duplication and reduces the
bandwidth occupation. The VTL3500 provides a raw capacity of up to 24 TB. By
enabling the de-duplication function, the user can obtain a raw capacity equal to 480
TB. As a result, the return on investment increases greatly.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 18 of 25
VTL3500 Product
Description

3.7.2 Procedure for De-duplication


When the conditions for triggering de-duplication are met, the VTL3500 scans the tape
data, which includes the metadata and the file data to be backed up. The file data is the
processing objective. The VTL3500 partitions the file data into data blocks of the same
size according to a specific algorithm, and performs de-duplication in steps.

Figure 3-5 Initializing the tape data

Figure 3-6 Processing procedure

Step 1 Read the data blocks and calculate the index value (content identity) of each data
block.
Step 2 Compare the index value with all the values in the original index table.
1) If the index value of the data block already exists in the index table, it indicates that
a data block of the same content already exists in the SIR. At that time, the VTL3500
deletes this data block and replace it with a link to the SIR.
2) If the index value of the data block does not exist in the index table, the VTL3500
saves this index value into the index table, saves the data block into the SIR, and
generates a link (specifies the location of the data block in the SIR) into the SIR.
Step 3 Repeat the preceding steps until all the data blocks are processed.
----End

After all the data blocks are processed, the file data is extracted and added into the SIR.
The file data zone on the tape only saves the links to the locations of the data blocks in
the SIR. When the data needs to be accessed, the data blocks can be quickly read
according to these links.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 19 of 25
VTL3500 Product
Description

Figure 3-7 Tape data after de-duplication

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 20 of 25
VTL3500 Product
Description

4 Experience

The VTL3500 is a low-end VTL designed for small and medium businesses (SMBs).
According to their requirements on data protection, the VTL3500 provides a series of
technologies and functions to help SMBs solve the problems in data backup and
disaster recovery. The VTL3500 increases the return on investment and reduces the
TCO of the IT infrastructure.

4.1 Powerful Virtualization Capability


With the powerful virtualization capability, the VTL3500 can take over physical tape
libraries and tape drives of most mainstream tape library suppliers, thus realizing the
hierarchical storage. The user can obtain both the performance of disk-based backup
and the long-period archiving feature of tapes.
The VTL3500 can virtualize multiple tape libraries/tape drives without bringing any
extra cost. The user can assign each server a specific tape library that has its own tape
drives, thus improving the management and backup.
The VTL3500 can be seamlessly deployed in the existing backup system without
needing any change of the exiting backup policies and configurations. The backup
servers can manage the VTL3500 in the similar way for physical tapes.

4.2 On-Demand Capacity Expansion


The on-demand capacity expansion function of the VTL3500 can help automatically
allocate the storage space to increase the utilization of the disk space. As for physical
tape libraries, the media management causes space waste (50% or more of the total
space) because a large number of tapes cannot be fully written.
The VTL3500 can implement capacity expansion through the addition of disks.
Therefore, the user does not need to purchase a high configuration of disks like tape
libraries. The user can add new disks incrementally as the data amount grows. Thus,
the initial procurement cost is much lower than that of tape libraries. For the routine
maintenance, the cost is much more lower, for disks are free of the various mechanical
faults found in tape libraries. The VTL3500 uses SATA disks to provide a high
capacity-price ratio without degrading the reliability and performance. For most users,
the VTL3500 needs a lower investment than physical tape libraries.
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 21 of 25
VTL3500 Product
Description

4.3 Application Scenario


The VTL3500 has the following four typical application scenarios.

4.3.1 Integrated and Economical Backup

The data on the heterogeneous hosts is backed up to the VTL3500 over an FC/IP SAN.
The performance of the multi-host concurrent backup can reach up to 1.44 TB/h. The
data duplication deletion ratio (20:1) and compression ratio (2:1) of the VTL3500
increases the utilization of the storage space and meets the requirement of the ever
increasing backup data.

4.3.2 Hierarchical Backup

The production data is backed up to the VTL3500 via backup servers. Through the
auto archiving/tape caching function, the user can export the data to physical tape
libraries, thus implementing data archiving. When the data needs to be recovered, the
user can read the backup data directly from the VTL3500 to achieve a high recovery
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 22 of 25
VTL3500 Product
Description

performance. The user can also read the archived on tapes of the physical tape libraries
to implement data recovery.

4.3.3 Remote Disaster Recovery

The data center and the remote disaster recovery center are respectively deployed with
one VTL3500. Through remote replication, data is copied to the disaster recovery
center over the wide area network (WAN). The remote replication function of the
VTL3500 supports incremental replication. In addition, it supports the data replication
after de-duplication to reduce the bandwidth occupation. The VTL3500 can encrypt
the data for remote replication to ensure the security of data transfer.

4.3.4 Distributed Backup

For a multi-branch organization, one or more VTL3500s can be deployed according to


the data amount of each node. For a branch that has a small data amount, data can be
backed up to the data center over the WAN. For a branch that has a large data amount,
data can be first backed up to the local VTL3500 to achieve a high backup and
recovery performance. Then, the data is copied to the data center through remote
replication to back up and manage data in a centralized manner. The data center can be
deployed with multiple VTL3500s that comprise a storage pool. The VTL3500 has a
unified management interface, through which the resources in the storage pool can be
centrally managed.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 23 of 25
VTL3500 Product
Description

5 Conclusion

The VTL technology is indispensable to the storage market, for this technology
provides a high backup/recovery performance and can be combined with physical tape
libraries to implement the hierarchical storage. The de-duplication technology is an
emerging storage technology, which can help solve the problem of soaring costs due to
explosive data growth.
The VTL3500 developed by Huawei Symantec inherits the advantages of physical
tape libraries and disk arrays. At the same time it eliminates the hardware deficiencies
of tape libraries, the VTL3500 provides a higher backup/recovery performance than
disk arrays, thus meeting the requirements for various backup windows.
 The powerful virtualization capability meets users' requirements for sharing
backup devices.
 The on-demand capacity expansion and high duplication deletion ratio improves
the utilization of the storage space, thus increasing the return on investment and
reducing the TCO.
 The tape caching function can be used to easily deploy the hierarchical storage
system, and implement automatic backup and hierarchical storage. At the same
time, the tape encryption technology eliminates the risks of data leakage and
safeguards the archived data.
 Combined with the de-duplication technology and tape encryption technology,
the IP replication function can be used to realize low bandwidth-occupation
replication. The user can construct a reliable and safe remote data-level disaster
recovery with low investment.

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 24 of 25
VTL3500 Product
Description

6 Acronyms and Abbreviations

Table 6-1 List of Acronyms and Abbreviations

Abbreviation Full Spelling


VTL Virtual Tape Library
SIR Single Instance Repository

Huawei Symantec Proprietary and Confidential


1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 25 of 25

Vous aimerez peut-être aussi