Académique Documents
Professionnel Documents
Culture Documents
code
Issue 1.0
Date 2010-05-18
and other Huawei Symantec trademarks are trademarks of Huawei Symantec Technologies
Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their
respective holders.
Notice
The purchased products, services and features are stipulated by the commercial contract made
between Huawei Symantec and the customer. All or partial products, services and features
described in this document may not be within the purchased scope or the usage scope. Unless
otherwise agreed by the contract, all statements, information, and recommendations in this
document are provided “AS IS” without warranties, guarantees or representations of any kind, either
express or implied.
The information in this document is subject to change without notice. Every effort has been made in
the preparation of this document to ensure accuracy of the contents, but all statements, information,
and recommendations in this document do not constitute the warranty of any kind, express or
implied.
Email: support@huaweisymantec.com
VTL3500 Product
Description
Contents
3 Solution...............................................................................................................................10
3.1 Advantages of the VTL3500 ..............................................................................................................10
3.2 Powerful Virtualization Capability..................................................................................................... 11
3.3 On-Demand Capacity Expansion ....................................................................................................... 11
3.4 IP Replication.....................................................................................................................................12
3.5 Tape Caching......................................................................................................................................14
3.6 Tape Encryption .................................................................................................................................17
3.7 De-duplication....................................................................................................................................18
4 Experience...........................................................................................................................21
4.1 Powerful Virtualization Capability.....................................................................................................21
4.2 On-Demand Capacity Expansion .......................................................................................................21
4.3 Application Scenario ..........................................................................................................................22
5 Conclusion..........................................................................................................................24
6 Acronyms and Abbreviations.........................................................................................25
As the data amount increases rapidly and the market competition heats up, customers
have higher requirements on the reliability and performance of data backup and
recovery. The traditional physical tape library technology is already unable to meet
customer requirements. Under the background that the virtual storage technology
develops and SATA hard disks emerge, the virtual tape library (VTL) becomes a
mature and cost-effective kind of data backup device. The VTL uses disk arrays as the
storage device and virtualizes the existing hard disks as the mainstream tape library
through the built-in virtualization software. The VTL combines multiple advantages,
such as the high reliability, high performance, ease-of-management of disk devices,
and mature media management of tape devices. Therefore, the VTL has attracted more
and more attention.
Since the SATA disk is advantageous in the cost and performance, more and more
users adopt disk to disk (D2D) backup to construct a fast and reliable backup system.
The capacity of the disk backup device, however, tends insufficient as the data amount
soars. Large amounts of duplicate data consumes much of the capacity. Under this
circumstance, the de-duplication technology comes into being and has become hot in
recently years. De-duplication can greatly reduce the amount of the data that needs to
be stored. In addition, de-duplication can dramatically decrease the amount of the data
replicated between remote nodes, thus reducing the occupation of bandwidth.
This document is going to introduce some key technologies of VTL and analyze their
values for customers. These technologies include virtualization, on-demand capacity
expansion, multi-stream backup, FC/IP SAN backup, remote IP replication, tape
caching, and de-duplication.
VTL3500 Product
Description
2 Introduction
Reliability
Figure 2-1 shows the analysis of backup failures by IDC.
Other 3%
Don't Know 3%
A tape library consists of mechanical parts. The tape drive boasts hundreds of
thousands of hours of operating life, but it often becomes faulty within one or two
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 5 of 25
VTL3500 Product
Description
years in the practical use. The robot of the tape library has a high fault probability. A
large proportion of the users of low-end and mid-range tape libraries suffer from at
least one backup failure due to the fault of the tape library. The tape library is
vulnerable to failures resulting from the external environment, such as dust and
moisture. The combination of components degrades the overall system availability.
The tape library is fault intolerant. The whole tape library runs abnormally and even
the whole backup system breaks down when a single failure of the tape drive, tape slot,
robot, controller, barcode scanning system, or tape incoming and ejecting device. The
low availability heightens the maintenance cost. According to the statistics, in 2002,
the average yearly maintenance cost of tape libraries accounted for 10% to 15% of the
procurement cost. What bores users more is that the repair of tape libraries must be
performed by professionals. The long repair period messes the daily operation up. That
compels users to purchase multiple tape drives, which are the major expensive parts of
a tape library. As a result, users' total cost of ownership (TCO) increases.
To improve the reliability of the tape-based storage, many users adopt the tape
replication method to implement dual backups of data. This time and labor consuming
method brings extra operation costs. In essence, backup itself is not the objective.
Backup only counts when it can ensure data recovery. The reliability of the backup
media determines the reliability of backup data. Tapes are exposed to the air and
vulnerable to electromagnetism, dust, moisture, magnetic particles, conglutination, and
moldiness. Users sometimes find the tapes damaged before starting data recovery.
Performance
As the service requirements grow, each system requires shorter backup windows. The
performance bottleneck of tape devices exists in data reading and writing, and also
tape loading, which sometimes spends more time than data reading and writing. If the
data on multiple tapes needs to be recovered, a complete system recovery takes a long
time and has a very low recovery performance. If users want to back up more data in a
shorter time, users need to install more tape drives in their tape libraries. That means
higher expenses, higher fault probabilities, and higher investment as well when the
tape technology is updated. In fact, due to the limitations of the design of the tape
library, the number of the tape drives that can be added is limited.
Scalability
On the one hand, the data amount increases ceaselessly; on the other hand, the
expansion space for the tape library is limited. If the user purchases a large tape library
(with over 200 slots for example), the procurement cost is very high even if a
relatively low configuration is chosen.
Return on Investment
As the data amount increases, each system requires shorter backup windows. Under
the current backup systems, data backup and recovery take more and more time.
Consequently, uses are required to increase the performance and capacity of the
existing tape libraries. The results, however, are higher hardware costs, more difficult
media management, higher software costs, higher fault probabilities, and higher
maintenance costs. Moreover, the return on investment is reduced because of the low
utilization of tapes and tape libraries, high maintenance costs of tape libraries, and
short lifecycle of tape drive technologies.
Eventually, users will find that the investment on data protection is beyond expected
and the return is far from expected, and that the backup system itself increases the
workload of maintaining the whole storage system. That has become a common
problem for many organizations.
storage. This solution is only a supplement to the disk-based backup and used for
accelerating backup and recovery.
Management
The disk-based backup is one of the functions of the backup software. The types
of backup software from different vendors implement disk-based backup in
different methods and no universal standard exists. As a result, under the
environment of multiple backup systems, the user cannot realize centralized
backup management or protect their investment.
2.2 De-duplication
As the Internet develops, large organizations, governments, and finance institutions
have increasingly growing data centers. The increasing requirement for storage space
boosts the storage cost. The IT personnel must deal with the top three issues: saving
energy, reducing power consumption, and lowering the system cost. As a hot
technology in the storage field, de-duplication solves these problems.
De-duplication is developed for reducing space occupation by duplicate data and thus
lowering costs and energy consumption. When adopting the de-duplication technology,
the user must consider the following factors:
Effect of de-duplication on the backup performance
De-duplication ratio
Efficiency of remote replication
Total benefits
Scalability
3 Solution
The Oceanspace VTL3500 virtual tape library (hereinafter referred to as the VTL3500)
is a backup solution developed by Huawei Symantec Technologies Co., Ltd.
(hereinafter referred to as Huawei Symantec) for the low-end market. The VTL3500
virtualizes SATA disk arrays into a physical tape library through the software. The
VTL3500 provides a high performance and supports seamless deployment. Moreover,
the VTL3500 supports de-duplication and integrated backup software to reduce users'
investment on the IT infrastructure.
misoperations, vicious destroy, and virus attacks. The VTL3500 simulates the data
read/write method of physical tape and stores the backup data in the raw device. Thus,
users cannot operate the backup data directly and viruses cannot destroy the data. The
VTL3500 solves the security problems found in the disk-based backup.
In the disk-based backup, a file system needs to be created in the storage unit. Data is
read and written first through the I/O interface of the file system and then through the
invoked I/O interface of the raw device. During data transfer, the overhead of invoking
the two interfaces degrades the system performance. In addition, the file system itself
may be a performance bottleneck. The VTL3500 transfers data through directly
reading and writing the raw device. This method fully utilizes the high speed of the
raw device and increases the transfer efficiency.
3.4 IP Replication
Replication is a common technology used for disaster recovery. Data replication refers
to copying data from one medium onto another medium and generating a data copy by
using the data replication software.
The traditional disaster recovery generally uses the transportation method. The backup
software copies data onto a physical tape library, and the physical tape library is
transported to a remote place for preservation. During the transportation, tapes may get
lost or damaged; thus, the effect of disaster recovery cannot be ensured.
Over an IP network, the local VTL3500 copies data on virtual tapes to the remote
VTL3500. Through this method, the VTL3500 utilizes the convenience and high speed
of the network to save the transportation cost. The local VTL3500 encrypts the tape
data by using the encryption algorithm before data transfer. Then the remote VTL3500
decrypts the data after receiving it. As a result, the data security during transfer is
ensured.
The VTL3500 provides four options for the IP replication:
Remote Copy
Automatic Replication
IP Replication
Replication upon De-duplication.
Among the four options, three support automatic replication and one supports manual
replication. Table 3-1 lists the four options of IP replication.
tape library; for a virtual tape library, this command means to put the virtual tape
into the virtual vault).
Remote Copy is triggered manually. The user can copy the data on the selected
disk to the VTL3500 in the disaster recovery center. Then, the VTL3500 in the
disaster recovery center allocates the space equal to that of the source tape to the
target disk, and sets the same barcode. When the copy is complete, the system
automatically promotes the disk to the virtual vault of the remote VTL3500 for
future use. Through the Remote Copy function, the whole virtual tape can be
copied to the remote VTL3500, without the need of creating a new virtual tape in
the remote VTL3500. Before the copy, any virtual tape in the remote VTL3500
must not have the same name as any virtual tape in the local VTL3500.
IP Replication is triggered based on the policy.
The policy can be:
− Data increment-based replication policy.
The VTL3500 can identify the amount of the data backed up to the tape each
time. If the data increment exceeds the pre-set threshold, the replication is
automatically triggered after the copy.
− Time point-based replication.
The user can specify the time point for the first replication and the replication
interval for each virtual tape. Then, the data on the virtual tape will be copied
according to the specified time point. The remote virtual tape that adopts IP
Replication must be promoted manually before use.
Replication upon De-duplication is manually triggered based on the policy.
The triggering condition can be the specific date or time point, or upon the
completion of the backup operation. The local VTL3500 transfers the data after
de-duplication to the remote VTL3500 over an IP network. After de-duplication,
data blocks instead of data are transferred during the IP replication. The
bandwidth occupation decreases and the transfer efficiency increases. As a result,
the remote data-level disaster recovery can be implemented with low costs, easy
deployment, and high efficiencies.
The remote IP replication has the following scenarios:
One VTL3500 copies data to the remote VTL3500.
Data can be recovered directly from the VTL or physical tape library. To fully utilize
the high-speed cache, the VTL3500 provides various migration triggering policies and
space reclaiming policies.
The time-based migration policy and intelligent migration policy cannot be used
simultaneously. For the time-based migration policy, "Certain time point each day"
and "Certain time point each week" cannot be used at the same time. The user can only
select either for the condition of triggering migration. Multiple options of the
intelligent policy can be chosen simultaneously. The options can be combined to meet
different requirements of migration.
Users do not need to worry about data loss. The VTL3500 only reclaims the space
originally occupied by the migrated data. The space occupied by the other data will not
be reclaimed. Thus, the data security and consistency are ensured.
The tape encryption function of the VTL3500 uses the 128-bit Advanced Encryption
Standard (AES) encryption algorithm. The user can create one or more tape keys to
encrypt the data exported to physical tapes and decrypt the data imported to virtual
tapes. The data on the tape library is inaccessible unless the correct key has been used
to decrypt the data. Moreover, the user can set passwords for each key. Only when the
correct password is provided can the key name, password, and password hint be
changed and can the key be deleted and exported.
When data is being exported to a physical tape library or during the IP replication, the
user can employ a created key to encrypt the data, thus ensuring the security of the
tape data. Even if tapes are lost or stolen or data packets are intercepted during the
transportation, the user does not need to worry the data security. If the correct key is
not used, the data on tapes are totally inaccessible.
3.7 De-duplication
The VTL3500 only saves one copy of the backup data in the Single Instance
Repository (SIR). The redundant part of the original data is replaced by the index to
the single instance. This index can be used to read and recover data.
Step 1 Read the data blocks and calculate the index value (content identity) of each data
block.
Step 2 Compare the index value with all the values in the original index table.
1) If the index value of the data block already exists in the index table, it indicates that
a data block of the same content already exists in the SIR. At that time, the VTL3500
deletes this data block and replace it with a link to the SIR.
2) If the index value of the data block does not exist in the index table, the VTL3500
saves this index value into the index table, saves the data block into the SIR, and
generates a link (specifies the location of the data block in the SIR) into the SIR.
Step 3 Repeat the preceding steps until all the data blocks are processed.
----End
After all the data blocks are processed, the file data is extracted and added into the SIR.
The file data zone on the tape only saves the links to the locations of the data blocks in
the SIR. When the data needs to be accessed, the data blocks can be quickly read
according to these links.
4 Experience
The VTL3500 is a low-end VTL designed for small and medium businesses (SMBs).
According to their requirements on data protection, the VTL3500 provides a series of
technologies and functions to help SMBs solve the problems in data backup and
disaster recovery. The VTL3500 increases the return on investment and reduces the
TCO of the IT infrastructure.
The data on the heterogeneous hosts is backed up to the VTL3500 over an FC/IP SAN.
The performance of the multi-host concurrent backup can reach up to 1.44 TB/h. The
data duplication deletion ratio (20:1) and compression ratio (2:1) of the VTL3500
increases the utilization of the storage space and meets the requirement of the ever
increasing backup data.
The production data is backed up to the VTL3500 via backup servers. Through the
auto archiving/tape caching function, the user can export the data to physical tape
libraries, thus implementing data archiving. When the data needs to be recovered, the
user can read the backup data directly from the VTL3500 to achieve a high recovery
Huawei Symantec Proprietary and Confidential
1.0 (2010-05-18) Copyright © Huawei Symantec Technologies Co., Ltd. Page 22 of 25
VTL3500 Product
Description
performance. The user can also read the archived on tapes of the physical tape libraries
to implement data recovery.
The data center and the remote disaster recovery center are respectively deployed with
one VTL3500. Through remote replication, data is copied to the disaster recovery
center over the wide area network (WAN). The remote replication function of the
VTL3500 supports incremental replication. In addition, it supports the data replication
after de-duplication to reduce the bandwidth occupation. The VTL3500 can encrypt
the data for remote replication to ensure the security of data transfer.
5 Conclusion
The VTL technology is indispensable to the storage market, for this technology
provides a high backup/recovery performance and can be combined with physical tape
libraries to implement the hierarchical storage. The de-duplication technology is an
emerging storage technology, which can help solve the problem of soaring costs due to
explosive data growth.
The VTL3500 developed by Huawei Symantec inherits the advantages of physical
tape libraries and disk arrays. At the same time it eliminates the hardware deficiencies
of tape libraries, the VTL3500 provides a higher backup/recovery performance than
disk arrays, thus meeting the requirements for various backup windows.
The powerful virtualization capability meets users' requirements for sharing
backup devices.
The on-demand capacity expansion and high duplication deletion ratio improves
the utilization of the storage space, thus increasing the return on investment and
reducing the TCO.
The tape caching function can be used to easily deploy the hierarchical storage
system, and implement automatic backup and hierarchical storage. At the same
time, the tape encryption technology eliminates the risks of data leakage and
safeguards the archived data.
Combined with the de-duplication technology and tape encryption technology,
the IP replication function can be used to realize low bandwidth-occupation
replication. The user can construct a reliable and safe remote data-level disaster
recovery with low investment.