Académique Documents
Professionnel Documents
Culture Documents
Authors:
Version:
1.0 (18.11.2015)
Contents
1.
Introduction...............................................................................................................3
2.
2.1
2.2
2.2.1
2.2.2
2.2.3
2.2.4
2.2.5
How to validate that the HADR setup and the role assignment?............................................8
2.3
2.3.1
2.4
3.
3.1
4.
4.1
5.
Summary.................................................................................................................16
Page 2 of 20
1. Introduction
IBM Spectrum Protect (formerly known as Tivoli Storage Manager / TSM) [1] is made to protect business
critical data and applications, requiring continuous availability and disaster protection. High available
Spectrum Protect infrastructures in conjunction with IBM ProtecTIERs [2] data deduplication and
replication features lead to minimal RTO/RPO times, combined with maximum space efficiency at the
same time.
IBMs ProtecTIER solution offers great data deduplication and replication features which allows for
efficient replication of backup data to offsite locations, without the need to move physical tapes. In
addition, the following benefits come along with ProtecTIER:
Deduplication performance of up to 2.500MB/sec. for Backup and even higher performance for
restore operations
In Spectrum Protect and ProtecTIER backup environments, the ProtecTIER deduplicates and replicates
the backup data, while Spectrum Protect manages and replicates the backup catalog (meta data), which
is stored on an integrated IBM DB2 database [3].
Combining the IP based replication features of ProtecTIER and Spectrum Protect, it is possible to design
a flexible Data Protection environment with multi-site redundancy.
This article describes the setup of a multi-site redundancy backup environment using Spectrum Protect
together with ProtecTIER. It is based on the experiences we made during a customer implementation and
various tests in the ESCC Mainz Storage Systems Lab.
We will give you a short introduction to DB2 HADR feature and to the ProtecTIER solution. Further on,
well explain how a multi-site redundant backup environment based on Spectrum Protect together with
ProtecTIER is designed.
Page 3 of 20
The following list describes why you should think about using DB2 HADR in your Spectrum Protect
environment:
HADR is a standard feature of DB2, which is included with TSM beginning with version 6.x., so it
is ready to use.
Using HADR only for DB2 bundled with TSM requires no additional licenses.
HADR communication is managed by the database, using standard TCP/IP networks, so there
are no special requirements regarding disk subsystems or other HW or SW.
HADR is easy to setup and manage. Only a few commands are required to configure HADR on
an existing TSM instance.
HADR allows to implement cluster features on an application layer, with no need for operating
system cluster support.
Starting with DB2 v10.1, up to three HADR standby databases can be setup for a primary database. This
feature is available with Spectrum Protect (TSM) v7.1, which contains DB2 v10.5.
One system needs to be designated as Principal Standby, while additional standby systems can be
added as Auxiliary Standby.
All of the HADR sync modes are supported on the principal standby, but the auxiliary standbys
synchronization mode is always SUPERASYNC mode.
Page 4 of 20
SYNC: Log write on primary requires replication to the persistent storage on the standby.
NEARSYNC: Log write on primary requires replication to the memory on the standby.
ASYNC: Log write on primary requires a successful send to standby (receive is not guaranteed).
The following diagram shows an example of a DB2 HADR multiple standby environment:
Page 5 of 20
In order to allow the application of log updates from the primary DB to the standby DB, the standby
database needs to be initialized this is done by restoring an offline backup on the target DB:
1. Backup the DB on the primary TSM server host:
Backup the DB2 database to a shared media (db2 backup db tsmdb1 to /nfsdir/hadr)
Note: At this point, do not start the TSM server application again!
3. Continue to configure HADR for both DBs, still without starting TSM yet.
2.2.2
The following table shows the DB2 HADR parameters that need to be configured properly in a multi-target
environment:
Parameter
Description
hadr_local_host
hadr_local_svc
hadr_target_list
Defines a list of all databases (host:port) that participate in a HADR multiple standby
environment. It starts with the principal standby, followed by all auxiliary standbys
(assuming the local system will become the primary DB)
hadr_remote_host
Hostname of the HADR remote DB/Peer (what will be my principal standby, in case I
will become the primary DB?)
hadr_remote_inst
hadr_remote_svc
hadr_sync_mode
Log shipping syncronization mode between primary and principal standby (auxiliary
standby always uses SUPERASYNC)
hadr_timeout
Page 6 of 20
TSM_SERVER_A
(PRIMARY)
TSM_SERVER_B
(PRINCIPAL STANDBY)
TSM_SERVER_C
(AUXILIARY STANDBY)
All systems will use TCP port 60111 for the remote HADR service.
The primary server will use a replication type of SYNC to the principal standby.
Configuration of TSM_SERVER_A:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_B:60111|TSM_SERVER_C:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_B
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120
Configuration of TSM_SERVER_B:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_B
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_A:60111|TSM_SERVER_C:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120
Configuration of TSM_SERVER_C:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_C
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_A:60111|TSM_SERVER_B:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SUPERASYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120
Page 7 of 20
The following tasks have to be performed in order to start all involved HADR databases - standby(s) first,
primary last:
1. Start the database on the Principal Standby:
su tsminst1
db2 start hadr on db tsmdb1 as standby
2. Start the database on Auxiliary Standby(s)
su tsminst1
db2 start hadr on db tsmdb1 as standby
3. Start the database on Primary
su tsminst1
db2 start hadr on db tsmdb1 as primary
4. Start TSM
2.2.5
How to validate that the HADR setup and the role assignment?
Executing the db2pd command on the primary system provides an at-a-glance view to all peers:
[tsminst1@TSM_Server_A ~]$ db2pd -db tsmdb1 hadr
Database Member 0 -- Database TSMDB1 -- Active -- Up 8 days 23:41:07 -- Date 2015-0404-16.56.03.593986
HADR_ROLE
REPLAY_TYPE
HADR_SYNCMODE
STANDBY_ID
HADR_STATE
PRIMARY_MEMBER_HOST
PRIMARY_INSTANCE
STANDBY_MEMBER_HOST
STANDBY_INSTANCE
HADR_CONNECT_STATUS
HADR_CONNECT_STATUS_TIME
HEARTBEAT_INTERVAL(seconds)
HADR_TIMEOUT(seconds)
PRIMARY_LOG_FILE,PAGE,POS
STANDBY_LOG_FILE,PAGE,POS
STANDBY_REPLAY_LOG_FILE,PAGE,POS
PRIMARY_LOG_TIME
STANDBY_LOG_TIME
STANDBY_REPLAY_LOG_TIME
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
PRIMARY
PHYSICAL
SYNC
1
PEER
TSM_SERVER_A
tsminst1
TSM_SERVER_B
tsminst1
CONNECTED
03/26/2015 16:15:03.336112 (1427382903)
30
120
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
HADR_ROLE
REPLAY_TYPE
HADR_SYNCMODE
STANDBY_ID
HADR_STATE
PRIMARY_MEMBER_HOST
PRIMARY_INSTANCE
STANDBY_MEMBER_HOST
STANDBY_INSTANCE
HADR_CONNECT_STATUS
HADR_CONNECT_STATUS_TIME
=
=
=
=
=
=
=
=
=
=
=
PRIMARY
PHYSICAL
SUPERASYNC
2
REMOTE_CATCHUP
TSM_SERVER_A
tsminst1
TSM_SERVER_C
tsminst1
CONNECTED
03/26/2015 16:15:06.459274 (1427382906)
Page 8 of 20
=
=
=
=
=
=
=
=
30
120
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
Page 9 of 20
HADR is taking care of the TSM server database, but what about the storage pool data?
Ensure that also the TSM client backup data is accessible on the failover location. This can be
achieved e.g. by using a shared file system, a copy pool, or by using other data replication
techniques (e.g. ProtecTIER).
Ensure that the TSM clients can access the TSM instance on the failover host, e.g. by using
Service IP addresses or DNS changes.
Properly size the LAN / WAN connections from the TSM clients to the TSM failover system.
A DB2 HADR standby database can take over the primary role, e.g. if the primary system fails, or if there
is a planned maintenance. According to this, there are two potential failover types:
Graceful Failover:
If the primary and standby system are both available, they can switch their roles. This is used e.g.
for maintenance purpose.
Forced Failover:
This method is used to bring up a standby system with the primary role due to a failed primary
system.
2.3.1
Perform the following commands to gracefully failover from a primary server to a standby system:
On the primary server (skip this step for a forced failover):
1. Halt the TSM application (will also stop DB2)
2. Restart the DB2 in standby role:
su tsminst1
db2start
db2 start hadr on db tsmdb1 as standby
3. On one of the standby servers (preferred on the principal standby):
Execute the takeover command and validate that it was successful:
su tsminst1
db2 takeover hadr on db tsmdb1 by force
db2pd -db tsmdb1 hadr
4. Start the TSM application
In order to failback, perform the steps above in opposite direction.
Page 10 of 20
Page 11 of 20
Page 12 of 20
Page 13 of 20
A Spectrum Protect server (HADR primary) has two HADR standby servers. The principal
standby is in a second data center in the main location and acts as a failover system e.g. for
hardware maintenance purpose. The auxiliary standby acts as a failover system for DR purpose.
Perform DB2 HADR takeover from DC1 to DC2, monitor peering and finally start TSM in
DC2
Page 14 of 20
Delete all Drives and all Paths to the VTL library in TSM (e.g. by using perform libaction)
Re-define the library path and all drives using the proper device names on the failover
host
Enable DR mode for the ProtecTIER in DC2 to stop incoming replication traffic from
DC1
Use the PT GUI to move the replicated cartridges to the prepared VTL partition in DC2
Checkin the libvolumes to the re-defined library in TSM (first checkin scratch, then
private)
All replicated tape cartridges are read-only, which allows to perform restores of data
In order to perform new backups on the failover site, create new virtual tape cartridges
(readwrite)
Page 15 of 20
5. Summary
DB2 HADR offers a great approach to replicate a Spectrum Protect server database (the Meta data) to
one or more (standby) target sites. Combined with the native IP replication feature of the IBM ProtecTIER
VTL system, it is possible to build easy-to use, efficient, high available, high capacity and high
performance backup solutions, which provide superior Disaster protection at the same time.
Page 16 of 20
Page 17 of 20
Page 18 of 20
Appendix C: References
[1] IBM Spectrum Protect (TSM) Home page:
http://www-03.ibm.com/software/products/en/tivoli-storage-manager-family
Page 19 of 20
Disclaimer
The information contained in this documentation is provided for informational purposes only. While efforts
were made to verify the completeness and accuracy of the information provided, it is provided as is
without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising
out of the use of, or otherwise related to, this documentation or any other documentation. Nothing
contained in this documentation is intended to, nor shall have the effect of, creating any warranties or
representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the
applicable license agreement governing the use of IBM software.
The Techdocs information, tools and documentation ("Materials") are being provided to IBM Business
Partners to assist them with customer installations. Such Materials are provided by IBM on an "as-is"
basis. IBM makes no representations or warranties regarding these Materials and does not provide any
guarantee or assurance that the use of such Materials will result in a successful customer installation.
These Materials may only be used by authorized IBM Business Partners for installation of IBM products
and otherwise in compliance with the IBM Business Partner Agreement.
Trademarks
The following terms are trademarks or registered trademarks of the IBM Corporation in the United States
or other countries or both: IBM, ProtecTIER, System Storage, Spectrum Protect and Tivoli.
Linux is a registered trademark of Linus Torwald.
Other company, product, and service names may be trademarks or service marks of others.
Page 20 of 20