Vous êtes sur la page 1sur 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

IBM Virtualization Engine TS7700 Series Best Practices Understanding, Monitoring and Tuning the TS7700 Performance Version 1.5

Jim Fisher fisherja@us.ibm.com IBM Advanced Technical Support Americas Carl Bauske cabauske@us.ibm.com IBM Advanced Technical Support - Americas

Copyright IBM Corporation, 2012 Page 1 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Introduction............................................................................................................................................... 4 1.1 Summary of Changes ................................................................................................................. 4 1.2 Reader Comments Please! ......................................................................................................... 5 1.3 Related Documents .................................................................................................................... 5 2 Understanding Data Flow through the TS7700 ................................................................................ 6 2.1 Tasks Performed by the TS7700 Components ........................................................................... 6 2.1.1 Tasks Performed by the CPU ............................................................................................. 6 2.1.2 Tasks Performed by the TS7700 Cache.............................................................................. 7 2.1.3 Tasks Performed by the TS7740 Tape Drives .................................................................... 8 2.2 Performance is Shared Amongst the Mounted Virtual Drives .................................................. 8 2.3 How Copies Are Prioritized ....................................................................................................... 8 2.4 Data Movement through Cache ............................................................................................... 10 2.4.1 TS7700 Single Cluster ...................................................................................................... 10 2.4.2 TS7700 Two-Cluster Grid Balanced Mode ................................................................... 12 2.4.3 TS7700 Two-Cluster Grid Preferred Mode ................................................................... 15 2.4.4 TS7700 Three-Cluster Grid HA and DR Mode ............................................................. 17 2.4.5 TS7700 Four Cluster Grid Considerations ....................................................................... 19 2.4.6 TS7700 Hybrid Grid Considerations ................................................................................ 20 2.4.7 Cluster Families and Cooperative Replication ................................................................. 21 2.4.8 Retain Copy Mode ............................................................................................................ 23 3 Understanding Throttling in the TS7700 ........................................................................................ 27 3.1 What are the Throttling Types? ............................................................................................... 27 3.1.1 Host Write Throttle ........................................................................................................... 27 3.1.2 Copy Throttle .................................................................................................................... 29 3.1.3 Deferred Copy Throttle..................................................................................................... 30 3.2 What Causes Host Write and Copy Throttle to be turned on? ................................................. 31 3.3 What Causes Deferred Copy Throttle to be turned on? ........................................................... 32 3.4 How are Pre-Migrate Tasks Managed? .................................................................................... 33 3.5 Immediate Copy set to Immediate Deferred ............................................................................ 34 3.6 Synchronous Mode Copy set to Synchronous Deferred. ......................................................... 35 4 What Should I Monitor? ................................................................................................................. 36 4.1 Stakeholders ............................................................................................................................. 36 4.2 Storage Management................................................................................................................ 37 4.3 Storage Administration ............................................................................................................ 38 4.4 Operations ................................................................................................................................ 39 4.5 Plotting Cache Throughput from VEHSTATS ........................................................................ 40 4.5.1 Interpreting Cache Throughput......................................................................................... 41 5 Tuning the TS7700 ......................................................................................................................... 44 5.1 Power5 (V06/VEA) versus Power7 (V07/VEB) Performance Considerations ....................... 44 5.2 Deferred Copy Throttle (DCT) Value and Threshold .............................................................. 44 5.2.1 Deferred Copy Throttle (DCT) Value .............................................................................. 45 5.3 Preferred Pre-migration and Pre-migration Throttling Thresholds.......................................... 46 5.3.1 Preferred Pre-migration Threshold ................................................................................... 46 5.3.2 Pre-migration Throttling Threshold .................................................................................. 47 5.3.3 Disabling Host Write Throttle due to Immediate Copy.................................................... 47 5.4 Making Your Cache Deeper .................................................................................................... 48 Copyright IBM Corporation, 2012 Page 2 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


5.5 Back-End Drives ...................................................................................................................... 48 5.6 Grid Links ................................................................................................................................ 49 5.6.1 Provide Sufficient Bandwidth........................................................................................... 49 5.6.2 Grid Link Performance Monitoring .................................................................................. 50 5.7 Reclaim Operations .................................................................................................................. 51 5.7.1 Reclaim Threshold ............................................................................................................ 51 5.7.2 Inhibit Reclaim Schedule .................................................................................................. 51 5.7.3 Adjusting the Maximum Number of Reclaim Tasks ........................................................ 51 5.8 Limiting Number of Pre-Migration Drives (max drives)......................................................... 53 5.9 Avoid Copy Export during Heavy Production Periods ............................................................ 53 References: ............................................................................................................................................. 54 Disclaimers: ............................................................................................................................................ 54

Copyright IBM Corporation, 2012 Page 3 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Introduction
The IBM Virtualization Engine TS7700 Series is the latest in the line of tape virtualization products that has revolutionized the way mainframe customers utilize their tape resources. Tape Virtualization subsystems have become an essential part of most mainframe customers operations. Massive amounts of key customer data are placed under the control of the subsystem. The IBM TS7700 Virtualization Engine, with its virtual tape drives, disk cache and integrated hierarchical storage management, is designed to perform its tasks with no customer involvement once it has been configured. The TS7700 has a set of parameters used to regulate the performance of the subsystem, allowing the configuration and parameters to be altered by each customer thereby effectively customizing the performance of each TS7700 subsystem. Performance varies between standalone, 2, 3, 4, 5, and 6-cluster configurations. Make sure that this is factored into any planned change of your configuration. This document will help you understand the inner workings of the TS7700 so that you can make educated adjustments to the subsystem to achieve peak performance. This document starts by describing the flow of data through the subsystem. Next, the various throttles used to regulate the subsystem are described. Performance monitoring is then discussed finishing with how and when to tune the TS7700.

1.1 Summary of Changes


Version 1.0 - April 15, 2009 Initial Release Version 1.1 May 2009 Update description of Performance is Shared Amongst the Mounted Virtual Drives Clarified that DCT will be reported in VEHSTATs with a future level of Release 1.5 Corrected minor typos, etc. Version 1.2 September 2009 Modified diagrams to correctly show flow of data for cross-cluster mounts Add description of Deferred Copy Throttle (DCT) threshold tuning knob including recommendations. Add description of how to plot cache throughput from VEHSTATS. Add description of how PG0 and PG1 affect immediate and deferred copy priorities. Miscellaneous updates and corrections. Version 1.3 September 29, 2009 Revised recommendations for Grid Link Performance Monitoring Added considerations for four-cluster grid and hybrid grids. Add discussion of Retain Copy Mode Add discussion of Cluster Families Add discussion of when immediate copies are changed to immediate-deferred Add discussion of disabling Host Write Throttle due to immediate copies taking too long Copyright IBM Corporation, 2012 Page 4 of 55 Version 1.4 December, 2009

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Add discussion of setting maximum number of reclaim tasks with R1.6

Version 1.5 August, 2012 Fixed typo in section 2.3. Changed Management Class to Storage Class relative to PG0 and PG1 Major updates to reflect Power7 engine (VEB and VEA) and Release R2.0 and beyond Updated to include Synchronous Copy mode

1.2 Reader Comments Please!


Please send any corrections or additions to the authors. If you have any monitoring recommendations or tuning techniques, please send then along so we can share them.

1.3 Related Documents


White Papers, Presentations, and Tools available on Techdocs Search for TS7700 o http://www-03.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs TS7700 Related Publications and InfoCenter o http://publib.boulder.ibm.com/infocenter/ts7700/cust/index.jsp TS7700 Redbook Search for TS7700 o http://www.redbooks.ibm.com/

Copyright IBM Corporation, 2012 Page 5 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

2 Understanding Data Flow through the TS7700


This chapter provides an understanding of data flow through the TS7700, both the TS7720 and TS7740. This knowledge is the first step in helping you when you are analyzing your configuration and determining how to tune the subsystem.

2.1 Tasks Performed by the TS7700 Components


The figure below illustrates the major components of the TS7700. For the TS7720, the tape drives are not present, but the other components remain. There are various tuning points available to you that can be used to favor certain tasks over other tasks. These tuning knobs are discussed in later chapters. Lets first understand each components responsibility.
CPU Operating System Host Read/Write from/to cache Copy data to other clusters Copy data from other clusters Remote mount Copy Export Pre-migrate Recall Reclaim Management Interface

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Compressed Host Write Cache Host compressed Read/Write Pre-Migrate to tape Recall from tape Copies to/from other clusters Remote write and read

Disk Cache

Copy to other clusters Copy from other clusters Remote mount write and read

Grid

Grid Remote write and read Copies to/from other clusters

Copy Export

Pre-Migrate

Recall

Reclaim Drives Copy Export Pre-Migrate to tape Recall Reclaim

Figure 1 - Tasks Performed by TS7700 Components

2.1.1 Tasks Performed by the CPU


The CPU is involved in all aspects of the TS7700 functions. When evaluating performance, it is important to remember that the CPU is shared by all of these functions. The original TS7700s were powered by a Power5 engine. Since May of 2011, the TS7700s have been powered by a Power7 engine. The Power7 engine has significantly more CPU power than the Power5 engine. This means the CPU does not become a bottleneck to performance as soon with the Power7 engines. Operating System This is the overhead used by the AIX operating system. Host Read/Write from/to cache These are the compressed host data that is read from and written to the Host Bus Adapters (HBAs). This set of data passes through buffers in the CPU

Copyright IBM Corporation, 2012 Page 6 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


memory. The data rate can be limited by the number of Performance Increments installed. Also, the Host Write Throttle affects the read/write rate. This throttle is discussed in other sections. Copy data to other clusters This is data that is copied from this clusters cache to other clusters as synchronous (S), immediate (RUN or R), or deferred (D) copies. Deferred Copies are regulated by the Deferred Copy Throttle (DCT), also known as the Deferred Copy Read Throttle. Synchronous and immediate copies are not throttled by the originating cluster. Copy data from other clusters This is data that is copied into this clusters cache from other clusters as synchronous (S), immediate (RUN or R) or deferred (D) copies. Immediate and Deferred copies are regulated by the Copy Throttle. Synchronous copies are not regulated by the Copy Throttle. Remote write and read - Remote mounts include both logical volumes accessed from another cluster (machine B) through this clusters (machine A) cache and this cluster (machine A) accessing a logical volume in another clusters (machine B) cache. A remote mount for write that is originated from another cluster is regulated with the Host Write Throttle. This also applies to a synchronous mode copy, which is essentially a remote write. Remote mounts do not pass data through the cache of the cluster whose virtual device was allocated for the mount. Copy Export (TS7740 only) When a Copy Export operation occurs, data to be exported that resides only in cache is pre-migrated to physical tape. Also, each physical tape to be exported is mounted and a snapshot of the TS7700 database is written to it. This can be regulated by requesting the Copy Export outside your heavy production period. Pre-migrate (TS7740 only) Pre-migration is when logical volumes that reside only in cache are written to physical tape. Pre-Migration includes both primary and secondary copies. Premigration is regulated by several algorithms which are discussed in other chapters. o o o o o Idle Pre-Migration Fast Host Write Pre-Migration Somewhat Busy Pre-Migration Ramp Up Preferred Pre-Migration Threshold can be set by customer. Limit number of pre-migrate drives on a per-pool basis. This can be set by the customer.

Recall (TS7740 only) Recalled data is when a logical volume is transferred from a physical tape volume to the cache to satisfy a mount. Reclaim (TS7740 only) Reclaim involves transferring logical volumes from one physical tape to another. This data passes through the CPUs memory however, it does not pass through the disk cache. Reclaim is controlled by the Reclaim Threshold, the Inhibit Reclaim Schedule, the maximum number of reclaim tasks (set using host console request RCLMMAX), and the number of available back-end drives. Management Interface The Management Interface (MI) is a task that consumes CPU power. The MI is used to configure, operate and monitor the TS7700.

2.1.2 Tasks Performed by the TS7700 Cache


The TS7700 cache is the focal point for most data transfers, everything except reclaim activity. When evaluating performance, remember that the cache has a finite size and I/O bandwidth to satisfy a variety of tasks. Equally important to remember is that all data moved within the TS7700 is Copyright IBM Corporation, 2012 Page 7 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


compressed host data. The Host Bus Adapters compress and decompress the data as it is written and read by the host. Host Read/Write Pre-Migrate Recall Copies to/from other clusters Remote write and read

2.1.3 Tasks Performed by the TS7740 Tape Drives


The TS7740 physical back-end tape drives read and write data for a variety of reasons. The back-end drives are shared amongst these tasks. Balancing their usage is controlled by default algorithms that can be adjusted somewhat by the user. It is important to monitor the use of the physical drives ensuring there are enough back-end drives to handle the workload. Copy Export Pre-Migrate Recall Reclaim

2.2 Performance is Shared Amongst the Mounted Virtual Drives


The host IO rate is shared by the active mounted virtual drives. An active virtual drive is one where there is a logical volume is mounted and the host is actively reading or writing to the volume. A virtual device may be mounted, but if it is not being read from or written to, it is not using any host bandwidth. If there are 20 mounted and active virtual drives, each one will receive, on average, 5% of the current host IO bandwidth. If there are 40 mounted and active virtual drives, each one will receive, on average, 2.5% of the current host IO bandwidth. The host IO bandwidth depends on several factors including: how much of the TS7700s CPU capacity and cache throughput is being used by housekeeping tasks (pre-migrate, immediate and deferred copies, reclaim, and copy export) and by how much data the host is attempting to drive into and out of the TS7700.

2.3 How Copies Are Prioritized


In a multi-cluster grid, copies of volumes are made from one cluster to another. Copies are made in either a synchronous, immediate or deferred manner. There are actually five classifications of copies: Synchronous, Synchronous-Deferred, Immediate, Immediate-Deferred, and Deferred. SynchronousDeferred are volumes that were originally defined as synchronous, but were changed to deferred copies because the target cluster could not be reached. Immediate-deferred are volumes that were originally defined as immediate copies but were changed to deferred copies. An immediate copy can be changed to an immediate-deferred copy if the cluster that is to receive the copy is not online or the copy has taken too long. The TS7700 processes synchronous deferred copies as top priority followed by immediate, immediate-deferred, and finally deferred. Volumes in a source TS7740 cluster are defined as either PG0 or PG1 volumes via the Storage Class. Within the copy types, synchronous-deferred, immediate, immediate-deferred, and deferred, the TS7740 gives priority to PG0 volumes over PG1 volumes. The reason for prioritizing PG0 over PG1 is

Copyright IBM Corporation, 2012 Page 8 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


to allow the PG0 volumes to be removed from cache as soon as possible (after they are pre-migrated to tape) thus allowing for more PG1 volumes to reside in the TS7740 cache. The order in which copies are processed is as follows: 1. 2. 3. 4. 5. 6. 7. 8. Synchronous-Deferred PG0 Synchronous-Deferred PG1 Immediate PG0 Immediate PG1 Immediate-Deferred PG0 Immediate-Deferred PG1 Deferred PG0 Deferred PG1

There isnt a similar copy priority scheme for copies to other clusters that originate in a TS7720. Copies from a TS7720 do not use a priority scheme based on the Storage Class construct.

Copyright IBM Corporation, 2012 Page 9 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 2.4 Data Movement through Cache
This chapter describes the movement of data through the TS7700 cache in various configurations. This is to help you understand the various pieces of data movement that have to share the resources of the TS7700. The discussion does not include TS7740 reclaim activity since the data transferred from one tape drive to another does not pass through the cache.

2.4.1 TS7700 Single Cluster


A single cluster TS7700 is a simple configuration to understand since there are no communications with a remote cluster to worry about. For both the TS7720 and TS7740 the following data is passed through the subsystem: Uncompressed host data is compressed by the Host Bus Adapters (HBAs) and the compressed data is written to cache. Compressed data is read from the cache and decompressed by the HBA.
Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Write Compressed Host Write Read Cache Hit Compressed Host Read

Compressed Host Write

Disk Cache

Figure 2 - TS7720 Single Cluster Data Flow

For the TS7740, we add back-end tape drives for recalls and pre-migrates. If a read is requested and the logical volume does not exist in the cache, a stacked physical tape is mounted and the logical volume is read into cache. The host then reads the logical volume from the TS7740 cache. Host data will be written from cache to the physical stacked volumes in a process called premigrate.

Copyright IBM Corporation, 2012 Page 10 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Write Compressed Host Write Pre-Migrate Read Cache Hit Compressed Host Read Read Cache Miss Recall Compressed Host Read

Compressed Host Write

Disk Cache

Pre-Migrate

Recall

Figure 3 - TS7740 Single Cluster Data Flow

Copyright IBM Corporation, 2012 Page 11 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


2.4.2 TS7700 Two-Cluster Grid Balanced Mode
Lets look at a two-cluster, balanced mode grid to explain the data movement through the subsystem. Balanced mode means the host has virtual devices in both clusters varied online. For a scratch mount, host allocation can select a virtual device from either cluster. For both the TS7720 and TS7740 the following data is moved through the subsystem: Local write with no remote copy (Copy Consistency Point (CCP) of RN) includes writing the compressed host data to cache. Local write with remote copy (CCP of DD, RR, or SS) includes writing the compressed host data to cache and to the grid. o For a CCP of SS, the logical volume data is written to both clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event isnt complete until all data written up to that point has been written to non-volatile storage in both clusters. o For a CCP of RR the copy is immediate and must complete before Rewind/UNload (RUN) is complete. Copies are placed in the immediate copy queue. o For a CCP of DD the copy is deferred where the completion of the RUN is not tied to the completion of the copy operation. Copies are placed in the Deferred Copy Queue. Remote write with no local copy (CCP of NR) includes writing compressed host data to the grid. Local read with a local cache hit. Here the compressed host data is read from the local cache. Local read with a remote cache hit. Here the compressed host data is read from the remote cache via the grid link. Synchronous, immediate and deferred copies from the remote cluster. Here compressed host data is received on the grid link and copied into the local cache.

Copyright IBM Corporation, 2012 Page 12 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Write No Copy Compressed Host Write Write with Copy Compressed Host Write Copy to cluster 1 Read Cache Hit - Local Compressed Host Read Read Cache Hit Remote Compressed Host Read Remote Cache Read Copies from Cluster 1 Remote write to cluster 1 Remote read/write to other cluster

Compressed Host Write

Disk Cache

Copy to cluster 1 Copy from cluster 1 Remote read from cluster 1

Grid

Figure 4 - TS7720 Two-Cluster Grid - Balanced Mode Data Flow

For the TS7740, we add back-end tape drives for recalls and pre-migrates. A write with copy or no copy to another cluster includes the pre-migrate process. A read with local cache miss will result in one of the following: o A remote mount without recall o A recall into local cache from local stacked volume o A remote mount requiring recall from remote stacked volume Host data will be written from cache to the physical stacked volumes in a process called premigrate. This includes data written as a result of a local mount, volumes copied from other clusters, and a remote mount to this cluster for write (not shown).

Copyright IBM Corporation, 2012 Page 13 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Write No Copy Compressed Host Write Pre-Migrate Write with Copy Compressed Host Write Pre-Migrate Copy to cluster 1 Read Cache Hit - Local Compressed Host Read Read Cache Miss Recall Compressed Host Read Read Cache Hit Remote Compressed Host Read Remote Cache Read Copies from Cluster 1 Remote write to cluster 1 Remote read/write to other cluster

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read

Compressed Host Write

Disk Cache

Copy to cluster 1 Copy from cluster 1 Remote read from cluster 1

Grid

Pre-Migrate

Recall

Figure 5 - TS7740 Two-Cluster Grid - Balanced Mode Data Flow

Copyright IBM Corporation, 2012 Page 14 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


2.4.3 TS7700 Two-Cluster Grid Preferred Mode
Lets look specifically at a two-cluster, preferred mode grid to explain the data movement through the subsystem. Preferred mode means the host has virtual devices in just one cluster varied online. Host allocation will select a virtual device only from the cluster with varied on virtual devices. For both the TS7720 and TS7740 the following data is moved through the subsystem: Local write with no remote copy (CCP of RN) includes writing the compressed host data to cache. Local write with remote copy (CCP of RD, RR, or SS) includes writing the compressed host data to cache and to the grid. o For a CCP of SS, the logical volume data is written to both clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event isnt complete until all data written up to that point has been written to non-volatile storage in both clusters. o For a CCP of RR the copy is immediate and must complete before Rewind/UNload (RUN) is complete. Copies are placed in the immediate copy queue. o For a CCP of RD the copy is deferred where the completion of the RUN is not tied to the completion of the copy operation. Copies are placed in the Deferred Copy Queue. Remote write with no local copy (CCP of NR) includes writing compressed host data to the grid (Not shown in diagram below). Local read with a local cache hit. Here the compressed host data is read from the local cache. Local read with a remote cache hit. Here the compressed host data is read from the remote cache via the grid link.
Write No Copy Compressed Host Write Write with Copy Compressed Host Write Copy to cluster 1 Read Cache Hit - Local Compressed Host Read Read Cache Hit - Remote Compressed Host Read Remote Cache Read Copies from cluster 1 Remote write to cluster 1

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read

Compressed Host Write

Remote read/write to other cluster

Disk Cache

Copy to cluster 1

Grid

Figure 6 - TS7720 Two-Cluster Grid - Preferred Mode Data Flow

Copyright IBM Corporation, 2012 Page 15 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


For the TS7740, we add back-end tape drives for recalls and pre-migrates. A write with no copy and a write with copy also includes the pre-migrate process. A read with local cache miss will result in one of the following: o A remote mount without recall o A recall into local cache from local stacked volume o A remote mount requiring recall from remote stacked volume Host data will be written from cache to the physical stacked volumes in a process called premigrate. This includes data written as a result of a local mount for write.
Write No Copy Compressed Host Write Pre-Migrate Write with Copy Compressed Host Write Pre-Migrate Copy to cluster 1 Read Cache Hit - Local Compressed Host Read Read Cache Miss Recall Compressed Host Read Read Cache Hit - Remote Compressed Host Read Remote cache read Copies from cluster 1 Remote write to cluster 1 Remote read/write to other cluster

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read

Compressed Host Write

Disk Cache

Copy to cluster 1

Grid

Pre-Migrate

Recall

Figure 7 - TS7740 Two-Cluster Grid - Preferred Mode Data Flow

Copyright IBM Corporation, 2012 Page 16 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


2.4.4 TS7700 Three-Cluster Grid HA and DR Mode
Lets look specifically at a three-cluster grid configured in balanced mode between two production High Availability (HA) clusters and a remote cluster with virtual devices varied offline. We will look at the data movement through the subsystem. Balanced mode means the host has virtual devices in both clusters varied online. Host allocation can select a virtual device from either cluster. For both the TS7720 and TS7740 the following data is moved through the subsystem: Local write with no remote copy (CCP of RNN) includes writing the compressed host data to cache. Local write with HA copy (CCP of RDN, RRN, or SSN) includes writing the compressed host data to cache and to the grid. o For a CCP of SSN, the logical volume data is written to both HA clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event isnt complete until all data written up to that point has been written to non-volatile storage in both clusters. o For a CCP of RRN the copy to the other HA cluster is immediate and must complete before Rewind/UNload (RUN) is complete. Copies are placed in the immediate copy queue. o For a CCP of DDN the copy to the other HA cluster is deferred where the completion of the RUN is not tied to the completion of the copy operation. Copies are placed in the Deferred Copy Queue. Local write with HA and remote copy (CCP of RRD, RDD/DDD, or SSD) includes writing the compressed host data to cache and to the grid. o For a CCP of SSD, the logical volume data is written to HA both clusters at the same time. When a tape sync event occurs, either explicit or implicit, the sync event isnt complete until all data written up to that point has been written to non-volatile storage in both clusters. The copy to the remote cluster is deferred. o For a CCP of RRD, the copy to the HA cluster is immediate. The copy to the remote cluster is deferred. The immediate copy will be sourced from the mounting cluster. The remote copy will be sourced from either of the two HA clusters. The grid link performance and other factors are used to determine which cluster the remote cluster will source the deferred copy from. o For a CCP of RDD/DDD, the copies to the other HA cluster and remote cluster are deferred. Each cluster may source the copy from either of the other two clusters. Which cluster has a valid copy, the grid link performance and other factors are used to determine which cluster the two clusters will source the deferred copy from. Remote write with no local copy (CCP of NRN or NNR) includes writing compressed host data to the grid. Local read with a local cache hit. Here the compressed host data is read from the local cache. Local read with a remote cache hit. Here the compressed host data is read from one of the other two clusters cache via the grid link.

Copyright IBM Corporation, 2012 Page 17 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Immediate and deferred copies from the other HA cluster. Here compressed host data is received on the grid link and copied into the local cache.
Write No Copy Compressed Host Write Write with Copy Compressed Host Write Copy to cluster 1 Copy to cluster 2 Read Cache Hit - Local Compressed Host Read Read Cache Hit Remote Compressed Host Read Remote Cache Read Copies from Cluster 1 Remote write to cluster1

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read

Compressed Host Write

Remote read/write to other cluster

Disk Cache

Copy to cluster 1 Copy from cluster 1 Copy to cluster 2 Remote read from cluster 1

Grid

Figure 8 - TS7720 Three-Cluster Grid - HA and DR Mode Data Flow

For the TS7740, we add back-end tape drives for recalls and pre-migrates. A write with no copy and a write with copy also includes the pre-migrate process. A read with local cache miss will result in one of the following: o A remote mount without recall o A recall into local cache from local stacked volume o A remote mount requiring recall from remote stacked volume Host data will be written from cache to the physical stacked volumes in a process called premigrate. This includes data written as a result of a local mount for write, volumes copied from other clusters, and a mount for write from the HA cluster (not shown).

Copyright IBM Corporation, 2012 Page 18 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Write No Copy Compressed Host Write Pre-Migrate Write with Copy Compressed Host Write Pre-Migrate Copy to cluster 1 Copy to cluster 2 Read Cache Hit - Local Compressed Host Read Read Cache Miss Recall Compressed Host Read Read Cache Hit Remote Compressed Host Read Remote Cache Read Copies from Cluster 1 Remote write from cluster 1 Remote read/write to other cluster

Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read

Compressed Host Write

Disk Cache

Copy to cluster 1 Copy from cluster 1 Copy to cluster 2 Remote read from cluster 1

Grid

Pre-Migrate

Recall

Figure 9 - TS7740 Three-Cluster Grid - HA and DR Mode Data Flow

2.4.5 TS7700 Four Cluster Grid Considerations


With the four-cluster grid the copies to and from the fourth cluster (cluster 3) must also be considered. The three-cluster grid figure from the previous section can be used to understand the data flow, just add the data flow to and from cluster 3, if applicable.

2.4.5.1 Two, Two-Clusters in One!


In one possible four-cluster grid there are two local clusters (clusters 0 and 1) and two remote clusters (2 and 3). A popular use of the Copy Consistency Points (CCPs) is to copy data written by a host to cluster 0 is replicated to cluster 2, and data written by a host to cluster 1 is replicated to cluster 3. With this configuration there are two copies of data in the grid. All data is accessible by all clusters either within itself or via the grid. This means that all data is available when one of the clusters is not available.

Copyright IBM Corporation, 2012 Page 19 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


HA Production Site 0 Host
Copy Mode DNDN

HA DR Site 2

TS7700

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 10 - Two, Two Clusters in One!

With Dynamic Allocation Assist the host will allocate a virtual device for a private mount on the best cluster. The best cluster is typically the cluster that contains the logical volume in its cache.

2.4.6 TS7700 Hybrid Grid Considerations


Hybrid grids open up many possible configurations. There are 6 basic combinations (one 2-way, two 3-ways and three 4-way possibilities). The permutations for 5 and 6 cluster grids are not covered here. The same performance considerations apply to hybrids and homogeneous grids. An interesting hybrid grid, which is illustrated below, is one where there are two or three TS7720 clusters attached to the production hosts, and a single, perhaps remote, TS7740 DR Cluster. The TS7720 clusters do not replicate to each other, but all of the TS7720 clusters replicate to the single TS7740. This has the advantage of a large front end cache as presented by the TS7720 clusters, and provides a deep back-end for archiving and disaster recovery. However, the replication traffic from all of the TS7720 clusters is traveling across the same grid network. It is essential that adequate network bandwidth be provided to handle the traffic to the TS7740. Also, the network needs to have enough bandwidth to retrieve logical volumes that reside only in the TS7740 cluster.

Copyright IBM Corporation, 2012 Page 20 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

TS7720

LAN/WAN

TS7740

TS7720

TS7720
Figure 11 - Hybrid Grid - Three TS7720 Production Clusters, One TS7740 DR Cluster

2.4.7 Cluster Families and Cooperative Replication


Prior to release 1.6 and cluster families, only the Copy Consistency Points could be used to direct who gets a copy of data and when they get the copy. Decisions as to where to source a volume from were left to each cluster in the grid. Two copies of the same data may be transmitted across the grid links for two distant clusters. With the introduction of cluster families in release 1.6, you can make the copying of data to other clusters more efficient by influencing where a copy of data is sourced. This becomes very important with 3 to 6 cluster grids where the clusters may be geographically separated. For example, when two clusters are at one site and the other two are at a remote site, and the two remote clusters need a copy of the data, cluster families make it so only one copy of the data is sent across the long grid links. Also, when deciding where to source a volume, a cluster will give higher priority to a cluster in its family over a cluster in another family. A cluster family establishes a special relationship between clusters. Typically families are grouped by geographic proximity to optimize the use of grid bandwidth. Family members are given higher weight when deciding which cluster to prefer for TVC selection. The example below illustrates how cooperative replication occurs with cluster families. Cooperative replication is used for deferred copies only. When a cluster needs to pull a copy of a volume it will prefer a cluster within its family. The example above uses CCPs of RRDD. With cooperative replication one of the family B clusters at the DR site will pull a copy from one of the clusters in production family A. The second cluster in family B will wait for the other cluster in family B to finish getting its copy then will pull it from its family member. This way the volume only travels once across the long grid distance.

Copyright IBM Corporation, 2012 Page 21 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Production Site
Family to Family

DR Site

Within Family

WAN

Within Family

Family A

Copy Mode RRDD

Family B

Figure 12 - Cluster Families

Cooperative replication includes another layer of consistency. A family is considered consistent when just one member of the family has a copy of a volume. Since only one copy is required to be transferred to a family, the family is consistent after the one copy is complete. Since a family member will prefer to get its copy from another family member instead of getting the volume across the long grid link, the copy time is typically much shorter for the family member. Since each family member is pulling a copy of a different volume, this will make a consistent copy of all volumes to the family quicker. With cooperative replication a family will prefer retrieving a new volume that the family doesnt have a copy of yet, over copying a volume within a family. When there are fewer than 20 new copies to be made from other families, the family clusters will copy amongst themselves. This means second copies of volumes within a family are deferred in preference to new volume copies into the family. When a copy within a family has been queued for 12 hours or more, it is given equal priority with copies from other families. This prevents family copies from stagnating in the copy queue. Without families, a source cluster attempts to keep the volume in its cache until all clusters needing a copy have gotten their copy. With families, a clusters responsibility to keep the volume in cache is released once all families needing a copy have it. This allows PG0 volumes in the source cluster to be removed from cache sooner. Refer to the IBM Virtualization Engine TS7700 Series Best Practices - Hybrid Grid white paper on Techdocs for more details concerning Cluster Families.

Copyright IBM Corporation, 2012 Page 22 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


2.4.8 Retain Copy Mode
Retain Copy Mode is an optional setting where a volumes existing Copy Consistency Points are honored instead of applying the CCPs defined at the mounting cluster. This applies to private volume mounts for reads or write appends. It is used to prevent more copies of a volume in the grid than desired. First lets set the stage. The example shown below is a 4 cluster grid where cluster 0 replicates to cluster 2, and cluster 1 replicates to cluster 3. The desired result is that only 2 copies of data remain in the grid after the volume is accessed. Later, the host wants to mount the volume written to cluster 0. Device Allocation Assist is used to determine which cluster is the best to request the mount from. Device Allocation Assist asks the grid which cluster to allocate a virtual drive from. The host will then attempt to allocate a device from the best cluster, in this case, cluster 0.

HA Production Site 0 Host


Copy Mode DNDN

HA DR Site 2

TS7700

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 13 - Four Cluster Grid with Device Allocation Assist

As of this writing, JES3 does not support Device Allocation Assist, so 50% of the time the host will allocate to the cluster that doesnt have a copy in its cache. Without Retain Copy Mode, 3 or 4 copies of a volume will exist in the grid after the dismount instead of the desired two copies. In the case where host allocation picks the cluster that doesnt have the volume in cache, one or two additional copies are created on clusters 1 and 3 since the CCPs indicate the copies should be made to clusters 1 and 3. For a read operation, four copies remain. For a write append, three copies are created. This is illustrated below.

Copyright IBM Corporation, 2012 Page 23 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


HA Production Site 0 Host TS7700
Copy Mode DNDN

HA DR Site 2

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 14 - Four-Cluster Grid without Device Allocation Assist, Retain Copy Mode Disabled

With the Retain Copy Mode option set, the original CCPs of a volume are honored instead of applying the CCPs of the mounting cluster. A mount of a volume to the cluster that does not have a copy in its cache will result in a cross cluster (remote) mount instead. The cross cluster mount uses the cache of the cluster that contains the volume. The CCPs of the original mount are used. In this case, the result is that cluster 0 and 2 will have the copies and clusters 1 and 3 will not. This is illustrated below.

HA Production Site 0 Host TS7700


Copy Mode DNDN

HA DR Site 2

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 15 - Four Cluster Grid without Device Allocation Assist , Retain Copy Mode Enabled

Another example of the need for Retain Copy Mode is when one of the production clusters is not available. All allocations are made to the remaining production cluster. When the volume only exists in Copyright IBM Corporation, 2012 Page 24 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


clusters 0 and 2, the mount to cluster 1 will result in 3 or 4 copies. This applies to both JES2 and JES3 without Retain Copy Mode enabled.

HA Production Site 0 Host TS7700


Copy Mode DNDN

HA DR Site 2

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 16 - Four-Cluster Grid - One Production Cluster Down, Retain Copy Mode Disabled

The example below is with the Retain Copy Mode enabled and one of the production clusters down. In the scenario where the cluster containing the volume to be mounted is down, the host will allocate to a device on the other cluster, in this case, cluster 1. A cross cluster mount using Cluster 2s cache occurs. The original two copies remain. If the volume is appended to it is changed on cluster 2 only. Cluster 0 will get a copy of the altered volume when it rejoins the grid.

Copyright IBM Corporation, 2012 Page 25 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


HA Production Site 0 Host TS7700
Copy Mode DNDN

HA DR Site 2

TS7700

WAN
3

TS7700
Copy Mode NDND

TS7700

Figure 17 - Four-Cluster Grid - One Production Cluster Down, Retain Copy Mode Enabled

Refer to the IBM Virtualization Engine TS7700 Series Best Practices - Hybrid Grid white paper on Techdocs for more details concerning Retain Copy Mode.

Copyright IBM Corporation, 2012 Page 26 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Understanding Throttling in the TS7700

This section examines the variety of throttles used by the TS7700 to control the flow of data through the subsystem. The discussion describes the throttling types and how they are triggered. Throttling, in general, is used to encourage or enforce the priorities of the various task and functions running within the TS7700. The subsystem has a limited set of resources, (CPU, cache bandwidth, cache size, channel bandwidth, grid network bandwidth, physical tape drives, and so forth) that are shared by all the tasks moving data. The TS7700 uses a variety of explicit throttling methods to give the higher priority tasks more of the resources. The resources themselves will implicitly throttle items such as host bandwidth when the resource is used to 100%. The following is a list of the normally running tasks that move data. Immediate Copies Recalls Copy Export Host I/O This includes Sync Mode Copy writes Reclaims Pre-migration Deferred Copies

There are special case tasks that can occur, based on the state of the subsystem, that will consume resources be granted a higher priority. Here are some examples: Panic Reclaim The TS7740 detects the number of empty physical volumes has dropped below the minimum value and reclaims need to be done immediately to increase the count. Cache Fills with Copy Data To protect from having un-copied volumes removed from cache the TS7740 throttles data coming into the cache. Cache Overfills If no more data can be placed into the cache before data is removed then other tasks trying to add to the cache are heavily throttled.

3.1 What are the Throttling Types?


3.1.1 Host Write Throttle
The host-write throttle is applied to limit the amount of data written into cache from a host. Throttles incoming host writes from the channel and grid host I/O due to mounts from other clusters. If a cluster is applying host write throttle, a Sync Mode Copy write into the cluster from another cluster is also slowed down. This is because a sync mode copy is a host write even though it may be from another cluster. Triggered by: o Full Cache (Full Cache of Data to be Copied) o Immediate Copy (Large Amount of Immediate Data needs to be Moved) o Pre-Migrate (Large Amount of Un-Pre-Migrated Data in Cache) o Free Space (Cache is Near Full of any Data) Copyright IBM Corporation, 2012 Page 27 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Slows in-coming host data from channel and cross-cluster mount for write. This includes Sync Mode Copies Turned on when amount of non-premigrated data is greater than threshold (Pre-Migrate) Threshold default is 2000 GB Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Compressed Host Write Turned on when outgoing immediate copies are predicted to take longer than MIH. (Immediate Copy) Turned on when amount of data to be copied is 95% of the cache size AND TS7700 has been up more than 24 hours. (Full Cache) Turned on when free space in cache is low. Need to leave enough room for all writes to finish. (Free Space)

Disk Cache

Cross Cluster Mount for Write

Grid

Pre-Migrate

Recall

Figure 18 -Host Write Throttle

Copyright IBM Corporation, 2012 Page 28 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


3.1.2 Copy Throttle
The copy throttle is applied to limit the amount of data written into cache from other clusters' copy data. Throttles incoming copies, both immediate and deferred, from other clusters. Triggered by: o Full Cache o Pre-Migrate
Slows incoming copies of data from other clusters Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Compressed Host Write Turned on when amount of non-pre-migrated data is greater than threshold (Pre-Migrate) Threshold default is 2000 GB Turned on when amount of data to be copied is 95% of the cache size AND TS7700 has been up more than 24 hours. (Full Cache)

Disk Cache

Grid

Pre-Migrate

Recall

Figure 19 - Copy Throttle

Copyright IBM Corporation, 2012 Page 29 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


3.1.3 Deferred Copy Throttle
Throttles outgoing deferred copies to other clusters. This throttle is applied to maximize the host I/O rate at the expense of outgoing deferred copies. Also known as Deferred Copy Read Throttle Other clusters are reading data from this cluster.
Slows down deferred copies to other clusters to maintain host throughput. HBA Data is compressed Or decompressed Host Write Compressed Host Read Compressed Host Write Does not slow down immediate or sync mode copies Turned on when CPU Idle < 15% OR Host Compressed Throughput (Read and Write) is > 100 MB/s Evaluated every 30 seconds Default of 125ms delay Threshold default of 100MB/s compressed host IO

Host Read

Disk Cache

Grid

Pre-Migrate

Recall

Figure 20 - Deferred Copy Throttle (Deferred Read Copy Throttle)

Copyright IBM Corporation, 2012 Page 30 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 3.2 What Causes Host Write and Copy Throttle to be turned on?
Full Cache - Cache is full of data that needs to be copied to another cluster. o Amount of data to be copied to another cluster is > 95% of cache size AND the TS7700 has been up more than 24 hours. o This is reported as Write Throttle and Copy Throttle in VEHSTATS Immediate Copy - Immediate copies to other clusters, where this cluster is the source, are taking too long or are predicted to take too long. o The TS7700 evaluates the need for this throttle every two minutes. o The depth of the immediate copy queue is examined as well as the amount of time copies have been in the queue to determine if the throttle should be applied. Looking at age of oldest immediate copy in the queue: If oldest is 10-30 minutes old, throttle is set between 0.00166 seconds to 2 seconds. Linear ramp from 10 to 30 minutes. The maximum throttle (2 seconds) is applied immediately if an immediate copy has been in the queue for 30 minutes or longer. Looking at quantity of data, calculate how long transfer will take. If >35 minutes set throttle to max (2 seconds). If 5 to 35 minutes, throttle is set to .001111 seconds to 2 seconds. Linear ramp from 5 to 35 minutes.

o This is reported as Write Throttle in VEHSTATS. o Note: The time required for a 4000 MB immediate copy is 5 times longer than an 800 MB immediate copy. o Host Write Throttle due to Immediate Copies taking too long can be turned off using the Host Console Request. Refer to Section 5.3.3 - Disabling Host Write Throttle due to Immediate Copy on page 47 for more details. Pre-Migrate - Amount of data to be pre-migrated is above threshold (default 2000 GB) o This is reported as Write Throttle and Copy Throttle in VEHSTATS o These throttle values will be equal if Pre-Migrate is the sole reason for throttling. Free Space - Invoked when cache is near full of any data. o Used to make sure there is enough cache to handle the currently mounted volumes. o This is reported as Write Throttle in VEHSTATS

Copyright IBM Corporation, 2012 Page 31 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 3.3 What Causes Deferred Copy Throttle to be turned on?
CPU usage and the compressed host throughput are evaluated every 30 seconds. DCT is invoked when CPU Usage is > 85% OR compressed host throughput is > 100 MB/sec. The 100 MB/sec threshold is the default and can be changed by the customer via the Host Console Request. DCT remains in effect for the subsequent 30 second interval, after which it is reevaluated. Default DCT value is 125ms. The default value of 125 ms severely slows deferred copy activity (125 ms is added between each 32K block of data sent for a volume). The DCT can be set using Host Console Request. The setting of the DCT is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, THROTTLE, DCOPYT keywords. The DCT Threshold can be set using Host Console Request. The setting of the DCT is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, THROTTLE, DCTAVGTD keywords.

Copyright IBM Corporation, 2012 Page 32 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 3.4 How are Pre-Migrate Tasks Managed?
Host Read HBA Data is compressed Or decompressed Host Write Compressed Host Read Compressed Host Write

Disk Cache

Grid

Pre-Migrate

Recall

Try to avoid building up too much non-premigrated data in cache Increase number of pre-migrate tasks based on many criteria including: Host compressed write rate CPU activity How much data needs to be pre-migrated per pool How much data needs to be pre-migrated across all pools

Figure 21 Managing Pre-Migrate Tasks

The TS7740 uses a variety of criteria to manage the number of pre-migration tasks. The TS7700 looks at these criteria every 5 seconds to determine if one more pre-migration task should be added. Adding a pre-migration task is based on these and other factors: Host compressed write rate CPU activity How much data needs to be pre-migrated per pool How much data needs to be pre-migrated in total

A pre-migration task will not preempt a recall, reclaim or copy export task. There are four different algorithms working in concert to determine if another pre-migration task should be started. General details are described below. The actual algorithm has several nuances not described here. Idle Pre-Migration o If the CPU usage is idle more than 5% then a pre-migrate task is started, if appropriate. o The number of tasks is limited to six or the maximum pre-migration drives defined by pool properties, whichever is less. Fast Host Write Pre-Migration Mode o Compressed host write is > 30 MB/sec AND CPU idle <1% Copyright IBM Corporation, 2012 Page 33 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


o Pre-migration tasks are limited to two (for all pools). Lowered to one or zero if mode continues. Pre-Migration Ramp Up o Compressed host write is < 30 MB/s AND CPU idle < 5% o Indicates the TS7700 is somewhat busy. o Limited to available back-end drives minus one. Also limited by the maximum number of pre-migrate drives setting in pool properties. Preferred Pre-Migration o Amount of un-pre-migrated data exceeds the Preferred Pre-Migration threshold. The default threshold is 1600 GB. o Limited to available back-end drives minus one. Also limited by the maximum number of pre-migrate drives setting in pool properties. o Preferred Pre-Migration takes precedence over the other three algorithms.

3.5 Immediate Copy set to Immediate Deferred


The goal of an immediate copy is to complete one or more RUN consistency point copies of a logical volume prior to surfacing status of the RUN command to the mounting host. If one or more of these copies can not complete, the replication state of the targeted volume will enter the immediate-deferred state. The volume will remain in an immediate-deferred state until all of the requested RUN consistency points contain a valid copy. The immediate-deferred volume will replicate with a priority greater than standard deferred copies, but lower than non-deferred immediate copies. There are numerous reasons why a volume may enter the immediate-deferred state. For example, it may not complete within 40 minutes. Or, one or more clusters targeted to receive an immediate copy are not available. Independent of why, the host application or job associated with the volume is not aware that its previously written data has entered the immediate-deferred state. With Release 1.6 the reason why a volume moves to the immediate-deferred state is contained in the ERA 35 sense data. The codes are broken down into Unexpected and Expected reasons. New failure content is introduced into the CCW(RUN) ERA35 sense data: Byte 14 FSM Error - If set to 0x1C (Immediate Copy Failure), the additional new fields are populated. Byte 18 Bits 0:3 Copies Expected - How many RUN copies were expected for this volume. Byte 18 Bits 4:7 Copies Completed - How many RUN copies were actually verified as successful before surfacing SNS. Byte 19 Immediate Copy Reason Code o Unexpected 0x00 to 0x7F: Those reasons which are based on unexpected failures. 0x01 A valid source to copy was unavailable 0x02 Cluster targeted for a RUN copy is not available (unexpected outage) 0x03 40 minutes has passed and one or more copies have timed out. 0x04 Downgraded to immediate-deferred due to health/state of RUN target clusters. Copyright IBM Corporation, 2012 Page 34 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


0x05 Unknown reason. o Expected 0x80 to 0xFF: Those reasons which are based on configuration or due to planned outages. 0x80 One or more RUN target clusters are out of physical back end scratch. 0x81 One or more RUN target TS7720 clusters are low on available cache (95%+ full) 0x82 One or more RUN target clusters are in service-prep or service. 0x83 One or more clusters have copies explicitly disabled via the Library Request operation 0x84 The volume cannot be reconciled and is currently Hot against peer clusters. The additional data contained within the CCW(RUN) ERA35 sense data can be used within a z/OS custom user exit in order to act on a job moving to the immediate-deferred state. Since the requesting application that results in the mount has already received successful status prior to the issuing of the CCW(RUN), it cannot act on the failed status. However, future jobs can be suspended or other custom operator actions can be taken using the information provided within the sense data.

3.6 Synchronous Mode Copy set to Synchronous Deferred.


The default behavior of SYNCHRONOUS Mode Copy (SMC) is to fail a write operation if both clusters with the S copy policy are not available or become unavailable during write operations. You can enable the Synchronous Deferred on Write Failure (SDWF) option to permit update operations to continue to any valid consistency point in the grid. If there is a write failure, the failed "S" locations are set to a state of "synchronous-deferred". After the volume is closed, any synchronous-deferred locations are updated to an equivalent consistency point through asynchronous replication. If the SDWF option is not checked (default) and a write failure occurs at either of the "S" locations, then host operations fail and only content up to the last successful sync point should be viewed as valid. For example, imagine a three cluster grid and a copy policy of SSD, Sync Copy to clusters 0 and 1 and a deferred copy to Cluster 2. The host is connected to clusters 0 and 1. With this option disabled both clusters 0 and 1 must be available for write operations. If either one becomes unavailable, write operations will fail. With the option enabled, if either cluster 0 or 1 became unavailable, write operations would continue. The second S copy would become a synchronous deferred copy. In the above example, if the host was attached to cluster 2 only and the option was enabled, the write operations would continue even if both clusters 0 and 1 became unavailable. The S copies would become synchronous deferred copies. The synchronous-deferred volume will replicate with a priority greater than immediate-deferred and standard deferred copies. Refer to the IBM Virtualization Engine TS7700 Series Best Practices - Synchronous Mode Copy white paper on Techdocs for detailed information concerning Synchronous Mode Copy.

Copyright IBM Corporation, 2012 Page 35 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

4 What Should I Monitor?


The following charts have been collected from many TS7700 users who are the most successful at monitoring their tape virtualization subsystems. Each section shows the suggested reporting for 4 different tiers of users.

4.1 Stakeholders
The Stakeholders section is providing information that helps to justify the cost of the subsystem(s). These reports can help to eliminate any surprises for the stakeholders who may believe that the mainframe tape utilization is stagnant or insignificant. This tier of reporting should not be left out. The reporting should be provided at regular intervals and in a consistent format.
Reporting Interval Monthly

Information Virtual Mounts per Week Megabytes Transferred per Week Virtual Volumes Managed Megabytes Stored Ratio of Virtual to Physical Volumes Tape Volume Cache (TVC) Utilization

Source Report VEHSTATS

Tracking Monthly Trending Monthly Trending Monthly Trending Monthly Trending Monthly Trending Monthly Trending

Observation Bang for the buck - General Awareness Bang for the buck - General Awareness Bang for the buck - General Awareness Bang for the buck - General Awareness Bang for the buck - General Awareness If all my data must fit into the TS7720 TVC, am I running out of room?

VEHSTATS VEHSTATS VEHSTATS VEHSTATS - must be calculated VEHSTATS or Management Interface

Monthly Monthly Monthly Monthly

Monthly

Copyright IBM Corporation, 2012 Page 36 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 4.2 Storage Management
The Storage Management section is designed to provide the storage management teams with key data for the purposes of capacity planning and subsystem health. Familiarity with these data can prevent emergency capacity issues, and also serve to familiarize the team with growth trends. Many of these fields can be used to indicate compliance with Service Level Agreements. Many can be useful in tuning the subsystem.
Reporting Interval

Information Virtual Mounts per Day Megabytes transferred per Day Virtual Volumes Managed

Source Report

VEHSTATS

VEHSTATS

VEHSTATS

Megabytes Stored Daily Throttle indicator Cache Hit Percentage Physical Scratch Count Available Slot Count Tape Volume Cache (TVC) Utilization

VEHSTATS

VEHSTATS

VEHSTATS

VEHSTATS

D SMS,LIB(ALL),DETAIL VEHSTATS or Management Interface

Tracking Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending

Observation Increase over time indicates increased workload Increase over time indicates increased workload Capacity Planning - General Awareness

Daily

Daily

Daily

Daily

General Awareness Increase over time indicates increased workload

Daily

Daily

Gauges Performance Capacity Planning - General Awareness Capacity Planning - General Awareness If all my data must fit into the TS7720 TVC, am I running out of room?

Weekly

Weekly

Weekly

Copyright IBM Corporation, 2012 Page 37 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 4.3 Storage Administration
The Storage Administration section is designed to provide the administration team with key data for the purposes of capacity planning and subsystem health. Familiarity with these data can prevent emergency capacity issues, and also serve to familiarize the team with growth trends. Many of these fields can be used to indicate compliance with Service Level Agreements. Many can be useful in tuning the subsystem.
Reporting Interval

Information Virtual Mounts per Day Megabytes transferred per Day Virtual Volumes Managed Megabytes Stored Backend Drive Utilization Daily Throttle Indicators Average Virtual Mount Time Cache Hit Percentage Physical Scratch Count Available Slot Count Available Virtual Scratch

Source Report

Tracking Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Rolling Weekly Trending Watch for Healthy Distribution Rolling Weekly Trending Rolling Weekly Trending

VEHSTATS

Daily

VEHSTATS VEHSTATS VEHSTATS VEHSTATS VEHSTATS VEHSTATS VEHSTATS VEHSTATS D SMS,LIB(ALL),DETAIL D SMS,LIB(ALL),DETAIL

Daily Daily Daily Daily Daily Daily Daily Daily Daily Daily

Observation Increase over time indicates increased workload Increase over time indicates increased workload Capacity Planning Maximum 1 million per Grid Capacity Planning General Awareness Check for periods of 100% Key Performance Indicator Key Performance Indicator Key Performance Indicator Capacity Planning General Awareness Capacity Planning General Awareness Drive Insert

Data Distribution Times In Cache Tape Volume Cache (TVC) Utilization

BVIRPOOL Job VEHSTATS VEHSTATS or Management Interface

Weekly Weekly

Use for Reclaim Tuning Preference Group Tuning Indicator If all my data must fit into the TS7720 TVC, am I running out of room?

Weekly

Copyright IBM Corporation, 2012 Page 38 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 4.4 Operations
This section is for operations. By checking and signing off on these conditions and values each shift, the operators become familiar with extracting data from the subsystem, and also become familiar with what is normal making it vastly easier for them to recognize abnormal conditions. The temptation to automate this is strong, but there is a downside to 100% automation. The operators who provided this list feel more involved with the subsystem and much more aware of the conditions inside it. Familiarity with OAM, MVS and Library Request commands comes from daily usage.
Reporting Interval

Information All Virtual Drives Online VE Health Check Library Online Operational Exits Enabled Virtual Scratch Volumes Physical Scratch Tapes Interventions Check Grid Link Status Check the Number of Volumes on the Deferred Copy Queue

Source Report

LI DD,libname Management Interface

Tracking Display each composite library and each system Display each composite library Display each composite library and each system Display for each System Display each composite library Display each composite library Display each composite library Display each composite library Display for each cluster in the grid Display for each System

Observation Report or act on any missing drives Report any offline or degraded status Verifies availability to systems Report any disabled exits Report Each Shift Report Each Shift Report or Act on any Interventions Report any Errors or elevated Retransmit % Report and Watch for Gradual or Sudden Increases Report if queue depth is higher than usual

Each Shift Each Shift

D SMS,LIB(ALL),DETAIL D SMS,OAM D SMS,LIB(ALL),DETAIL D SMS,LIB(ALL),DETAIL D SMS,LIB(ALL),DETAIL LI REQ,libname, STATUS, GRIDLINK MI ==> Logical Volumes ==> Incoming Copy Queue

Each Shift Each Shift Each Shift Each Shift Each Shift

Each Shift

Each Shift

Copy Queue Depths

Management Interface

Each Shift

Copyright IBM Corporation, 2012 Page 39 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 4.5 Plotting Cache Throughput from VEHSTATS
When evaluating performance a graph that reveals a lot of information in a small space is the cache throughput for a cluster. There are performance tools available on Techdocs that will take 24 hours of 15-minute VEHSTATS data , seven days of 1-hour VEHSTATS data, or 90 days of daily summary data and create a set of charts for you. Refer to the following Techdoc sites for the performance tools and class replay for detailed information on how to use the performance tools. Tools - http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4717 Class replay - http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4872

The 24 hour, 15 minute data spreadsheets include the cache throughput chart. The cache throughput chart has two major components, the uncompressed host IO line and a stacked bar chart showing the cache throughput. The cache throughput chart includes the following components. All values are in MiB/s. Compressed host write (hunter green) Compressed host read (lime green) Compressed grid copy out to other clusters (cyan) Compressed grid copy in from other clusters (light blue) Compressed pre-migration to tape (yellow) Compressed read from physical tape (dark blue) Compressed remote reads from this cluster (orange) Compressed remote writes to this cluster (burnt orange)

The figure below illustrates the TVC Throughput chart for a TS7740.
Cluster 3 Cache Throughput
Comp Host Write MiB/s Recall MiB/s Comp Host Read MiB/s Remote Read MiB/s Copy Out MiB/s Remote Write MiB/s Copy In MiB/s Host IO MiB/s Pre-Mig MiB/s

400 350 300 250

MiB/s

200 150 100 50 0

Figure 22 - TVC Throughput - TS7740

12:00:00

13:00:00

14:00:00

15:00:00

16:00:00

17:00:00

18:00:00

Copyright IBM Corporation, 2012 Page 40 of 55

19:00:00

20:00:00

21:00:00

22:00:00

23:00:00

0:00:00

1:00:00

2:00:00

3:00:00

4:00:00

5:00:00

6:00:00

7:00:00

8:00:00

9:00:00

10:00:00

11:00:00

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


The figure below illustrates the TVC Throughput chart for a TS7720. Notice the physical tape related values are missing; pre-migration and read from physical Tape.
Cluster 0 Cache Throughput
Comp Host Write MiB/s Recall MiB/s Comp Host Read MiB/s Remote Read MiB/s Copy Out MiB/s Remote Write MiB/s Copy In MiB/s Host IO MiB/s Pre-Mig MiB/s

450 400 350 300

MiB/s

250 200 150 100 50 0

Figure 23 - TVC Throughput TS7720

4.5.1 Interpreting Cache Throughput


The TS7700 cache has a finite bandwidth. Refer to the TS7700 Performance White Paper for the expected cache throughput rates of the various cache models (CC6, CC7, etc.) The TVC bandwidth is shared between the Host IO (compressed), copy activity, pre-migration activity, and remote write/read. The TS7700 balances these tasks using various thresholds and controls in an effort to prefer host IO. The two major housekeeping tasks at work are the pre-migration of data from cache to tape, and deferred copies to other clusters. The TS7740 will delay these housekeeping tasks in order to preference host IO. The following sections describe these two housekeeping tasks.

4.5.1.1 Fast Host Write Pre-Migration Algorithm


The control of pre-migration tasks is discussed in an earlier section. The Fast Host Write algorithm limits the number of pre-migration tasks to two, one, or zero. This occurs when the compressed host write rate is greater than 30 MB/sec and the CPU is >99% busy. The graph extract below illustrates this algorithm in effect. During the 17:00 to 18:30 interval, the amount of pre-migration activity is zero. After this period of intense host activity and CPU usage, the pre-migrate tasks are allowed to start back up.

12:00:00

13:00:00

14:00:00

15:00:00

16:00:00

17:00:00

18:00:00

Copyright IBM Corporation, 2012 Page 41 of 55

19:00:00

20:00:00

21:00:00

22:00:00

23:00:00

0:00:00

1:00:00

2:00:00

3:00:00

4:00:00

5:00:00

6:00:00

7:00:00

8:00:00

9:00:00

10:00:00

11:00:00

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Figure 24 - Fast Host Write Pre Migration

4.5.1.2 Application of Deferred Copy Throttle


The following two figures illustrate the use of Deferred Copy Throttle (DCT). In the first figure, notice the amount of data being copied out is small. This is due to the DCT being applied since the compressed host IO is above the DCT threshold which is set to the default of 100MB/s. The second figure shows the compressed host IO dropping below the 100MB/s threshold and, as a result, the rate of deferred copies to other clusters is increased substantially.

Copyright IBM Corporation, 2012 Page 42 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Figure 25 - DCT Being Applied

Figure 26 - DCT is Turned Off

Copyright IBM Corporation, 2012 Page 43 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Tuning the TS7700

This section discusses the various adjustments available to the customer to tune their TS7700.

5.1 Power5 (V06/VEA) versus Power7 (V07/VEB) Performance Considerations


The 3957-V06 and VEA use a Power5 engine. The 3957-V07 and VEB use a more powerful Power7 engine. The difference in power means the performance tuning considerations vary for each one. With the Power5 the CPU is typically in the 95%-100% usage range during periods of high host IO. The Power7 engine typically doesnt saturate the CPU. The CC8/CX7 and CS8/XS7 Tape Volume Cache (TVC) has a higher throughput than the CC7/CX7, CC6/CX6, and CX7/XS7 TVCs. This means the older models of TVC become the bottleneck sooner than the latest models. The older models have a throughput of between 300MB/s and 400MB/s whereas the new TVC models have up to 600MB/s throughput. This means it takes more workload for the new TVC models to become a bottleneck. The CC8/XS7 and CX8/XS7 TVCs are only supported by the Power7 engines. With the combination of the powerful Power7 engine and the high throughput TVC, tuning recommendations can be different when compared to the Power5 and older TVC models. The Power7 can be paired with the CC7/XS7 and CX7/XS7 TVC. There is a benefit from the more power CPU, but the TVC becomes a bottleneck with a lower workload as compared to the Power7 with the latest TVC models.

The following sections discuss various tuning items and, where appropriate, describe how to tune based on the type of engine and TVC model. The recommendations for the Power5 based and Power7 with the older TVC models will be distinguished from recommendations for the Power7 with new TVC models.

5.2 Deferred Copy Throttle (DCT) Value and Threshold


The DCT is used to regulate outgoing deferred copies to other clusters in order to prefer host throughput. For some customers host throughput is more important than the deferred copies, but for others, deferred copies are just as important. Adjusting the DCT value and threshold will allow you to tune the performance of the deferred copies. The 3957-V06 and VEA uses a Power5 engine. The 3957-V07 and VEB use a more powerful Power7 engine. The Power5 engine and the Power7 engine with older TVC models need to apply DCT more often than the Power7 engine with the latest TVC in order to balance host IO and outgoing deferred copies. For the Power7 based engines the need to apply the DCT should be based on the TVC throughput. With the CC8/CS8 based TVC, cache throughputs of up to 600MB/s can be achieved. The Power7 engine has ample CPU power such that the TVC throughput will typically be a bottleneck before the CPU. Several customers with the Power7 engine and the CC8/CS8 TVC have been able to turn off DCT and still maintain the same Host IO rate.

Copyright IBM Corporation, 2012 Page 44 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


5.2.1 Deferred Copy Throttle (DCT) Value
When the DCT threshold is reached, the TS7700 adds a delay to each block of deferred copy data sent across the grid links from a cluster. The larger the delay, the slower the overall copy rate becomes. The performance of the grid links is also affected by the latency time of the connection. The latency has a large influence on the maximum grid throughput. For example, with a one way latency of 2025ms on a 2x1Gb grid link with 20 copy tasks on the receiving cluster, the maximum grid bandwidth will be approximately 140MB/s. Increasing the number of copy tasks on the receiving cluster should increase the grid bandwidth closer to 200MB/s. The default DCT is 125ms. The effect on host throughput as the DCT is lowered is not linear. Field experience shows the knee of the curve is at approximately 30ms. As the DCT value is lowered towards 30ms the host throughput is affected somewhat and deferred copy performance improves somewhat. At and below 30ms the host throughput is affected more significantly as well as deferred copy performance. Below is a chart showing the deferred copy rate with varying amounts of DCT. This is actual data from a two cluster grid with 20-25ms of latency between clusters, 2x1Gb grid links, and 20 copy tasks. At 125ms the copy rate is 5MB/s. At 40ms the copy rate increases to 15MB/s. By lowering the DCT to 15ms the copy rate increases to 40MB/s. From this point, small decreases in the DCT have dramatic changes in the copy rate. The host IO rate can also be impacted, especially with the Power5 engine.

Figure 27 - Effect of DCT on Grid Copy Rate

If the DCT needs to be adjusted from the default value, the initial recommended DCT value is between 30ms and 40ms. Favor the value to 30ms if the customer is more concerned with deferred copy performance or towards 40ms if the customer is concerned about sacrificing host throughput. After adjusting the DCT the host throughput and Deferred Copy Queue should be monitored to see if the desired balance of host throughput and deferred copy performance has been achieved. Lowering the DCT will improve deferred copy performance at the expense of host throughput. Below is a chart showing the deferred copy queue depth with DCT set to 125ms and then at 40ms. The effect on the copy queue can be clearly seen. The pair of humps on the left is with DCT set to 125ms, and the right pair of humps is with DCT set to 40ms.

Copyright IBM Corporation, 2012 Page 45 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

Figure 28 - Effect of DCT on Deferred Copy Queue

5.3 Preferred Pre-migration and Pre-migration Throttling Thresholds


These two thresholds are triggered by the amount of un-pre-migrated data in the cache. The thresholds default to 1600 GB and 2000 GB, respectively. The thresholds can be modified by the customer using the Host Console Request. The amount of un-pre-migrated data is available with the Host Console Request CACHE request. In the example below there is 750 GB of data yet to be pre-migrated. LI REQ,lib_name,CACHE TAPE VOLUME CACHE STATE V1 INSTALLED/ENABLED GBS 6000/ 3000 PARTITION ALLOC USED PG0 PG1 PMIGR COPY PMT CPYT 0 2000 1880 0 1880 750 0 14 14 The TS7700 historical statistics, which are available via BVIR and VEHSTATS, show the amount of un-pre-migrated data at the end of each reporting interval. This value is also available on the TS7700 Management Interface as a point-in-time statistic. Two host warning messages, low and high, can be configured for the TS7700 using the Host Console Request function. The setting of these warning limits is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, ALERT, RESDHIGH and SETTING, ALERT, RESDLOW keywords.

5.3.1 Preferred Pre-migration Threshold


When this threshold is crossed and the number of pre-migration processes increases, the host throughput will tend to decrease from the peak I/O rate. Lowering this value will decrease the peak throughput period. This will also delay the amount of time before pre-migration throttling may occur. The purpose is to hopefully cause data to be pre-migrated faster and avoid the Pre-Migration Throttling threshold. Refer to Section 3.4 0for details on how the pre-migration tasks are added. You may want to adjust this threshold lower to provide a larger gap between this threshold and the Premigration Throttling threshold. Do this if you want the gap to be larger but dont want to raise the Premigration Throttling Threshold. Copyright IBM Corporation, 2012 Page 46 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


This threshold can be raised, along with the Pre-Migration Throttling Threshold, to defer pre-migration until after a peak period. This can improve the host IO rate because the pre-migration tasks are not ramped up as soon with lower threshold. This trades off an increased amount of un-pre-migrated data for a higher host IO rate during heavy production periods. The Preferred Pre-migration Threshold is set using the Host Console Request function. The setting of this threshold is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, CACHE, PMPRIOR keywords.

5.3.2 Pre-migration Throttling Threshold


When this threshold is crossed, the Host Write Throttle and Copy Throttle are both invoked. The purpose is to slow incoming data to allow the amount of un-pre-migrated data to be reduced and not rise above this threshold. Refer to Section 3.2 for details on the Host Write and Copy Throttles. You may want to adjust this threshold if there are periods where the amount of data entering the subsystem increases for a period of time and the existing threshold is being crossed for a short period of time. Raising the threshold will avoid the application of the throttles and keep host and copy throughput higher. However, the exposure is there will be more un-pre-migrated data in cache. The extra un-pre-migrated data will take a longer period to be pre-migrated. The customer needs to determine the balance. The Pre-migration Throttling Threshold is set using the Host Console Request function. The setting of this threshold is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, CACHE, PMTHLVL keywords.

5.3.3 Disabling Host Write Throttle due to Immediate Copy


As discussed in Section 3.2 - What Causes Host Write and Copy Throttle to be turned on?, Host Write Throttle can be turned on due to immediate copies taking too long to copy to other clusters in the grid. Host Write Throttling is applied for various reasons including when the oldest copy in the queue is 10 or more minutes old. The TS7700 will change an immediate copy to immediate-deferred if the immediate copy has not started after 40 minutes in the immediate copy queue. This is to avoid triggering the 45 minute MIH in the host. When a copy is changed to immediate-deferred, the Rewind/Unload is completed and the immediate copy becomes a high priority deferred copy. Refer to Section 3.5 - Immediate Copy set to Immediate Deferred for more information. You can decide to turn off Host Write Throttling due to immediate copies taking too long if it is acceptable to have the immediate copies take longer. However, you will want to avoid the 40 minute limit where the immediate copies are changed to immediate-deferred. In grids where a large portion of the copies are immediate, better overall performance has been seen when the Host Write Throttle due to immediate copies is turned off. You are trading off host IO for length of time required to complete an immediate copy.

Copyright IBM Corporation, 2012 Page 47 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


The enabling and disabling of the host write throttle due to immediate copies is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, THROTTLE, ICOPYT keywords.

5.4 Making Your Cache Deeper


A deeper cache will improve the likelihood of a volume being in cache for a recall. A cache-hit for a recall improves performance when compared to a cache-miss which requires a recall from physical tape or access across the grid from another cluster. The TS7700 statistics provide a cache hit ratio for read mounts that can be monitored to make sure the cache-hit rate is not too low. Generally you want to keep the cache-hit ratio above 80%. Your cache can be made deeper in several ways. Add more cache. Utilize the Storage Class. For TS7740s, set the construct to utilize Preference Group 0 (PG0). PG0 volumes are removed from cache soon after they are pre-migrated to physical tape. PG0 volumes are actively removed from cache and do not wait for the cache to fill before being removed. This leaves more room for the PG1 volumes, which remain in cache as long as possible, to be available for recalls. Many customers have effectively made their cache deeper by examining their jobs and identifying which ones are most likely not to be recalled. They use Storage Class to assign these jobs to PG0. For TS7720s, set the Storage Class construct to use Prefer Remove for volumes you dont expect to mount. Use Pinned for those you know you will be mounting, and Prefer Keep for the others. Prefer Keep is the default Storage Class action.

5.5 Back-End Drives


Ensuring there are enough back-end drives is very important. Below are general guidelines for the number of back-end drives versus the number of performance increments installed in the TS7740. If there are insufficient back-end drives, the performance of the TS7740 will suffer. As a guideline, we recommend the following ranges of back-end drives based on the host throughput configured for the TS7740. The lower number of drives in the ranges is for scenarios where there are few recalls, whereas the upper number is for scenarios where there are numerous recalls. Remember, these are guidelines not rules. Throughput 100 MB/sec 200 MB/sec 300 MB/sec 400 MB/sec 500 MB/sec 600 1000 MB/sec Back-end Drives 4-6 5-8 7-10 9-12 10-14 12-16

Installing the correct number of back-end drives is important, but it is also important the drives be available for use. Available means they are operational and may be idle or may be in use. The Host Console Request function can be used to set up warning messages for when the number of available drives drops. The setting of the Available Physical Drive Low and High Warning levels is discussed in Copyright IBM Corporation, 2012 Page 48 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the SETTING, ALERT, PDRVLOW and SETTING, ALERT, PDRVCRIT keywords.

5.6 Grid Links


This section discusses the grid link performance and the setting of the performance warning threshold.

5.6.1 Provide Sufficient Bandwidth


The customer network between the TS7700s must have sufficient bandwidth to account for the total replication traffic. For customers sharing network switches among multiple TS7700 paths or with other network traffic, the sum total of bandwidth on that network needs to be sufficient to account for all of the network traffic. The TS7700 uses the TCP/IP protocol for moving data between each cluster. In addition to the bandwidth, there are other key factors which impact the throughput which the TS7700 can achieve. Some of the factors which directly affect performance are: Latency between the TS7700s Network efficiency (packet loss, packet sequencing, and bit error rates) Network switch capabilities Flow control to pace the data from the TS7700s ISL capabilities - flow control, buffering, performance

The TS7700s attempt to drive the grid network links at their full rate, which may be much higher than the network infrastructure may be able to handle. The TS7700 supports the IP flow control frames, to have the network pace the rate at which the TS7700 attempts to drive the network. The best performance is achieved when the TS7700 is able to match the capabilities of the underlying network, resulting in fewer dropped packets (note: when the system attempts to give the network more data than it can handle, it begins to throw away the packets it cannot handle). This causes TCP to have to stop, resynchronize, and resend amounts of data, resulting in a much less efficient use of the network. To maximize network throughput, care should be used to ensure that the underlying network: Has sufficient bandwidth to account for all network traffic expected to be driven through the system - eliminate network contention Can support flow control between the TS7700s and the switches - allows the switch to pace the TS7700 to the WAN's capability o Flow control between the switches is also a potential factor, to ensure that they themselves are able to pace with each other's rate The performance of the switch is capable of handling the data rates expected from all of the network traffic

In short, latency between the sites is the primary factor. However packet loss due to bit error rates or due to the network not being capable of handling the full link rate causes TCP to resend data, which multiplies the effect of the latency.

Copyright IBM Corporation, 2012 Page 49 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


5.6.2 Grid Link Performance Monitoring
The TS7700 will generate a host message when it detects the grid performance is degraded. If the degraded condition persists, a Call Home is generated. The performance of the grid links is monitored periodically and if one link is performing worse than the other links, by an SSR settable value, a warning message is generated and sent to the host. The purpose of this warning is to alert the customer if an abnormal grid performance difference exists. The value needs to be adjusted so that warning messages are not generated due to normal variations of the grid performance. The default value is different for different code levels. For example, a setting of 75% means that if one links performance is 25% lower than the other link then a warning message is generated. The grid link performance is available with the Host Console Request function and is available on the TS7700 Management Interface. The monitoring of the grid link performance using the Host Console Request is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User's Guide which is available on techdocs. Use the STATUS, GRIDLINK keywords. The Grid Link Degraded threshold also includes two other values that can be set by the SSR. Number of Degraded Iterations - The number of consecutive 5 minute intervals link degradation was detected before reporting an attention message. Generate a Call Home Iterations - The number of consecutive 5 minute intervals link degradation was detected before generating a Call Home.

The default values are set to 60% for the threshold, 9 iterations before an attention message is generated and 12 iterations before a Call Home is generated. You should use the default values unless you are receiving intermittent warnings and support indicates the values should be changed. If you receive intermittent warnings, you should have the SSR change the threshold and iteration values to the recommended values. For some customers, the default and recommended values still need to be adjusted. For example, a customers clusters in a two-cluster grid are 2000 miles apart with a round trip latency of approximately 45msec. The normal variation seen by this customer is 20% to 40%. They have set the threshold value at 25% and the iterations to 12 and 15.

Copyright IBM Corporation, 2012 Page 50 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 5.7 Reclaim Operations
Reclaim operations consume two back-end drives per reclaim task. Reclaim operations also consume CPU MIPS. If needed, the TS7740 will allocate pairs of idle drives for reclaim operations being sure to leave one drive available for recall. Reclaim operations affect host performance, especially during peak workload periods. You can tune your reclaim tasks using both the Reclaim Threshold and Inhibit Reclaim Schedule.

5.7.1 Reclaim Threshold


The reclaim threshold directly affects how much data is moved during each reclaim operation. Customers tend to raise this threshold too high because they want to store more data on their stacked volumes. This results in reclaim operations having to move larger amounts of data and consuming back-end drive resources that are needed for recalls and pre-migration. Once a reclaim tasks is started, it does not free up its back-end drives until the source volume being reclaimed is empty. The table below shows the amount of data that needs to be moved depending upon the stacked tape capacity and the Reclaim Percentage. When the threshold is reduced from 40% to 20% only half of the data needs to be reclaimed, thus cutting the time and resources needed for reclaim in half. Cartridge Capacity 300 GB 500 GB 640 GB 700 GB 1000 GB 4000 GB 10% 30 GB 50 GB 64 GB 70 GB 100 GB 400 GB Reclaim Threshold 20% 30% 60 GB 90 GB 100 GB 150 GB 128 GB 192 GB 140 GB 210 GB 200 GB 300 GB 800 GB 1200 GB 40% 120 GB 200 GB 256 GB 280 GB 400 GB 1600 GB

5.7.2 Inhibit Reclaim Schedule


The Inhibit Reclaim Schedule should be used to inhibit reclaims during the customers busy periods. This will leave back-end drives available for recalls and pre-migrates. We recommend you start the inhibit time 60 minutes prior to the heavy workload period. This will allow any started reclaim tasks to complete before the heavy workload period.

5.7.3 Adjusting the Maximum Number of Reclaim Tasks


Reclaim operations consume back-end drives, two per task, and consume CPU cycles. This is why the Inhibit Reclaim Schedule should be used to turn off reclaim operations during heavy production periods. When reclaim operations are not inhibited, it may be desirable to limit the number of reclaim tasks. Perhaps there is moderate host IO during the uninhibited period and reclaim is consuming too many back-end drives and/or CPU cycles. The Host Library Request command can be used to limit the number of reclaim tasks in the TS7740. The new second keyword RECLAIM is added along with the third keyword of RCLMMAX. This only applied to the TS7740. Also, the Inhibit Reclaim Schedule is still honored. The limit is turned off by setting the value to -1. The minimum value, besides -1, is 1. The setting of the maximum number of reclaim tasks is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Copyright IBM Corporation, 2012 Page 51 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012


Command Line Request User's Guide which is available on techdocs. Use the SETTING, RECLAIM, RCLMMAX keywords. The maximum number of reclaim tasks is limited by the TS7740 based on the number of available backend drives as shown in the table below. Number of Available Drives 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Maximum Number of Reclaims 1 1 1 2 2 3 3 4 4 5 5 6 6 7

Copyright IBM Corporation, 2012 Page 52 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012 5.8 Limiting Number of Pre-Migration Drives (max drives)
Each storage pool allows you to define the maximum number of back-end drives to be used for premigration tasks. There are several triggers that cause the TS7740 to ramp up the number of premigration tasks. This is discussed in section 3.4 on page 33. If a ramp up of pre-migration tasks occurs followed by the need for more than one recall, the recall will have to wait until a pre-migration task is complete for a back-end drive to free up. A single pre-migration task will move up to 30 GB at a time. Having to wait for a back-end drive will delay a logical mount that requires a recall. If this ramp up is causing too many back-end drives to be used for pre-migration tasks, you can limit the number of pre-migration drives in the pool properties panel. For a Power5 engine (3956-V06) the maximum number of pre-migration drives per pool should not be set to more than 4. Additional drives will not increase the copy rate to the drives. For a Power7 engine (3956-V07), pre-migration can benefit from having 8 to 10 drives available for pre-migration. For Copy Export pools it is advisable that the maximum number of pre-migration drives be set appropriately. If the customer is exporting a small amount of data each day (one or 2 cartridges worth of data) you should limit the pre-migration drives to two. If more data is being exported you should set the maximum to four. This will limit the number of partially filled export volumes. Look at MB/GB written to a pool, compute MB/sec, compute max and average, and compute number of pre-migration drives per pool. Base the number of drives using approximately 100 MB/sec/drive.

5.9 Avoid Copy Export during Heavy Production Periods


Since a Copy Export operation requires each physical volume to be exported to be mounted, it is best to perform the operation during a slower workload time.

Copyright IBM Corporation, 2012 Page 53 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

References:
None

Disclaimers:
Copyright 2008, 2012 by International Business Machines Corporation. No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation. Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This information could include technical inaccuracies or typographical errors. IBM may make improvements and/or changes in the product(s) and/or programs(s) at any time without notice. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBMs intellectually property rights, may be used instead. It is the users responsibility to evaluate and verify the operation of any non-IBM product, program or service. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON INFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM is not responsible for the performance or interoperability of any non-IBM products discussed herein. The customer is responsible for the implementation of these techniques in its environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. Unless otherwise noted, IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. Trademarks The following are trademarks or registered trademarks of International Business Machines in the United States, other countries, or both. IBM, TotalStorage, DFSMS/MVS, S/390, z/OS, and zSeries. Other company, product, or service names may be the trademarks or service marks of others.

Copyright IBM Corporation, 2012 Page 54 of 55

Understanding, Monitoring and Tuning the TS7700 Performance August 2012

End of Document

Copyright IBM Corporation, 2012 Page 55 of 55

Vous aimerez peut-être aussi