Application Performance and Tuning

V3.1.0.
cover
Front cover
DB2 UDB for z/OS Application Performance and Tuning

(Course Code CF96)
Student Notebook
ERC 3.2
IBM Certified Course Material
Student Notebook
Trademarks IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: CICS MVS DB2 OS/390 IMS z/OS
Windows is a trademark of Microsoft Corporation in the United States, other countries, or both. Other company, product and service names may be trademarks or service marks of others.
June 2005 Edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an as is basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customers ability to evaluate and integrate them into the customers operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Copyright International Business Machines Corporation 2000, 2005. All rights reserved. This document may not be reproduced in whole or in part without the prior written permission of IBM. Note to U.S. Government Users Documentation related to restricted rights Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
V3.1.0.1
Student Notebook
TOC
Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Unit 1. Application Performance Issues and Management Methods . . . . . . . . . 1-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Why Performance Disappointments? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Users Complaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 DBA Checks EXPLAIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 DBA Adds LNAME to X3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Users Keep Complaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Accounting Trace Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 DBA Meets Application Developer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 DBA Improves Index (Again) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 Who Should Detect Problems? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 When Should Problems Be Detected? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12 Before Writing Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14 A Touch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15 Why Did Optimizer Not Choose X2? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-16 X2 Would Prevent Sort But... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17 The Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19 Unit 2. Towards Better Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 2.1 DB2 Index Structure and Basic Access Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 Clustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Basic Access Path Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7 Matching Index Scan, Nonclustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 Matching Index Scan, Clustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 Nonmatching Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10 Index-Only Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 Matching versus Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 Predicting Matching Columns - Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13 Predicting Matching Columns - Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 Remember Unit 1 Example? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15 Evaluating an Access Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16 Very Quick Upper Bound Estimate (VQUBE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 Sequential Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18 Recommended Mental Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19 Buffer Pool Hits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Copyright IBM Corp. 2000, 2005
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Contents
iii
Student Notebook
2.2 Index Design - Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-21 DB Version 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22 DB Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23 Recommended Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-24 Components of Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25 Alarm Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27 Alarm Limit Exceeded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-28 Case 1 - Primary Key= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-29 Case 2 - Matching Clustered Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-30 Case 3 - Matching Nonclustered Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31 Case 4 - Nonmatching Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-32 Case 5 - Table Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-33 DB2 for z/OS Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-34 Disk Space Estimate for Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36 Inserts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38 Primary, Alternate and Foreign Key Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-40 Why Avoid Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-41 When Will Touches Take Place? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42 When Will Touches Take Place?... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43 2.3 Lab 1: Improve Indexes For Customer / Order Application . . . . . . . . . . . . . . . 2-45 Lab 1: Improve Indexes For Customer / Order Application . . . . . . . . . . . . . . . . . .2-46 Lab 1: Current Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-47 Lab 1: Using One Cursor - Left Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48 Lab 1: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49 Lab 1: Worksheet 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50 Lab 1: Worksheet 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51 Lab 1: Worksheet 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-52 2.4 Index Design - Part Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-53 Inadequate Indexing Detected - What Next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-54 Start With Three Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-55 Three Stars, Perfect Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-56 Three-Star Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-57 Deriving Best Possible Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-58 Candidate 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-59 Candidate 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-61 IN-List Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-63 Cost of Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-64 Add Columns to Existing Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-65 Add New Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-66 Too Many Indexes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-67 Change Row Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-68 Index Design Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-69 Recommended Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-70 VQUBE for Candidates 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-71 Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-72 2.5 Lab 2: Poorly Performing Application Already In Production . . . . . . . . . . . . . 2-73 Lab 2: Poorly Performing Application Already In Production . . . . . . . . . . . . . . . . .2-74
iv DB2 UDB for z/OS Application Performance Copyright IBM Corp. 2000, 2005
V3.1.0.1
Student Notebook
TOC
Lab 2: Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-75 Lab 2: EXPLAIN Output - Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-77 Lab 2: EXPLAIN Output - Part Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-78 Lab 2: EXPLAIN Information Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-79 Lab 2: Initial Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-80 Lab 2: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-81 Lab 2: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-82 Lab 2: Design Candidate 2 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-83 2.6 Advanced Access Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-85 Asynchronous Read (Prefetch) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-86 List Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-87 List Prefetch - Good News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-88 List Prefetch - Bad News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-89 Solution: OPTIMIZE FOR N ROWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-90 IN-list Predicates and List Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-91 Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-92 Pitfalls with Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-93 One-Fetch Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-94 2.7 Lab 3: Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-95 Lab 3: Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-96 Lab 3: Current Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-97 Lab 3: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-98 Lab 3: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-99 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-100 Unit 3. Towards Better Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Performance Issues in Table Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 Denormalization 1: Copy from Parent to Dependent . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Denormalization 2: Summary Tables and Columns . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 Unit 4. Learning to Live with Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 4.1 Dangerous Predicates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3 Cost-Based Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4 Predicate Too Difficult for Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5 Disappointed with Matching Columns? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6 A Nonindexable Predicate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 Other Nonindexable Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8 Do Not Ban Nonindexable Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9 WHERE PRED1 OR PRED2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10 Boolean Term Or Non-Boolean Term? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11 Safe versus Dangerous Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12 Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13 4.2 Lab 4: Browsing Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15
Contents
Student Notebook
4.3
4.4
4.5
4.6
4.7
Lab 4: Browsing Application Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16 Lab 4: Browsing SQL Currently In Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17 Lab 4: Instructions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18 Lab 4: Instructions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19 Optimizer and Filter Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21 Definition of Filter Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22 Reality versus Optimizer's Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24 Optimizer's Filter Factor Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26 Default Filter Factors for Range Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-27 Correlated Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28 How to Help Optimizer with Filter Factor Problems . . . . . . . . . . . . . . . . . . . . . . . .4-29 Filter Factor - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-30 Slow SQL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-31 Current Indexes (in Addition to Primary Key Index) . . . . . . . . . . . . . . . . . . . . . . . .4-32 Average Filter Factors (Actual versus Optimizers Estimate) . . . . . . . . . . . . . . . . .4-33 VQUBEs with Average Filter Factors (Actual versus Optimizers Estimate) . . . . . .4-34 How To Help the Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-35 Learn To Live with Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36 Join Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-37 3 Join Methods, 2 Join Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-38 Merge Scan Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-39 Nested Loop Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-40 How to Estimate Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-41 Join Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-42 But Optimizer Chose ORDER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-43 Optimal Indexes for Joins and Subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44 Optimal Indexes for Joins: Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-45 How to Predict Best Table Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-46 Join Pitfall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-47 Lab 5: Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-49 Lab 5: Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-50 Lab 5: ACCOUNT Table and CUST Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-51 Lab 5: Instructions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-52 Lab 5: Instructions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-53 Lab 5: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-54 Lab 5: Design Candidate 2 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-55 Subquery Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-57 Two Types of Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-58 Noncorrelated Subquery (Single Value) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-59 Noncorrelated Subquery (Multiple Values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-60 Correlated Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-61 EXPLAIN and Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-62 Lab 6: Different Implementations of the Same Transaction . . . . . . . . . . . . . . 4-63 Lab 6: Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-64 Lab 6: Available Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-65 Lab 6: At A Glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-66 Lab 6: Ideal Access Path (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-67
DB2 UDB for z/OS Application Performance Copyright IBM Corp. 2000, 2005
vi
V3.1.0.1
Student Notebook
TOC
Lab 6: Ideal Access Path (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-68 Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet . . . . . . . . . . . 4-69 Lab 6: PGM 2 - Join Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-70 Lab 6: PGM 3 - Correlated Subquery Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . 4-71 Lab 6: PGM 4 - Noncorrelated Subquery Worksheet . . . . . . . . . . . . . . . . . . . . . . 4-72 Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet . . . . . . . . . . . 4-73 4.8 Union Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-75 UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-76 4.9 Lab 7: UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-77 Lab 7: UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-78 Lab 7: Current Table and Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-79 Two Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-80 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-81 Unit 5. Unpredictable Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 5.1 Optional Input Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 Many Criteria, Only a Few Selected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 Best Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 One Cursor, One Access Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 Without REOPT(ALWAYS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 5.2 Star Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 Star Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 Star Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 Table Order Crucial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 Two Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13 Fact Table: Important Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15 Unit 6. Massive Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 6.1 Massive Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3 Batch Job Performance Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 Buffer Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5 How Long Do Pages Stay in Buffer Pool? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6 How to Measure MUPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7 Random Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 (TR) = Buffer Pool Hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9 Closer to Lower Bound or Upper Bound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10 X1: TR = 10,000 or 1,000,000? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 Table Even Worse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12 Reduce Random Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13 Surprises Possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14 Complicated? Unpredictable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15 CPU Queuing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16 Reduce Number of Touches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17 Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
Contents
vii
Student Notebook
6.2 Lab 8: Improve Batch Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19 Lab 8: Batch Application Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-20 Lab 8: Theoretical Worst Case Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-21 Lab 8: Theoretical Best Case Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22 Lab 8: Worst versus Best . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23 Lab 8: Index X6 - A Closer Look . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-24 Lab 8: Refinements Of Worst And Best Estimates . . . . . . . . . . . . . . . . . . . . . . . . .6-25 Lab 8: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-26 Lab 8: Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-27 6.3 Massive Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-29 Massive Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-30 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-31 Unit 7. Worried about CPU Time? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-2 Rough CPU Time Estimate (z990) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3 Lab 8 Base Case POLICY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5 Lab 8 Base Case CUST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6 Lab 8 Base Case CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7 Lab 8 Base Case Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-9 Unit 8. Avoiding Locking Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2 Three Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-3 Three Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5 Three Serious Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8 With Those Assumptions...Lock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-9 Lock Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-10 Three Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11 Unlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-12 What Is the Problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13 Example (Page Locking) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-14 Example...Wrong Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-15 Serious Recommendation No.3 Ignored . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-16 Example: Unnecessary Waiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-17 Another Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-18 Lock Too Weak (and Too Short) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-19 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-20 Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-21 Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-22 Lock Wait Too Long? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-23 Shorter X Lock Duration: Intermediate Commit . . . . . . . . . . . . . . . . . . . . . . . . . . .8-24 Shorter X Lock Duration: Manual Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-25 Example - Unnecessary Waiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-26 Unnecessary Waiting - Base Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-27
viii DB2 UDB for z/OS Application Performance Copyright IBM Corp. 2000, 2005
V3.1.0.1
Student Notebook
TOC
Unnecessary Waiting - Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unnecessary Waiting - Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unnecessary Waiting - Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unnecessary Waiting - Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unnecessary Waiting - Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unnecessary Waiting - Summary... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Who is Afraid of WITH UR? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Many Pages Locked Too Long . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Commit Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prevent Long Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hot Pages? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analyzing Long Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Responsible for Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-28 8-29 8-30 8-31 8-32 8-33 8-34 8-35 8-36 8-37 8-38 8-39 8-40 8-41 8-42 8-43
Unit 9. Monitoring Application Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 DB2 Trace Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3 Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 Reading an Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 Accounting Traces and VQUBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11 Analyzing an Accounting Trace (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12 Analyzing an Accounting Trace (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13 Most Useful Accounting Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17

Contents
ix
Student Notebook
DB2 UDB for z/OS Application Performance
V3.1.0.1
Student Notebook
TMK
Trademarks
The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies: IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: CICS MVS DB2 OS/390 IMS z/OS
Windows is a trademark of Microsoft Corporation in the United States, other countries, or both. Other company, product and service names may be trademarks or service marks of others.

Trademarks
xi
Student Notebook
xii
V3.1.0.1
Student Notebook
pref
Course Description
DB2 UDB for z/OS Application Performance and Tuning Duration: 5 days Purpose
This course is designed to teach the students how to prevent application performance problems and to improve the performance of existing applications.
Audience
DB2 for z/OS application developers.
Prerequisites
Familiarity with DB2 for z/OS application programming.
Objectives
After completing this course, you should be able to: Design better indexes Determine how to live with the optimizer (avoid pitfalls, help when necessary) Avoid locking problems Use accounting traces to find significant performance problems in an operational application

Course Description
xiii
Student Notebook
Contents Overview of application performance issues and performance management methods Towards better indexes
From data model to database version 0 Detecting inadequate indexing with VQUBE (very quick upper bound estimate) Three-star index: deriving the best possible index for a SELECT Estimating the cost of an index Restrictions and limitations
Towards better tables

Clustering Denormalization
Learning to live with the optimizer

Predicting index matching and screening Indexable predicates Boolean term predicates REOPT(ALWAYS) and the alternatives Join issues Subquery issues Union issues
Unpredictable transactions
Unpredictable predicates Many criteria, few provided Star join Indexes enabling index-only access versus materialized query tables
xiv
V3.1.0.1
Student Notebook
pref
Massive batch
Problem 1: random disk I/O Estimating and minimizing disk I/O time Manual and automatic parallelism Massive deletes
Worried about CPU time?

Worksheet for rough CPU time estimates
Preventing long lock waits

Lock life cycle Recommendations
Tuning operational applications

Analyzing slow transactions with accounting traces Detecting inadequate indexing Detecting optimizer problems Detecting long lock waits Detecting tables which should be denormalized

Course Description
xv
Student Notebook
xvi
V3.1.0.1
Student Notebook
pref
Agenda
Day 1
Welcome Application performance issues and management methods DB2 index structure and basic access paths Index design - part one
Day 2
Index design - part one (cont.) Lab 1 (Improve indexes for customer / order application) Lab 1 Review Index design - part two Machine Exercise 1 Machine Exercise 1 Review Index design - part two (cont.) Lab 2 (Poorly performing application already in production) Lab 2 Review Advanced access paths
Day 3
Lab 3 (Multiple index access) Lab 3 Review Towards better tables Dangerous predicates Machine Exercise 2 Machine Exercise 2 Review Dangerous predicates (cont.) Lab 4 (Browsing application) Lab 4 Review Optimizer and filter factors Machine Exercise 3 Machine Exercise 3 Review Join issues
Day 4
Join issues (cont.) Lab 5 (Joins) Lab 5 Review Subquery issues
Agenda
xvii
Student Notebook
Lab 6 (Different implementations of the same transaction) Lab 6 Review Union issues Lab 7 (Union) Lab 7 Review Machine Exercise 4 Machine Exercise 4 Review Unpredictable transactions Massive batch Lab 8 (Improve batch performance)
Day 5
Lab 8 Review Massive delete Worried about CPU time? Avoiding locking problems Monitoring application performance
xviii DB2 UDB for z/OS Application Performance
V3.1.0.1
Student Notebook
Uempty
Unit 1. Application Performance Issues and Management Methods

What This Unit Is About
This unit describes common DB2 application performance problems, different approaches to detect them, and different solutions.
What You Should Be Able to Do

After completing this unit, you should be able to: Describe common DB2 application performance problems Evaluate different approaches for detecting the problems Describe different solutions
1-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR 'HVFULEH FRPPRQ '% DSSOLFDWLRQ SHUIRUPDQFH SUREOHPV (YDOXDWH GLIIHUHQW DSSURDFKHV IRU GHWHFWLQJ WKH SUREOHPV 'HVFULEH GLIIHUHQW VROXWLRQV
&RS\ULJKW ,%0 &RUSRUDWLRQ
Figure 1-1. Unit Objectives
CF963.2
Notes:
1-2
V3.1.0.1
Student Notebook
Uempty
:K\ 3HUIRUPDQFH 'LVDSSRLQWPHQWV"
Figure 1-2. Why Performance Disappointments?
CF963.2
Notes:
What are the most common reasons for long response times or slow batch jobs in DB2 applications?
1-3
Student Notebook
8VHUV &RPSODLQLQJ
)1$0( 0,..2 &,7< 0,/$1
/1$0( &86712 -21(6 .26.,
5HVSRQVH WLPH VRPHWLPHV YHU\ ORQJ

2QH PLOOLRQ FXVWRPHUV 5HVXOW W\SLFDOO\ OHVV WKDQ WHQ URZV
Figure 1-3. Users Complaining
CF963.2
Notes:
A new medium-size application was taken into production. Many users are now complaining about widely varying response times. This is surprising because the application appeared very fast in user training sessions a couple of weeks ago. Fairly large tables were used in those sessions. The programs and the setup were identical to those now in production.
1-4
V3.1.0.1
Student Notebook
Uempty
'%$ &KHFNV (;3/$,1

;
&,7<
3&
;
; /1$0( )1$0(
(;3/$,1
&86712
,QGH[ ; 0& < 1 6257
,1'(;21/<

6(/(&7 )520 :+(5(
&867

URZV SDJHV
:KROH UHVXOW PDWHULDOL]HG DW 23(1 &85625
/1$0( &86712 &867 )1$0( )1$0( $1' &,7< &,7< 25'(5 %< /1$0( 237,0,=( )25 52:6
Figure 1-4. DBA Checks EXPLAIN
CF963.2
Notes:
The data base administrator checks the EXPLAIN for one transaction which the users have been complaining about: the one shown on the previous visual. She finds something suspicious: DB2 performs a sort. This means, of course, that DB2 must materialize the whole result when the cursor is opened. This can take a long time if the result consists of many rows. The following abbreviations are used on this visual: MC = Matching columns For the indexes: P = Primary key index C = Clustering index
1-5
Student Notebook
'%$ $GGV /1$0( WR ;
3&
; /1$0( )1$0(
;
;
&,7< /1$0(
(;3/$,1
&86712
,QGH[ ; 0& 6257 1 ,1'(;21/<

&867

URZV SDJHV
5HVXOW PDWHULDOL]HG URZ E\ URZ DW )(7&+
6(/(&7 )520 :+(5(
/1$0( &86712 &867 )1$0( )1$0( $1' &,7< &,7< 25'(5 %< /1$0( 237,0,=( )25 52:6
Figure 1-5. DBA Adds LNAME to X3
CF963.2
Notes:
To prevent the sort, the DBA adds LNAME to index X3. The cursor contains WHERE CITY = :CITY and ORDER BY LNAME. Now the first transaction will materialize only what is needed to build the first screen; the disk I/Os take place at FETCH time. The EXPLAIN with the new index confirms this: the optimizer sees that the result will be in the requested order without a sort; SORT=N.
1-6
V3.1.0.1
Student Notebook
Uempty
8VHUV .HHS &RPSODLQLQJ
'%$ DQDO\]HV DQ DFFRXQWLQJ WUDFH 6HOHFWLRQ FULWHULD

3/$11$0( ;;;
/RFDO UHVSRQVH WLPH ! V
Figure 1-6. Users Keep Complaining
CF963.2
Notes:
However, the users are not impressed. They say that the simple transaction which the DBA thought she had fixed is still very slow, even when the result is just a couple of rows. One user is quite aggressive, claiming that the system is getting slower and slower. She says that she once had to wait several minutes for a response, with a customer on the phone. The DBA suspects a system problem. Perhaps the new application has overloaded the hardware. To prove this, she starts an accounting trace to catch some slow occurrences of this transaction and analyzes the output. In this class, the term local response time is used to represent the elapsed time of a program, from create thread to terminate thread. The local response time is the class 1 elapsed time according to the accounting trace terminology. We will come back to this subject in unit 9. Instead of using the plan name as a selection criterion, the CICS transaction code, the IMS PSB name and many other selection criteria could be used.
1-7
Student Notebook
$FFRXQWLQJ 7UDFH 2XWSXW
V
/2&$/ 5(63216( 7,0(
64/
12164/
V
V
V
/2&. :$,7
V
&38 7,0(
V
6<1&+521286 5($' 1XPEHU $9* PV
V
:$,7 )25 35()(7&+
V
27+(5
Figure 1-7. Accounting Trace Output
CF963.2
Notes:
This diagram shows the most important numbers from the accounting trace. It tells much more than EXPLAIN. The numbers are measurements, not predictions. The slowest transaction spent 516 seconds executing SQL calls. This time is broken down to five components. Wait for prefetch is waiting for asynchronous reads (sequential prefetch, dynamic prefetch and list prefetch) to complete. Synchronous reads are normally random: the program is suspended because a page must be read from disk. The accounting trace also shows the number of synchronous reads. It does not look like a system problem. The slowest transaction is doing 50,000 synchronous reads with an average duration of 10ms. Why so many synchronous reads when there is no sort in the access path? Puzzled, the DBA calls an application developer familiar with the new application and arranges a meeting.
1-8
V3.1.0.1
Student Notebook
Uempty
'%$ 0HHWV $SSOLFDWLRQ 'HYHORSHU
:RUVW FDVH LGHQWLILHG %LJ FLW\

)LOWHU IDFWRU IRU &,7<
0,/$1

5DUH ILUVW QDPH

)LOWHU IDFWRU IRU )1$0(
0,..2

'% PXVW UHDG LQGH[ URZV DQG WDEOH URZV
Figure 1-8. DBA Meets Application Developer
CF963.2
Notes:
The application developer knows that 10% of the customers come from one big city, and he estimates that as many as 10,000 customers (1%) could have the same first name. The input with the biggest result, then, would produce 1000 result rows (0.1%) and cause up to 100,000 synchronous reads to the table if there were a sort in the access path. Now, with no sort, the number of synchronous reads per transaction should be much less: 2000 (2% of 100,000) if the biggest result takes 50 screens. So, why 50,000 synchronous reads? Suddenly the application developer sees an explanation: if the user enters a big city and a rare first name, DB2 must check all 100,000 table rows related to that city before the first (and only) screen is built. That could explain the 50,000 synchronous reads. The DBA agrees.
1-9
Student Notebook
'%$ ,PSURYHV ,QGH[ $JDLQ
;
&,7< )1$0( /1$0( &86712
$FFRXQWLQJ WUDFH IRU ZRUVW FDVH ZLWK ILQDO LQGH[

/2&$/ 5(63216( 7,0(
(;3/$,1 ,QGH[ ; 0& 6257 1 ,1'(;21/<
PV
64/ 12164/
<
PV
PV
PV
/2&. :$,7
PV
&38 7,0(
PV
6<1&+521286 5($'
PV
:$,7 )25 35()(7&+
PV
27+(5
Figure 1-9. DBA Improves Index (Again)
CF963.2
Notes:
Adding FNAME after the current two columns would eliminate almost all synchronous reads against the table; DB2 would read a table row only when CITY and FNAME are right. However, in the case of a big city and a rare name, DB2 would still have to read up to 100,000 index entries. To avoid this, the DBA decides to add FNAME between CITY and LNAME. There is still no need to sort. The customers who live in Milan and have Mikko as first name are now next to each other in X3 in LNAME sequence. DB2 needs to scan only 20 index entries to find the first 20 of them. The DBA notes that the only reason to access the table is now CUSTNO. She decides to add this short and non-volatile column to the index. The bind with the new index produces the expected EXPLAIN. A measurement after the second index change shows excellent response times. The users are finally happy with this transaction, but many other transactions are still very slow. The DBA is exhausted.
1-10 DB2 UDB for z/OS Application Performance
V3.1.0.1
Student Notebook
Uempty
:KR 6KRXOG 'HWHFW 3UREOHPV"
'%$"
$SSOLFDWLRQ 'HYHORSHU"
0867 81'(567$1'
'% ,QGH[ GHVLJQ WDEOH LPSOHPHQWDWLRQ RSWLPL]HU SLWIDOOV ORFNLQJ 948%( TXLFN HVWLPDWH (;3/$,1 DFFRXQWLQJ WUDFH 8VHU LQSXW ILOWHU IDFWRUV
$SSOLFDWLRQ
Figure 1-10. Who Should Detect Problems?
CF963.2
Notes:
There are three DBAs and almost 100 application developers in this company. The DBAs are busy with day-to-day database administration. They handle serious performance incidents, but no longer have much time for design reviews or regular monitoring. The DBAs are also less and less familiar with the applications. Many application developers have learned to use EXPLAIN and accounting traces, mostly after analyzing performance problems with a DBA. Some of them would like to learn more about the optimizer, while others feel that performance is not their concern; they already have so many things and products to worry about.
1-11
Student Notebook
:KHQ 6KRXOG 3UREOHPV %H 'HWHFWHG"
%HIRUH ZULWLQJ SURJUDP" (;3/$,1 LQ WHVW" 0RQLWRU DFFRXQWLQJ WUDFH LQ WHVW" (;3/$,1 LQ SURGXFWLRQ" 0RQLWRU DFFRXQWLQJ WUDFH LQ SURGXFWLRQ" $IWHU XVHUV FRPSODLQ"
Figure 1-11. When Should Problems Be Detected?
CF963.2
Notes:
Currently, in this company, some access path problems are found when new programs are moved into production. The DBAs routinely check the EXPLAINs and demand an explanation for each table scan and non-matching index scan. This does not seem to be enough, because many problems are not detected until the users complain. Starting from the bottom of the list, it is easy to think of better procedures which would catch performance problems earlier or even prevent them. 1. Regular exception monitoring with accounting traces would show all slow transactions. 2. EXPLAIN could be analyzed in more detail. Somebody familiar with the SQL calls could check whether the expected index is used in the expected way (number of matching columns), whether joins access the tables in a reasonable order, and so on. All sorts should be checked. All non index-only accesses should be checked.
V3.1.0.1
Student Notebook
Uempty
3. If the test databases are fairly realistic (as in this case), an accounting exception trace would catch many slow transactions as well as locking bottlenecks during user training. The users are not likely to report a response time of a few seconds, but it would stand out in an accounting exception trace. 4. When new programs are bound with fairly realistic test databases, many access path problems (inadequate indexes or optimizer problems) can be caught simply by checking the EXPLAIN. 5. Most access path problems can be detected and prevented as soon as the specifications for a program are fixed. All that is needed is a very rough estimate with the current indexes: can acceptable performance be achieved with these indexes (and these tables)?
1-13
Student Notebook
%HIRUH :ULWLQJ 3URJUDP

;
&,7<
WRXFKHV UDQGRP
WRXFKHV DOO UDQGRP

URZV $6680( XS WR PV SHU UDQGRP WRXFK XS WR PV SHU VHTXHQWLDO WRXFK
! /2&$/ 5(63216( 7,0( 83 72 V

Figure 1-12. Before Writing Program
CF963.2
Notes:
Estimating is really simple with the VQUBE (very quick upper bound estimate) method presented in this course. All you need to do is to count the touches (index and table rows read by DB2) and determine how many of these are random. You must be familiar with the application, however. You must have at least a rough idea of the size of the result (or actually the filter factor of each predicate). You must be able to recognize the worst input for each index candidate. In our case, for instance, you must know that 10% of customers have the most common value in CITY. Then you find out very quickly that index X3 is not adequate with the worst input. Index X3 on the visual is the original index with only one column (CITY). 100,001 random touches (100,000 for the table, 1 for the index) multiplied by 10ms give an upper bound estimate for the local response time, 1000s. The estimate for sequential touches (99,999 x 0.02ms = 2s, 0.2% of the estimate for random touches) can be ignored, as these are only estimates.
V3.1.0.1
Student Notebook
Uempty
$ 7RXFK
'% H[DPLQHV DQ LQGH[ URZ RU D WDEOH URZ
21( 728&+
75 76
5$1'20 728&+
6(48(17,$/ 728&+
7KH URZ RQ WKH VDPH SDJH DV WKH SUHYLRXV URZ RU RQ WKH QH[W SDJH
Figure 1-13. A Touch
CF963.2
Notes:
Index row means an index key and one pointer called RID (record ID) on the leaf page (the lowest level of the index). If a table contains one million rows, all indexes pointing to the table have one million index rows, both unique and non-unique indexes.
1-15
Student Notebook
:K\ 'LG 2SWLPL]HU 1RW &KRRVH ;"
;
/1$0( )1$0(
URZV
%HWWHU WKDQ ; ZKHQ UHVXOW ELJ PD[ ILOWHU IDFWRUV :RUVH WKDQ ; ZLWK DYHUDJH ILOWHU IDFWRUV
67$7,& 64/
21( $&&(66 3$7+ EDVHG RQ DYHUDJH ILOWHU IDFWRUV

Figure 1-14. Why Did Optimizer Not Choose X2?
CF963.2
Notes:
Now, let's return to the case. Can we blame the optimizer? Would it not have been better to choose X2 if the key of X3 is CITY alone? Index X2 does prevent sort. This makes it better than the original X3 if the result is very large, like 1000 rows. Then, with the presented cursor (and max 20 fetches), DB2 needs to scan only 2% of the customers (20 rows per screen divided by 1000 rows) to build the first screen. This means 20,000 sequential touches to X2 and, assuming a worst case filter factor of 1% for FNAME = :FNAME, 200 random touches to CUST (thanks to index screening for column FNAME, discussed later). Local response time = 20,000 x 0.02ms + 200 x 10ms = 2.4s; not good but much better than the 1000s with X3.
V3.1.0.1
Student Notebook
Uempty
; :RXOG 3UHYHQW 6RUW %XW
3&
(;3/$,1
;
&86712
,QGH[ ; 0& 1 1 6257
; /1$0( )1$0(
;
&,7<
,1'(;21/<

6(/(&7 )520 :+(5(
&867

URZV SDJHV
'% PD\ VFDQ ZKROH ;
/1$0(&86712 &867 )1$0( )1$0( $1' &,7< &,7< 25'(5 %< /1$0( 237,0,=( )25 52:6
Figure 1-15. X2 Would Prevent Sort But...
CF963.2
Notes:
With average input, however, X2 is worse than X3. When the result is only one screen, DB2 must check every X2 row to build the response: 1,000,000 sequential touches to the index; local response time = 20s (plus one random touch to the table for every row with the right FNAME). When choosing the access path for a static SQL call with host variables in the WHERE clause (but without BIND parameter REOPT(ALWAYS)), the optimizer estimates the elapsed time of the alternatives assuming average filter factors. If there are 1000 different cities in the CUST table, for instance, the assumed filter factor is 1/1000. If you want to minimize the response time with the worst input, you should bind with REOPT(ALWAYS) or design an index that performs well with any input. Binding every package with REOPT(ALWAYS) would be an overkill. Estimating the costs of alternative access paths at each execution time increases CPU time by a few milliseconds per cursor or SQL statement, at least. Often especially with well-designed indexes REOPT(ALWAYS) does not change the access path.
1-17
Student Notebook
7KH 0HVVDJH
$SSOLFDWLRQ WXQLQJ YHU\ VORZ LI QRW HVWLPDWHEDVHG LI FHQWUDOL]HG
Figure 1-16. The Message
CF963.2
Notes:
Index design is not automatic; the optimizer is not perfect; lock waits sometimes need attention; some tables need denormalizing. All these problems are application-related. In the ideal world, each application developer is aware of these issues and attacks them early in the lifecycle of an application program, using, for instance, VQUBE, EXPLAIN and accounting traces. In the real world, not every application developer can get the education and experience to become self-sufficient in DB2 application performance. A realistic approach is to designate enough application DBAs (semi-DBAs). Over time, the application DBAs (sometimes called 50/50 people: 50% DBAs, 50% application developers) can train the application developers to check the EXPLAINs and accounting traces of the programs they have written, and even to do a VQUBE before coding. If tuning is not based on estimates, it is trial-and-error. Many insignificant problems may be fixed before the big one.
V3.1.0.1
Student Notebook
Uempty
8QLW 6XPPDU\
.H\ SRLQWV (VWLPDWH HDUO\ 948%( ZRUVW LQSXW (;3/$,1 HDUO\ 5XQ DFFRXQWLQJ H[FHSWLRQ WUDFH HDUO\
Figure 1-17. Unit Summary
CF963.2
Notes:
1-19
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 2. Towards Better Indexes

This unit deals with detecting inadequate indexes at application design time using VQUBE (very quick upper bound estimate). A three-star algorithm is proposed for designing the best possible index for a given SELECT.

After completing this unit, you should be able to: Detect inadequate indexing with VQUBE as soon as program specifications are completed Design the best possible index for a single-table SELECT Evaluate the cost of an index
How You Will Check Your Progress

Accountability: Labs 1, 2, and 3
References
SC18-7413 DB2 UDB for z/OS Version 8 Administration Guide
2-1
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Detect inadequate indexing with VQUBE as soon as program specifications are completed Design the best possible index for a single-table SELECT Evaluate the cost of an index
Copyright IBM Corporation 2005
CF963.2
Notes:
2-2
V3.1.0.1
Student Notebook
Uempty
2.1 DB2 Index Structure and Basic Access Paths

After completing this topic, you should be able to: Perform basic access path classification Differentiate between a matching index scan with a nonclustering index, a matching index scan with a clustering index, and a nonmatching index scan Identify how to recognize index-only access and describe its benefits Differentiate between index matching and index screening Describe how you can predict matching columns Evaluate the cost of a query based on random and sequential touches Use the very quick upper bound estimate (VQUBE) analysis to detect slow access paths early
2-3
Student Notebook
Index
13 45 86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
T A B L E Data Page Data Page Data Page Data Page
Row
1,000,000 rows 50,000 pages
Figure 2-2. Index
CF963.2
Notes:
When you create an index, DB2 will scan the table and collect the address (RID) and key value from each table row. Let us assume you create a CUSTNO index for an INVOICE table with one million rows. After the first step, DB2 has a file of one million rows, each containing a CUSTNO value and a pointer (RID). Next, DB2 sorts these records according to CUSTNO. Then, it builds a set of leaf pages which contain one million index entries in CUSTNO sequence. Depending on CREATE INDEX specifications, DB2 may leave a percentage of free space on each leaf page and every Nth leaf page may be left empty; for example a specification of PCTFREE 25 FREEPAGE 8 would leave at least 25% free space on each leaf page and every 8th leaf page would be left empty. The size of index pages is always 4K, so typically 50 to 200 index entries fit on one page. In our example, with a short key (CUSTNO) the number of leaf pages might be 1,000,000 / 200 = 5000. When the leaf pages are complete, DB2 creates nonleaf pages which enable it to find the first index entry with a given key value very quickly, even when a table has billions of rows. Each nonleaf page points to a set of pages (typically 100 to 300) on the next lower level.
2-4 DB2 UDB for z/OS Application Performance Copyright IBM Corp. 2000, 2005
V3.1.0.1
Student Notebook
Uempty
Each entry on a nonleaf index page contains enough information of the highest key value on the referred page for selecting the right lower-level page. DB2 keeps adding levels until a level has only one page. This page, the starting point when searching for a key value, is called the root page. The number of nonleaf pages is typically less than 1% of the number of leaf pages. If our CUSTNO index has 5000 leaf pages (with ample free space), the next level can have 5000 / 250 = 20 pages, and the third level is the root page. When estimating the number of disk I/Os, it is normally assumed that nonleaf pages stay in the buffer pools in real storage. This means, of course, that the buffer pool(s) containing the indexes must be larger than the sum of all nonleaf pages, probably at least twice as large. If this is true in our case, finding the first invoice with a given CUSTNO takes two disk I/Os (one leaf page, one data page). The next few invoices would probably need one I/O each: same leaf page, different data page. When a row is added to the INVOICE table, DB2 has to add an entry to the leaf page of the CUSTNO index. It finds the right leaf page using the nonleaf pages (hopefully in buffer pool), reads the leaf page (one synchronous read), and then adds the new entry (CUSTNO + RID if CUSTNO does not yet exist, or the RID only if there is already at least one row with the same CUSTNO) at the right place according to the CUSTNO sequence. The entries with the same key value are sorted in RID sequence. Normally, there is room for the newcomer in the leaf page. Then, adding an index entry requires only one synchronous read. If the leaf page is full, DB2 splits the page and puts the new leaf page as close as possible to the original page. After the split, the leaf pages are no longer physically in key order, but a chain of pointers always connects the leaf pages in correct sequence. This is why an ORDER BY often does not cause a sort. How many pages must DB2 read if you want to determine the total amount of all 1000 invoices to one customer?
2-5
Student Notebook
Clustering Index
13 45 86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
Row
1,000,000 rows 50,000 pages

Figure 2-3. Clustering Index
CF963.2
Notes:
You can (and should) define a clustering index for each table. The clustering index is created using the CREATE INDEX ... CLUSTER keyword. The clustering index itself is no different from any other index, but it affects the physical order of table rows in two ways: When a table row is inserted, DB2 tries to place the new row in the home page defined by the clustering index. If there is not enough room, DB2 tries the pages close to the home page. When a table is reorganized, DB2 restores perfect clustering as shown on the visual. Now, how many pages must be read to see all 1000 invoices to one customer?
2-6
V3.1.0.1
Student Notebook
Uempty
Basic Access Path Classification
Index used?
Yes: Index scan No: Table scan
Index-only access?
Yes: Without table reference No: With table reference
Read all leaf pages?

Yes: Nonmatching index scan No: Matching index scan
MC = number of matching columns
Figure 2-4. Basic Access Path Classification
CF963.2
Notes:
Index scan does not mean that DB2 scans the whole index; it simply means that an index is somehow involved in an access path. Three advanced access paths will be discussed at the end of this unit: List prefetch Multiple index access One-fetch index scan
2-7
Student Notebook
Matching Index Scan, Nonclustering Index
13
45
86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
TR = 1 TS = M
Row
TR = up to M
EXPLAIN: MATCHCOLS > 0
Figure 2-5. Matching Index Scan, Nonclustering Index
CF963.2
Notes:
TR is the number of random touches, TS is the number of sequential touches. These values are the input for very quick upper bound estimate (VQUBE). M is the number of matching index entries. They relate to a key value or a key range. Why up to M random touches (instead of M)? This is due to index screening and will be discussed later in this unit.
2-8
V3.1.0.1
Student Notebook
Uempty
Matching Index Scan, Clustering Index
13
45
86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
TR = 1 TS = M
Row
TR = 1 TS = M or TR = R (if index screening)
EXPLAIN: MATCHCOLS > 0
Figure 2-6. Matching Index Scan, Clustering Index
CF963.2
Notes:
This is a very efficient access path if the table is reorganized often enough to enable DB2 to place new rows in their home pages. This is what we have to assume when making estimates. It is then the responsibility of a DBA to keep the tables in a good shape, at least those from which several rows are read with clustered index scan. Mislocated rows cause random touches. Index screening reduces the number of table touches. However, as the table touches are not sequential (some rows are skipped), the elapsed time per table touch is more than 0.02ms. To be on the safe side (upper bound) and to enable quick estimates, the skip sequential touches are considered random in VQUBE: TR = R, where R is the number of rows left after index screening. Of course, this leads to pessimistic estimates when few rows are skipped. The actual time per table touch for clustered index scan with index screening is between 0.02ms and 10ms.
2-9
Student Notebook
Nonmatching Index Scan
13
45
86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
TR = 1 TS = T-1
TR = R
Row
EXPLAIN: MATCHCOLS = 0
Figure 2-7. Nonmatching Index Scan
CF963.2
Notes:
T is the number of rows in the table; R is the number of rows left after index screening. In theory, the number of table touches is up to T (TS with clustering index, otherwise TR), but nonmatching index scan with table reference only makes sense if there is significant index screening. Clustering does not make a big difference, since the qualifying rows are not close to each other and table touches can be considered random.
V3.1.0.1
Student Notebook
Uempty
Index-Only Access
13
45
86
Root Page
Nonleaf Pages
13
19
33
45
62
75
86
...4
...8
. . 13
. . 19
. . 33
. . 45
. . 62
. . 75
. . 86
Leaf Pages
MC = 0:
TR = 1 TS = T - 1
MC > 0:
TR = 1 TS = M
EXPLAIN: INDEXONLY = Y
Figure 2-8. Index-Only Access
CF963.2
Notes:
This is a very nice access path. Very fast (only one TR if no leaf pages have been split), easy to predict (TS = M). No wonder indexes enabling index-only access have become so popular. To reduce the impact of leaf page splits on sequential processing, you should leave every fourth or eighth leaf page empty when you reorganize an index in which leaf page splits are likely to occur. An index-only access path may have any number of matching columns. The worst case (MC=0) may be acceptable if the index is not very large.
2-11
Student Notebook
Matching versus Screening
Index Matching
Defines the range of index rows to be touched
first M
last M
LNAME, FNAME
CUST
SELECT FROM WHERE ....
matching
Index Screening
Predicate evaluated in index, table row touched only if predicate true (and if access path not index-only)
CUST
LNAME LIKE 'M%' FNAME LIKE 'S%'
Basic Recommendation
All predicates in WHERE clause should be supported by one index (matching or screening)
AND
screening
Figure 2-9. Matching versus Screening
CF963.2
Notes:
Matching reduces the number of index and table touches; screening reduces the number of table touches. Matching support for a predicate is better than screening support, but screening support is better than no support. A predicate with matching support is called a matching predicate, and the related column is called a matching column; likewise with screening. The first N columns of an index can be matching columns. Any index column after these can be a screening column. If you follow the basic recommendation, the number of table touches will never be higher than the number of result rows. If the largest result table is 1000 rows (example from unit 1), the worst case is 1000 random touches to the table: 1000 x 10ms = 10s. The sequential index touches may still be a problem (VQUBE: 1,000,000 x 0.02ms = 20s if MC=0).
V3.1.0.1
Student Notebook
Uempty
Predicting Matching Columns - Basic Rules

Look at the index columns from leading to trailing. For each column, look at the SQL statement:
If there is no predicate for a column, this column is not a matching column and all the following columns in the index are not matching columns. If there are predicates for a column, at least one predicate must be indexable and Boolean term, otherwise, the column is not a matching column and all the following columns in the index are not matching columns. If the predicate for a column is a range predicate, all the following columns in the index are not matching columns.
Figure 2-10. Predicting Matching Columns - Basic Rules
CF963.2
Notes:
This is one of the most important visuals in this course. You should memorize it or pin it on your wall. Matching columns do not refer to an index alone or to an SQL statement alone. One SQL statement together with one index has a certain number of matching columns. Indexable and Boolean term will be discussed in unit 4. BETWEEN, LIKE, >, >=, <, <=, >, < are range predicates.
2-13
Student Notebook
Predicting Matching Columns - Exceptions
At most one IN-list predicate can be a matching predicate on an index. For multiple index access and index access with list prefetch, IN-list predicates cannot be used as matching predicates.
Figure 2-11. Predicting Matching Columns - Exceptions
CF963.2
Notes:
Note that the column referred to in the IN-list needs not be the last index column. With WHERE A= AND B IN ( ) and index (B,A), MC=2 is possible. Multiple index access and list prefetch will be discussed later in this unit.
V3.1.0.1
Student Notebook
Uempty
Remember Unit 1 Example?

More user-friendly - also optimizer-friendly?
SELECT FROM WHERE LNAME, CUSTNO CUST FNAME LIKE :FNAME AND CITY LIKE :CITY ORDER BY LNAME OPTIMIZE FOR 20 ROWS
'BON%' 'TAM%'
CITY, FNAME, LNAME, CUSTNO
MC =
TS =
Figure 2-12. Remember Unit 1 Example?
CF963.2
Notes:
Users probably do not want to type the whole city name and the whole firstname, only the first few characters. This implies range predicates LIKE or BETWEEN instead of equal predicates. With the final index from unit 1, what is now the number of matching columns and the number of sequential touches when the user enters the first characters of the biggest city? 100,000 customers (10%) live in qualifying cities. To improve performance in cases where the input is complete (selected from a list, for instance), the program can choose another cursor (WHERE CITY = ...).
2-15
Student Notebook
Evaluating an Access Path
Matching columns only a starting point

Nonmatching index scan may be acceptable Matching index scan may be very slow
TR = Number of random touches

Typically 10ms (mostly I/O time)
TS = Number of sequential touches

Roughly 0.02ms (I/O time overlapped with CPU time)
Figure 2-13. Evaluating an Access Path
CF963.2
Notes:
Your EXPLAIN tool may report that an access path is MATCHING INDEX SCAN (2/4) when MC=2 and the index has four columns. The number of columns in the index is not very relevant. The number of predicates is more interesting. If there are two predicates and MC=2, the access path is probably not bad. However, you cannot really evaluate an access path (fast enough/too slow) until you know TR and TS with the worst input.
V3.1.0.1
Student Notebook
Uempty
Very Quick Upper Bound Estimate (VQUBE)
LRT = TR x 10ms + TS x 0.02ms

LRT = Local response time TR = Number of random touches TS = Number of sequential touches
Purpose: Detect slow access paths early

(and with minimal effort)
Actual local response time often much less, seldom more
Figure 2-14. Very Quick Upper Bound Estimate (VQUBE)
CF963.2
Notes:
There are many assumptions behind this simple formula: A random disk I/O is assumed to take 10ms This implies moderate disk load (less than 35%). 10ms may be very pessimistic when the random touches are not really random but skip sequential, or if there are many disk cache hits. CPU time per touch with sequential processing is assumed to be 0.02ms. This requires a z990 processor (more than 400 MIPS per processor). The CPU time per row is much less than 0.02ms when many rows are touched to find a qualifying row. Queuing times including lock waits are supposed to be insignificant. CPU time estimates important for capacity planning will be discussed in unit 7.
2-17
Student Notebook
Sequential Prefetch
Read many (typically 32) pages at a time I/O time per page less than 1ms per 4K page I/O time overlapped with CPU time
I/O CPU
2 1
3 2 3
,2,
= 32 pages each
Figure 2-15. Sequential Prefetch
CF963.2
Notes:
The circled 1, 2 and 3 represent a set of 32 pages each. When the first set of 32 pages is in the buffer pool, the program starts to process the rows on these pages. Meanwhile, DB2 is reading (prefetching) the next 32 pages from the disk subsystem. If CPU processing is faster than I/O, the program has to wait for the prefetch to complete before starting to process set 2.
V3.1.0.1
Student Notebook
Uempty
Recommended Mental Image
LNAME, FNAME
LNAME
ANDERSEN
FNAME
HANS NILS HAKAN HAKAN MARIA META TAPIO VILLE KALLE
RID
SELECT... FROM... WHERE LNAME = 'ANDERSSON' AND FNAME LIKE 'M%'
ANDERSEN ANDERSSON ANDERSSON ANDERSSON ANDERSSON ANDERSSON ANDERSSON ANTTILA
INDEX: MC = 2, TR = 1, TS = 2
Figure 2-16. Recommended Mental Image
CF963.2
Notes:
When counting index touches, you should remember these assumptions: Ignore nonleaf pages; they are supposed to stay in buffer pool. Assume that DB2 goes directly to the first qualifying index row with matching index scan; the time for the search in a leaf page is insignificant. Assume that the index rows are in key sequence; leaf page splits are ignored. Assume N index rows when N pointers relate to one key value (see HAKAN ANDERSSON on the visual) 'Not found' is one index touch. The pointers (RID, Record ID) point to a table row. They consist of two parts: page number (three or four bytes) and row number within the page (one byte).
2-19
Student Notebook
Buffer Pool Hits
VQUBE: Only nonleaf index pages assumed to be in buffer pool when transaction starts
Need a less pessimistic estimate? Assume 0.02ms for cheap random touches if transaction touches same page several times if leaf or table page very popular
x100 SELECT TR = 10 (TR) = 90
T
10 pages
Figure 2-17. Buffer Pool Hits
CF963.2
Notes:
The buffer pools, typically a few GB today, should reside in the real storage of the CPU. Roughly speaking, it contains the recently referenced index and table pages. A one-gigabyte buffer pool contains 250,000 4K pages. To be on the safe side (upper bound) and easy to use, the basic VQUBE assumes no buffer pool hits for leaf and table pages. In the two cases listed on the visual, this is very pessimistic. When a random touch finds a row in the buffer pool, the elapsed time is less than 0.02ms. These cheap random touches are represented by (TR). Table and index pages which are referenced at least once a minute tend to stay in the buffer pool. In the example on the foil, if table T is referenced very frequently say, once a second it is likely to stay in the buffer pool all day long: all touches to it are cheap. Otherwise, the first touch to each page will bring that page to the buffer pool, and it will stay in the buffer pool until the end of the transaction.
V3.1.0.1
Student Notebook
Uempty
2.2 Index Design - Part One

After completing this topic, you should be able to: Describe a recommended approach to implement a database from design through implementation, taking into consideration application implications to the performance of the database List performance components that contribute to the response time perceived by the application user Determine acceptable worst input and average response times for applications Identify potential solutions when applications are not achieving the response time requirements specified Given a database implementation and application requirement, determine whether the current database design is efficient enough for the applications Identify how DB2 for z/OS type 2 indexes can be exploited to improve performance Estimate disk space requirements for indexes Consider the impact of leaf page splits on access via an index Describe techniques that can be employed to minimize the requirement for DB2 to perform an index page split Identify index considerations with respect to foreign key definitions
2-21
Student Notebook
DB Version 0
P,C ORDERNO
P,C ITEMNO
ORDER
1,000,000 rows
P
ORDERNO, ITEMNO
ITEM
10,000 rows C
ITEMNO
ORDERITEM
1,500,000 rows
P C
= =
Primary index Clustering index
Derived mechanically from data model except clustering

Figure 2-18. DB Version 0
CF963.2
Notes:
When you design a new database, the natural starting point is a data model. Good entities maximize the flexibility of the database; you should be able to add attributes and new entities without changing existing programs. Database version 0 can be derived from the data model without any application knowledge. Entities become tables; relations become foreign keys. Indexes are created for each primary key, alternate key and foreign key. A primary or alternate key index may also serve as a foreign key index, as index ORDERNO,ITEMNO on the visual. The only non-trivial decision at this stage is choosing the clustering for each table. Application knowledge helps: which should be faster, accessing the order items of an order or those relating to an item? Clustering is relatively easy to change, but an initial decision must be made before any estimating is possible: a random touch is 500 times more expensive than a sequential touch.
V3.1.0.1
Student Notebook
Uempty
DB Version 1
P,C ORDERNO
P,C
ORDERDATE
ITEMNO
ORDER
1,000,000 rows
P
ORDERNO, ITEMNO
ITEM
10,000 rows
C
ITEMNO
ORDERITEM
1,500,000 rows
Performance of transaction 1 satisfactory

Figure 2-19. DB Version 1
CF963.2
Notes:
When the specifications for the first program (PGM1) are fixed, DB version 0 should be evaluated: will PGM1 be fast enough with these indexes and these tables? If PGM1 needs all orders with a given orderdate, DB version 0 would imply 1,000,000 sequential touches (20s). An index with only ORDERDATE may or may not be sufficient. If there are up to 1000 orders per day, a nonclustered index scan means up to 1000 random touches to the table (10s). An index enabling index-only access is then required. By definition, DB version 1 is good enough for PGM1: you can write a program that satisfies the performance requirements.
2-23
Student Notebook
Recommended Approach
DATA MODEL DB VERSION 0
E = Estimate performance (VQUBE) C = Check performance (EXPLAIN, traces)
E C
SPEC2 DB V2 PGM 2
E C
SPEC3 DB V3 PGM 3
E C
SPEC4 DB V4 PGM 4
PRODUCTION
Figure 2-20. Recommended Approach
CF963.2
Notes:
The performance of the next program is estimated with DB version 1 and so on. If all programs (transactions as well as batch) are estimated correctly, then the indexes and the tables enable good performance from first production day. To detect inefficient programming and optimizer-related problems early, the access paths should be checked (EXPLAIN, measurements with accounting traces) as soon as realistic test tables are available.
time
SPEC1 DB V1 PGM 1
V3.1.0.1
Student Notebook
Uempty
Components of Response Time
RESPONSE TIME
LINE
LOCAL RESPONSE TIME
TRANSFER
WAIT
DISK I/O
CPU
OTHER WAITS (LOCKING,...)
TABLE and INDEX
OTHERS (LOGGING,...)
SERVICE
QUEUING
Figure 2-21. Components of Response Time
CF963.2
Notes:
In this course we limit ourselves to the local response time (LRT). Line time can be significant, even today, if each SQL call results in an interaction between the client and the server. As processors get faster while the time for a random disk I/O remains roughly the same for one year to another it is hard to speed up disk rotation or arm movement disk I/O time tends to be the biggest component. The table and index I/O time in the diagram means synchronous reads and the non-overlapped part of asynchronous reads (wait for prefetch). It includes volume and drive queuing. Thanks to large real storage, other I/Os are normally insignificant today. Program and package load should happen only when the system is started or after maintenance. The synchronous log writes are normally very fast (less than 10ms per commit point). All other I/Os are ignored in VQUBE. CPU (service) time may be the dominant component if processing is sequential or if most of the needed pages are in the buffer pools or disk caches.
2-25
Student Notebook
CPU queuing time tends to be insignificant today for high-priority transactions. It is ignored in VQUBE. Waiting for locks is the most common contributor to OTHER WAITS, ignored in VQUBE. VQUBE predicts local response time to be up to TR x 10ms + TS x 0.02ms. A rough upper bound estimate for the SQL-related CPU (service) time is (TR+TS) x 0.02ms. The components of the response time are important when monitoring performance. If the measured time is more than VQUBE, the difference may be due to one of the factors ignored in VQUBE. Basically, the application developers should ensure that all programs have an acceptable local response time according to VQUBE; the DB2 specialists should ensure that measured local response times do not significantly exceed the VQUBE local response times.
V3.1.0.1
Student Notebook
Uempty
Alarm Limits
BATCH Data warehouse queries Operational transactions Average input Worst input Local response time 0.5s 5s Commit interval: 5s
ESTIMATE (VQUBE)
EITHER YES VALUE NEXT PAGE EXCEEDED ? NO OK

Figure 2-22. Alarm Limits
CF963.2
Notes:
Two alarm limits should be used to define satisfactory performance. The visual shows typical alarm limits for CICS and IMS transactions. The five-second limit does not relate to the unluckiest transaction with a lot of queuing; it relates to the average response with the worst input. The estimate for the worst input is the most important, but the users would not be happy if the average response time of a transaction type was three seconds. Therefore, the VQUBE should be done for the average input as well. It is difficult to define any alarm limits for data warehouse (ad-hoc) queries and batch jobs. For every batch job, however, the elapsed time between two commit points should be estimated. If the worst estimate exceeds five seconds, lock durations should be analyzed.
2-27
Student Notebook
Alarm Limit Exceeded
If estimate above limit:
Improve indexing Improve SQL statements Denormalize tables Reduce lock durations Negotiate with users
Figure 2-23. Alarm Limit Exceeded
CF963.2
Notes:
Index improvement is the most common medicine. With triggers, denormalizing tables no longer poses an integrity risk; it is a performance tradeoff, just like adding an index. An example of denormalization is adding ITEMNAME to the ORDERITEM table. When ITEMNAME is updated in the ITEM table, a trigger would update the related rows in the ORDERITEM table. Users may accept a different output sequence or drop a total field when they see the difference in response time.
V3.1.0.1
Student Notebook
Uempty
Case 1 - Primary Key =
ORDERNO = ITEMNO =
UNIT_PRICE QUANTORD
EXPECTED RESULT = 1 ROW X1 P,C ORDERNO, ITEMNO U X2 ITEMNO, ORDERNO, BACKORDER
ORDERITEM
1,500,000 rows
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 2-24. Case 1 - Primary Key=
CF963.2
Notes:
This is not DB version 0; two columns have already been added to the foreign key index, clustering has been changed and ITEMNAME has been added to the ORDERITEM table. Five new transactions are now specified. Is the current database efficient enough for these? If not, what would you change?
2-29
Student Notebook
Case 2 - Matching Clustered Index Scan

ORDERNO = Sequence by ITEMNO ORDERNO ITEMNO QUANTORD
EXPECTED RESULT = 100 ROWS (max) X1 P,C ORDERNO, ITEMNO U X2 ITEMNO, ORDERNO, BACKORDER
ORDERITEM
1,500,000 rows
MC X1 2
INDEX TR TS 1 -
TABLE TR TS 1 -
LRT 20ms
Figure 2-25. Case 2 - Matching Clustered Index Scan
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Case 3 - Matching Nonclustered Index Scan

ORDERNO UNIT_PRICE QUANTORD
ITEMNO =
EXPECTED RESULT = 1000 ROWS (max)

P,C
ORDERNO, ITEMNO
X1
ITEMNO, ORDERNO, BACKORDER
X2
ORDERITEM
1,500,000 rows
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 2-26. Case 3 - Matching Nonclustered Index Scan
CF963.2
Notes:
2-31
Student Notebook
Case 4 - Nonmatching Index Scan

BACKORDER = 1 Sequence by ITEMNO ITEMNO ORDERNO QUANTSHIP

P,C
ORDERNO, ITEMNO
X1
X2
ORDERITEM
1,500,000 rows
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 2-27. Case 4 - Nonmatching Index Scan
CF963.2
Notes:
BACKORDER has two possible values: 0=normal, 1=delivery problem.
V3.1.0.1
Student Notebook
Uempty
Case 5 - Table Scan

ITEMNAME LIKE 'ABC%' ITEMNO QUANTORD

P,C
ORDERNO, ITEMNO
X1
X2
ORDERITEM
1,500,000 rows
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 2-28. Case 5 - Table Scan
CF963.2
Notes:
Column ITEMNAME has already been added to ORDERITEM table to make another transaction faster.
2-33
Student Notebook
DB2 for z/OS Index
Max 64 columns and 2000 bytes per key (Prior to Version 8: 254 bytes) No nonkey index columns No index entry suppression Points to one table ASC/DESC by column DPSI (data partitioned secondary index)
- One TR per partition if partitioning key not used as search criteria - ORDER BY / GROUP BY always results in a sort
Figure 2-29. DB2 for z/OS Index
CF963.2
Notes:
Our discussion so far has been fairly product-independent. Let us now review some specifics of the current DB2 for z/OS implementation, the type 2 index. The second bullet means that all columns listed in CREATE INDEX make up the key of the index and determine the location in the sequence chain. When any of the indexed columns is updated in the table, DB2 first removes the old index row and then inserts the new index row to the position determined by the new key value. There is no facility like DDATA in an IMS database. The third bullet also points out a difference compared to IMS. There is no sparse indexing in DB2. In case 4 we would have liked to create an index which has rows only for the exceptions (BACKORDER=1). Such an index would be smaller and cheaper to maintain. With triggers you can now build an index-like table which has one row for each exception. DPSI (data partitioned secondary index) is a special index type in DB2 for z/OS introduced in Version 8. DPSI can be defined only on partitioned table spaces. DPSI are divided into partitions (same number of partitions as the underlying partitioned table space). Each DPSI partition contains all key values and RIDs of the corresponding table partition. The index
V3.1.0.1
Student Notebook
Uempty
key sequence is maintained only within each partition. The same key value could appear in many DPSI partitions. Let us assume that the ORDER table is partitioned by ORDERNO. If a DPSI is defined on CUSTNO, a SELECT looking for all orders with CUSTNO = 17 will have to access all partitions of the DPSI, as CUSTNO 17 could appear in each partition. This means one TR for each DPSI partition (instead of only one TR if the index had not been a DPSI). This could make a big difference in local response time if there is a high number of partitions. If table ORDER has 100 partitions, a non-DPSI index on CUSTNO would give 1 TR and 100 TS (assuming 100 qualifying rows), LRT = 12ms. A DPSI index on CUSTNO would give 100 TR and 100 TS, LRT = 1s. Touching all DPSI partitions can be avoided only if the partitioning key is also referenced in the WHERE clause and if there is no host variable in its predicate (or REOPT(ALWAYS) is specified at BIND time), as the optimizer is then able to find out the partitions containing qualifying rows. Another problem with DPSI is that ORDER BY or GROUP BY always results in a sort, as the key sequence is no longer correct over all DPSI partitions.
2-35
Student Notebook
Disk Space Estimate for Indexes
1.5 x NROWS x (KEY+8) for unique indexes

Nonunique indexes may be much smaller 1.5 includes free space and nonleaf pages DB2 does not compress indexes
Figure 2-30. Disk Space Estimate for Indexes
CF963.2
Notes:
NROWS is the number of table rows. KEY is the combined length of the columns copied to the index. Add 1 to KEY for each nullable column. For an index defined as PADDED, varchar columns are stored with their maximum length, without the length field. For an index defined as NON PADDED, the length of a varchar column is its average length plus 2 for the length field. The overhead (8) is the sum of 5 (RID length, could be 4 for smaller objects), 1 (the delete flag), and 2 (the pointer at the bottom of the index page referring to this key). Nonunique indexes can be very small because the key value is stored only once per leaf page. There is an additional 2 bytes per key to store the number of RIDs per key. The delete flag is repeated for each RID.
V3.1.0.1
Student Notebook
Uempty
DB2 indexes cannot be compressed, but DB2 will truncate keys (from right to left) in the nonleaf pages (including the root page) if the truncated value is still enough to define the range of keys in the pages in the next lower level. As the number of nonleaf pages in an index is roughly 1% of the number of leaf pages, key truncation does not significantly reduce the index size, but the number of index levels may decrease (CPU saving for index probes) and, due to the lower number of nonleaf pages, there may be a higher hit ratio in buffer pools and disk caches for the nonleaf pages.
2-37
Student Notebook
Inserts
Leaf page split if page full and insert not to end

normally one extra TR
Figure 2-31. Inserts
CF963.2
Notes:
Leaf page split is fast, but index scans with many sequential index touches become slower after the splits, especially if the other halves go to the end of the index. The random touches caused by leaf page splits may be cheap if the leaf page containing the other half is close to the original leaf page. Then it may be already in the buffer pool because of sequential prefetch. The DBAs or semi-DBAs should define enough free space per leaf page (the recommendation is 2 x predicted random insert rate before the next reorg) to keep the number of leaf page splits low. Values as high as 50% are reasonable with current disks. In addition, every 4th or 8th leaf page should be left empty if leaf page splits will occur. An index could, with ever-increasing keys, need no free space or empty pages. Indexes with a hot spot (many inserts to the beginning or somewhere in the middle) need special treatment.
V3.1.0.1
Student Notebook
Uempty
If nobody has time to tailor free space/reorg frequency per index, a standard setup (like 25% free, every 8th leaf page empty, a weekly reorg of every index with at least one page split) could be adequate, but better performance will be achieved if those familiar with the application (semi-DBAs?) classify the indexes according to insert pattern and frequency, and then monitor the leaf page splits.
2-39
Student Notebook
Primary, Alternate and Foreign Key Indexes

Index key must be primary key
P
Primary key index
If another column added, uniqueness of primary key not guaranteed
==> same for alternate keys
Foreign key index
Index key should start with foreign key

If foreign key = A,B Index A,B,C OK Index A,C,B not used for DB2 referential integrity checking
Figure 2-32. Primary, Alternate and Foreign Key Indexes
CF963.2
Notes:
As mentioned, all columns added to a DB2 for z/OS index become part of a key. If the index is unique, DB2 only enforces the uniqueness of the whole index key. Therefore, no columns should be added to primary key indexes or alternate key indexes. Alternate key is one or more columns which must be unique per table. Example: In a customer table, CUSTNO may be the primary key, and social security number may be an alternate key. If DB2 referential integrity is used, slow deletes are often caused by a 'foreign key index' whose key does not start with the foreign key columns. DB2 will quietly use a table scan every time it needs to check if a row to be deleted has any dependants. These table scans are not shown in EXPLAIN.
V3.1.0.1
Student Notebook
Uempty
Why Avoid Sorts?
DB2 sort is fast today

VQUBE: 0.002ms per sorted row
A sort in access path forces DB2 to materialize whole result at OPEN CURSOR
Extra touches if whole result not fetched
Real storage needed to store materialized result (important for large sorts)
Figure 2-33. Why Avoid Sorts
CF963.2
Notes:
An ORDER BY will not cause a sort if DB2 uses an index in which the matching index rows are in the requested order (and if the optimizer decides not to use list prefetch; more about that later). DB2 sort time is ignored in VQUBE, because the CPU time is insignificant compared to the time required to retrieve the rows to be sorted. The formula shows the CPU time for a medium-size sort (say, 1,000,000 rows). Small sorts will consume less CPU time per row. Large sorts may need disk I/O. However, if the whole result is not fetched, it is very important that DB2 materializes the result FETCH by FETCH and not at OPEN CURSOR. This is why all sorts should be investigated in every EXPLAIN review. Furthermore, when estimating a SELECT with ORDER BY or GROUP BY, you should check whether DB2 needs to do a sort and count the touches accordingly.
2-41
Student Notebook
When Will Touches Take Place?

CURSOR X: SELECT CUSTNO, ORDERNO, ... FROM ORDER WHERE CUSTNO = :HV ORDER BY CUSTNO, ORDERDATE OPTIMIZE FOR 1 ROW
ROW SORT
NO ROW SORT
OPEN CURSOR all qualifying rows read from ORDER to workfile and sorted 20,000 touches FETCH first result row read from workfile
1 2
OPEN CURSOR no touches FETCH first row read from ORDER via (CUSTNO, ORDERDATE) index 2 touches
Figure 2-34. When Will Touches Take Place?
CF963.2
Notes:
This is one of the very important visuals in this course. You may want to pin it up in your cafeteria. OPTIMIZE FOR N ROWS tells the optimizer how many FETCHes the program typically issues; the optimizer then tries to find the fastest access path for that case. If OPTIMIZE FOR N ROWS is omitted, the optimizer assumes that all result rows are fetched. Two questions about the cursor: What happens if CUSTNO is dropped from ORDER BY? Nothing. (Still no sort) What happens if OPTIMIZE FOR 1 ROW is omitted? That is dangerous. Without OPTIMIZE FOR 1 ROW, the optimizer cannot know that the program issues only one fetch. Then it looks for the fastest way to retrieve all orders with a given CUSTNO. It might choose table scan and sort or matching index access with list prefetch (to be discussed later in this unit) and sort; both very slow ways to find the oldest order per customer.
V3.1.0.1
Student Notebook
Uempty
When Will Touches Take Place?...
Index
CUSTNO ... 76 77 77 77 ... 77 78 78 ...
ORDERDATE ... 4.1.2000 5.7.2000 6.7.2000 6.7.2000 ... 8.9.2000 2.7.2000 5.7.2000 ...
10,000
ORDER
1,000,000 rows
Figure 2-35. When Will Touches Take Place?...
CF963.2
Notes:
The prerequisite for avoiding the sort is an index which corresponds to the ORDER BY. With the correct index shown on the visual, DB2 is able to create the one-row result (the oldest order of customer number 77) with two touches. In some cases, OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY is needed to avoid an unwanted sort.
2-43
Student Notebook
V3.1.0.1
Student Notebook
Uempty
2.3 Lab 1: Improve Indexes For Customer / Order Application
2-45
Student Notebook
Lab 1: Improve Indexes For Customer / Order Application

CUSTZIP = XXXXX
PGM
CUSTOMER ORDER ORDER CUSTOMER ORDER
What the Application Does For the CUSTZIP=XXXXX entered by the user: It displays: CUSTNO, CUSTLASTNAME and CUSTFIRSTNAME for all customers who live in that particular area This customer information is sorted by CUSTNO Customer information is displayed even if there are no orders For each customer It displays: ORDERNO, TOTAL$_ITEMS and ORDERDATE for all orders of this customer This order information is sorted by ORDERDATE
One screen = 20 data lines Customer data = 1 line per customer Order data = 1 line per order
Figure 2-36. Lab 1: Improve Indexes For Customer / Order Application
CF963.2
Notes:
The user enters a ZIP code and wants to see all customers living in the area corresponding to this ZIP code. For those customers having orders, the orders should also be displayed. The customers should be displayed in CUSTNO sequence. The orders for one customer should be displayed in ORDERDATE sequence.
V3.1.0.1
Student Notebook
Uempty
Lab 1: Current Indexes

P,C X1 X2 P,C X3 U X4 X5
CUSTNO
CUSTZIP
ORDERNO
CUSTNO, ORDERNO
TOTAL$_ITEMS
CUST
50,000 rows 1500 pages
ORDER
1,000,000 rows 20,000 pages
Customers per CUSTZIP: average = 50, max = 1000 Orders per customer: average = 20, max = 200
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 2-37. Lab 1: Current Indexes
CF963.2
Notes:
The average number of orders per customer (20) is the relationship between the number of rows in both tables (1,000,000 / 50,000). The other values are derived from the RUNSTATS statistics. This will be covered in unit 4.
2-47
Student Notebook
Lab 1: Using One Cursor - Left Outer Join

DECLARE... SELECT C.CUSTNO,CUSTLASTNAME,CUSTFIRSTNAME, ORDERNO,TOTAL$_ITEMS,ORDERDATE FROM CUST C LEFT OUTER JOIN ORDER O ON C.CUSTNO=O.CUSTNO WHERE CUSTZIP= ORDER BY C.CUSTNO,ORDERDATE OPTIMIZE FOR 20 ROWS OPEN FETCH CLOSE
The left outer join will: Read all qualifying customers (50) for the CUSTZIP Sort by CUSTNO For each customer, read and sort all orders (50 X 20)
MC X2,CUST X4,ORDER 1 1
INDEX TABLE TR TS TR TS 1 50 50 50 50X20 50X20 -
LRT 0.5s 10.5s 11s
LRT = (1,101X10ms) + (1,050 X 0.02ms) = 11s

Figure 2-38. Lab 1: Using One Cursor - Left Outer Join
CF963.2
Notes:
As we need to access 2 tables and include customers with no orders, a left outer join may seem, at first, the best approach. But a VQUBE soon shows us this is not acceptable.
V3.1.0.1
Student Notebook
Uempty
Lab 1: Instructions
Assume the program reads data necessary to fill the first screen only Lab 4 will show how to get data for follow-on screens Assume a clever program: One that does no unnecessary work Predicates are easy enough for the optimizer It uses OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY It uses an appropriate number of cursors What You Have to Do 1. Code the first cursor, that is, an SQL statement to read the required columns and rows in the correct sequence from CUST for a given CUSTZIP. Do the VQUBE and estimate the LRT. Decide what contributes most to the LRT 2. Code the second cursor, that is, an SQL statement to read the required columns and rows in the correct sequence from ORDER for a given customer. Do the VQUBE and estimate the LRT. Decide what contributes most to the LRT 3. Improve the index used by the first cursor to give an acceptable LRT 4. Improve the index used by the second cursor to give an acceptable LRT. Or add a new index if that would be better
Figure 2-39. Lab 1: Instructions
CF963.2
Notes:
The lab instructions are guidelines only. You are encouraged to approach the problem in your own way if you prefer.
2-49
Student Notebook
Lab 1: Worksheet 1
P,C X1 X2
Code first cursor here

CUSTNO CUSTZIP
DECLARE ... SELECT ...
CUST
Customers per CUSTZIP: average = 50,max = 1000

MC INDEX TR TS TABLE TR TS LRT
Figure 2-40. Lab 1: Worksheet 1
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 1: Worksheet 2
P,C X3 U X4 X5
Code second cursor here

ORDERNO
CUSTNO, ORDERNO
TOTAL$_ITEMS
DECLARE ... SELECT ...
ORDER
1,000,000 rows 20,000 pages
Orders per customer: average = 20, max = 200

MC INDEX TR TS TABLE TR TS LRT
CF963.2La
Notes:
2-51
Student Notebook
Lab 1: Worksheet 3
For improved indexes
CUST
ORDER
1,000,000 rows 20,000 pages
MC
INDEX TR TS
TABLE TR TS
LRT
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
2.4 Index Design - Part Two

After completing this topic, you should be able to: Describe the steps to take to make improvements to the database design, given that inadequate indexes exist in the database Identify the top three characteristics that you should try to achieve with your index definition Choose the best possible index for your application situation Consider the costs implied by implementing indexes in your database design
2-53
Student Notebook
Inadequate Indexing Detected - What Next?

Traditional Approach
1. Find biggest component in VQUBE, reduce with better index 2. Redo VQUBE 3. Repeat if necessary
1. Design best possible index for slow SELECT, do VQUBE 2. Estimate index cost 3. Reduce index cost if necessary, redo VQUBE
Figure 2-43. Inadequate Indexing Detected - What Next?
CF963.2
Notes:
The first approach minimizes index costs, given response time requirements. The second approach minimizes response times, given index cost limits. The development in disk technology favors the second approach: the disks are denser than they used to be (and cheaper per megabyte), but not much faster. Now it is almost always a good tradeoff to spend disk space to reduce disk I/Os (and CPU time).
V3.1.0.1
Student Notebook
Uempty
Start with Three Stars
* ** *** ** *
Figure 2-44. Start With Three Stars
CF963.2
Notes:
As we have seen, there are numerous alternative indexes for even simple SQL calls. The recommended approach means starting from the best possible index and then only if the best index is too expensive finding the second best alternative.
2-55
Student Notebook
Three Stars, Perfect Index
Interesting index rows
* * *
Notes:
As close to each other as possible

optimal matching columns
In right sequence
no sort
With enough columns

no table access (index-only)
Figure 2-45. Three Stars, Perfect Index
CF963.2
You have already seen two three-star indexes: the final solution of the example in unit 1 and lab 1.
V3.1.0.1
Student Notebook
Uempty
Three-Star Index
Good starting point when slow SELECT found Sometimes not possible
Key length (more than 2000 bytes) Number of columns (more than 64 columns) Stars 1 and 2 in conflict
In rare cases too expensive
Figure 2-46. Three-Star Index
CF963.2
Notes:
You should design a three-star index as a starting point whenever you detect a slow SELECT, by estimate or by measurement. For reasons listed on the visual, it is sometimes impossible to create a three-star index. With the procedure on the next pages, you can easily derive the best possible index, even in those cases.
2-57
Student Notebook
Deriving Best Possible Index
Candidate 1
Interesting index rows close to each other
Candidate 2
No sort
Figure 2-47. Deriving Best Possible Index
CF963.2
Notes:
The best possible index is candidate 1 or candidate 2. In many cases, candidate 1 has three stars and there is no need to derive candidate 2.
V3.1.0.1
Student Notebook
Uempty
Candidate 1
1 Start with columns in equal predicates and IS NULL predicates (indexable, Boolean term), in any order 2 Add the column in the most selective range predicate (indexable, Boolean term) 3 Add the remaining columns in the statement (start with ORDER BY or GROUP BY columns, excluding the columns from steps 1 and 2, to avoid the sort if possible)
Figure 2-48. Candidate 1
CF963.2
Notes:
1. The order of the columns with equal predicates does not matter as far as our SELECT is concerned, but there may be a difference in maintenance cost. For WHERE A= AND B= indexes A,B and B,A are equal. If you already have index A but no index B, you would obviously choose A,B to avoid a new index. WHERE A IS NULL is also an equal predicate. 2. The most selective range predicate is the one with the lowest filter factor when the user enters the worst input. Filter factor is the number of qualifying rows divided by the number of table rows. The filter factor of predicate SEX = 'F' is roughly 0.5 in table POPULATION. 3. The order of the last columns (not in ORDER BY or GROUP BY) is irrelevant to our SELECT. To reduce maintenance I/O, you should put the most volatile columns at the end. If the number of columns exceeds 64, or if the key length exceeds 2000 bytes, all columns added only for index-only access should be removed from the index.
2-59
Student Notebook
Let us apply this algorithm to the cursor in unit 1: 1. Start with CITY, FNAME because of X3 2. No range predicates 3. Add LNAME (in ORDER BY) and CUSTNO This is, of course, the same index that the DBA designed with common sense.
V3.1.0.1
Student Notebook
Uempty
Candidate 2
Derive if candidate 1 does not prevent sort
1 Start with columns in equal predicates and IS NULL

predicates (indexable, Boolean term), in any order
2 Add columns from ORDER BY or GROUP BY, excluding

the columns from step 1
3 Add the remaining columns in the statement, in any order
Figure 2-49. Candidate 2
CF963.2
Notes:
If the number of columns exceeds 64, or if the key length exceeds 2000 bytes, all columns added only for index-only access should be removed from the index. In the unit 1 example with equal predicates (figure 1-4), candidate 1 (CITY,FNAME,LNAME,CUSTNO) gets three stars. It is the perfect index. Candidate 2 is not needed. Candidate 1 for the LIKE cursor (figure 2-12) is FNAME,LNAME,CITY,CUSTNO. It gets only two stars because DB2 must do a sort for the ORDER BY. Candidate 2 is needed: 1. No equal predicates 2. Start with LNAME 3. Add CUSTNO,FNAME,CITY (CITY more volatile than FNAME) Now use VQUBE to determine which candidate is faster. Assume the worst input. Candidate 1 has one matching column but does not prevent sort. In the worst case, the filter factor of FNAME LIKE is 1% or slightly more. TS=10,000 and LRT=0.2s.
Copyright IBM Corp. 2000, 2005 Unit 2. Towards Better Indexes 2-61
Student Notebook
Candidate 2 has no matching columns, but it prevents sort. The worst case filter factor for CITY LIKE ... AND FNAME LIKE ... is 0. Then, TS=1,000,000 and LRT=20s. Candidate 1 is the best possible index for the cursor with LIKEs.
V3.1.0.1
Student Notebook
Uempty
IN-List Predicates
Only one matching IN-list predicate
Candidate 1 : include the most selective IN-list column anywhere in step 1, all other IN-list columns in step 3
To get matching for the second, third,... IN-list column, replace these IN-lists by multiple cursors or UNION ALL
If list prefetch, no matching IN-list predicates

To get list prefetch and matching, replace IN-lists by multiple cursors or UNION ALL
Figure 2-50. IN-List Predicates
CF963.2
Notes:
The column in the most selective IN-list predicate may be in any position in the first column group of candidate 1. Use the normal guidelines. When the access path is index-only, there is no list prefetch.
2-63
Student Notebook
Cost of Index
Disk space Maintenance
SQL statements: INSERT, UPDATE, DELETE Utilities: LOAD, REORG
Locking no longer an issue with type 2 indexes If maintenance costs too high, first drop all the columns needed for index-only access
Figure 2-51. Cost of Index
CF963.2
Notes:
Our three-star index for equal predicates (CITY,FNAME,LNAME,CUSTNO) is obviously not too expensive. The disk space could be (figure 2-30): 1.5 x 1,000,000 x 100 bytes = 150MB. The 100 bytes are a guess about the length of the index key (sum of the lengths of the 4 columns, including NULL indicators, plus 8). The original nonunique index CITY was one order of a magnitude smaller, so the increase in disk space requirement is almost 150MB. The columns added to X3 are not frequently updated. If we needed to save disk space, column CUSTNO could be dropped from the index. In this case, the index would be perhaps 10% smaller, and the response time would go up by 200ms (20 x 10ms), due to table access. This does not seem like a good tradeoff. We will discuss maintenance costs in the next visuals.
V3.1.0.1
Student Notebook
Uempty
Add Columns to Existing Index
INSERT and DELETE not affected
UPDATE added column becomes slower

One TR (10ms) if index row stays on same leaf page Otherwise two TRs (20ms)
Figure 2-52. Add Columns to Existing Index
CF963.2
Notes:
Writes (table and index pages) are almost always asynchronous in DB2. Only the synchronous reads contribute to response time. Some books warn about indexing volatile columns. Before deciding not to index a column, the update cost should be quantified with the numbers on the visual. How many milliseconds are added to the updating transaction? On the other hand, adding a volatile column to many indexes may slow down updates of the column noticeably. If a column is copied to ten indexes, updating the ten copies adds 100 to 200ms (depending on the position of the column in the index key) to the response time. It is assumed that nonleaf pages stay in buffer pool or disk cache.
2-65
Student Notebook
Add New Index
INSERT and DELETE will become slower One TR (10ms)
UPDATE column will become slower One or two TRs (10 or 20ms)
Figure 2-53. Add New Index
CF963.2
Notes:
Adding an index is normally more expensive than adding columns to an existing index. The required performance of inserts and deletes may set a limit to the number of indexes a table tolerates, as the next example shows. In addition, you must consider the updates of columns in the new index. It is assumed that nonleaf pages stay in buffer pool or disk cache.
V3.1.0.1
Student Notebook
Uempty
Too Many Indexes?
P,C X1 ORDERNO, ITEMNO
X2
X10
....
ORDERITEM
Add 20 ORDERITEM rows (same ORDERNO) ORDERITEM Index X1 Indexes X2 ...X10 TR = 1, TS = 19 TR = 1, TS = 19 TR = 9 x 20 = 180 LRT = 10ms LRT = 10ms LRT = 1800ms LRT = 1.8s
Figure 2-54. Too Many Indexes?
CF963.2
Notes:
Transactions which insert or delete several rows may determine the acceptable number of indexes per table. An elegant but still fairly expensive solution is to create a special index buffer pool for these indexes, assuming that this is a critical transaction which must have shorter response times. If the average size of these indexes is 500MB, the dedicated index buffer pool should be 5GB. If you pay 10 euros/dollars per real storage MB per month (your rate may be lower), the monthly bill for this pool is 50,000 euros/dollars. But then, all index touches are cheap: 200 x 0.02ms = 4ms. A more economical but less effective solution is to add at least 5GB to disk cache. If the insert rate to this table is high, the leaf pages of indexes X1 to X10 would tend to stay in the disk cache, and the average I/O time per leaf page would be less than one millisecond.
2-67
Student Notebook
Change Row Order

Clustering (A,B,C) (A,B,C) (A,C,B) (A,B,D,C)
DANGEROUS!
A SELECT MAY BECOME SIGNIFICANTLY SLOWER
Figure 2-55. Change Row Order
CF963.2
Notes:
Any change affecting the physical order of index or table rows is risky.
V3.1.0.1
Student Notebook
Uempty
Index Design Example

Measured local response time sometimes >5s with current indexes
DECLARE CB CURSOR FOR SELECT ORDERNO, TOTAL FROM ORDER WHERE TOTAL > :TOTAL ORDER BY ORDERNO OPTIMIZE FOR 20 ROWS
P,C
ORDERNO
CUSTNO
OPEN CB
MAX 20
ORDER
1,000,000 rows
FETCH CB CLOSE CB
Figure 2-56. Index Design Example
CF963.2
Notes:
A CICS program shows a measured local response time (using accounting traces) which is often more than 5 seconds, and most of the time is spent waiting for prefetch (asynchronous read time). Fortunately, the program is very simple: there is only one cursor. The number of executed SQL calls is never more than 22. The EXPLAIN shows a table scan for this cursor. Somebody should have noticed this already in a pre-production EXPLAIN review, but better late than never.
2-69
Student Notebook
Design best possible index
VQUBE for three filter factors (TOTAL > :TOTAL)

Filter factor = 0 Filter factor = 0.1 % (largest reasonable result) Filter factor = 100%
Make decision
Figure 2-57. Recommended Approach
CF963.2
Notes:
The user may enter anything in input field TOTAL. The assumed worst reasonable input is a value which produces 1000 result rows. The filter factor for predicate TOTAL > :TOTAL is then 0.1%.
V3.1.0.1
Student Notebook
Uempty
VQUBEs for Candidates 1 and 2

Candidate 1
TOTAL, ORDERNO MC=1 SORT=Y INDEXONLY=Y
FF FF TR TR
Candidate 2
ORDERNO, TOTAL
MC=0 SORT=N INDEXONLY=Y TR TS 1 1 1 1,000,000 20,000 19
TS
TS
LRT LRT
FF 0% 0.1% 100%
TS =
LRT 20S 0.41S 0.01S
0% 0.1% 100%
1 1 1
0 1000 1,000,000
0.01S 0.03S 20S
TS = FF X 1,000,000
1,000,000 20 = NSCREENS FF
FF = Filter Factor
Figure 2-58. VQUBE for Candidates 1 and 2
CF963.2
Notes:
Shading relates to touched index rows, not to qualifying index rows. If all orders are big, the whole candidate 1 must be scanned. If there are no big orders, the whole candidate 2 must be scanned. The number of touches to candidate 2 can be expressed as a function of filter factor: TS = 20/FF. There are a maximum of 20 lines per screen (OPTIMIZE FOR 20 ROWS), and it takes 1/FF touches to find one qualifying row. Assumption: no correlation between ORDERNO and TOTAL.
2-71
Student Notebook
Decision
Choose candidate 1 (TOTAL, ORDERNO)
Do not open cursor if filter factor > 0.1%
Local response time less than 0.1s
Figure 2-59. Decision
CF963.2
Notes:
Candidate 1 is clearly better. It gives excellent performance with any reasonable input. If any input must be accepted, you could create both indexes and let DB2 choose the index every time according to input: dynamic SQL or BIND option REOPT(ALWAYS).
V3.1.0.1
Student Notebook
Uempty
2.5 Lab 2: Poorly Performing Application Already In Production

Lab 2 is about monitoring and tuning an application already in production but giving poor performance. We use accounting trace information to identify the source of the problem and we explore various options for improving the DB2 implementation. This lab is based on a real situation.
2-73
Student Notebook
Lab 2: Poorly Performing Application Already In Production

11 transactions with local response time > 5s during four hours Longest local response time = 144s
LOCAL RESPONSE TIME
144s
SQL NON-SQL
143s
1s
1s
LOCK WAIT
7s
CPU TIME
134s
SYNCHRONOUS READ Number: 5895 AVG: 22.7ms
0s
WAIT FOR PREFETCH
1s
OTHER
Figure 2-60. Lab 2: Poorly Performing Application Already In Production
CF963.2
Notes:
These numbers were observed at a large installation. Over a 4 hour monitoring period 11 transactions were found to have an unacceptable local response time of more than 5s. The worst case was found to be 144s. For this worst case, accounting trace information was used to build up the bubble chart which showed where time was being spent. The problem appeared to be the large amount of time 134s spent on synchronous reads, due to the large amount of synchronous reads, 5895.
V3.1.0.1
Student Notebook
Uempty
Lab 2: Accounting Trace

Suggests sort
TABLE BUFFER POOL WORKFILE BUFFER POOL INDEX BUFFER POOL
GETPAGES SYNCHRONOUS READS PAGES READ ASYNCHRONOUSLY
15,319 5888 0
CONCLUSIONS:
11 0 0
117 7 160
SQL DML SELECT OPEN FETCH CLOSE DML-ALL 1 1 18 1 21
TR = 1 TS = 15,000
5 sequential prefetches in index
_ _ _ __ __ _ _ _ _ _ __ __ __
TR = 6000 (TR) = 9000 TS = 0
+ SORT (1000... 2000 rows?)

Figure 2-61. Lab 2: Accounting Trace
CF963.2
Notes:
The accounting trace shows further values for the worst of these 11 transactions (local response time = 144s). A GETPAGE request is issued internally by DB2 when it needs to read a table or index page. The GETPAGE request will be satisfied from a buffer pool or from the disk subsystem. Reads from the disk subsystem can be: Synchronous (random) Asynchronous (using skip sequential or sequential prefetch processing) GETPAGE requests and disk subsystem reads are reported by buffer pool. This installation had 4 buffer pools defined: One for the catalog / directory (not shown on the visual) One for application tables
2-75
Student Notebook
One for application indexes One for the workfiles Buffer pool hits are the difference between the number of DB2s GETPAGE requests and the number of pages read from the disk subsystem. Accessing application tables DB2 made approx. 15,000 GETPAGE requests for table pages. - Approx 6,000 of these requests resulted in synchronous reads to the disk subsystem. - Approx 9,000 (15,000 - 6,000) of these requests were cheap random touches satisfied from the buffer pool (buffer pool hits). The table seems to be an active one because for much of the time the requested page was already in the buffer pool. But the application only made 21 SQL calls resulting in a huge number (15,000) of random table touches Very few SQL calls produce thousands of random table touches. It looks like the transaction needs a better index. Accessing application indexes DB2 made 117 GETPAGE requests for index pages. These caused some initial synchronous activity (could be nonleaf pages or leaf pages physically misplaced due to leaf page splits) followed by 5 prefetch requests bringing from the disk subsystem 160 (5 x 32) pages in the buffer pool. Accessing workfiles The 11 GETPAGEs to workfiles suggest a DB2 sort. We shall see that the SQL query contained an ORDER BY.
V3.1.0.1
Student Notebook
Uempty
Lab 2: EXPLAIN Output - Part One

DBRM/PACK STMT TYP
MC=2
Not index-only
DA 1457 P MATCHING INDEX SCAN(2/4)-DATA PAGES DA 1457 P ADDITIONAL SORT FOR ORDER BY DA 1565 P MATCHING INDEX SCAN(2/4)-DATA PAGES DA 1565 P ADDITIONAL SORT FOR ORDER BY ========================================================= STATEMENT NUMBER : 1457 DECLARE KURSOR1 CURSOR FOR SELECT CUSTNO, TYPE, SUBTYPE, DATE1, BO, CUSTNAME, ELNO, ETNO, DATE2, STATUS FROM RSTATUS WHERE TYPE = :TYPE AND BO = :BO AND DATE1 < :DATE1A AND CUSTNAME >= :LO AND CUSTNAME <= :HI AND STATUS < 400 ORDER BY CUSTNAME, BO, TYPE, SUBTYPE 18 rows per OPTIMIZE FOR 18 ROWS
Sort
screen
Figure 2-62. Lab 2: EXPLAIN Output - Part One
CF963.2
Notes:
The report is a DB2 PM batch EXPLAIN report. The EXPLAIN output shows that package DA has 2 SQL statements, namely, 1457 and 1565. The two cursors are fairly similar. 1457 produces the first screen. 1565 produces the second and subsequent screens, if any. In this lab we investigate only the first cursor. KURSOR1 statement 1457 - The number of matching columns is the first number in the parentheses (2/4). - There is a sort in the access path for ORDER BY. - Data pages are accessed so the access is not index-only.
2-77
Student Notebook
Lab 2: EXPLAIN Output - Part Two

INDEX: RSTATUS_BO-------------------------------------------------------STATSTIME: 1996-11-03-17.57.22 FULL KEY CARD: 992897 PAGES: 7871 1"ST KEY CARD: 363 SPACE: 36000K INDEX TYPE: 2 PGSIZE:4096 ERRULE:NO CLUSTERRATIO: 48% KEY NO. --1 2 3 4*
Chosen Index
LEVELS:3 UNIQUE:NO BFPOOL:BP1 CLRULE:NO
CLUSTERING: N CLUSTERED: N DB.NAME: IXSPACE:
COLUMN NAME ----------BO TYPE SUBTYPE CUSTNAME
COL.TYPE LNG -------- --CHAR 5 CHAR 2 CHAR 2 CHAR 8
NULL CARD. ---- ---NO 363 NO -1 NO -1 NO -1
ORDER ----ASC. ASC. ASC. ASC.
LOW2KEY ------C'00010' N/A N/A X'..
HIGH2KEY ------C'98160' N/A N/A X'..
MARKED (*) COLUMN HAS FIELD PROCEDURE: CFINSORT TABLE: RSTATUS----------------------------------------------------------STATSTIME ROWS % PAGES ACT.PAGES : 1996-11-03-17.57.22 : 1530103 COLUMNS : 24 : 93 DBASE ID: 490 : 26769 TABLE ID: 15
ROWLENGTH: 135 TB STATUS: X AUDITING: NONE
EDIT PROC.: VALIDPROC.: TABCREATOR.:
Table has 1,500,000 rows approx

Figure 2-63. Lab 2: EXPLAIN Output - Part Two
CF963.2
Notes:
The table RSTATUS has only two indexes The primary key index (not shown on this EXPLAIN output) and Index RSTATUS_BO chosen for this query
V3.1.0.1
Student Notebook
Uempty
Lab 2: EXPLAIN Information Summary

SELECT CUSTNO, TYPE, SUBTYPE, DATE1, BO, CUSTNAME, ELNO, ETNO, DATE2, STATUS FROM RSTATUS WHERE TYPE = :TYPE AND BO = :BO AND DATE1 < :DATE1A AND CUSTNAME >= :LO AND CUSTNAME <= :HI AND STATUS < 400 ORDER BY CUSTNAME, BO, TYPE, SUBTYPE OPTIMIZE FOR 18 ROWS
P,C
BO, TYPE, SUBTYPE, CUSTNAME
RSTATUS_BO
0 star:
- MC=2 (should be 3) - SORT=Y - INDEXONLY=N
RSTATUS
1,500,000 rows
Figure 2-64. Lab 2: EXPLAIN Information Summary
CF963.2
Notes:
This visual summarizes the information from the two previous visuals. The number of rows in table RSTATUS (1,500,000) is the rounded value from EXPLAIN (1,530,103).
2-79
Student Notebook
Lab 2: Initial Observations

1. Observation 1
Index RSTATUS_BO is:
BO,TYPE,SUBTYPE,CUSTNAME
ORDER BY is:
CUSTNAME,BO,TYPE,SUBTYPE
Index does not prevent SORT
2. Observation 2
SRs to table
Clusterratio 48%
No index-only access
3. Observation 3
No index support for DATE1 and STATUS
No index matching or index screening
4. Observation 4
No index matching support for CUSTNAME
Only index screening
Figure 2-65. Lab 2: Initial Observations
CF963.2
Notes:
We can now consolidate what the accounting trace and EXPLAIN information have told us. It is now clear that what we need is a better index.
V3.1.0.1
Student Notebook
Uempty
Lab 2: Instructions
1 Design candidates 1 and 2
Assume following filter factors for worst input: STATUS < 400 FF = 2% CUSTNAME ...... FF = 100% Optional input often omitted DATE1 < : DATE1A FF = 100% BO = : BO FF = 3% TYPE = : TYPE FF = 75%
2 For the 2 candidates:
3 4
Using VQUBE, estimate local response time assuming above filter factors Estimate costs (disk space, INSERT / UPDATE / DELETE overheads) Assume table RSTATUS does not tolerate an additional index. Design the best affordable index Which candidate would you choose if you did not know the filter factors?
CF963.2
Notes:
The worst input filter factors are given by RUNSTATS statistics (most frequent occurring values for BO and TYPE, additional statistics for STATUS) explained in unit 4, and by application knowledge. The transaction is looking for open applications of a certain type in a branch office. CUSTNAME is an optional input field. If the user does not enter any value, all rows qualify for this predicate and therefore FF = 100%, DATE < :DATE1A only filters out applications which arrived today, so, the filter factor is close to 100%. STATUS is updated whenever an application is processed.
2-81
Student Notebook
Lab 2: Design Candidate 1 Index Worksheet

1. Start with columns in equal predicates and IS NULL predicates (indexable, Boolean term), in any order
2. Add the column in the most selective range predicate (indexable, Boolean term)
3. Add the remaining columns in the statement (start with ORDER BY or GROUP BY columns, excluding the columns from steps 1 and 2, to avoid the sort if possible)
Figure 2-67. Lab 2: Design Candidate 1 Index Worksheet
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty

Derive if candidate 1 does not prevent sort
2. Add columns from ORDER BY or GROUP BY, excluding the columns from step 1
3. Add the remaining columns in the statement, in any order
CF963.2
Notes:
2-83
Student Notebook
V3.1.0.1
Student Notebook
Uempty
2.6 Advanced Access Paths

After completing this topic, you should be able to: Describe the basic principles of the three kinds of prefetch Identify the implications of the three kinds of prefetch on index design Given a query that actually fetches a small number of rows from a large result set, identify two potential solutions to communicate this fact to the optimizer so that it can make a more informed decision Identify benefits and pitfalls that may occur with multiple index access
2-85
Student Notebook
Asynchronous Read (Prefetch)
Prefetch
'list prefetch'
SEQUENTIAL
SKIP SEQUENTIAL
EXPLAIN: PREFETCH=L
BIND
EXPLAIN: PREFETCH=S
EXECUTE
'dynamic prefetch'
EXPLAIN: PREFETCH=D (only if expected by optimizer)
Figure 2-69. Asynchronous Read (Prefetch)
CF963.2
Notes:
Prefetch reduces I/O time per page and overlaps it with CPU time. It is useful to know the basic principles of the three kinds of prefetch when designing indexes. 1. If, at BIND time, DB2 notices that sequential prefetch is efficient for reading leaf or table pages, it turns on sequential prefetch, which is reported by EXPLAIN. Only the first page is read synchronously. After that, DB2 typically reads 32 pages with one I/O trying to stay ahead of the program. The time per 4K page is 0.15ms with current disks. That is why the cost per sequential touch is only 0.02ms in VQUBE. 2. If sequential prefetch is not turned on at BIND time, DB2 monitors the access pattern of each SQL statement to each page set (index or table). If the access is sequential or almost sequential, dynamic prefetch is turned on. Eight pages are read synchronously before checking the pattern, otherwise performance is the same as with classical sequential prefetch. Dynamic prefetch is reported by accounting trace (by buffer pool), and in EXPLAIN under certain conditions. 3. When the optimizer sees at bind time that skip sequential processing would be efficient, it decides to use list prefetch. This decision is reported by EXPLAIN. List prefetch is presented on the following pages.
V3.1.0.1
Student Notebook
Uempty
List Prefetch
Faster nonclustered index access
- Read qualifying RIDs from index (using an index-only matching index scan) - Sort RIDs by table page number - Prefetch up to 32 table pages at a time
I/O time per page less than 10ms
1
I/O CPU
2 1
WFP
3 2
WFP
= 32 pages each
1
WFP = wait for prefetch

Figure 2-70. List Prefetch
CF963.2
Notes:
By sorting the pointers before accessing the table, list prefetch converts random access to skip sequential. If list prefetch reads every other page from a table, the average wait time per page may be 2ms. If list prefetch reads three pages from a large table, the average wait time per page may be 10ms, as with synchronous read. To be on the safe side, VQUBE assumes 10ms per random touch even with list prefetch. If you need a less pessimistic estimate, assume 1ms per table touch if more than 1% of table rows are read. An example: figure 1-4, biggest city, filter factor of CITY = :CITY 10%. With list prefetch (a very likely choice because SORT=Y), a realistic estimate for table touches is 100,000 x 1ms = 100s. This is why index CITY, LNAME (figure 1-5) will result in a longer response time with the worst input (biggest city, rare first name): SORT=N, no list prefetch, 100,000 x 10ms = 1000s. The optimizer's decisions are based on the average case. The wait time between the end of processing of block N and the availability for processing of block N+1 is called wait for prefetch (WFP) in this course.
2-87
Student Notebook
List Prefetch - Good News
P,C
CUSTNO
CUSTZIP
CUST
SELECT FROM WHERE ORDER BY
CUSTNO, CUSTLASTNAME... CUST CUSTZIP = :CUSTZIP CUSTNO
+ Random touches to CUST become skip sequential

I/O time per table page may be 5ms instead of 10ms
- RIDs must be sorted

Very fast (VQUBE: 0.0001ms/RID)
Local response time significantly shorter with list prefetch
Figure 2-71. List Prefetch - Good News
CF963.2
Notes:
The CPU time for RID sort is insignificant. List prefetch may fail if DB2 finds a surprisingly large number of RIDs at execution time. DB2 will then change the access path to table scan. For example, 90% of the index rows may qualify when the most common value is moved to the host variable in WHERE COL = :hv. An index enabling index-only access may be a good solution in such a case.
V3.1.0.1
Student Notebook
Uempty
List Prefetch - Bad News
P,C
CUSTNO
CUSTZIP, CUSTNO
SELECT CUSTNO, CUSTLASTNAME... FROM CUST WHERE CUSTZIP = :CUSTZIP ORDER BY CUSTNO
CUST
- List prefetch with ORDER BY results in row sort, which implies result materialization at OPEN CURSOR many unnecessary index and table touches if whole result not FETCHed Local response time significantly longer with list prefetch
Figure 2-72. List Prefetch - Bad News
CF963.2
Notes:
Some transactions became slower when list prefetch was added to DB2 (Version 2 Release 2). To enable the optimizer to weigh the shorter I/O time against the number of I/Os, OPTIMIZE FOR N ROWS was implemented in the next release. FETCH FIRST N ROWS ONLY has the same effect on the optimizer as OPTIMIZE FOR N ROWS.
2-89
Student Notebook
Solution: OPTIMIZE FOR N ROWS

FETCH FIRST N ROWS ONLY has the same effect on the optimizer Optimizer finds fastest access path for N FETCHes (list prefetch / no list prefetch) If OPTIMIZE FOR N ROWS is omitted, optimizer assumes whole result FETCHed
Important points: OPTIMIZE FOR N ROWS does not always prevent list prefetch OPTIMIZE FOR N ROWS does not always prevent result materialization at OPEN CURSOR
Figure 2-73. Solution: OPTIMIZE FOR N ROWS
CF963.2
Notes:
OPTIMIZE FOR N ROWS affects the cost estimates of the optimizer. It is a good standard to add it to SELECT whenever the whole result is not FETCHed. The more the optimizer knows about the application, the more likely it is to choose the best access path.
V3.1.0.1
Student Notebook
Uempty
IN-list Predicates and List Prefetch
- IN-list predicates are never matching
predicates with list prefetch

SELECT CUSTNO, ... FROM CUST WHERE CUSTZIP IN (111,222)
111
222
CUSTZIP
CUSTZIP
CUST
without list prefetch MC = 1
Figure 2-74. IN-list Predicates and List Prefetch
CUST
with list prefetch MC = 0
CF963.2
Notes:
You have to replace the cursor on the visual with two cursors or UNION ALL (with equal predicates) if you want list prefetch. An easier and more effective solution is to add columns to the index to get index-only access.
2-91
Student Notebook
Multiple Index Access

WHERE CUSTNO BETWEEN 10000 AND 20000 AND CUSTZIP = 99000
CUSTNO CUSTZIP
1
RIDs
3
RIDs
5
sort intersect sort
2
RIDs
6
list prefetch CUST
WHERE CUSTNO BETWEEN 10000 AND 20000 OR CUSTZIP = 99000 ===> step 5 : union instead of intersect
Figure 2-75. Multiple Index Access
CF963.2
Notes:
Multiple index access is advanced list prefetch: the pointers are collected from several indexes or from several parts of the same index. In step 5 the pointer sets are compared to implement AND or OR. Compared to single index access, it may eliminate many table touches. Multiple index access may use the same index several times. For instance, WHERE CUSTNO < 100 OR CUSTNO > 20,000 could access the CUSTNO index twice, once for each predicate, and process the two RID lists as shown on the visual.
V3.1.0.1
Student Notebook
Uempty
Pitfalls with Multiple Index Access
No index-only
Always list prefetch ORDER BY SORT
IN-list predicate not matching predicate
Figure 2-76. Pitfalls with Multiple Index Access
CF963.2
Notes:
Multiple index access always results in table touches because the RIDs point to the table; DB2 cannot get back to the leaf pages.
2-93
Student Notebook
One-Fetch Index Scan

SELECT MAX (ORDERNO) FROM ORDER
ORDERNO
ORDER
1,000,000 rows
EXPLAIN: ACCESSTYPE = I1 TR = 1 TS = 0
Figure 2-77. One-Fetch Index Scan
CF963.2
Notes:
Certain restrictions apply. One-fetch index scan is possible only if all of the following conditions are true: There is only one table in the query There is only one column function (either MIN or MAX) Either no predicate or all predicates are matching predicates for the index There is no GROUP BY Column functions are on - The first index column if there are no predicates - The last matching column of the index if the last matching predicate is a range predicate - Next index column (after the last matching column) if all matching predicates are equal predicates
The following query is OK (I1) with index C1,C2,C3: SELECT MAX(C2) FROM T WHERE C1=5 AND C2 BETWEEN 5 AND 10
V3.1.0.1
Student Notebook
Uempty
2.7 Lab 3: Multiple Index Access
2-95
Student Notebook
Lab 3: Multiple Index Access

SELECT FROM WHERE ORDERNO, CUSTNO, TOTAL$_ITEMS ORDER ORDERDATE = '7.1.2004' AND TOTAL$_ITEMS > 100
Assumptions:
1% of orders with ORDERDATE = '7.1.2004' 5% of orders with TOTAL$_ITEMS > 100 0.05% of orders with ORDERDATE = '7.1.2004' AND TOTAL$_ITEMS > 100
Figure 2-78. Lab 3: Multiple Index Access
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 3: Current Indexes
P,C ORDERNO
X1
U CUSTNO, ORDERNO
X2
X3
X4 TOTAL$_ITEMS
ORDERDATE
ORDER
1,000,000 rows
Figure 2-79. Lab 3: Current Indexes
CF963.2
Notes:
2-97
Student Notebook
Lab 3: Instructions
1. Do a VQUBE for multiple index access with current indexes Multiple index access consists of: Separate index-only accesses to indexes X3 and X4
Only RIDs are extracted VQUBE ignores time for RID list sorts and intersection Access to table ORDER Uses list prefetch VQUBE assumes a very pessimistic 10ms per TR
2. You can achieve single index access by adding either: TOTAL$_ITEMS to X3 or ORDERDATE to X4 Which is better? Do the VQUBE for your preferred case 3. Design a 3 star index using the candidate 1 procedure and do the VQUBE for this case
CF963.2
Notes:
With current implementation: No index with all columns from WHERE clause Single matching index scan with MC=2 not possible Single matching index scan with MC=1 using X3 would give: - TS=10,000 on X3 - TR=10,000 on table ORDER - Local response time = 100s Better access path is multiple index access using X3 and X4 - Avoids most of the 10,000 TRs to table
V3.1.0.1
Student Notebook
Uempty

CF963.2
Notes:
2-99
Student Notebook
Unit Summary
Key points: If predicted or measured local response time too long, find slow SELECTs and design best possible indexes for them Number of indexes per table depends only on required INSERT/DELETE/UPDATE performance Indexes enabling index-only access are almost always good for performance
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit 3. Towards Better Tables

This unit is about the performance tradeoffs in table design.

After completing this unit, you should be able to: Evaluate clustering alternatives Consider the tradeoffs in two kinds of denormalization Describe why tables for optional attributes are often not good for performance
3-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR (YDOXDWH FOXVWHULQJ DOWHUQDWLYHV &RQVLGHU WKH WUDGHRIIV LQ WZR NLQGV RI GHQRUPDOL]DWLRQ 'HVFULEH ZK\ WDEOHV IRU RSWLRQDO DWWULEXWHV DUH RIWHQ QRW JRRG IRU SHUIRUPDQFH
CF963.2
Notes:
3-2
V3.1.0.1
Student Notebook
Uempty
3HUIRUPDQFH ,VVXHV LQ 7DEOH 'HVLJQ
&86712
&86712
&
2QH RU PDQ\ WDEOHV" &OXVWHULQJ 'HQRUPDOL]DWLRQ
$
%DWWULEXWHV RSWLRQDO FROXPQV
$ %
Figure 3-2. Performance Issues in Table Design
CF963.2
Notes:
There are several ways to model the reality. Consequently, there are several analytically correct table designs for an application. Two proposals for table design may be equally flexible, but one may perform better than the other. The difference between table designs can be determined only by estimates. Critical programs should be estimated early, because many table changes are difficult to implement after programming has started. Generally, of course, a design with fewer tables and rows performs better, other things being equal. The number of rows relates to the number of touches, and the number of tables relates to the number of random touches. The decision on the visual is an important one. If B is an optional attribute of the customer entity (the relation between entities A and B, if B is seen as an entity, would be 1 to C, one to conditional), should we create a table for B, with CUSTNO as the primary key? The design with two tables may save some disk space (probably not much if the tables are compressed), while the design with one table is faster.
3-3
Student Notebook
With IMS databases, a design with two segment types was common. It was efficient because the normal physical implementation interleaved A and B segments in one data set and connected them with pointers. This is not so with DB2.
3-4
V3.1.0.1
Student Notebook
Uempty
&OXVWHULQJ
2IWHQ GHWHUPLQHG E\ ODUJH EDWFK MREV DYRLG UDQGRP WRXFKHV 6WDQGDUG VROXWLRQ IRU FRQIOLFWV ,QGH[RQO\
& &
&
&
&86712
&86712
&867
32/,&<
&
25'(512
,7(012
25'(512
,7(012 PDQ\ FROXPQV
25'(5
,7(0
25'(5,7(0
Figure 3-3. Clustering
CF963.2
Notes:
Clustering often has a dramatic effect on the performance of large batch jobs which process tables that are bigger than the buffer pools. If two tables like ORDER and ITEM have a common dependent table (ORDERITEM), only one parent can be clustered like the dependent. An index enabling index-only access (ITEMNO, many columns) is often a good solution; the rows in this index are clustered as the rows in the other parent table.
3-5
Student Notebook
'HQRUPDOL]DWLRQ &RS\ IURP 3DUHQW WR 'HSHQGHQW
2QH WDEOH DFFHVVHG LQVWHDG RI WZR 5DQGRP WRXFKHV PLQLPL]HG 0DLQWHQDQFH ZLWK WULJJHU 8SGDWH SDUHQW RYHUKHDG PD\ EH KLJK /RQJHU ; ORFNV
&867
$&&2817
$GG &8671$0(
0DNH 948%( IRU 83'$7( &8671$0( DVVXPH DFFRXQWV DVVXPH WKUHH LQGH[HV ZLWK &8671$0( SRLQWLQJ WR $&&2817
Figure 3-4. Denormalization 1: Copy from Parent to Dependent
CF963.2
Notes:
When performance is not adequate even with the best possible indexes, denormalization (adding redundant table columns) should be considered. From a performance point of view, there are two kinds of denormalization. Adding CUSTNAME to ACCOUNT table is an example of type 1. SELECTs that need CUSTNAME in addition to ACCOUNT columns are faster, but UPDATE CUSTNAME takes longer and some data may be locked for a long time.
3-6
V3.1.0.1
Student Notebook
Uempty
'HQRUPDOL]DWLRQ 6XPPDU\ 7DEOHV DQG &ROXPQV
([DPSOH
25'(5
$GG 727$/ B,7(06
25'(5,7(0
([DPSOH
727$/B%$/$1&(
6XP RI DFFRXQW EDODQFHV QHZ WDEOH FRXOG EH LPSOHPHQWHG DV D PDWHULDOL]HG TXHU\ WDEOH
$&&2817
0DLQWHQDQFH RYHUKHDG OHVV WKDQ ZLWK GHQRUPDOL]DWLRQ

6XPPDU\ URZ RIWHQ ; ORFNHG PD\ EHFRPH D ERWWOHQHFN
Figure 3-5. Denormalization 2: Summary Tables and Columns
CF963.2
Notes:
Type 2 denormalization may create additional lock waits because the summary row is often X locked. UPDATE BALANCE is not dramatically slower there is only one extra row to update but the summary row may become a bottleneck because of the exclusive lock which is held until commit. If the queries to summary data do not need up-to-date data, the summary columns could be updated periodically. As with indexes, perhaps we tend to overemphasize the overhead of maintenance. Triggers make denormalization safe. Denormalized tables are often a good tradeoff. Materialized query tables (MQT) can be used to implement denormalized tables under certain conditions. The only advantage of MQTs is that the optimizer is aware of them and will transform a query written to access the base table(s) in an equivalent query using the MQTs. For this transformation to occur, many conditions must be met.
3-7
Student Notebook
8QLW 6XPPDU\
.H\ SRLQWV &OXVWHULQJ RIWHQ FULWLFDO IRU PDVVLYH EDWFK &RQVLGHU GHQRUPDOL]LQJ LI SHUIRUPDQFH QRW DGHTXDWH ZLWK EHVW SRVVLEOH LQGH[HV 6HSDUDWH WDEOHV IRU RSWLRQDO GDWD RIWHQ EDG IRU SHUIRUPDQFH
CF963.2
Notes:
3-8
V3.1.0.1
Student Notebook
Uempty
Unit 4. Learning to Live with Optimizer

This unit is about preventing and fixing optimizer-related problems.

After completing this unit, you should be able to: Describe the limitations related to dangerous predicates Identify situations when the optimizer needs help with filter factor estimates Avoid the pitfalls with joins, subqueries, and unions

Accountability: Labs 4, 5, 6, and 7
References
4-1
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the limitations related to dangerous predicates Identify situations when the optimizer needs help with filter factor estimates Avoid the pitfalls with joins, subqueries, and unions
CF963.2
Notes:
4-2
V3.1.0.1
Student Notebook
Uempty
4.1 Dangerous Predicates

After completing this topic, you should be able to: Recognize predicates that can cause the optimizer to miscalculate filter factors Determine predicates that can cause problems with the access path selected Identify common nonindexable predicates Differentiate between stage 1 and stage 2 predicates
4-3
Student Notebook
Cost-Based Optimizer
I/O TIME MIS, X1 MIS, X1, LP MIS, X2 MIS, X2, LP MIA, X1 + X2 Table scan XXX XXX XXX XXX XXX XXX
CPU TIME XXX XXX XXX XXX XXX XXX
COST XXX XXX XXX

lowest
XXX XXX XXX
MIS = Matching index scan LP = List prefetch MIA = Multiple index access
Figure 4-2. Cost-Based Optimizer
CF963.2
Notes:
The optimizer sees many reasonable alternative access paths for a query and estimates the cost for each. The cost relates to local response time in VQUBE but the formula is much more sophisticated.
4-4
V3.1.0.1
Student Notebook
Uempty
Predicate Too Difficult for Optimizer
Nonindexable Non-Boolean term Stage 2 Filter factor
Figure 4-3. Predicate Too Difficult for Optimizer
CF963.2
Notes:
If the optimizer does not choose the best access path, the reason is often in the WHERE clause. The first three points relate to queries for which the optimizer does not see the best access path. Filter factor problems are different. The optimizer sees the best access path but overestimates its relative cost, or underestimates the cost of another alternative.
4-5
Student Notebook
Disappointed with Matching Columns?
Administration Guide:
Look at the index columns from leading to trailing. For each index column, if there is at least one indexable Boolean term predicate on that column, it is a match column.
Figure 4-4. Disappointed with Matching Columns?
CF963.2
Notes:
Two desirable properties for a predicate: indexable and Boolean term. If you write a nonindexable or non-Boolean term predicate in your WHERE clause, the number of matching columns may be lower than you expect.
4-6
V3.1.0.1
Student Notebook
Uempty
A Nonindexable Predicate
SELECT ....... FROM ORDER WHERE TOTAL$_ITEMS NOT BETWEEN 20 AND 90 TOTAL$_ITEMS MC = 0, SORT = N
ORDER BY TOTAL$_ITEMS
Better alternatives: UNION ALL
TOTAL$_ITEMS MC = 1 (2X), SORT = Y
2 CURSORS
TOTAL$_ITEMS MC = 1 (2X), SORT = N
Figure 4-5. A Nonindexable Predicate
CF963.2
Notes:
The optimizer cannot choose a matching index scan, because NOT BETWEEN is a nonindexable predicate. It must choose between nonmatching index scan and table scan. UNION ALL is better because both SELECTs have an indexable predicate. However, ORDER BY in UNION or UNION ALL always causes a sort. Therefore, two cursors is the best alternative. WHERE TOTAL$_ITEMS < 20 OR TOTAL$_ITEMS > 90 is not a good solution because of the OR. DB2 would probably choose multiple index access: SORT=Y, INDEXONLY=N. The issues related to OR will be discussed later in this unit.
4-7
Student Notebook
Other Nonindexable Predicates
Comparisons with different data types

- with some exceptions
Scalar functions Arithmetic expressions

- with many exceptions
BETWEEN COL1 AND COL2
Figure 4-6. Other Nonindexable Predicates
CF963.2
Notes:
See the complete list in the Administration Guide of the DB2 version you are using. The list gets more complicated version by version as more predicates are made indexable.
4-8
V3.1.0.1
Student Notebook
Uempty
Do Not Ban Nonindexable Predicates
CURSOR1:
CURSOR2:
SELECT... FROM T
T
10,000,000 rows Result= 10 rows
SELECT... FROM T WHERE COL1 = 2 x COL2
x10
OPEN CURSOR1 FETCH CURSOR1
x10M
OPEN CURSOR2 FETCH CURSOR2

check COL1 = 2 x COL2
The difference: 9,999,990 FETCHes If CPU cost of FETCH is 10us, = 100s CPU time
Figure 4-7. Do Not Ban Nonindexable Predicates
CF963.2
Notes:
Banning all nonindexable predicates is an unwise standard. If you can make a nonindexable predicate indexable, you should do it, but leaving out a nonindexable predicate increases the number of executed SQL calls; CPU time goes up.
4-9
Student Notebook
WHERE PRED1 OR PRED2
Three Cases
Can be converted to IN-list

WHERE COLX = :A OR COLX = :B optimizer COLX IN (:A, :B)
Not like IN-list but both predicates indexable

WHERE COLX = :A OR COLX > :B
At least one nonindexable predicate

WHERE COLX <> :A OR COLY = :B
All access paths possible
Only multiple index access, nonmatching index scan or table scan possible
Only nonmatching index scan or table scan possible
Figure 4-8. WHERE PRED1 OR PRED2
CF963.2
Notes:
When two predicates are combined with an OR, the access path chosen by the optimizer may be non-optimal. Anybody writing an OR in the WHERE clause should be aware of the current limitations in access path selection. This visual shows how the optimizer handles WHERE PRED1 OR PRED2. More complex cases must be analyzed with the concept of Boolean term predicates.
V3.1.0.1
Student Notebook
Uempty
Boolean Term or Non-Boolean Term?
A predicate is Boolean term if a row can be rejected whenever the predicate is evaluated false.
without looking at the other predicates in the WHERE clause
PRED1 AND (PRED2 OR PRED3) Example 1
PRED1 OR (PRED2 AND PRED3) Example 2
Figure 4-9. Boolean Term Or Non-Boolean Term?
CF963.2
Notes:
No predicate is non-Boolean term as such; if you have no OR in a WHERE clause, all predicates are Boolean term. Non-Boolean term predicates may cause matching columns disappointments. Remember the important sentence: For each column, if there is at least one indexable Boolean term predicate on that column, it is a match column.
4-11
Student Notebook
Safe versus Dangerous Predicates
WHERE
Predicate 1 AND Predicate 2
Predicate
Indexable
(and Stage 1)
Nonindexable
OR Predicate 3
Stage 1
No matching
Stage 2
No matching, no screening
Figure 4-10. Safe versus Dangerous Predicates
CF963.2
Notes:
A simple predicate (like Predicate 1, Predicate 2, Predicate 3; the combination is called a compound predicate) is one of these: Indexable (and stage 1) Nonindexable and stage 1 Nonindexable and stage 2 Stage 2 predicates are evaluated by a component which understands all DB2 predicates but uses more CPU time than the component which is only able to evaluate stage 1 predicates. In addition, the stage 2 component is not able to do index screening. An example of a stage 2 predicate is WHERE current date BETWEEN COL1 AND COL2. Even with an index containing COL1 and COL2, DB2 reads the table row to evaluate the predicate: no index screening. Remember figure 2-9 (matching versus screening)? To enable matching, a predicate must be indexable and Boolean term. To enable screening, a predicate must be stage 1.
V3.1.0.1
Student Notebook
Uempty
Browsing
Simple approach
Read whole result in one transaction, store result somewhere
Performance may be acceptable if result always small
Recommended approach
Fetch one screen per transaction
Important to prevent result materialization at OPEN CURSOR (no sort!) and to ensure high number of matching columns
Figure 4-11. Browsing
CF963.2
Notes:
The simple approach is convenient but risky: if the result can sometimes consist of many screens (say, more than ten), response time may be unacceptable. If the user interface has a scrolling bar, it may be necessary to send more than one screen at a time to the workstation. A maximum number of lines like 300 should be set, and the access path should probably be index-only. The recommended approach requires careful predicate analysis. The next transaction should start index scan exactly at the point where the current one exits.
4-13
Student Notebook
V3.1.0.1
Student Notebook
Uempty
4.2 Lab 4: Browsing Application
4-15
Student Notebook
Lab 4: Browsing Application Description

1. User enters first few characters of CUSTNAME, say, 'SMIT' Result 0.......1000 customers
Uses recommended approach to FETCH 1 screen per transaction

Browsing Program
3 star index
U
X2
SMIT
2. Program moves: 'SMIT' padded with hex 00 to :PREVNAME 'SMIT' padded with hex FF to :HIGH Low values (hex '00') to :PREVNO
P,C
CUSTNAME, CUSTNO, CITY

X1
CUSTNO
10 SMITH ..... 20 SMITH ..... 15 SMITHERS .. 99 SMITHSON ..
3. Program FETCHs first 20 rows and displays 1st screen 5. Program moves: CUSTNAME from the 20th row to :PREVNAME
CUST
1,000,000 rows
4. One line per customer CUSTNO, CUSTNAME, CITY Sorted by CUSTNAME, CUSTNO Max 20 lines per screen
CUSTNO from the 20th row to :PREVNO 6. Saves :PREVNAME and :PREVNO for next transaction
VQUBE: TR=1 TS=19 Estimated LRT 10.4ms
But surprisingly response times are sometimes very long!!

Figure 4-12. Lab 4: Browsing Application Description
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 4: Browsing SQL Currently In Use

DECLARE BR CURSOR FOR SELECT CUSTNO, CUSTNAME, CITY FROM CUST WHERE (CUSTNAME = :PREVNAME AND CUSTNO > :PREVNO) OR (CUSTNAME > :PREVNAME AND CUSTNAME <= :HIGH) ORDER BY CUSTNAME, CUSTNO OPTIMIZE FOR 20 ROWS OPEN BR
Max 20 times
FETCH BR CLOSE BR Save :PREVNAME, :PREVNO

Figure 4-13. Lab 4: Browsing SQL Currently In Use
CF963.2
Notes:
4-17
Student Notebook
Lab 4: Instructions (1 of 2)
1. What are the predicates in the SELECT statement intended to achieve? 2. Classify each of the 4 (simple) predicates in the SELECT:
SELECT FROM WHERE CUSTNO, CUSTNAME, CITY CUST (CUSTNAME = :PREVNAME AND CUSTNO > :PREVNO) OR (CUSTNAME > :PREVNAME AND CUSTNAME <= :HIGH) ORDER BY CUSTNAME, CUSTNO OPTIMIZE FOR 20 ROWS
Are these predicates: a. Indexable or nonindexable? b. Stage 1 or stage 2? c. Boolean term or non-Boolean term?
3. What is it that makes the predicates in this SELECT 'dangerous'? 4. Which access path is going to be ruled out? 5. Which possible access paths may be chosen?
Figure 4-14. Lab 4: Instructions (1 of 2)
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
6. Do a VQUBE and estimate the local response time for these possible access paths: a. Nonmatching index scan b. Multiple index access c. Table scan 7. Is it an index problem? a. Can the indexes be improved? 8. Is it a filter factor problem? a. Would REOPT(ALWAYS) help? 9. Is it an SQL problem? a. How would you rewrite the browsing SELECT for cursor repositioning to improve performance?
CF963.2
Notes:
4-19
Student Notebook
V3.1.0.1
Student Notebook
Uempty
4.3 Optimizer and Filter Factors

After completing this topic, you should be able to: Define filter factor Identify sources of information for the optimizer's calculation of filter factor Consider the implication of default filter factors Describe the impact of correlated columns used in the WHERE clause Use techniques to overcome filter factor miscalculations
4-21
Student Notebook
Definition of Filter Factor
Filter factor =
number of rows in result table number of rows in source table
WHERE
FNAME = :FNAME AND CITY = :CITY

Filter factor =
1 2000 x
Filter factor = 1,000,000 = 2000 Filter factor =

2000 = 1,000,000 1 500
500
1 500 =
1 1,000,000
Average result = 1 row (if no correlation)
Figure 4-16. Definition of Filter Factor
CF963.2
Notes:
When you estimate the elapsed time of a cursor, you must make an assumption about the size of the result table. So must the optimizer. The filter factor of a predicate is between 0 and 1. Normally, the filter factor depends on the contents of the table: when a female customer is added to a customer table, the filter factor of SEX = 'F' goes up. Some predicates, like COLX = COLX, are always true (filter factor=1), while others, like 0=1, are always false (filter factor=0). Predicates like these are sometimes used to influence the estimates of the optimizer. A simple predicate, like FNAME = :FNAME, has a filter factor and so does a compound predicate, like the one on the visual. The compound predicate filter factor is not always the product of the filter factors of the ANDed simple predicates. In our example, if each city has a unique set of first names, the filter factor of the compound predicate is 1/2000. This is not uncommon. Think, for instance, of WHERE MANUFACTURER = 'HONDA' AND MODEL = 'ACCORD'.
V3.1.0.1
Student Notebook
Uempty
Like you, the optimizer must also think about when the result will be materialized: at OPEN CURSOR or FETCH by FETCH. When a cursor contains OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY, the optimizer knows it will need to materialize only N rows if there is no workfile or temporary table (no sort...) in the access path.
4-23
Student Notebook
Reality versus Optimizer's Estimate
Filter factor Actual

SELECT COUNT ( ) WHERE Predicate
Optimizer's estimate
RUNSTATS
CATALOG
X%
Y%
Figure 4-17. Reality versus Optimizer's Estimate
CF963.2
Notes:
If you are familiar with the application, you have an idea of the filter factors. You can measure the filter factor with a SELECT COUNT(*) if the predicate refers to a value, like SEX = 'F'. For SEX = :SEX, you must find the cardinality (the number of different values) of column SEX to determine the average filter factor. The optimizer never issues SELECTs. Its filter factor estimates are based on statistics collected by the RUNSTATS utility. The optimizer knows, for instance, that the cardinality of SEX is 2. Obviously, if X and Y are far from each other, the optimizer may choose a wrong access path, no matter how good the cost formula is. RUNSTATS reads application tables and indexes, and stores statistics in the catalog, mainly these: Per table - Number of rows (column CARDF in catalog table SYSIBM.SYSTABLES)
V3.1.0.1
Student Notebook
Uempty
- Number of pages (NPAGESF in SYSIBM.SYSTABLES) Per index - Number of leaf pages (NLEAF in SYSIBM.SYSINDEXES) - Clusterratio: Percentage of table rows in the same order as the index, 100% for clustering index after table reorganization (CLUSTERRATIOF in SYSIBM.SYSINDEXES) - Number of different index key values (FULLKEYCARDF in SYSIBM.SYSINDEXES) Per column - Number of different values (cardinality, COLCARDF in SYSIBM.SYSCOLUMNS) Automatic for first index column (FIRSTKEYCARDF in SYSIBM.SYSINDEXES), optional for other columns - Second lowest and second highest value (LOW2KEY and HIGH2KEY in SYSIBM.SYSCOLUMNS) First 2000 bytes Automatic for first index column, optional for other columns - Most frequently occurring values and least frequently occurring values with their frequency (COLVALUE and FREQUENCYF in SYSIBM.SYSCOLDIST) Automatic for first index column Per group of columns (N columns concatenated), optional - Number of different values (CARDF in SYSIBM.SYSCOLDIST) - Most frequently occurring values and least frequently occurring values with their frequency (COLVALUE and FREQUENCYF in SYSIBM.SYSCOLDIST)
4-25
Student Notebook
Optimizer's Filter Factor Formulae
Predicate type COL = value COL IS NULL COL op value COL BETWEEN value1 AND value2 COL LIKE 'char%' COL IN (list)
Filter factor 1/COLCARDF 1/COLCARDF (H2 - value)/(H2 - L2) or (value - L2)/(H2 - L2) (value2 - value1)/(H2 - L2) similar to BETWEEN char||00 and char||FF list size x (1/COLCARDF)
Default filter factor 0.04 0.04 see next page see next page see next page list size x 0.04
. . .
op is any of the operators : > , >= , < , <= , > , <
H2 = second highest value for COL (HIGH2KEY in SYSIBM.SYSCOLUMNS) L2 = second lowest value for COL (LOW2KEY in SYSIBM.SYSCOLUMNS)
Figure 4-18. Optimizer's Filter Factor Formulae
CF963.2
Notes:
This chart shows the optimizer's filter factor formulae for some common predicates. They are not surprising. The only interesting column is default filter factor. These are used not only when RUNSTATS is forgotten (not likely), but also when a range predicate refers to a host variable, like BALANCE > :BALANCE. Most and least frequently occurring values are used when available and when possible. For instance, the estimate for SEX = 'F' is 99% if the optimizer knows that 99% of rows in NURSE table have value 'F' in column SEX. With a host variable, the estimate is 1/COLCARDF.
V3.1.0.1
Student Notebook
Uempty
Default Filter Factors for Range Predicates
If COLCARDF is ...
...then filter factor is BETWEEN, LIKE >,>=,<,<= 1/10,000 1/3000 1/1000 1/300 1/100 1/30 1/10 1/3 1 1/3
>= 100,000,000 >= 10,000,000 >= 1,000,000 >= 100,000 >= 10,000 >= 1000 >= 100 >= 2 =1 <= 0
3/100,000 1/10,000 3/10,000 1/1000 3/1000 1/100 3/100 1/10 1 1/10
Figure 4-19. Default Filter Factors for Range Predicates
CF963.2
Notes:
Which defaults would you use if you wrote an optimizer?
4-27
Student Notebook
Correlated Columns
WHERE FNAME = :FNAME AND CITY = :CITY
Filter factor =
1 500 1 1 500
Filter factor =
1 2000
Filter factor of compound predicate may be different from
2000 x
To get a more accurate filter factor for a compound predicate, collect the cardinality and the most and least frequently occurring values for the concatenation of FNAME and CITY. RUNSTATS ... TABLE(...) COLGROUP(FNAME,CITY) FREQVAL COUNT xx BOTH or RUNSTATS ... INDEX ... KEYCARD FREQVAL NUMCOLS 2 COUNT xx BOTH (if an index starting with FNAME and CITY exists)
Figure 4-20. Correlated Columns
CF963.2
Notes:
xx indicates the number of most and least frequently occurring values that RUNSTATS will collect. The most and least frequently occurring values will only be used by the optimizer if the predicate does not contain a host variable or if dynamic SQL or BIND REOPT(ALWAYS) is used.
V3.1.0.1
Student Notebook
Uempty
How to Help Optimizer with Filter Factor Problems

CPU overhead
BIND ... REOPT(ALWAYS) or dynamic SQL

Update catalog tables Add redundant predicates
AND COLX BETWEEN :LO AND :HI (to reduce estimated cost of an alternative) OR 0=1 (to make a predicate non-Boolean term) add 1 or CONCAT empty string (to make a predicate nonindexable)
dangerous
Optimization hints
Update PLAN_TABLE, BIND ... OPTHINT(...)
Figure 4-21. How to Help Optimizer with Filter Factor Problems
CF963.2
Notes:
1. The most elegant solution is to use actual values for the filter factor estimates (instead of the defaults). The overhead is difficult to predict. For simple SQL statements it may be a few milliseconds of CPU time. 2. This is hard to manage and therefore dangerous. A harmless-looking example is updating the number of levels in an index when DB2 chooses an index with fewer levels although another index would give index-only access. 3. Redundant predicates have been the standard solution before optimization hints became available. It is easier to manage than alternative 2, but not always possible. The redundant predicates may lose their expected effect when the optimizer is improved. 4. Optimization hints is the long-awaited veto option. The idea is to mark the wanted access path in PLAN_TABLE (the output of EXPLAIN) and then feed it back to the optimizer with a new BIND option, OPTHINT. Programs are not affected, but QUERYNO should be added to keep the hint active when program maintenance changes statement numbers.
4-29
Student Notebook
Filter Factor - Example

Accounting trace output:
LOCAL RESPONSE TIME
5min 12s
NON-SQL
SQL
Getpages (tables) Getpages (indexes) SR (tables) SR (indexes) Seq. prefetch requests SQL calls
: 20 429 : 362 : 8349 : 130 : 246 : 27
5min 11s
1s
15s
LOCK WAIT CPU TIME
4min 38s
SYNCHRONOUS READ AVG per page: 32.8ms WAIT FOR PREFETCH OTHER
Figure 4-22. Filter Factor - Example
CF963.2
Notes:
A huge number of pages is read from the disk subsystem some synchronously, some with prefetch by 27 SQL calls.
V3.1.0.1
Student Notebook
Uempty
Slow SQL Statement
SELECT (36 columns) FROM LETTER WHERE BO = :BO AND CNAME BETWEEN :LO AND :HI AND PICKED IN (' ', 'L') AND CNO >= :CNOPREV AND LNO > :LNOPREV ORDER BY CNAME OPTIMIZE FOR 23 ROWS
Figure 4-23. Slow SQL Statement
CF963.2
Notes:
Several statements in this program were SELECT COUNTs whose access paths were clustered index scans with data reference. They caused a large number of sequential touches. A few columns had to be added to the current index to eliminate table touches. For this statement, the optimizer had chosen index (PICKED,CNO), which seems strange. Index (BO,CNAME) is the clustering index and, furthermore, it would prevent the sort for ORDER BY CNAME.
4-31
Student Notebook
Current Indexes (in Addition to Primary Key Index)

0 star: - MC=2 (should be 3) - SORT=Y - INDEXONLY=N 1 star: - MC=2 (should be 3) - SORT=N - INDEXONLY=N
C BO, CNAME
PICKED, CNO 24,000 leaf pages
23,000 leaf pages
LETTER 8,400,000 rows 660,000 pages
Figure 4-24. Current Indexes (in Addition to Primary Key Index)
CF963.2
Notes:
Table LETTER has three indexes. Only the two shown on the visual are relevant for our SQL statement.
V3.1.0.1
Student Notebook
Uempty
Average Filter Factors (Actual versus Optimizer's Estimate)

Actual PICKED IN (' ','L') CNO >= :CNOPREV CNAME BETWEEN ... BO = :BO 1/4000 often 1 often 1 1/622 Optimizer's Estimate 1/4000 1/1000 1/10 1/622
Optimizer thinks PICKED, CNO is very selective COLCARDF(CNO) = 2M COLCARDF(CNAME) = -1

Shows that RUNSTATS statistics have never been collected for this column
Figure 4-25. Average Filter Factors (Actual versus Optimizers Estimate)
CF963.2
Notes:
The cardinality (COLCARDF in SYSIBM.SYSCOLUMNS) of BO is 622. PICKED has only five different values, all known to the optimizer (most or least frequently occurring values). Values ' ' and 'L' are rare (and the optimizer knows it). The actual filter factors for the two other predicates are often 1, as these are optional input fields. As these two predicates are range predicates containing host variables, the optimizer must use default filter factor values (see figure 4-19). The only input to the optimizer in this case is the cardinality of the columns (COLCARDF in SYSIBM.SYSCOLUMNS), shown on the visual. By referring to figure 4-19, a cardinality of 2 millions leads to a default filter factor of 1/1000. For column CNAME, the cardinality shows a value of -1 in the catalog. This value shows that RUNSTATS statistics have never been collected for this column. By referring again to figure 4-19, the default filter factor in this case is 1/10. The filter factor for LNO > :LNOPREV is of no interest for our example, as LNO is not present in any index.
Copyright IBM Corp. 2000, 2005 Unit 4. Learning to Live with Optimizer 4-33
Student Notebook
VQUBEs with Average Filter Factors (Actual versus Optimizer's Estimate)

Index used Actual VQUBE
TS (index): 8,400,000 / (4000x1) = 2100 TR (table): 8,400,000 / (4000x1) = 2100 LRT = 21s TS (index): 8,400,000 / (622x1) = 13,505 TS (table): 8,400,000 / (622x1) = 13,505 LRT = 0.54s
Optimizer's "VQUBE"
TS (index): 8,400,000 / (4000x1000) = 2.1 TR (table): 8,400,000 / (4000x1000) = 2.1 LRT = 21ms TS (index): 8,400,000 / (622x10) = 1350 TS (table): 8,400,000 / (622x10) = 1350 LRT = 54ms
CF963.2
PICKED, CNO
BO, CNAME
Figure 4-26. VQUBEs with Average Filter Factors (Actual versus Optimizers Estimate)
Notes:
8,400,000 is the number of rows in table LETTER. The touches on table LETTER are random for index PICKED, CNO, as this is not the clustering index. They are sequential for index BO, CNAME, as this is the clustering index. The first and only TR on the indexes and, for index BO, CNAME, on table LETTER has been ignored, as these 10ms do not change anything to the estimates. The actual estimates show that index BO, CNAME is, by far, the better index (0.54s versus 21s). But the optimizer estimates show that index PICKED, CNO is the better one (21ms versus 54ms). So, the optimizer will use this index. The main reason for the optimizers bad estimates is the huge difference between the actual filter factor and the estimated filter factor for column CNAME. The measured values with accounting traces (local response time = 5min 12s) is by far higher than the estimated 21s, because our estimates are based on average filter factors. The measured values were worst case values.
V3.1.0.1
Student Notebook
Uempty
How To Help the Optimizer

BIND ... REOPT(ALWAYS) or dynamic SQL ==> much better filter factor estimates (will be close to actual if RUNSTATS statistics up to date) Update COLCARDF for CNO to 1 ==> optimizer's filter factor estimate now 1 Make CNO >= :CNOPREV nonindexable example: CNO || >= :CNOPREV ==> access path via PICKED,CNO: MC = 1 Optimization hints Create the best possible index for this SQL statement
Figure 4-27. How To Help the Optimizer
CF963.2
Notes:
The inconsistent use of RUNSTATS (cardinality for CNO was collected, cardinality for CNAME was not collected) contributed to the wrong index choice. Fixing that could be enough to make the optimizer choose the better index.
4-35
Student Notebook
Learn to Live with Optimizer

1
Writing SQL
Understand nonindexable and non-Boolean term predicates
EXPLAIN
Check: Index used, matching columns, sort, index-only
Best access path chosen? (VQUBE, actual filter factors) If not, check predicates (nonindexable, non-Boolean term?) If OK, analyze filter factors (VQUBE, estimated filter factors)
Figure 4-28. Learn To Live with Optimizer
CF963.2
Notes:
Anyone who writes SQL in a professional role should understand the concepts of nonindexable and non-Boolean term, and also the pitfalls discussed later in this unit. The application developer should do EXPLAIN as soon as a realistic test database is available. This will reveal simple errors early. This applies also to SQL generated by a tool.
V3.1.0.1
Student Notebook
Uempty
4.4 Join Issues

After completing this topic, you should be able to: Differentiate between the join methods and join types available to DB2 Identify how to select optimal indexes for joins and subqueries
4-37
Student Notebook
3 Join Methods, 2 Join Types

Methods
Nested loop Merge scan Hybrid
EXPLAIN : METHOD
Types
Inner join All three methods can be used Outer join Full Always merge scan Right or left Never hybrid
EXPLAIN : JOIN_TYPE
Figure 4-29. 3 Join Methods, 2 Join Types
CF963.2
Notes:
Nested loop is the most common join method. Merge scan may be faster than nested loop if a join predicate index is missing or if the result table is large. Hybrid join is essentially nested loop with list prefetch on the inner table.
V3.1.0.1
Student Notebook
Uempty
Merge Scan Join
OUTER TABLE
A A C D D G G G . . . . . .
INNER TABLE
B C E E E G G H . . . . . .
MERGE
SCAN
SCAN
I N D E X
RESULT TABLE
I N D E X
TWO ORDERED SETS DEVELOPED FOR MERGE PASS INDEX OR RDS SORT MAY BE USED ON EITHER TABLE ONE MERGE PASS ONLY
Figure 4-30. Merge Scan Join
CF963.2
Notes:
Merge scan finds the qualifying rows from both tables, sorts by join column if necessary, and then merges the two row sets. The inner table is always materialized in a workfile, even if there is no sort. Otherwise, there is no difference between the outer and the inner table.
4-39
Student Notebook
Nested Loop Join
OUTER TABLE
INNER TABLE
SCAN
SCAN SCAN SCAN
I N D E X
I N D E X
SINGLE SCAN OF OUTER TABLE REPETITIVE SCANS OF INNER TABLE INDEX MAY BE USED TO ACCESS EITHER TABLE
RESULT TABLE
Figure 4-31. Nested Loop Join
CF963.2
Notes:
When the optimizer chooses nested loop, DB2 first finds one qualifying row from one table (the outer table), and then the related rows from the other table. The optimizer chooses the outer table based on the cost estimates of the alternatives. Nested loop is the most common join method in transactions. Nested loop is efficient when the result is small, the indexes good, and the optimizer chooses the best table order. The choice about outer and inner table is important. Basically, fewer accesses to the inner table will give better performance if the needed indexes are available. This is why the better outer table is the one with the fewest qualifying rows in most cases.
V3.1.0.1
Student Notebook
Uempty
How to Estimate Joins
VQUBE:
If full outer join, the method will always be merge scan. If left or right outer join, assume nested loop. The outer table will be the left or right table respectively. If inner join, assume nested loop. Assume the outer table to be the one with the fewest qualifying rows. For all cases, count TRs and TSs as with simple selects.
Figure 4-32. How to Estimate Joins
CF963.2
Notes:
If the result is large and nested loop slow, assume merge scan. The number of qualifying rows is the number of rows left when the local predicates to that table have been applied. The rule of thumb for inner joins predicts the table order correctly in most cases, but not always. The optimizer does not use a simple rule like this; it estimates the cost of each alternative. In VQUBE, a join and a program with several cursors seem equally fast, because the number of SQL calls is not taken into account. Actually, a join consumes less CPU time if the access paths are identical.
4-41
Student Notebook
Join Example
P,C
X1
X2
X3
X4
X5
CUSTNO
CUSTZIP
ORDERNO
CUSTNO
ORDERDATE
CUST
1000 rows SELECT FROM WHERE
ORDER
2000 rows C.CUSTNO, CUSTLASTNAME, CUSTZIP, ORDERNO, TOTAL$_ITEMS, ORDERDATE CUST C, ORDER O C.CUSTNO = O.CUSTNO AND CUSTZIP BETWEEN :HV1 AND :HV2 AND ORDERDATE BETWEEN :HV3 AND :HV4
1st 1st
(5%) (90%)
NESTED LOOP OUTER TABLE = CUST OUTER TABLE = ORDER
2nd
2nd
= table TOUCHES
index
table
index
+ +
+ +
+ +
TOUCHES
Figure 4-33. Join Example
CF963.2
Notes:
When both tables in a two-table join have a local predicate, it is not obvious which table should be the outer one. You can use the number of qualifying rows rule of thumb or, for a more reliable prediction, VQUBE. The relationship between tables CUST and ORDER is one-to-many. On average, there are two ORDER rows per one CUST row. Assuming nested loop, which table would you choose as the outer table?
V3.1.0.1
Student Notebook
Uempty
But Optimizer Chose ORDER
Filter factor
Optimizer's estimate
Actual
CUSTZIP BETWEEN.. ORDERDATE BETWEEN..
5% 90%
Assume: COLCARDF (CUSTZIP) = 50 COLCARDF (ORDERDATE) = 500

Figure 4-34. But Optimizer Chose ORDER
CF963.2
Notes:
A common problem: The optimizer's estimates for the filter factors of the range predicates with host variables (without REOPT(ALWAYS)) are not very good; the optimizer cannot know at bind time what will be moved to the host variables at execution time.
4-43
Student Notebook
Optimal Indexes for Joins and Subqueries
Table access order affects index requirements Indexes influence table access order decision
Table A
Table B
1 Assume best indexes for all alternatives 2

3
Find the best alternative Design best indexes for that alternative
Figure 4-35. Optimal Indexes for Joins and Subqueries
CF963.2
Notes:
The number of qualifying rows rule of thumb assumes the best possible indexes.
V3.1.0.1
Student Notebook
Uempty
Optimal Indexes for Joins: Example
SELECT ... FROM WHERE
CUST, ACC CX BETWEEN ... AND AX BETWEEN... AND CUST. CUSTNO = ACC. CUSTNO
X1
X2
X3
X4
X5
X6
CUSTNO
CX, ...
CUSTNO,
...
ACCNO
CUSTNO,
...
AX, ...
CUST
CUST, ACC ACC, CUST X2 and X5 important X6 and X3 important
ACC
Figure 4-36. Optimal Indexes for Joins: Example
CF963.2
Notes:
Assume we currently have only the primary key indexes (X1 and X4) and the foreign key index (X5). In the first case (CUST is the outer table), we would add index X2 and enough columns to X5 to get index-only access. In the second case, we would add indexes X6 and X3. X1 is a primary key index, so no columns should be added to it.
4-45
Student Notebook
How to Predict the Best Table Order
Nested Loop Join with no ORDER BY: The table with the lowest number of qualifying rows should probably be the outermost table
EXPLAIN: PLANNO
Figure 4-37. How to Predict Best Table Order
CF963.2
Notes:
If ORDER BY refers to only one table, that table should be the outermost table. If the above considerations conflict, you should do a VQUBE to predict the best table order, or maybe create indexes for both or all alternatives and check PLANNO (table access number, 1 refers to the outermost table) in EXPLAIN output. If ORDER BY refers to more than one table, sort cannot be avoided.
V3.1.0.1
Student Notebook
Uempty
Join Pitfall
ORDER BY referring to inner table in nested loop join results in SORT

Even with perfect indexes
Figure 4-38. Join Pitfall
CF963.2
Notes:
Anyone writing SQL professionally should know this. If the sort is not acceptable, the join must be replaced by two or more cursors.
4-47
Student Notebook
V3.1.0.1
Student Notebook
Uempty
4.5 Lab 5: Joins

This is an important lab. Joins are often slow because of inadequate indexing.
4-49
Student Notebook
Lab 5: Joins
For customers whose names begin with a specific string of characters, say, 'SMIT' we are looking for large account balances, say, those greater than 20
SELECT FROM WHERE
CUSTNAME, CUST.CUSTNO, ACCNO, BALANCE ACCOUNT, CUST ACCOUNT.CUSTNO = CUST.CUSTNO AND CUSTNAME LIKE :CN FF = 1% AND BALANCE > :BAL FF = 0.5% ORDER BY CUSTNAME, CUST.CUSTNO
Figure 4-39. Lab 5: Joins
CF963.2
Notes:
LIKE :CN is indexable if the content of the host variable does not start with a special character (% or _) and if column CUSTNAME does not have a fieldproc. If you use fieldprocs to, say, sort national characters, replace LIKE by BETWEEN.
V3.1.0.1
Student Notebook
Uempty
Lab 5: ACCOUNT Table and CUST Table

P X1 C X2 P,C X3
ACCNO
CUSTNO
CUSTNO
ACCOUNT
3,000,000 rows Columns: ACCNO (Primary key) CUSTNO (Foreign key) BALANCE
CUST
1,000,000 rows Columns: CUSTNO (Primary key) CUSTNAME
....
Currently, the tables have only the basic, recommended indexes: ACCOUNT has a primary key index X1 on ACCNO and a foreign key index X2 on CUSTNO CUST has a primary key index X3 on CUSTNO
Figure 4-40. Lab 5: ACCOUNT Table and CUST Table
....
CF963.2
Notes:
The tables are normalized. There are no redundant columns.
4-51
Student Notebook
1. Assume a nested loop join a. How many qualifying rows will there be from the ACCOUNT table? b. How many qualifying rows will there be from the CUST table? c. In which sequence would you access the 2 tables? Hint: Assume the table with the fewer qualifying rows to be the outer table 2. Assume CUST to be the outer table and improve the performance of the JOIN as follows: a. Add a suitable index to CUST and b. Improve an existing index on ACCOUNT 3. Do a VQUBE for the JOIN with the improved indexes: a. For the total result set b. For the first screen
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
4. Repeat question 2 but assume ACCOUNT to be the outer table and improve the performance of the JOIN as follows: a. Add a suitable index to ACCOUNT and b. Add a suitable index to CUST 5. Do a VQUBE for the JOIN with the improved indexes 6. Is the performance of the JOIN sufficient with these improved indexes? If not, what can you do? 7. Denormalize the ACCOUNT table by adding CUSTNAME and amend the query. 8. For this denormalized table and the amended query, design the best possible index and do the VQUBE
CF963.2
Notes:
4-53
Student Notebook

CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
2. Add columns from ORDER BY or GROUP BY, excluding the columns from step 1
3. Add the remaining columns in the statement
CF963.2
Notes:
4-55
Student Notebook
V3.1.0.1
Student Notebook
Uempty
4.6 Subquery Issues

After completing this topic, you should be able to: Differentiate between correlated and noncorrelated subqueries Describe implications of noncorrelated subqueries that return a single value versus those that return multiple values
4-57
Student Notebook
Two Types of Subquery
NONCORRELATED SUBQUERY
no link between outer query and subquery
2
OUTER QUERY
SUBQUERY
WORKFILE repetitive workfile scans
CORRELATED SUBQUERY
outer query and subquery linked by a correlation value
1
OUTER QUERY
SUBQUERY repetitive subquery executions

Figure 4-45. Two Types of Subquery
CF963.2
Notes:
This is another very important visual. Anyone writing subqueries professionally should understand and remember the difference between correlated and noncorrelated subqueries. When you write a join, the optimizer may choose the join method and the table order in many cases. This is not the case with subqueries. That is why the optimizer sometimes converts your subquery into a join before choosing the access path.
V3.1.0.1
Student Notebook
Uempty
Noncorrelated Subquery (Single Value)
All orderitems with QUANTORD greater than average

SELECT ORDERNO FROM ORDERITEM WHERE QUANTORD > (SELECT AVG(QUANTORD) FROM ORDERITEM)
1 Execute subquery once - result = single value 2 Execute outer query
VQUBE = subquery + outer query
Figure 4-46. Noncorrelated Subquery (Single Value)
CF963.2
Notes:
If the workfile consists of a single row, its processing cost can be ignored.
4-59
Student Notebook
Noncorrelated Subquery (Multiple Values)

All customers who do not have any orders
SELECT FROM WHERE CUSTNO, CUSTLASTNAME CUST CUSTNO NOT IN (SELECT CUSTNO FROM ORDER)
1 Execute subquery 2 Save result in workfile (sorted, duplicates removed) and build sparse index 3 Execute outer query 4 For every row from outer query, scan workfile through sparse index and apply IN/ALL/ANY predicate (scan stops as soon as predicate evaluation (true/false) is known)
VQUBE: For each scan of workfile, assume TS=100
Figure 4-47. Noncorrelated Subquery (Multiple Values)
CF963.2
Notes:
The sparse index built by DB2 for the workfile is a special one-level index with a fixed number of entries. Each index entry contains the highest value in one part of the workfile. The proposed 100 sequential touches is a safe estimate for accessing the workfile via the sparse index.
V3.1.0.1
Student Notebook
Uempty
Correlated Subquery
All customers having at least one order bigger than a given limit
SELECT FROM WHERE CUSTNO, CUSTLASTNAME, CUSTFIRSTNAME CUST X EXISTS (SELECT 'X' FROM ORDER WHERE CUSTNO = X.CUSTNO AND TOTAL$_ITEMS > :HV)
1 2
Execute outer query

For every qualifying row, execute subquery (EXISTS stops as soon as successful)
VQUBE = outer query + N x subquery N = number of qualifying rows from outer query
Figure 4-48. Correlated Subquery
CF963.2
Notes:
No workfile, no new VQUBE rules. Often the same query can be written as a correlated or noncorrelated subquery. Sometimes the former is faster, sometimes the latter. It seems, however, that with good indexes the correlated subquery is more often the faster alternative.
4-61
Student Notebook
EXPLAIN and Subquery
Execution order not shown in EXPLAIN
Check SQL statement or EXPLAIN: Correlated or noncorrelated?

EXPLAIN : QBLOCK_TYPE
Figure 4-49. EXPLAIN and Subquery
CF963.2
Notes:
EXPLAIN does not show the execution sequence of a subquery. The execution sequence is not the same as the order of rows in the PLAN_TABLE. EXPLAIN shows the type of a subquery (QBLOCK_TYPE is CORSUB or NCOSUB). Then apply the very important visual (figure 4-45): noncorrelated starts from the bottom, correlated from the top.
V3.1.0.1
Student Notebook
Uempty
4.7 Lab 6: Different Implementations of the Same Transaction
4-63
Student Notebook
Lab 6: Description
Program
CUSTZIP ORDERDATE
1. 1 cursor and 1 singleton select 2. Join 3. Correlated subquery 4. Noncorrelated subquery ORDER CUST
CUSTNO CUSTPHONE CUSTLASTNAME
5. 1 cursor and 2 singleton selects
Input
CUSTZIP (in host variable :HVCUSTZIP) ORDERDATE (in host variable :HVORDERDATE)
Requirement
Show CUSTNO, CUSTPHONE and CUSTLASTNAME for all the customers in one area (in other words, one CUSTZIP) who have orders older than a given date
Assumptions
50 customers on average per CUSTZIP 20 orders on average per customer 10% of customers have at least one old order (average 4 old orders)
Figure 4-50. Lab 6: Description
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 6: Available Indexes
P,C
X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
ORDER
1,000,000 rows 20,000 pages
Figure 4-51. Lab 6: Available Indexes
CF963.2
Notes:
4-65
Student Notebook
Lab 6: At A Glance
One CUSTZIP
1,000 CUSTZIPs 50 customers on average per CUSTZIP 20 orders per customer 10% customers have at least one old order Customers with old orders have average 4 old orders
20 20 20 20 20 20 20 20 20 Orders Orders Orders Orders Orders Orders Orders Orders Orders
20 Orders including 4 old orders 20 Orders including 4 old orders 20 Orders including 4 old orders 20 Orders including 4 old orders 20 Orders including 4 old orders
WHERE CUSTZIP = :HVCUSTZIP

50,000 customers 50 customers per CUSTZIP Filter Factor=50 / 50,000 = 0.001
WHERE ORDERDATE < :HVORDERDATE

10% customers have average 4 old orders 5000 customers have 20,000 old orders 1,000,000 orders in total Filter Factor=20,000 / 1,000,000=0.02
Figure 4-52. Lab 6: At A Glance
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 6: Ideal Access Path (1 of 2)

Ideally, we would want to access the CUST table only when we know a customer has an old order Mark up the diagram with the ideal access path and do the VQUBE
P,C X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
MC INDEX TR TS
ORDER
1,000,000 rows 20,000 pages
TABLE TR TS LRT
Figure 4-53. Lab 6: Ideal Access Path (1 of 2)
CF963.2
Notes:
4-67
Student Notebook
Lab 6: Ideal Access Path (2 of 2)

This would be the ideal access path but optimizer is not able to generate access path like this With multiple index access, indexes must point to same table
P,C X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
MC INDEX TR TS 1 50 50 - TABLE LRT TR TS - 0.011s - 0.500s - 0.050s 5 0.561s
ORDER
1,000,000 rows 20,000 pages
X2 X4 CUST
1 2 -
For each of the following 5 implementations, do the VQUBE and count SQL statements Which comes closest to the ideal case?
CF963.2
Figure 4-54. Lab 6: Ideal Access Path (2 of 2)
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet

CURSOR X: SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME FROM CUST WHERE CUSTZIP = :HVCUSTZIP SELECT 1 FROM ORDER WHERE CUSTNO = :HVCUSTNO AND ORDERDATE < :HVORDERDATE FETCH FIRST ROW ONLY
X1
SQL Y:
P,C
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
OPEN X
ORDER
1,000,000 rows 20,000 pages
FETCH X MOVE CUSTNO TO :HVCUSTNO execute Y If SQLCODE = 0 add customer to result
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 4-55. Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet
CF963.2
Notes:
4-69
Student Notebook
Lab 6: PGM 2 - Join Worksheet

SELECT DISTINCT CUSTNO, CUSTPHONE, CUSTLASTNAME FROM CUST, ORDER WHERE CUST.CUSTNO = ORDER.CUSTNO AND CUSTZIP = :HVCUSTZIP AND ORDERDATE < :HVORDERDATE
P,C X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
OPEN X
ORDER
1,000,000 rows 20,000 pages
MC
FETCH X
INDEX TR TS
TABLE TR TS
LRT
CLOSE X
Why the DISTINCT? What are the filter factors of the local predicates? Which table will be the outer one assuming a nested loop join?
Figure 4-56. Lab 6: PGM 2 - Join Worksheet
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 6: PGM 3 - Correlated Subquery Worksheet

SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME FROM CUST X WHERE CUSTZIP = :HVCUSTZIP AND EXISTS ( SELECT 'X' FROM ORDER WHERE CUSTNO = X.CUSTNO AND ORDERDATE <:HVORDERDATE)
P,C X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
ORDER
1,000,000 rows 20,000 pages
OPEN X
MC
FETCH X
INDEX TR TS
TABLE TR TS
LRT
CLOSE X
Figure 4-57. Lab 6: PGM 3 - Correlated Subquery Worksheet
CF963.2
Notes:
4-71
Student Notebook
Lab 6: PGM 4 - Noncorrelated Subquery Worksheet

SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME FROM CUST WHERE CUSTZIP = :HVCUSTZIP AND CUSTNO IN (SELECT CUSTNO FROM ORDER WHERE ORDERDATE < :HVORDERDATE)
P,C X1
X2
P,C
X3
X4
X5
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
ORDER
1,000,000 rows 20,000 pages
Workfile (CUSTNOs)
OPEN X
MC
INDEX TR TS
TABLE TR TS
LRT
FETCH X
CLOSE X
Figure 4-58. Lab 6: PGM 4 - Noncorrelated Subquery Worksheet
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet

CURSOR X: SQL Y: SELECT CUSTNO FROM CUST WHERE CUSTZIP = :HVCUSTZIP SELECT 1 FROM ORDER WHERE CUSTNO = :HVCUSTNO AND ORDERDATE <:HVORDERDATE FETCH FIRST ROW ONLY
U
X2 P,C X3 X4 X5
P,C
X1
CUSTNO
CUSTZIP, CUSTNO
ORDERNO
CUSTNO, ORDERDATE
ORDERDATE, CUSTNO
CUST
OPEN X
ORDER
1,000,000 rows 20,000 pages
FETCH X INTO :HVCUSTNO execute Y IF SQLCODE = 0 THEN SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME FROM CUST WHERE CUSTNO = :HVCUSTNO add customer to result
MC
INDEX TR TS
TABLE TR TS
LRT
Figure 4-59. Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet
CF963.2
Notes:
4-73
Student Notebook
V3.1.0.1
Student Notebook
Uempty
4.8 Union Issues

After completing this topic, you should be able to: Avoid three significant performance pitfalls related to UNION operations
4-75
Student Notebook
UNION
SELECT ... FROM WHERE UNION SELECT ... FROM WHERE ORDER BY
... ... or UNION ALL ... ... ...
UNION
Sort to eliminate duplicates
PITFALL 1
UNION ALL
No sorting (duplicates allowed)
Both Cases
PITFALL 2 One select at a time (table may be scanned several times) ORDER BY always results in an additional sort PITFALL 3
Figure 4-60. UNION
CF963.2
Notes:
UNION is a simple operation, but there are three significant performance pitfalls. The third one is a common cause for disappointments. If both UNION (without ALL) and ORDER BY are specified, DB2 will do only one sort if the sort requirements for both clauses can be merged.
V3.1.0.1
Student Notebook
Uempty
4.9 Lab 7: UNION
4-77
Student Notebook
Lab 7: UNION
Find all small and large orders

SELECT ORDERNO, ORDERDATE, TOTAL$_ITEMS FROM ORDER WHERE TOTAL$_ITEMS < :HVSMALL UNION SELECT ORDERNO, ORDERDATE, TOTAL$_ITEMS FROM ORDER WHERE TOTAL$_ITEMS > :HVLARGE ORDER BY TOTAL$_ITEMS
Assumptions: 2% of rows qualify for the first predicate 1% of rows qualify for the second predicate
To do: 1. VQUBE 2. Improve SQL, indexes, or both
Figure 4-61. Lab 7: UNION
CF963.2
Notes:
Why would anyone write a complicated cursor like this instead of a single SELECT with OR? Of course, to avoid non-Boolean term predicates.
V3.1.0.1
Student Notebook
Uempty
Lab 7: Current Table and Indexes
P, C ORDERNO
X1
X2 CUSTNO, ORDERNO
X3 TOTAL$_ITEMS
ORDER
100,000 rows 2000 pages
MC
INDEX TR TS
TABLE 20.2s LRT TR TS 0.2s

20s
Figure 4-62. Lab 7: Current Table and Indexes
CF963.2
Notes:
4-79
Student Notebook
Two Issues
Optimizer does not always see the best alternative

Nonindexable Non-Boolean term Subqueries ORDER BY in complex statement
Optimizer's estimates not always accurate enough

Filter factors Buffer pool hit ratios
Figure 4-63. Two Issues
CF963.2
Notes:
The optimizer gets smarter and smarter but it will never be perfect; these two issues will not go away. This is the price we have to pay for the flexibility of relational databases. Compared to non-relational databases without an optimizer, relational databases are very forgiving: many unplanned changes can be made to the physical structure (like indexes) without touching the application programs.
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Key points: Nonindexable predicates Stage 2 predicates Non-Boolean term predicates Actual filter factor versus optimizer's estimate Joins, subqueries, unions
CF963.2
Notes:
4-81
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 5. Unpredictable Transactions

This unit is about the access path issues (index design, optimizer) caused by optional input fields and star joins.

After completing this unit, you should be able to: Design good cursors and indexes for a transaction with optional input fields Describe the problems the index designer and the optimizer face with star joins
References
5-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR 'HVLJQ JRRG FXUVRUV DQG LQGH[HV IRU D WUDQVDFWLRQ ZLWK RSWLRQDO LQSXW ILHOGV 'HVFULEH WKH SUREOHPV WKH LQGH[ GHVLJQHU DQG WKH RSWLPL]HU IDFH ZLWK VWDU MRLQV
CF963.2
Notes:
5-2
V3.1.0.1
Student Notebook
Uempty
5.1 Optional Input Fields
5-3
Student Notebook
0DQ\ &ULWHULD 2QO\ D )HZ 6HOHFWHG
$ % & '
Figure 5-2. Many Criteria, Only a Few Selected
CF963.2
Notes:
The user may enter only one field or any combination. The table is a million-row table. Efficient indexing is required because a table scan takes too long.
5-4
V3.1.0.1
Student Notebook
Uempty
%HVW 6ROXWLRQ
:+(5(
$ %(7:((1 $ $1' $ $1' % %(7:((1 % $1' % $1' & %(7:((1& $1' & $1' ' %(7:((1' $1' '
3URJUDP PRYHV ORZ YDOXH WR $ DQG KLJK YDOXH WR $ LI XVHU OHDYHV LQSXW ILHOG $ EODQN %,1' 5(237$/:$<6 RU G\QDPLF 64/
Figure 5-3. Best Solution
CF963.2
Notes:
This cursor produces the correct result for any input, but the same access path is used every time, assuming that the SQL is static and bind option REOPT(ALWAYS) is not used. Which access path would the optimizer choose? If there was an index for each input field, the optimizer would choose either a matching index scan (MC=1) via the index with the highest cardinality (the assumed filter factor for that index would be low), or a multiple index access. In both cases, the access path would have one million touches (assuming the table has one million rows) whenever the input did not match the chosen access path. The response time would often be too long. REOPT(ALWAYS) or dynamic SQL enables DB2 to choose the access path according to the input. The optimizer sees which predicates do no filtering (filter factor=1), and it is able to derive a fairly good filter factor estimate for the others, based on LOW2KEY, HIGH2KEY, and the least or most frequently occurring values. The problem is now reduced to designing adequate indexes for any input. If you want to avoid the overhead of access path selection at each execution, you must write a cursor for every index, and choose the right cursor for each input in the application program.
Copyright IBM Corp. 2000, 2005 Unit 5. Unpredictable Transactions 5-5
Student Notebook
2QH &XUVRU 2QH $FFHVV 3DWK

0&
$%&'
%$&'
&$%'
'$%&
7
6WDWLF 64/ ZLWKRXW 5(237$/:$<6 2SWLPL]HU FKRRVHV DFFHVV SDWK DW ELQG WLPH 'HIDXOW ILOWHU IDFWRUV IRU UDQJH SUHGLFDWHV
Figure 5-4. One Cursor, One Access Path
CF963.2
Notes:
If you choose the best solution (the cursor on the previous visual), but without REOPT(ALWAYS), creating several indexes for the query is wishful thinking. The same index (say, the shaded one) will be used every time until next bind or rebind. If the user enters data in fields C and D, this would be a fairly good access path, especially if filter factor of (C BETWEEN) < filter factor of (D BETWEEN). Matching columns = 1 Number of index touches = filter factor of (C BETWEEN) multiplied by number of rows in table Index screening for D BETWEEN Number of table touches = number of qualifying rows Without REOPT(ALWAYS), the optimizer may sometimes choose the wrong index like D,A,B,C in this case but that would only increase the number of sequential index touches. An index starting with shoe size is not likely to be selected if there are two predicates, because the assumed filter factor for a BETWEEN is 10% if the cardinality of the column is between 2 and 100 (see figure 4-19). REOPT(ALWAYS) enables the optimizer to choose the index according to user input.
V3.1.0.1
Student Notebook
Uempty
:LWKRXW 5(237$/:$<6
$%&'
%$&'
&$%'
'$%&
7
:ULWH IRXU FXUVRUV (QVXUH HDFK FXUVRU XVHV GLIIHUHQW LQGH[ &KRRVH 23(1 &85625 LQ SURJUDP EDVHG RQ XVHU LQSXW 0DWFKLQJ FROXPQV EXW ILOWHU IDFWRU QHYHU
Figure 5-5. Without REOPT(ALWAYS)
CF963.2
Notes:
Without REOPT(ALWAYS), you face two non-trivial problems: 1. You must ensure that each cursor uses the intended index. Optimization hint is one way to accomplish this. 2. You have to analyze user input in the application program and open the appropriate cursor. If each index contains all the search fields, the number of random table touches is equal to the number of result rows. If that number is too high, add columns to the indexes to get index-only access. If matching columns = 1 results in too many index touches, add cursors with equal predicates if you do not use REOPT(ALWAYS). With REOPT(ALWAYS), DB2 treats a BETWEEN :valuex AND :valuex like an equal predicate.
5-7
Student Notebook
5-8
V3.1.0.1
Student Notebook
Uempty
5.2 Star Join
5-9
Student Notebook
6WDU 6FKHPD
',0 ',0 ',0
)$&7
',0 ',0
',0
Figure 5-6. Star Schema
CF963.2
Notes:
This schema is common in data warehouse applications. The user may see the data as an n-dimensional cube. The fact table is normally much larger than any of the dimension tables. A typical query refers to a few attributes in different dimension tables, and asks for sums or averages from fact rows related to these attributes. The basic problem is the same as in the previous scenario unpredictable user input but the performance problems are much more difficult, for two reasons: A typical query may need to read millions of fact rows. Most queries are joins referring to several dimension tables which do not have any common columns.
V3.1.0.1
Student Notebook
Uempty
6WDU -RLQ
6(/(&7 )520 :+(5( 680 ,7(0 6725( 6$/(6 ,7(0,7(012 6$/(6,7(012 $1' 6725(6725(12 6$/(66725(12 $1' ,7(0*5283 $1' 6725(=,3

*5283 %<
,7(0
URZV
6725(
URZV
6$/(6
URZV
Figure 5-7. Star Join
CF963.2
Notes:
This is a very simple star join, yet it can be very slow.
5-11
Student Notebook
7DEOH 2UGHU &UXFLDO
1RUPDOO\ LPSRUWDQW WR DFFHVV DOO GLPHQVLRQ WDEOHV EHIRUH IDFW WDEOH 2SWLPL]HU PD\ FKRRVH ZURQJ WDEOH RUGHU EHFDXVH RI LQFRUUHFW ILOWHU IDFWRU HVWLPDWHV
Figure 5-8. Table Order Crucial
CF963.2
Notes:
In most cases, an access path reading fact rows before evaluating all dimension table predicates would be very slow. With dynamic SQL or REOPT(ALWAYS), the optimizer is likely to choose the correct table order. If it does not, you must help the optimizer. One alternative is to write or generate several cursors: first process the dimension cursors, then the fact cursor.
V3.1.0.1
Student Notebook
Uempty
7ZR $OWHUQDWLYHV
3 3
,7(012
,7(0*5283
6725(12
6725(=,3
,7(0
URZV
6725(
URZV
3&
7,0( ,7(012 6725(12
,7(012 6725(12
6725(12
6$/(6
6$/(6
URZV
75 75

Figure 5-9. Two Alternatives
CF963.2
Notes:
The first access path (the solid line) touches 10,000,000 SALES rows. This would take 10,000,000 x 10ms = 28 hours, according to VQUBE. The second access path touches only 100,000 SALES rows. VQUBE local response time = 17 minutes. How does DB2 join the two dimension tables (ITEM and STORE) in the second access path? After all, these tables do not have any common columns. DB2 does a Cartesian join: it builds all combinations of the 100 qualifying STORENOs and the 1000 qualifying ITEMNOs. The result is a workfile with 100,000 rows. This workfile is then compared against the ITEMNO,STORENO index of the SALES table.
5-13
Student Notebook
)DFW 7DEOH ,PSRUWDQW 3RLQWV

'HQRUPDOL]H GLPHQVLRQ WDEOHV LQWR IDFW WDEOH 'HILQH PDQ\ LQGH[HV WR IDFW WDEOH DOORZLQJ LQGH[RQO\ DFFHVV SHUPXWDWLRQV RI GLPHQVLRQ WDEOH NH\V &RQVLGHU GLIIHUHQFH RI
,7(012 ,7(012 6725(12 DQG ,7(012 6725(12
LQ WKH H[DPSOH (DFK LQGH[ DOORZLQJ LQGH[RQO\ DFFHVV PD\ EH ODUJHU WKDQ WKH WDEOH LQGH[HV QRW FRPSUHVVHG $OWHUQDWLYH 0DWHULDOL]HG TXHU\ WDEOHV
Figure 5-10. Fact Table: Important Points
CF963.2
Notes:
The ITEMNO,STORENO is a relatively good index for our example. DB2 does not need to scan the whole index, only 1000 slices relating to the qualifying ITEMNOs, or actually 100,000 subslices relating to the qualifying ITEMNO,STORENO combinations. 100,000 random index touches (100,000 x 10ms = 17 minutes) would be a very pessimistic estimate because the access is skip sequential. The 100,000 table rows are not close to each other, so 100,000 x 10ms = 17 minutes is a realistic estimate. To eliminate the table touches, the fact table columns should be copied to the index.
V3.1.0.1
Student Notebook
Uempty
8QLW 6XPPDU\
.H\ SRLQWV 6SHFLDO DWWHQWLRQ QHHGHG ZKHQ XVHU KDV PDQ\ RSWLRQV 0DQ\ LQGH[HV DOORZLQJ LQGH[RQO\ DFFHVV 2SWLPL]HU PD\ QHHG KHOS
CF963.2
Notes:
5-15
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 6. Massive Batch

In this unit we discuss the performance problems caused by massive batch jobs, and how to make batch jobs run faster.

After completing this unit, you should be able to: Detect early the eventual performance problems with massive batch jobs Make batch jobs run faster

Accountability: Lab 8
References
6-1
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Detect early the eventual performance problems with massive batch jobs Make batch jobs run faster
CF963.2
Notes:
6-2
V3.1.0.1
Student Notebook
Uempty
6.1 Massive Batch

After completing this topic, you should be able to: Detect early the eventual performance problems with massive batch jobs Recommend design changes to reduce random disk I/O and improve batch performance Identify design changes required to implement parallelism in a massive batch program
6-3
Student Notebook
Batch Job Performance Issues
Random disk I/O

Same page may be read several times from disk
CPU queuing time

Low CPU priority
CPU time
Millions of touches
Figure 6-2. Batch Job Performance Issues
CF963.2
Notes:
A batch job can be called massive if it executes more than one million SQL calls, and at least one of the tables is large compared to the buffer pools. A batch job with one million SQL calls processing at least one large table may finish in a few minutes or it may run for several hours. If the elapsed time is surprisingly long, the largest component is probably one of the three listed above, most often the first one.
6-4
V3.1.0.1
Student Notebook
Uempty
Buffer Pools
APPL PGM
ROW/COLUMN
DB2
PAGE
PAGE
DASD
BUFFER POOL
Figure 6-3. Buffer Pools
CF963.2
Notes:
The size of the buffer pools plays a critical role in database performance. Many disk I/Os are avoided because the requested page is already in the buffer pool. Transaction response times deteriorate if nonleaf index pages do not stay in the buffer pool. Batch jobs are even more sensitive to buffer pool size and load: they may read each page of a table several times if the touches are random and the table is too large for the buffer pool.
6-5
Student Notebook
How Long Do Pages Stay in Buffer Pool?
Assume:
- 5 pages read per transaction - 50 transactions per second - Buffer pool 100 MB (= 25,000 pages)
250 pages/s
DASD
BUFFER POOL 25,000 pages
Maximum unreferenced pool age (MUPA)

= 25,000 pages 250 pages/s = 100s (roughly)
Figure 6-4. How Long Do Pages Stay in Buffer Pool?
CF963.2
Notes:
DB2 starts with an empty buffer pool. In this example, when 25,000 database pages have been read from disk, the buffer pool is full. With the assumed I/O rate this would happen after 100 seconds. When the buffer pool is full, the next page read from disk will overlay a page in the buffer pool. Roughly speaking, DB2 will overlay the least recently used page in the buffer pool. How long will the newly arrived page stay in the buffer pool if no program touches it? This time is called MUPA (Maximum unreferenced pool age). The simple formula on the visual suggests 100 seconds. This is somewhat optimistic because some popular pages stay in buffer pool forever (or at least as long as DB2 is up) and reduce the effective size of the buffer pool. MUPA will be less than 100 seconds. If MUPA is 30 seconds, pages that are referenced at least once in 30 seconds will stay in the buffer pool.
6-6
V3.1.0.1
Student Notebook
Uempty
How To Measure MUPA

Implement some single row tables, each in its own tablespace, without indexes:
T10, T30, T100, T300, T1000, T3000
Write a batch program which issues SELECTs at regular intervals:

once in 10 seconds to T10 once in 30 seconds to T30 and so on
Run the program for 1 hour during peak periods with I/O activity trace turned on for this program
T10 T30 T100 T300 T1000 T3000 SR 1 1 1 10 4 2
MUPA= 100s
Figure 6-5. How to Measure MUPA
CF963.2
Notes:
MUPA can be measured with a simple program. While MUPA varies widely according to the load, it is good to know the range in your installation: A few seconds? A few minutes? A few hours? A few minutes is typical with current hardware. There may be more than one buffer pool. If this is true, each buffer pool may (and should) have a different MUPA. Therefore, the program shown on the visual must be run several times; the first time with all tables allocated to the first buffer pool, the second time with all tables allocated to the second buffer pool, and so on. Some buffer pool tools give the MUPAs of the different buffer pools. If your installation has such a tool, there is, of course, no need to measure the MUPA as explained on this visual.
6-7
Student Notebook
Random Disk I/O
X1
1,000,000 random touches
CUSTNO
10,000 leaf pages
PROGRAM
CUST
2,000,000 rows 50,000 pages
BUFFER POOLS
How many random disk I/Os?

Assuming only nonleaf pages of X1 in buffer pool when program starts
Figure 6-6. Random Disk I/O
CF963.2
Notes:
This is the big question: how many random disk I/Os? A batch program issues one million SELECTs with random CUSTNOs. DB2 does a matching index scan one million times. With a big buffer pool (and long MUPA), every leaf page and table page is read once from disk: 60,000 random I/Os (10,000 for the leaf pages, 50,000 for the table pages), each taking perhaps 10ms, total I/O time 600s. In the worst case (small buffer pool, short MUPA), there are no buffer pool hits: each SELECT causes two random I/Os. 2,000,000 x 10ms = 20,000s.
6-8
V3.1.0.1
Student Notebook
Uempty
(TR) = Buffer Pool Hit
Lower bound (long MUPA)

INDEX (TR) TR X1, CUST 10,000 990,000 TS TABLE TR (TR) TS 50,000 950,000 -
LRT
640s
Upper bound (short MUPA)

INDEX (TR) TR X1, CUST 1,000,000 TABLE TR (TR) TS TS - 1,000,000 -
LRT
20,000s
Figure 6-7. (TR) = Buffer Pool Hit
CF963.2
Notes:
The total elapsed time of the batch job with one million SELECTs is between 640s and 20,000s, according to VQUBE.
6-9
Student Notebook
Closer to Lower Bound or Upper Bound?

By page set (table space or index space):
- Page set size versus

buffer pool size
- Time between references

to same page versus MUPA Example: 1,000,000 random touches
Average time between touches (one cycle) between 0.64ms (lower bound) and 20ms (upper bound)
X1
CUSTNO
10,000 leaf pages
INDEX BUFFER POOL

200,000 pages (800MB) MUPA: 60 ... 600 seconds
Figure 6-8. Closer to Lower Bound or Upper Bound?
CF963.2
Notes:
If an estimate like between half an hour and six hours is not adequate, you have to analyze the access pattern of each page set which has a high number of random touches. If the page set is larger than the corresponding buffer pool, there is no hope. The actual number of random disk I/Os may be close to the upper bound. Otherwise, the first step is to find out the minimum and maximum average time it takes between 2 touches to the same object. In our example, there are 1,000,000 touches to both X1 and CUST. As the local response time is between 640s (lower bound) and 20,000s (upper bound), the minimum average time between 2 touches to the same object is 640s / 1,000,000 = 0.64ms. The maximum average time between 2 touches to the same object is 20,000s / 1,000,000 = 20ms.
V3.1.0.1
Student Notebook
Uempty
X1: TR = 10,000 or 1,000,000?
10,000 leaf pages fit in index buffer pool Average time between touches to the same leaf page:
10,000 x (0.64ms ... 20ms) = 6.4s ... 200s MUPA = 60s ... 600s
TR will be up to 1,000,000 - very sensitive to buffer pool load
Figure 6-9. X1: TR = 10,000 or 1,000,000?
CF963.2
Notes:
The MUPA (60s ... 600s) has been measured as explained on visual 6-5 for the index buffer pool. The average time between touches to the same leaf page is 10,000 times higher than the time needed for one cycle (0.64ms ... 20ms), as there are 10,000 leaf pages in index X1. The average time between touches to the same leaf page and the MUPA overlap. Therefore, depending on other programs running at the same time and using the same buffer pool, the number of index I/Os (and the local response time) varies widely from one run to another.
6-11
Student Notebook
Table Even Worse
50,000 table pages do not fit in table space buffer pool

Table space buffer pool 25,000 pages
Average time between touches to a table page:

50,000 x (0.64ms ... 20ms) = 32s ... 1000s MUPA = 10s ... 60s
TR will be close to 1,000,000
Figure 6-10. Table Even Worse
CF963.2
Notes:
The average time between touches to the same table page is longer than the measured MUPA for the table space buffer pool. Each touch will lead to a disk I/O. A much larger table space buffer pool is needed to prevent multiple I/Os per table page.
V3.1.0.1
Student Notebook
Uempty
Reduce Random Disk I/O

1 2
Index-only Consistent clustering

VQUBE: 2,000,000 x 0.02ms = 40s
C
X1
CUSTNO
1,000,000 touches - TR =1
CUST
3
Sort before access

Add VQUBE for sort: NROWS x 0.002ms = 1,000,000 x 0.002ms = 2s
Bigger buffer pools (longer MUPA) Denormalize

Figure 6-11. Reduce Random Disk I/O
CF963.2
Notes:
This is the most important visual in this unit. The number of random I/Os in a massive batch job is difficult to predict but easy to reduce. When processing is sequential, a batch job does not need a large buffer pool, no matter how large the tables and indexes. Roughly 100 pages per page set is enough for efficient sequential prefetch. Index-only is the most efficient solution if the random touches are only to the table(s). The estimated sort time (2s) for sorting is CPU time only. The I/O time can be longer but probably not much longer if the sort workfile buffer pool is large.
6-13
Student Notebook
Surprises Possible
Time per TR less than 10ms

List prefetch
Many cheap touches although buffer pool small compared to page set
Access not totally random
Surprising random touches with clustered index scan

Table not reorganized frequently enough
Figure 6-12. Surprises Possible
CF963.2
Notes:
Even the optimizer has difficulties in predicting the number of random I/Os.
V3.1.0.1
Student Notebook
Uempty
Complicated? Unpredictable?
Yes! Yes!
Therefore, avoid random touches to large indexes and tables in batch jobs.
Figure 6-13. Complicated? Unpredictable?
CF963.2
Notes:
The message is clear: 1. Minimize random touches in massive batch jobs by careful table design, careful index design, and careful program design. 2. Keep large tables and indexes well-organized.
6-15
Student Notebook
CPU Queuing Time
CPU queuing time = A x CPU time

More processors, faster processors, higher priority Reduce number of touches Reduce number of SQL calls Reduce number of locks
0min 70min 3min
SYNCHR. READ SQL LOCAL RESPONSE TIME
225min
NON-SQL
220min
5min
7min
WAIT FOR PREFETCH
140min OTHER (includes CPU queuing)
LOCK WAIT CPU TIME
Figure 6-14. CPU Queuing Time
CF963.2
Notes:
If the accounting trace shows this breakdown for the elapsed time, CPU queuing is the biggest component. Given the number of processors and the system load, CPU queuing time is proportional to CPU time, so if CPU time is reduced by 50%, CPU queuing time also drops by 50%.
V3.1.0.1
Student Notebook
Uempty
Reduce Number of Touches

Eliminate unnecessary work
over-generalized service modules SELECT application logic
Denormalize tables
Read small tables only once
Index-only
Figure 6-15. Reduce Number of Touches
CF963.2
Notes:
It is amazing how many programs do unnecessary work. A generalized service module may, for instance, access tables that are not needed at all by the requesting module. Before spending a lot of time changing application programs, it is wise to quantify the expected saving in CPU time. Two methods for estimating CPU time will be discussed in the next unit. Some changes save seconds; some save hours.
6-17
Student Notebook
Parallelism
Parallelism (I/O and CPU) may radically reduce total elapsed time BIND ... DEGREE(ANY)
Often not automatic: must clone program manually

C
P1
P2 P3 P1 P2 P3
Figure 6-16. Parallelism
CF963.2
Notes:
Parallelism is the final solution to massive batch, but it is seldom automatic. Normally the application must divide the work to roughly equal pieces according to the main table. Each program clone then processes its own piece and the related rows in other tables.
V3.1.0.1
Student Notebook
Uempty
6.2 Lab 8: Improve Batch Performance
6-19
Student Notebook
Lab 8: Batch Application Description

Batch program
CURSOR P: SELECT CUSTNO, CODENO, ... FROM POLICY OPEN P
P
Current tables and indexes

X1 C X2 F X3 F X4
PNO
PDATE
CUSTNO
CODENO
POLICY
10,000,000 rows 1,000,000 pages P,C X5 P,C X6
x10M
FETCH P SELECT... FROM CUST WHERE CUSTNO=:CUSTNO SELECT... FROM CODE WHERE CODENO=:CODENO
CUSTNO
10,000 leaf pages
CODENO
4 leaf pages
CUST
Program barely finishes in a weekend Many synchronous reads 30,000,000 SQL calls 50,000,000 touches POLICY table
Table scan, SORT=N
2,000,000 rows 200,000 pages
CODE
1000 rows 20 pages
Buffer Pools
Buffer Pool Appl. indexes Appl. tables Others Size 800MB 100MB 100MB MUPA 60-600 sec 10-60 sec ?
Other tables
Matching index scan MC=1, SORT=N, INDEXONLY=N
Figure 6-17. Lab 8: Batch Application Description
CF963.2
Notes:
A fairly massive batch job: 30,000,000 SQL calls. Maybe a three-table join would have been a better idea, but let us try easier changes first. The program is running very slowly now; it barely finishes during a weekend. The biggest component is synchronous read. No estimate was done when the program was designed, but better late than never. Assume the following MUPAs: 1. Application indexes 1 to 10 minutes 2. Application tables 10 to 60 seconds The singleton SELECTs have proper predicates (WHERE CUSTNO = :CUSTNO and WHERE CODENO = :CODENO), so the total number of touches is 50,000,000. The POLICY table is accessed with a table scan; the other tables with matching index scan (MC=1, INDEXONLY=N).
V3.1.0.1
Student Notebook
Uempty
Lab 8: Theoretical Worst Case Estimate

Here is a VQUBE with the 'worst case' assumptions below
Upper Bound
INDEX TR (TR) POLICY X5, CUST 10M X6, CODE 10M
Assumptions
No buffer pool hits
Buffer pools are very small Many concurrent programs
TS
TR 1 10M 10M
TABLE (TR)
TS 10M
LRT
200s 200,000s 200,000s 400,200s = 111h
Each index and table GETPAGE results in page being read from disk
Short MUPA No cheap random touches
Figure 6-18. Lab 8: Theoretical Worst Case Estimate
CF963.2
Notes:
The theoretical worst case is no buffer hits. This might happen if the buffer pools are very small and/or if there are many concurrent programs. No cheap random touches: Local response time = 40,000,001 x 10ms + 10,000,000 x 0.02ms = 400,200 s
6-21
Student Notebook
Lab 8: Theoretical Best Case Estimate

Here is a VQUBE with the 'best case' assumptions below
Lower Bound
INDEX TR (TR) POLICY X5, CUST 10,000 9,990,000 X6, CODE
Assumptions
Initially, each index and table page has to be read in from disk once Thereafter, no reread from disk
All pages read by batch job stay in buffer pools for the duration of the job Buffer pools are significantly larger than CUST and CODE and their indexes
TS
9,999,996
TR 1
20
TABLE (TR)
9,999,980
TS 10M
LRT
200s 2500s 400s 3100s < 1h
200,0009,800,000
Long MUPA Many cheap random touches
Absolutely best case is where all pages are already resident in buffer pools
Figure 6-19. Lab 8: Theoretical Best Case Estimate
CF963.2
Notes:
The theoretical best case is no reread from disk: all pages read by our batch job stay in the buffer pools for the duration of the job. This requires buffer pools that are significantly larger than CUST and CODE and their indexes. Of course, the absolutely best case is one where all pages are resident in the buffer pools. Many cheap random touches. Local response time = 210,025 x 10ms + 39,789,976 x 0.02ms + 10,000,000 x 0.02ms = 3100s
V3.1.0.1
Student Notebook
Uempty
Lab 8: Worst versus Best

Worst case estimate
POLICY
X5
Best case estimate

POLICY
X5
CUSTNO
CUSTNO
CUST
X6
CUST
X6
CODENO
CODENO
CODE
CODE
LRT for 10,000,000 iterations is 400,200s (111h approximately) Elapsed time for 1 cycle is 40ms
LRT for 10,000,000 iterations is 3100s (less than 1h) Elapsed time for 1 cycle is 0.3ms
Knowing the cycle time, we can find the average time between references to a specific page in each index and table We can then compare this time to see if it is within the MUPA of the corresponding buffer pool
Figure 6-20. Lab 8: Worst versus Best
CF963.2
Notes:
So, local response time is between 3100s and 400,000s. The elapsed time for one cycle (= processing one POLICY row and the associated CUST row and CODE row) is between 0.3ms (3100s / 10,000,000) and 40ms (400,000s / 10,000,000). Knowing this, you can find the average time between references to a page in each index and table. Then, compare it against the MUPA of the corresponding buffer pool.
6-23
Student Notebook
Lab 8: Index X6 - A Closer Look

Each leaf page of X6 is touched once in every 4 cycles on average MUPA of index buffer pool is 60-600s
Worst case estimate
POLICY
X5
Best case estimate

POLICY
X5
CUSTNO
CUSTNO
CUST
X6
CUST
X6
CODENO
4 leaf pages
TR=10M
CODENO
TR=4, (TR)=9,999,996
4 leaf pages
CODE
CODE
Average time between references to a specific leaf page is 4x40ms=160ms

Within MUPA of index buffer pool
Average time between references to a specific leaf page is 4x0.3ms=1.2ms

Within MUPA of index buffer pool
Therefore, X6 leaf pages stay in buffer pool once read
Therefore, X6 leaf pages stay in buffer pool once read
The better estimate for X6 is TR=4, (TR)=9,999,996

Figure 6-21. Lab 8: Index X6 - A Closer Look
CF963.2
Notes:
Let us start with X6. Each of the four leaf pages will be touched once in four cycles, on average. The average time between references to a page is therefore four cycles = 4 x (0.3 to 40ms) = 1.2 to 160ms, much less than the MUPA of the index buffer pool (60 to 600s). Thus, the leaf pages of X6 stay in buffer pool once read (no surprise). TR = 4, (TR) = 9,999,996.
V3.1.0.1
Student Notebook
Uempty
Lab 8: Refinements Of Worst And Best Estimates

We can now return to our earlier VQUBEs and make adjustments to take into account buffering for index X6 The lower bound estimate for index X6 already assumes buffering
Upper Bound
INDEX TR (TR) POLICY 10M X5, CUST X6, CODE 10M
4
TS
9,999,996
TR 1 10M 10M
TABLE (TR)
TS 10M
LRT
200s 200,000s 200,000s 400,200s = 111h
Lower Bound
TR INDEX (TR) TS TR 1
20
TABLE (TR)
POLICY X5, CUST 10,000 9,990,000 X6, CODE

4 9,999,996
9,999,980
TS 10M
LRT
200s 2500s 400s 3100s < 1h
200,000 9,800,000
Figure 6-22. Lab 8: Refinements Of Worst And Best Estimates
CF963.2
Notes:
6-25
Student Notebook
Lab 8: Instructions
In a similar way, analyze CODE, X5, CUST and POLICY Decide where buffering will occur, won't occur, will be borderline or won't be important Adjust the upper / lower bound TR and (TR) estimates and local response times accordingly What design changes to the implementation could reduce the TRs and (TR)s?
Upper Bound
TR POLICY X5, CUST 10M X6, CODE 10M
4
INDEX (TR)
TS
9,999,996
TR 1 10M 10M
TABLE (TR)
TS 10M
LRT
200s 200,000s 200,000s 400,200s = 111h
Lower Bound
TR INDEX (TR) TS TR 1
20
TABLE (TR)
POLICY X5, CUST 10,000 9,990,000 X6, CODE

4 9,999,996
9,999,980
TS 10M
LRT
200s 2500s 400s 3100s < 1h
200,000 9,800,000
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Lab 8: Worksheet
Upper Bound (short MUPA)

INDEX TR (TR) POLICY X5, CUST X6, CODE TS TR TABLE (TR) TS LRT
Lower Bound (long MUPA)

INDEX TR (TR) POLICY X5, CUST X6, CODE TS TR TABLE (TR) TS LRT
Figure 6-24. Lab 8: Worksheet
CF963.2
Notes:
6-27
Student Notebook
V3.1.0.1
Student Notebook
Uempty
6.3 Massive Delete

After completing this topic, you should be able to: Consider the implications of batch applications that must delete massive numbers of rows
6-29
Student Notebook
Massive Delete
DELETE FROM ORDER WHERE ORDERDATE < :HV
C X1
ORDERDATE
...
One million old rows have to go. ORDERDATE (Key of X1) ever-increasing, so old rows at beginning of table. How long does it take? Can you make it faster?
-- - -CUSTNO- - -- -
X5
ORDER
100,000,000 rows 2,000,000 pages
-1%
Buffer pools: Application indexes 200,000 pages Application tables 25,000 pages
Assume each index has 500,000 leaf pages

Figure 6-25. Massive Delete
CF963.2
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Key points: Minimize TR Minimize (TR) Minimize TS Parallelize
CF963.2
Notes:
6-31
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 7. Worried about CPU Time?

This unit is about predicting CPU time.

After completing this unit, you should be able to: Predict CPU time with a rough formula
References
7-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR 3UHGLFW &38 WLPH ZLWK D URXJK IRUPXOD
CF963.2
Notes:
7-2
V3.1.0.1
Student Notebook
Uempty
5RXJK &38 7LPH (VWLPDWH ]

64/ FDOOV *(73$*(V 3DJHV UHDG UDQG 3DJHV UHDG VHT /RFN UHTXHVWV 5RZ SURFHVVLQJ 5RZV VRUWHG [ XV [ XV [ XV [ XV [ XV [ XV [ XV
7RWDO
XV PLFURVHFRQG ]
Figure 7-2. Rough CPU Time Estimate (z990)
CF963.2
Notes:
The first step towards a CPU time estimate is VQUBE: The SQL-related CPU time is likely to be less than 0.02ms per touch (z990). The next level, much more accurate, is this worksheet. GETPAGEs include nonleaf pages. A matching index scan, index-only, with a three-level index requires three GETPAGEs to retrieve the first index row. Lock request means LOCK and UNLOCK. Scanning a table with 10,000 pages requires 10,000 lock requests with page locking if lock avoidance always fails. Row processing could be applying residual predicates (nonmatching predicates), evaluating built-in scalar functions, and so on. The suggested coefficients assume no data sharing. If you need CPU time estimates or maximum accuracy, use EXPLAIN. It takes into account the number and type of predicates, for instance.
7-3
Student Notebook
This worksheet (and the alternatives) estimate only the CPU time for processing the SQL call in DB2. The cost of sending an SQL call from CICS to DB2 is not included. This overhead should be measured and added to the worksheet.
7-4
V3.1.0.1
Student Notebook
Uempty
/DE %DVH &DVH 32/,&<
64/ FDOOV *(73$*(V 3DJHV UHDG UDQG 3DJHV UHDG VHT /RFN UHTXHVWV 5RZ SURFHVVLQJ 5RZV VRUWHG 7RWDO
0 0 0 0 0
[ XV [ XV [ XV [ XV [ XV [ XV [ XV
V V V V V V
XV PLFURVHFRQG ]
Figure 7-3. Lab 8 Base Case POLICY
CF963.2
Notes:
The CPU time for sequential processing is fairly low and predictable. Page locking is assumed. With row locking, the lock-related CPU time would be 10M x 2 us = 20s if lock avoidance always fails. With uncommitted read (UR) the number of lock requests is zero.
7-5
Student Notebook
/DE %DVH &DVH &867
64/ FDOOV *(73$*(V 3DJHV UHDG UDQG 3DJHV UHDG VHT /RFN UHTXHVWV 5RZ SURFHVVLQJ 5RZV VRUWHG 7RWDO
0 00 0 XS WR 0 0 0
V V XS WR V V V XS WR V
XV PLFURVHFRQG ]
Figure 7-4. Lab 8 Base Case CUST
CF963.2
Notes:
Random processing is more expensive and unpredictable. The buffer pool hit ratio plays an important role. X5 is assumed to be a 3-level index. Each access needs 3 GETPAGEs.
7-6
V3.1.0.1
Student Notebook
Uempty
/DE %DVH &DVH &2'(
64/ FDOOV *(73$*(V 3DJHV UHDG UDQG 3DJHV UHDG VHT /RFN UHTXHVWV 5RZ SURFHVVLQJ 5RZV VRUWHG 7RWDO 00
0 0 0 0
V V V V V

V
XV PLFURVHFRQG ]
Figure 7-5. Lab 8 Base Case CODE
CF963.2
Notes:
This estimate illustrates the CPU cost of a small table which stays in the buffer pool. X6 is a 2-level index. (1-level indexes no longer exist.)
7-7
Student Notebook
/DE %DVH &DVH 6XPPDU\
32/,&< &867 &2'( 727$/
V XS WR V V XS WR V
948%( &38 WLPH
0 [ PV
V
Figure 7-6. Lab 8 Base Case Summary
CF963.2
Notes:
As this example shows, VQUBE overestimates the CPU time when processing is sequential, and when leaf or table pages stay in the buffer pool. Note that this worksheet estimate is very sensitive to the number of random I/Os.
7-8
V3.1.0.1
Student Notebook
Uempty
8QLW 6XPPDU\
.H\ SRLQWV 1XPEHU RI 64/ FDOOV 1XPEHU RI *(73$*(6 1XPEHU RI ,2V 1XPEHU RI ORFN UHTXHVWV
CF963.2
Notes:
7-9
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 8. Avoiding Locking Problems

This unit is about avoiding two kinds of locking problems: long lock waits and wrong results.

After completing this unit, you should be able to: Avoid lock durations that are too long or locks that are too strong Prevent wrong results caused by lock durations that are too short or locks that are too weak

Accountability: Case studies
References
8-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR $YRLG ORFN GXUDWLRQV WKDW DUH WRR ORQJ RU ORFNV WKDW DUH WRR VWURQJ 3UHYHQW ZURQJ UHVXOWV FDXVHG E\ ORFN GXUDWLRQV WKDW DUH WRR VKRUW RU ORFNV WKDW DUH WRR ZHDN
CF963.2
Notes:
8-2
V3.1.0.1
Student Notebook
Uempty
7KUHH 6WUDWHJLHV
WLPH

,1387 5(63 ,1387 5(63
'DQJHURXV
,1387
; ;
&
5(63
5(63
,1387
'HIDXOW
,1387
&
&
5(63
6RPHWLPHV QHHGHG
&
&
Figure 8-2. Three Strategies
CF963.2
Notes:
X = exclusive lock C = commit point An exclusive lock is taken when a page or row is modified. It is released at commit point. When a page or row is X locked, other programs are not allowed to modify it, or even read it, unless they are willing to see uncommitted data (SELECT WITH UR). A commit point marks the end of a unit of recovery. If a program is unable to terminate normally, DB2 backs out the modifications the program has made since its last commit point. The visual shows three ways to implement a two-screen update. The two screens are related. The user does not want any partial updates in the database: all or nothing. The first strategy is convenient for the programmer, because rows updated in the first transaction stay locked until the end of the conversation. It is dangerous, however, to include user think time in lock duration. This approach is recommended only for personal applications (one user).
Copyright IBM Corp. 2000, 2005 Unit 8. Avoiding Locking Problems 8-3
Student Notebook
The second strategy is the normal one. It is the default in IMS and the standard in CICS (pseudo-conversational). Lock durations are short if the response times are short. The application must handle the possibility that data is updated by another user between transactions, as well as the backout of the first-screen updates if the second transaction fails. If the local response time of a transaction may exceed five seconds, intermediate commits (third strategy) should be considered, in order to keep lock durations below five seconds. Intermediate commit points are created with EXEC CICS SYNCPOINT in CICS and with program-to-program switch in IMS. The application must handle incomplete updates at screen level. DB2 backs out only the updates since the last commit point.
8-4
V3.1.0.1
Student Notebook
Uempty
7KUHH 4XHVWLRQV
&38 ,2
$UH ORQJ ORFN ZDLWV !V SRVVLEOH LI DOO WUDQVDFWLRQV DUH IDLUO\ IDVW &38 ,2 V"
V
$UH ORQJ ORFN ZDLWV SRVVLEOH LI DOO SDJHV DUH HTXDOO\ SRSXODU"
&DQ UHDGRQO\ WUDQVDFWLRQV FDXVH ORQJ ORFN ZDLWV"
6(/(&7 )25 83'$7(
Figure 8-3. Three Questions
CF963.2
Notes:
1. Possible but quite unlikely. If all transactions finish in less than five seconds and if they create a commit point when they write a response to the user (extremely important), they cannot hold any lock for more than five seconds. It is possible, however, that three fairly slow transactions (CPU time plus I/O time four seconds) are entered almost simultaneously. If they all need to update the same page early in the program, the first one will lock the page for four seconds. The second one will take eight seconds (lock wait 4s) and the third one will take twelve seconds (lock wait 8s). This scenario is, of course, unlikely. Therefore, the first step towards preventing long lock waits is good access paths: estimated local response time less than five seconds even with the worst input.
8-5
Student Notebook
2. Possible but extremely unlikely A typical database has millions of table pages. If access to the pages is totally random, the likelihood that two concurrent transactions will need the same page is very small. The second step towards preventing long lock waits is to avoid hot pages. Pages in a small active table will, of course, be accessed more often than the average page. Row locking is a good option for these tables. If a single row is hot, you may need to change something fundamental in your application. 3. Yes. Without uncommitted read, a SELECT or a FETCH may take an S lock which stops updaters. When the cursor does not have FOR UPDATE (and a read-only transaction should not), the S lock is unlikely but still possible. It is important, therefore, that read-only programs (transactions and batch) also respect the five-second limit. A commit point should be created when five seconds have elapsed if other mechanisms do not release the locks quickly enough.
8-6
V3.1.0.1
Student Notebook
Uempty
7KUHH 6HULRXV 5HFRPPHQGDWLRQV
1R FRPPLW LQWHUYDO !V 1R WDEOH SDJH RU URZ ORFNHG PRUH WKDQ RI WKH WLPH SHDN KRXU $OZD\V 83'$7( '(/(7( ZLWK :+(5( &855(17 2)
12 /21* /2&.6 12 +27 2%-(&76
,17(*5,7<
Figure 8-4. Three Serious Recommendations
CF963.2
Notes:
If you follow these recommendations, you can sleep well; at least you should not have locking nightmares.
8-7
Student Notebook
$VVXPSWLRQV
%,1' &855(17'$7$12

(QDEOHV ORFN DYRLGDQFH
1R ZRUNILOHWHPSRUDU\ WDEOH
1R VRUW 1R YLHZ PDWHULDOL]DWLRQ 1R WDEOH H[SUHVVLRQ PDWHULDOL]DWLRQ 1R VFUROODEOH FXUVRU 1R PHUJH VFDQ MRLQ
Figure 8-5. Assumptions
CF963.2
Notes:
Lock avoidance (discussed later) should always be enabled. It is also assumed that indexes are designed to avoid sorts whenever possible.
8-8
V3.1.0.1
Student Notebook
Uempty
:LWK 7KRVH $VVXPSWLRQV/RFN
6(/(&7 )(7&+
LI ORFN DYRLGDQFH IDLOV DQG LVRODWLRQ OHYHO QRW 85
QR )25 83'$7(
)25 83'$7(
6 LI ORFN DYRLGDQFH IDLOV 8
DQG LVRODWLRQ OHYHO QRW 85
,16(57 83'$7( '(/(7(
Figure 8-6. With Those Assumptions...Lock
CF963.2
Notes:
A program requests a lock when one of these SQL calls is executed.
8-9
Student Notebook
/RFN $YRLGDQFH
'% NQRZV WKH URZ LV FRPPLWWHG ZLWKRXW DFTXLULQJ DQ 6 ORFN
WLPHVWDPS
RI ODVW XSGDWH
SDJH KHDGHU
7$%/( 3$*( ;
,I
WLPHVWDPS
RI ODVW XSGDWH ROGHU WKDQ VWDUW WLPH RI DQ\ RSHQ XQLW RI UHFRYHU\ XSGDWLQJ WKH WDEOH VSDFH FRQWDLQLQJ SDJH ; DOO URZV RQ WKH SDJH DUH FRPPLWWHG
$SSOLHV RQO\ WR 6 ORFNV $SSOLHV RQO\ WR LVRODWLRQ OHYHO &6 1HHGV %,1' &855(17'$7$12
Figure 8-7. Lock Avoidance
CF963.2
Notes:
Lock avoidance is used for read-only cursors (no FOR UPDATE) defined with isolation level CS if the plan or package containing the cursor is bound with CURRENTDATA(NO). Lock avoidance applies also to singleton selects (with isolation level CS and CURRENTDATA(NO)). Lock avoidance does not mean 100% lock avoidance. If the timestamp of the last update from the page header is not older than the start time of all units of recovery updating the table space containing the page, lock avoidance fails and DB2 asks for an S lock as it would have done without lock avoidance. Lock avoidance failures are normally less than 1% on average under normal conditions.
V3.1.0.1
Student Notebook
Uempty
7KUHH /HYHOV
+ROGLQJ
6 6 8 ;
2. 2.
2.
:$,7
6 8 ;
6+$5( 83'$7( (;&/86,9(
5HTXHVWLQJ
8 ;
:$,7 :$,7
:$,7 :$,7 :$,7
2. :$,7
ORFN JUDQWHG UHTXHVWHU VXVSHQGHG XQWLO KROGHU UHOHDVHV ORFN
Figure 8-8. Three Levels
CF963.2
Notes:
A read-only program (without FOR UPDATE) takes only S locks, if any. A program requesting an S lock has to wait when the object is X locked because reading uncommitted data is not acceptable except with ISOLATION UR. It is more difficult to understand why a program requesting an X lock has to wait when the object is S locked. Actually, this is not necessary if all programmers respect serious recommendation number 3. This is why DB2 now has an option to avoid the S lock in most cases (CURRENTDATA(NO) enables lock avoidance).
8-11
Student Notebook
8QORFN
6 $1' 8
&RPPLW SRLQW 'RQH

LVRODWLRQ OHYHO &6 RQO\
6,1*/(721 6(/(&7 1H[W LQVWUXFWLRQ
&85625 (QG RI UHVXOW 64/&2'( &/26( &85625
'% PRYHV WR DQRWKHU SDJH RU URZ

LVRODWLRQ OHYHO &6 RQO\
&RPPLW SRLQW
Figure 8-9. Unlock
CF963.2
Notes:
A commit point releases all locks, but S and U locks are often released before commit point. This is why a long-running read-only program may not need any intermediate commit points.
V3.1.0.1
Student Notebook
Uempty
:KDW LV WKH 3UREOHP"
/RFN GXUDWLRQ WRR VKRUW
:521* 5(68/76
RU
/RFN GXUDWLRQ WRR ORQJ
811(&(66$5< :$,7,1*
Figure 8-10. What Is the Problem?
CF963.2
Notes:
Do I really need to know all these details? Is DB2 locking not automatic? With X locks, DB2 automatically prevents lost updates, but the application developer affects the duration and level of locks in many ways.
8-13
Student Notebook
([DPSOH 3DJH /RFNLQJ
73[ 23(1 ,7(0
73\
& /3
)(7&+ ,7(0
)(7&+ ,7(0
6 73[
&/26( ,7(0
73\
&200,7
1R LQGH[ SDJH ORFNV
,VRODWLRQ OHYHO &6 1R )25 83'$7( LQ FXUVRU 6 6ORFN WDNHQ LI ORFN DYRLGDQFH IDLOV
Figure 8-11. Example (Page Locking)
CF963.2
Notes:
This is how pages are locked and unlocked by a read-only program with the assumed options. The S locks can be too short or too long. If the table has row locking, the diagram is the same, but the locked objects are rows (TRx and TRy instead of TPx and TPy). If a workfile or temporary table is created, the sequence of events is the same, but all locking and unlocking takes place at OPEN ITEM. When OPEN ITEM is completed, the program fetches from its workfile or temporary table and the permanent table is not locked by this program; other programs are free to update pages TPx and TPy.
V3.1.0.1
Student Notebook
Uempty
([DPSOH:URQJ 5HVXOWV
1RUPDOO\ URZV DUH FRPPLWWHG ,I LVRODWLRQ OHYHO &6

1R 6 ORFN WDNHQ LQ PRVW FDVHV GXH WR ORFN DYRLGDQFH )HWFKHG URZ QRW ORFNHG DIWHU )(7&+ 0D\ EH XSGDWHG RU GHOHWHG E\ DQRWKHU SURJUDP 'DQJHURXV LI SURJUDP ZKLFK IHWFKHG WKH URZ XSGDWHV ZLWK :+(5( 3ULPDU\B.H\ DVVXPLQJ WKH URZ LV QRW FKDQJHG VLQFH )(7&+
6ROXWLRQ 6HULRXV UHFRPPHQGDWLRQ 1R
Figure 8-12. Example...Wrong Results
CF963.2
Notes:
These surprises are avoided if programs inform DB2 about updating intent with FOR UPDATE.
8-15
Student Notebook
6HULRXV 5HFRPPHQGDWLRQ 1R ,JQRUHG
)(7&+
,7(0 )(7&+ ,7(0
83'$7( ,7(0 6(7 %$/$1&( %$/ :+(5( ,7(012 ,7(012 83'$7( ,7(0 6(7 %$/$1&( %$/ :+(5( ,7(012 ,7(012
7UDQVDFWLRQ $
7UDQVDFWLRQ %
CF963.2
Figure 8-13. Serious Recommendation No.3 Ignored
Notes:
When lock avoidance is successful, a row is not locked between FETCH and UPDATE. This can lead to logical errors. However, turning off lock avoidance (CURRENTDATA(YES)) does not fix the problem because several programs may hold an S lock on the same object. The FETCH must take a U lock.
V3.1.0.1
Student Notebook
Uempty
([DPSOH 8QQHFHVVDU\ :DLWLQJ
:KHQ ORFN DYRLGDQFH IDLOV 6 ORFN KHOG DW OHDVW XQWLO QH[W )(7&+ DJDLQVW VDPH &85625
(ODSVHG WLPH FDQ EH YHU\ ORQJ RWKHU 64/ FDOOV 5HTXHVWHU RI ; ORFN PXVW ZDLW
0DQ\ VROXWLRQV
3URJUDP GHVLJQ LQGH[ WXQLQJ URZ ORFNLQJ :,7+ 85
Figure 8-14. Example: Unnecessary Waiting
CF963.2
Notes:
Long S locks with ISOLATION CS cause unnecessary waiting. According to serious recommendation number 1, no lock should stay alive for more than five seconds. Of course, we would like lock durations to be significantly shorter. Anything (like one million FETCHes to another cursor) may happen between the two FETCH ITEMs in our example. If the time between two FETCHes cannot be reduced to an acceptable level, the cursor should be closed immediately after FETCH.
8-17
Student Notebook
$QRWKHU 3UREOHP
/RFN WRR ZHDN
:521* 5(68/76
RU
/RFN WRR VWURQJ
811(&(66$5< :$,7,1*
Figure 8-15. Another Problem
CF963.2
Notes:
Because the level of the lock (S/U/X) is determined by the SQL call, the programmer can influence it.
V3.1.0.1
Student Notebook
Uempty
/RFN 7RR :HDN DQG 7RR 6KRUW

38
25'(512
25'(5
352*5$0 $ 86(5 6(/(&7 25'(512 ,172 +925'12 )520 1(;725'(512
1(;725'(512
352*5$0 $ 86(5
83'$7( 1(;725'(512 6(7 25'(512 25'(512 ,16(57 ,172 25'(5 9$/8(6+925'12 &200,7
6(/(&7 25'(512 ,172 +925'12 )520 1(;725'(512 83'$7( 1(;725'(512 6(7 25'(512 25'(512 ,16(57 ,172 25'(5 9$/8(6+925'12 &200,7
Figure 8-16. Lock Too Weak (and Too Short)
CF963.2
Notes:
The users are confused. Sometimes the order they are trying to enter is not accepted by the system. If they try again, everything works normally.
8-19
Student Notebook
7KH 3UREOHP
86(5 6(/(&7 25'(512
1(; +9 6
1(; +9
86(5

6(/(&7 25'(512
83'$7( 1(;725'(512
; 686
83'$7( 1(;725'(512
,16(57 25'(5

&200,7

,16(57 25'(5
1(; +9 686
2QO\ SDJHURZ RI WDEOH 1(;725'(512 &RQWHQW RI KRVW YDULDEOH +925'12 6XVSHQVLRQ )DLOV EHFDXVH RI GXSOLFDWH NH\V
Figure 8-17. The Problem
CF963.2
Notes:
As table NEXTORDERNO has only one row (and therefore one page), the locking diagram is the same for lock size ROW or PAGE. The locking diagram reveals the problem: the lock taken by SELECT ORDERNO INTO :HVORDNO FROM NEXTORDERNO is too weak and too short. User 2 may read ORDERNO before user 1 has incremented it. The duplicate value is detected when user 2 tries to insert the row into ORDER. The quick fix was to make the program retry a few times from SELECT ORDERNO if the INSERT is not successful.
V3.1.0.1
Student Notebook
Uempty
6ROXWLRQ
86(5
'(&/$5( & &85625 )25 6(/(&7 25'(512 )520 1(;725'(512 )25 83'$7( 23(1 & )(7&+ & ,172 +925'12 83'$7( 1(;7 25'(512 :+(5( &855(17 2) & ,16(57 25'(5 &/26( & &200,7
8 8 686 ; 1(; +9 1(; +9
86(5
'(&/$5( & &85625 )25 6(/(&7 25'(512 )520 1(;725'(512 )25 83'$7( 23(1 & )(7&+ & ,172 +925'12
1(; +9 686
2QO\ SDJHURZ RI WDEOH 1(;725'(512 &RQWHQW RI KRVW YDULDEOH +925'12 6XVSHQVLRQ

Figure 8-18. Solution 1
CF963.2
Notes:
The author of the confusing program did not follow serious recommendation number 3. FOR UPDATE (which requires a cursor) prevents duplicate ORDERNO values. Now the lock on ORDERNO is long enough and strong enough.
8-21
Student Notebook
6ROXWLRQ
86(5
83'$7( 1(;725'(512
1(; +9 ;
1(; +9
86(5
686
83'$7( 1(;725'(512
6(/(&7 25'(512 ,172 +925'12 ,16(57 25'(5 &200,7

;
6(/(&7 25'(512 ,172 +925'12
1(; +9
2QO\ SDJHURZ RI WDEOH 1(;725'(512 &RQWHQW RI KRVW YDULDEOH +925'12
Figure 8-19. Solution 2
CF963.2
Notes:
This solution is more convenient: no cursor.
V3.1.0.1
Student Notebook
Uempty
/RFN :DLW 7RR /RQJ"
;

;
$VVXPH ,16(57 25'(5 WDNHV PV 75 1(;725'(512 ; ORFNHG IRU PV
25'(5
1(;725'(512
$VVXPH LQVHUW WUDQVDFWLRQV SHU VHFRQG

1(;725'(512 ORFNHG $YHUDJH ORFN ZDLW

RI WKH WLPH PV
[ PV
Figure 8-20. Lock Wait Too Long?
CF963.2
Notes:
In both cases the only row in NEXTORDERNO is locked for a fairly long time. It may become a hot row if the insert rate is high. With the assumptions on the visual, the row (or the page) is locked 50% of the time. According to queuing theory, the average lock wait will be 100ms. The formula, assuming exponential distributions for service time and interarrival time, is Q = u/(1-u) x S, where Q is average queuing time, u is utilization, and S is average service time. It would be impossible to do more than ten insert transactions per second with the current design. When the transaction rate approaches 10 tr/s, u approaches 1. Then, every insert transaction has a very long local response time. The lock on NEXTORDERNO must be made shorter, or the primary key of ORDER must be changed (several key sets or timestamp; not easy to do when the application is already implemented).
8-23
Student Notebook
6KRUWHU ; /RFN 'XUDWLRQ ,QWHUPHGLDWH &RPPLW
1(;725'(512 23(1 &
/RFN GXUDWLRQ QRZ URXJKO\ PV

WZR FKHDS UDQGRP WRXFKHV SOXV FRPPLW SRLQW
)(7&+ &
83'$7( &
&200,7
,16(57 25'(5
&200,7
1(;725'(512 ORFNHG
$YHUDJH ORFN ZDLW
[ PV
RI WKH WLPH
PV
+ROHV LQ 25'(512 VHTXHQFH SRVVLEOH
Figure 8-21. Shorter X Lock Duration: Intermediate Commit
CF963.2
Notes:
It is easy to add an intermediate commit point to the program. The overhead is not significant: less than 1ms of CPU time, less than 10ms of synchronous log write time. However, if the insert program abends between the two commit points, the insert to ORDER is backed out but the update of NEXTORDERNO is not. If holes in ORDERNO are accepted, defining the ORDERNO column AS IDENTITY is a more efficient solution. No OPEN, FETCH, CLOSE, COMMIT, no NEXTORDERNO table, no serialization.
V3.1.0.1
Student Notebook
Uempty
6KRUWHU ; /RFN 'XUDWLRQ 0DQXDO 3UHIHWFK
1(;725'(512
/RFN GXUDWLRQ URXJKO\ PV

FKHDS UDQGRP WRXFKHV SOXV FRPPLW SRLQW
6(/(&76 :,7+ 85
23(1 &
)(7&+ &
83'$7( &
5HDG SDJHV WR EH XSGDWHG WR EXIIHU SRRO EHIRUH WDNLQJ ORFNV 1(;725'(512 ORFNHG
$YHUDJH ORFN ZDLW [ PV
,16(57 25'(5
&200,7
RI WKH WLPH
PV
1R KROHV LQ 25'(512 VHTXHQFH
Figure 8-22. Shorter X Lock Duration: Manual Prefetch
CF963.2
Notes:
The idea of this solution is to remove all I/Os from the duration of the X lock of the hot row. This can be done with redundant SELECTs (WITH UR) which bring all the pages that will be updated to the buffer pool before taking any X locks.
8-25
Student Notebook
([DPSOH 8QQHFHVVDU\ :DLWLQJ

'(&/$5( ,7(0 &85625 )25 6(/(&7 ,7(012 )520 ,7(0 25'(5 %< ,7(012 '(&/$5( 25'(5,7(0 &85625 )25 6(/(&7 25'(512 )520 25'(5,7(0 :+(5( ,7(012 +9 25'(5 %< 25'(512
3&
;
,7(012
,7(0
23(1 ,7(0 )(7&+ ,7(0 23(1 25'(5,7(0 )(7&+ 25'(5,7(0 &/26( 25'(5,7(0 &/26( ,7(0
URZV SDJHV 8& ;
,7(012 25'(512
25'(5,7(0 (ODSVHG WLPH PLQXWH 76 RUGHU LWHPV UHODWH WR WKH EHVWVHOOLQJ LWHP 5HTXLUHPHQW /RFN GXUDWLRQ PD[ V
URZV SDJHV
Figure 8-23. Example - Unnecessary Waiting
CF963.2
Notes:
A simple read-only program like this may cause long lock waits and even timeouts to updating programs. The programmer did not respect serious recommendation number 1: No commit interval > 5s. Adding intermediate commit points solves the problem and is easy implement but there are many other ways to reduce the duration of S locks.
V3.1.0.1
Student Notebook
Uempty
8QQHFHVVDU\ :DLWLQJ %DVH &DVH

3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$ 2ULJLQDO &6 3$*( 1R 12
73 23(1 ,7(0 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 121(
6
73
73 ILUVW SDJH RI WDEOH ,7(0 73 VHFRQG SDJH RI WDEOH ,7(0 6 DYRLGDEOH 6 ORFN
Figure 8-24. Unnecessary Waiting - Base Case
CF963.2
Notes:
If lock avoidance fails, the page is locked until the cursor position moves to the next page or a commit point is created. Lock duration can be several minutes. Transactions wanting to update the ITEM table would experience long lock waits and timeouts.
8-27
Student Notebook
8QQHFHVVDU\ :DLWLQJ 6ROXWLRQ
6ROXWLRQ &KDQJH WKH SURJUDP &ORVH WKH ,7(0 FXUVRU DIWHU HDFK )(7&+ UHRSHQ DIWHU DOO RUGHU LWHPV WR WKH LWHP KDYH EHHQ IHWFKHG $GG :+(5( ,7(012 ! KY DQG 237,0,=( )25 52: WR WKH ,7(0 FXUVRU '(&/$5( ,7(0 &85625 )25 6(/(&7 ,7(012 )520 ,7(0 :+(5( ,7(012 ! +9 25'(5 %< ,7(012 237,0,=( )25 52: '(&/$5( 25'(5,7(0 &85625 )25 6(/(&7 25'(512 )520 25'(5,7(0 :+(5( ,7(012 +9 25'(5 %< 25'(512 GR XQWLO HRI 23(1 ,7(0 )(7&+ ,7(0 &/26( ,7(0 23(1 25'(5,7(0 GR XQWLO HRI )(7&+ 25'(5,7(0 HQG &/26( 25'(5,7(0 HQG
3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$
6ROXWLRQ &6 3$*( 1R 12
73 73
23(1 ,7(0 )(7&+ ,7(0 75 &/26( ,7(0
23(1 ,7(0 )(7&+ ,7(0 75 &/26( ,7(0
Figure 8-25. Unnecessary Waiting - Solution 1
CF963.2
Notes:
Closing the ITEM cursor immediately after each FETCH makes the S lock very short (worst case: 2 TR, 20ms).
V3.1.0.1
Student Notebook
Uempty

3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$ 6ROXWLRQ &6 3$*( <HV 12
23(1 ,7(0 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 121(
73 6
73 6
'(&/$5( ,7(0 ,16(16,7,9( 6&52// &85625 )25 6(/(&7 ,7(012 )520 ,7(0 25'(5 %< ,7(012
CF963.2
Notes:
The most elegant way to force a materialization at OPEN CURSOR is to define the cursor as scrollable.
8-29
Student Notebook

3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$ 2ULJLQDO &6 52: 1R 12
73 23(1 ,7(0 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 121(
73
CF963.2
Notes:
Row locking reduces lock durations significantly, but in the worst case when lock avoidance fails for the most popular row that row is S locked for 6 seconds.
V3.1.0.1
Student Notebook
Uempty

3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$ 6ROXWLRQ 85 3$*( 1R 12
73 23(1 ,7(0 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 75 )(7&+ ,7(0 121(
73
'(&/$5( ,7(0 &85625 )25 6(/(&7 ,7(012 )520 ,7(0 25'(5 %< ,7(012 :,7+ 85
CF963.2
Notes:
Uncommitted read eliminates all lock waits. SELECT WITH UR does not cause any lock waits, and also it does not wait if a page is X locked.
8-31
Student Notebook
8QQHFHVVDU\ :DLWLQJ 6XPPDU\
5HDGHU PD\ FDXVH 8SGDWHU PD\ FDXVH D ORQJ ZDLW WR XSGDWHU D ORQJ ZDLW WR UHDGHU 2ULJLQDO 6ROXWLRQ &/26( &85625 6ROXWLRQ 0DWHULDOL]DWLRQ 6ROXWLRQ 5RZOHYHO ORFNLQJ 6ROXWLRQ ,VRODWLRQ OHYHO 85 <HV XS WR V 12 12 <(6 XS WR V 12 <(6 <(6 <(6 <(6 12
Figure 8-29. Unnecessary Waiting - Summary
CF963.2
Notes:
The 60 seconds lock duration in the original program is the worst-case assumption. If almost all ORDERITEMs relate to ITEMs in one of the two ITEM table pages, that page would be locked for almost the whole duration of the program.
V3.1.0.1
Student Notebook
Uempty
8QQHFHVVDU\ :DLWLQJ 6XPPDU\
3URJUDP ,62/$7,21 /2&.6,=( 0DWHULDOL]DWLRQ &855(17'$7$
2ULJLQDO &6 3$*( 1R 12

73 73
6ROXWLRQ &6 3$*( <HV 12

73 73 6 6
2ULJLQDO &6 52: 1R 12

75 75 75
6ROXWLRQ 85 3$*( 1R 12
73 73
23(1 ,7(0 )(7&+ ,7(0 75 )(7&+ ,7(0 75

6
)(7&+ ,7(0 75
)(7&+ ,7(0 121(
Figure 8-30. Unnecessary Waiting - Summary...
CF963.2
Notes:
8-33
Student Notebook
:KR LV $IUDLG RI :,7+ 85"
5HDGHU GRHV QRW FDXVH ORFN ZDLWV WR XSGDWHU 8SGDWHU GRHV QRW FDXVH ORFN ZDLWV WR UHDGHU /HVV RYHUKHDG 5HVXOW PD\ EH ORJLFDOO\ LQFRQVLVWHQW DV ZLWK &6
$ PRYLQJ URZ PD\ EH FRXQWHG WZLFH RU QRW DW DOO
5HVXOW PD\ FRQWDLQ GDWD WKDW LV ODWHU UROOHG EDFN
Figure 8-31. Who is Afraid of WITH UR?
CF963.2
Notes:
WITH UR seems safe in many report programs and queries. What are the risks in the previous example, for instance? The most obvious risk is seeing data that is later rolled back. A program which updates the ITEM table could do something totally crazy (because of a bug), detect it in a reasonability check after an updating call, and issue a ROLLBACK. A normal SELECT would never see the corrupted data, but a SELECT WITH UR could.
V3.1.0.1
Student Notebook
Uempty
0DQ\ 3DJHV /RFNHG 7RR /RQJ
3 25'(512
;
; 8& 25'(5'$7( 25'(512
'(/(7( )520 25'(5 :+(5( 25'(5'$7( +9
25'(5
URZV SDJHV
&$6&$'(
; 25'(512 ,7(012
25'(5 URZV DUH GHOHWHG (ODSVHG WLPH PLQXWHV 5HTXLUHPHQW /RFN GXUDWLRQ PD[ V
25'(5,7(0 URZV SDJHV
Figure 8-32. Many Pages Locked Too Long
CF963.2
Notes:
This is a very convenient way to delete old rows. The number of IRLM entries would not be a big problem with page locking in this case (the old rows are next to each other in the first table pages), but if a DELETE takes 15 minutes, some table pages are X locked for 15 minutes. Serious recommendation number 1 (no commit interval > 5s) is not respected. The 15 minutes elapsed time assumes there are several indexes on both tables (not shown on the visual). Let us assume that it takes four seconds to delete the biggest order and all its dependants.
8-35
Student Notebook
6ROXWLRQ
6ROXWLRQ $GG FXUVRU

23(1 25'(5
73 8 ;
73
'(&/$5( 25'(5 &85625 :,7+ +2/' )25 6(/(&7 )520 25'(5 :+(5( 25'(5'$7( +9 )25 83'$7( 23(1 25'(5 GR XQWLO HRI )(7&+ 25'(5 '(/(7( )520 25'(5 :+(5( &855(17 2) 25'(5 &200,7 HQG &/26( 25'(5
)(7&+ 25'(5 75 '(/(7( 25'(5 &200,7 )(7&+ 25'(5 75 '(/(7( 25'(5 &200,7 )(7&+ 25'(5 75
8 ;
Figure 8-33. Solution
CF963.2
Notes:
DB2 releases all locks at commit point, even those related to cursors WITH HOLD, if ZPARM RELCURHL is set to YES (recommended). This program keeps an ORDER page locked only for the time it takes to delete one ORDER row and its dependants (max 4s).
V3.1.0.1
Student Notebook
Uempty
&RPPLW 2YHUKHDG
948%( PV SHU &200,7

,Q SURSRVHG VROXWLRQ FRPPLW SRLQWV DGG [ PV V WR HODSVHG WLPH
&RQVLGHU WLPHUFRQWUROOHG FRPPLW ,) WLPHBVLQFHBODVWBFRPPLW ! V 7+(1 &200,7

V V PD[ HODSVHG WLPH IRU RQH '(/(7( )520 25'(5
Figure 8-34. Commit Overhead
CF963.2
Notes:
The commit points add about 10% to the elapsed time if a COMMIT is issued after each DELETE. Committing only if more than one second has elapsed since last commit would reduce the number of commits from 10,000 to about 1000.
8-37
Student Notebook
3UHYHQW /RQJ /RFN :DLWV
948%(
7UDQVDFWLRQV %DWFK MREV /RFDO UHVSRQVH WLPH ZRUVW LQSXW V /RQJHVW FRPPLW LQWHUYDO V
)LQG KRW SDJHV HDUO\

/RFN UHTXHVW UDWH [ $YHUDJH ORFN GXUDWLRQ ! 1RWH ,16(57 GRHV QRW ZDLW LI KRPH SDJH ORFNHG DQRWKHU SDJH FKRVHQ
Figure 8-35. Prevent Long Lock Waits
CF963.2
Notes:
If local response time can be longer than five seconds, intermediate commits or other ways to release locks should be considered. Long commit intervals may be acceptable in read-only programs if other mechanisms (like CLOSE CURSOR) make lock durations short enough. When using the hot page formula, you should remember that an INSERT never waits for an X lock. However, the last page of a table is often a problem if inserts go the end and the newly-arrived rows are often read or updated: the SELECTs (without UR) and UPDATEs will have to wait until the X lock is released.
V3.1.0.1
Student Notebook
Uempty
+RW 3DJHV"
6(/(&7V SHU VHFRQG 3238/$5 3$*(

$YHUDJH 6 ORFN GXUDWLRQ V
83'$7( SHU VHFRQG

$YHUDJH ; ORFN GXUDWLRQ V
,16(576 SHU VHFRQG /$67 3$*(

$YHUDJH ; ORFN GXUDWLRQ V
6(/(&7 SHU VHFRQG

$YHUDJH 6 ORFN GXUDWLRQ V
Figure 8-36. Hot Pages?
CF963.2
Notes:
Do these pages cause significant lock waits to readers or updaters?
8-39
Student Notebook
'HDGORFNV
5DUH LI
1R KRW SDJHV 1R ORQJ ORFN GXUDWLRQV $OZD\V 6(/(&7 )25 83'$7(
6 $ 6 % 6(/(&7 6(/(&7 )25 83'$7( ; 8 ; 8 ; &
Figure 8-37. Deadlocks
CF963.2
Notes:
FOR UPDATE does not only prevent wrong results; it also reduces the number of deadlocks. If your application is close to all three objectives, you will not see many deadlocks. If, in addition, you are able to access tables and rows in a consistent sequence, deadlocks will be very rare indeed.
V3.1.0.1
Student Notebook
Uempty
$QDO\]LQJ /RQJ /RFN :DLWV
6WDWLVWLFDO DSSURDFK HVVHQWLDO

$ FROOLVLRQ PD\ RFFXU IURP WLPH WR WLPH
$FFRXQWLQJ WUDFH VKRZV YLFWLPV

&KHFN ZKLFK WDEOHV DFFHVVHG E\ ZDLWLQJ SDFNDJHV
/RFN VXVSHQVLRQ WUDFH VKRZV HYHU\WKLQJ

2EMHFW &XOSULW DQG YLFWLPV /RZ RYHUKHDG EXW YROXPLQRXV RXWSXW GDWD UHGXFWLRQ SURJUDP QHHGHG
Figure 8-38. Analyzing Long Lock Waits
CF963.2
Notes:
Accounting trace classes 3 and 8 are needed to show lock waits. Performance trace class 6 (IFCIDs 44 and 45) is needed for the lock suspension trace. The accounting trace reveals long lock waits with minimal effort. If the package suffering from long lock waits does not process a large number of tables, the problem is normally found without any more detailed traces.
8-41
Student Notebook
5HVSRQVLEOH IRU /RFN :DLWV
$33/,&$7,21 '(9(/23(5
+ROG QR ORFN IRU PRUH WKDQ V $OZD\V )25 83'$7( LI UHDGLQJ ZLWK XSGDWH LQWHQW :HDNHVW SRVVLEOH LVRODWLRQ OHYHO :,7+ 85&65655 %,1' &855(17'$7$12 7U\ WR DFFHVV URZV LQ FRQVLVWHQW VHTXHQFH
'$7$%$6( 63(&,$/,67
/RFNVL]H 52: RU 3$*( 1R ORFN HVFDODWLRQ SDJHURZ WDEOH 'HDGORFN GHWHFWLRQ IUHTXHQF\ V
Figure 8-39. Responsible for Lock Waits
CF963.2
Notes:
Most locking problems are application-related. Traditionally, application developers do not know enough about DB2 locking.
V3.1.0.1
Student Notebook
Uempty
8QLW 6XPPDU\
.H\ SRLQWV 1R ORQJ ORFN GXUDWLRQV OHVV WKDQ V :HDNHVW SRVVLEOH LVRODWLRQ OHYHO )25 83'$7( DQG :+(5( &855(17 2)
CF963.2
Notes:
8-43
Student Notebook
V3.1.0.1
Student Notebook
Uempty
Unit 9. Monitoring Application Performance

This unit is about monitoring application performance with accounting traces.

After completing this unit, you should be able to: Identify how traces work Define what an accounting trace is List the most important counters in an accounting trace Compare VQUBE and accounting traces Analyze an accounting trace Describe the most useful accounting reports
References
SC18-7413 SC18-7978 DB2 UDB for z/OS Version 8 Administration Guide DB2 Performance Expert for z/OS Version 2 / DB2 Performance Monitor for z/OS Version 8 Report Reference
9-1
Student Notebook
8QLW 2EMHFWLYHV
$IWHU FRPSOHWLQJ WKLV XQLW \RX VKRXOG EH DEOH WR ,GHQWLI\ KRZ WUDFHV ZRUN 'HILQH ZKDW DQ DFFRXQWLQJ WUDFH LV /LVW WKH PRVW LPSRUWDQW FRXQWHUV LQ DQ DFFRXQWLQJ WUDFH &RPSDUH 948%( DQG DFFRXQWLQJ WUDFHV $QDO\]H DQ DFFRXQWLQJ WUDFH 'HVFULEH WKH PRVW XVHIXO DFFRXQWLQJ UHSRUWV
CF963.2
Notes:
9-2
V3.1.0.1
Student Notebook
Uempty
'% 7UDFH 2YHUYLHZ
'%
7UDFH ILOH
1RW SDUW RI WKH '% SURGXFW
3HUIRUPDQFH 0RQLWRU RU 8VHU ZULWWHQ SURJUDP
'% /2$' XWLOLW\
Figure 9-2. DB2 Trace Overview
CF963.2
Notes:
DB2 writes information about its own activity, if requested. This information is written as records to a sequential file (on the visual, this file is called trace file). There are two major problems when dealing with trace records. First, the format of the trace records is not user-friendly (variable record length, roughly 300 different record types, called IFCIDs, most information in binary format, and so on). Second, the volume of the produced trace records may be high, so a selection and/or reduction (grouping) program is needed. This program could be user-written, or an existing software product could be purchased and used for these purposes. DB2 itself does not contain any program to process trace files; it only produces them. IBMs products to process trace files are called DB2 Performance Monitor and DB2 Performance Expert. The output could be listings, online panels, or files loadable in DB2 tables using the DB2 LOAD utility.
9-3
Student Notebook
$FFRXQWLQJ 7UDFH
2QH UHFRUG IRU HDFK SURJUDP H[HFXWLRQ 2XWSXW YROXPH PD\ EH KLJK /RZ &38 RYHUKHDG 0XVW EH DFWLYDWHG 67$57 75$&($&&7* &/$66 RU 7KURXJK =3$50 VHWWLQJ
Figure 9-3. Accounting Trace
CF963.2
Notes:
Traces are subdivided into trace types. The one of interest for application tuning is called accounting trace. As with all trace types, accounting traces must be activated, by using the START TRACE command, or by setting the corresponding ZPARMs (the corresponding traces will then be started automatically at START DB2). Some customers run accounting traces on a 24-hour basis, while others run accounting traces during peak hours only. Accounting writes one record for each program execution. Therefore, the output volume may be high. For instance, if one million CICS transactions are executed during one day, and if accounting trace was active during the whole day, at least one million records will be written to the trace file.
9-4
V3.1.0.1
Student Notebook
Uempty
5HDGLQJ DQ $FFRXQWLQJ 7UDFH

&ODVV HODSVHG WLPH PV
/2&$/ 5(63216( 7,0( &ODVV HODSVHG WLPH PV 64/
&ODVV HODSVHG WLPH &ODVV HODSVHG WLPH 12164/ PV
/2&. :$,7 &ODVV ORFN ODWFK VXVSHQVLRQ PV
&38 7,0(
6<1&+5 5($'
:$,7 )25 35()(7&+
27+(5
PV
&ODVV &ODVV &38 V\QFKURQRXV ,2 WLPH VXVSHQVLRQ PV PV 1XPEHU $9* PV
&ODVV RWKHU UHDG ,2 VXVSHQVLRQ PV
Figure 9-4. Reading an Accounting Trace
CF963.2
Notes:
The terminology we have used so far in this course is not the official accounting terminology. The visual shows the relationship between the terminology we have used so far and the accounting terminology. The next four pages show an accounting trace (formatted by DB2 Performance Expert). The layout may be different when using another formatting program, but the content must be the same as it is the content of one accounting trace record generated by DB2. The time values have six digits after the decimal point, which represent microseconds. Pages 1 to 3 show information at the thread/plan level; page 4 shows information at the package/DBRM level. OTHER is calculated as: SQL - (LOCK WAIT + CPU TIME + SYNCHRONOUS READ + WAIT FOR PREFETCH) To calculate the average time per synchronous read, the number of synchronous reads from column EVENTS is used, 2 in our example.
9-5
Student Notebook
The class 1 elapsed time (local response time) does not include activities performed before the thread is created or after the thread is terminated. For instance, for a client/server application, the time spent to send the request from the client to DB2 for z/OS and the time spent to send the response back to the client are not included in the class 1 elapsed time. Class 3 other read I/O suspension is the wait time for prefetch operations to complete. This includes sequential prefetch, list prefetch, and dynamic prefetch. With todays (2005) hardware, the wait time for sequential prefetch and dynamic prefetch is 0 in most cases, and very close to 0 for the rest. Therefore, it is safe to assume that a class 3 other read I/O suspension value much higher than zero is almost always related to list prefetch.
9-6
Uempty
V3.1.0.1
LOCATION: GROUP: MEMBER: SUBSYSTEM: DB2 VERSION:
EDUCDBP8 N/P N/P DBP8 V8
DB2 PERFORMANCE EXPERT (V2) ACCOUNTING TRACE - LONG
PAGE: REQUESTED FROM: TO: ACTUAL FROM:
1-1 NOT SPECIFIED NOT SPECIFIED 02/06/05 08:31:34.27
---- IDENTIFICATION -------------------------------------------------------------------------------------------------------------ACCT TSTAMP: 02/06/05 08:31:34.27 PLANNAME: DSNESPCS WLM SCL: 'BLANK' CICS NET: N/A BEGIN TIME : 02/06/05 08:31:34.05 PROD ID : N/P CICS LUN: N/A END TIME : 02/06/05 08:31:34.27 PROD VER: N/P LUW NET: DEIBMA4O CICS INS: N/A REQUESTER : EDUCDBP8 CORRNAME: CHCF960 LUW LUN: A4OASBP8 MAINPACK : DSNESM68 CORRNMBR: 'BLANK' LUW INS: BC87DD3D2335 ENDUSER : 'BLANK' PRIMAUTH : CHCF960 CONNTYPE: TSO LUW SEQ: 1 TRANSACT: 'BLANK' ORIGAUTH : CHCF960 CONNECT : TSO WSNAME : 'BLANK'

DB2 (CL.2) ---------0.182471 0.182471 0.000000 0.000000 0.000000 0.005636 0.005636 0.005636 0.000000 0.000000 0.000000 0.000000 0.172605 0.172605 0.000000 N/A N/A 0.004230 134 0 0 N/A N/A ELAPSED TIME -----------0.000000 0.000000 0.000000 0.000000 N/P N/A N/A N/A N/P N/P N/A N/A N/A N/A N/A N/P N/A N/P N/A N/A N/A N/A IFI (CL.5) ---------N/P N/A N/A N/A N/A CLASS 3 SUSPENSIONS -------------------LOCK/LATCH(DB2+IRLM) SYNCHRON. I/O DATABASE I/O LOG WRITE I/O OTHER READ I/O OTHER WRTE I/O SER.TASK SWTCH UPDATE COMMIT OPEN/CLOSE SYSLGRNG REC EXT/DEL/DEF OTHER SERVICE ARC.LOG(QUIES) ARC.LOG READ DRAIN LOCK CLAIM RELEASE PAGE LATCH NOTIFY MSGS GLOBAL CONTENTION COMMIT PH1 WRITE I/O ASYNCH CF REQUESTS TOTAL CLASS 3 ELAPSED TIME -----------0.000000 0.003759 0.003759 0.000000 0.038999 0.000000 0.129848 0.000069 0.128173 0.001605 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.172605 EVENTS -------0 2 2 0 2 0 5 1 2 2 0 0 0 0 0 0 0 0 0 0 0 9 HIGHLIGHTS -------------------------THREAD TYPE : ALLIED TERM.CONDITION: NORMAL INVOKE REASON : DEALLOC COMMITS : 2 ROLLBACK : 0 SVPT REQUESTS : 0 SVPT RELEASE : 0 SVPT ROLLBACK : 0 INCREM.BINDS : 0 UPDATE/COMMIT : 0.00 SYNCH I/O AVG.: 0.001879 PROGRAMS : 0 MAX CASCADE : 0 PARALLELISM : NO ROLLUP PLAN : NO EVENTS -------0 0 0 0 GLOBAL CONTENTION P-LOCKS ------------------------------------P-LOCKS PAGESET/PARTITION PAGE OTHER ELAPSED TIME -----------0.000000 0.000000 0.000000 0.000000 EVENTS -------0 0 0 0
MVS ACCOUNTING DATA : CH058250 ACCOUNTING TOKEN(CHAR): N/A ACCOUNTING TOKEN(HEX) : N/A CLASS 2 TIME DISTRIBUTION ---------------------------------------------------------------CPU !=> 3% NOTACC !=> 2% SUSP !===============================================> 95%
ELAPSED TIME DISTRIBUTION ---------------------------------------------------------------APPL !=========> 19% DB2 !==> 4% SUSP !======================================> 76%
TIMES/EVENTS -----------ELAPSED TIME NONNESTED STORED PROC UDF TRIGGER
APPL(CL.1) ---------0.226066 0.226066 0.000000 0.000000 0.000000
CPU TIME AGENT NONNESTED STORED PRC UDF TRIGGER PAR.TASKS
0.016805 0.016805 0.016805 0.000000 0.000000 0.000000 0.000000
SUSPEND TIME AGENT PAR.TASKS STORED PROC UDF
0.000000 N/A N/A 0.000000 0.000000
NOT ACCOUNT. DB2 ENT/EXIT EN/EX-STPROC EN/EX-UDF DCAPT.DESCR. LOG EXTRACT.
N/A N/A N/A N/A N/A N/A
Student Notebook
GLOBAL CONTENTION L-LOCKS ------------------------------------L-LOCKS PARENT (DB,TS,TAB,PART) CHILD (PAGE,ROW) OTHER
9-7
9-8
DB2 PERFORMANCE EXPERT (V2) ACCOUNTING TRACE - LONG PAGE: REQUESTED FROM: TO: ACTUAL FROM: 1-2 NOT SPECIFIED NOT SPECIFIED 02/06/05 08:31:34.27 DATA SHARING TOTAL ------------ -------GLB CONT (%) N/P L-LOCKS (%) N/P P-LOCK REQ N/P P-UNLOCK REQ N/P P-CHANGE REQ N/P LOCK - XES N/P UNLOCK-XES N/P CHANGE-XES N/P SUSP - IRLM N/P SUSP - XES N/P SUSP - FALSE N/P INCOMP.LOCK N/P NOTIFY SENT N/P SQL DDL CREATE DROP ALTER ---------- ------ ------ -----TABLE 0 0 0 CRT TTABLE 0 N/A N/A DCL TTABLE 0 N/A N/A AUX TABLE 0 N/A N/A INDEX 0 0 0 TABLESPACE 0 0 0 DATABASE 0 0 0 STOGROUP 0 0 0 SYNONYM 0 0 N/A VIEW 0 0 N/A ALIAS 0 0 N/A PACKAGE N/A 0 N/A PROCEDURE 0 0 0 FUNCTION 0 0 0 TRIGGER 0 0 N/A DIST TYPE 0 0 N/A SEQUENCE 0 0 0 TOTAL RENAME TBL COMMENT ON LABEL ON TOTAL -------0 0 0 STORED PROC. -----------CALL STMTS ABENDED TIMED OUT REJECTED TOTAL -------0 0 0 0 0 0 0 0 0 0 LOCKING TOTAL ------------------- -------TIMEOUTS 0 DEADLOCKS 0 ESCAL.(SHAR) 0 ESCAL.(EXCL) 0 MAX PG/ROW LCK HELD 1 LOCK REQUEST 51 UNLOCK REQST 49 QUERY REQST 0 CHANGE REQST 0 OTHER REQST 0 LOCK SUSPENS. 0 IRLM LATCH SUSPENS. 0 OTHER SUSPENS. 0 TOTAL SUSPENS. 0 ROWID ---------DIR ACCESS INDEX USED TS SCAN UDF --------EXECUTED ABENDED TIMED OUT REJECTED TOTAL -------0 0 0 0 TRIGGERS -----------STMT TRIGGER ROW TRIGGER SQL ERROR TOTAL -------0 0 0
Student Notebook
SQL DML TOTAL -------- -------SELECT 0 INSERT 0 UPDATE 0 DELETE 0
DESCRIBE DESC.TBL PREPARE OPEN FETCH CLOSE
0 0 1 1 61 1
DML-ALL
64
SQL DCL TOTAL ---------- -------LOCK TABLE 0 GRANT 0 REVOKE 0 SET SQLID 0 SET H.VAR. 0 SET DEGREE 0 SET RULES 0 SET PATH 0 SET PREC. 0 CONNECT 1 0 CONNECT 2 0 SET CONNEC 0 RELEASE 0 CALL 0 ASSOC LOC. 0 ALLOC CUR. 0 HOLD LOC. 0 FREE LOC. 0 DCL-ALL 0
RID LIST --------------USED FAIL-NO STORAGE FAIL-LIMIT EXC.
TOTAL -------1 0 0
Uempty
V3.1.0.1

DATA CAPTURE -----------IFI CALLS REC.CAPTURED LOG REC.READ ROWS RETURN RECORDS RET. DATA DES.RET TABLES RET. DESCRIBES TOTAL -------N/P N/P N/P N/P N/P N/P N/P N/P SERVICE UNITS ------------CPU AGENT NONNESTED STORED PRC UDF TRIGGER PAR.TASKS CLASS 1 -------------173 173 173 0 0 0 0 CLASS 2 -------------58 58 58 0 0 0 0 DRAIN/CLAIM -----------DRAIN REQST DRAIN FAILED CLAIM REQST CLAIM FAILED TOTAL -------0 0 5 0 LOGGING ----------------LOG RECS WRITTEN TOT BYTES WRITTEN TOTAL -------0 0 MISCELLANEOUS --------------MAX STOR VALUES TOTAL -------0
QUERY PARALLEL. ------------------MAXIMUM MEMBERS MAXIMUM DEGREE GROUPS EXECUTED RAN AS PLANNED RAN REDUCED ONE DB2 COOR=N ONE DB2 ISOLAT ONE DB2 DCL TTABLE SEQ - CURSOR SEQ - NO ESA SEQ - NO BUF SEQ - ENCL.SER MEMB SKIPPED(%) DISABLED BY RLF REFORM PARAL-CONFIG REFORM PARAL-NO BUF
TOTAL -------N/P 0 0 0 0 0 0 0 0 0 0 0 0 NO 0 0
DYNAMIC SQL STMT -------------------REOPTIMIZATION NOT FOUND IN CACHE FOUND IN CACHE IMPLICIT PREPARES PREPARES AVOIDED CACHE_LIMIT_EXCEEDED PREP_STMT_PURGED
TOTAL -------0 0 1 0 0 0 0
---- RESOURCE LIMIT FACILITY -------------------------------------------------------------------------------------------------TYPE: N/P TABLE ID: N/P SERV.UNITS: N/P CPU SECONDS: 0.000000 MAX CPU SEC: N/P
BP0 BPOOL ACTIVITY --------------------BPOOL HIT RATIO (%) GETPAGES GETPAGES-FAILED BUFFER UPDATES SYNCHRONOUS WRITE SYNCHRONOUS READ SEQ. PREFETCH REQS LIST PREFETCH REQS DYN. PREFETCH REQS PAGES READ ASYNCHR.
TOTAL -------10 65 0 0 0 2 0 2 0 56
Student Notebook
9-9
Student Notebook
---- IDENTIFICATION -------------------------------------------------------------------------------------------------------------ACCT TSTAMP: 02/06/05 08:31:34.27 PLANNAME: DSNESPCS WLM SCL: 'BLANK' CICS NET: N/A BEGIN TIME : 02/06/05 08:31:34.05 PROD ID : N/P CICS LUN: N/A END TIME : 02/06/05 08:31:34.27 PROD VER: N/P LUW NET: DEIBMA4O CICS INS: N/A REQUESTER : EDUCDBP8 CORRNAME: CHCF960 LUW LUN: A4OASBP8 MAINPACK : DSNESM68 CORRNMBR: 'BLANK' LUW INS: BC87DD3D2335 ENDUSER : 'BLANK' PRIMAUTH : CHCF960 CONNTYPE: TSO LUW SEQ: 1 TRANSACT: 'BLANK' ORIGAUTH : CHCF960 CONNECT : TSO WSNAME : 'BLANK' DSNESM68 -----------------ELAPSED TIME - CL7 CPU TIME AGENT PAR.TASKS SUSPENSION-CL8 AGENT PAR.TASKS NOT ACCOUNTED CPU SERVICE UNITS AGENT PAR.TASKS DB2 ENTRY/EXIT N/P 57 57 0 TIMES -----------0.182460 0.005626 0.005626 0.000000 0.172605 0.172605 0.000000 0.004228 DSNESM68 -----------------LOCK/LATCH SYNCHRONOUS I/O OTHER READ I/O OTHER WRITE I/O SERV.TASK SWITCH ARCH.LOG(QUIESCE) ARCHIVE LOG READ DRAIN LOCK CLAIM RELEASE PAGE LATCH NOTIFY MESSAGES GLOBAL CONTENTION TOTAL CL8 SUSPENS. TIME -----------0.000000 0.003759 0.038999 0.000000 0.129848 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.172605 EVENTS -----0 2 2 0 5 0 0 0 0 0 0 0 9 TIME/EVENT -----------N/C 0.001879 0.019499 N/C 0.025970 N/C N/C N/C N/C N/C N/C N/C 0.019178
DSNESM68 -----------------TYPE LOCATION COLLECTION ID PROGRAM NAME CONSISTENCY TOKEN ACTIVITY TYPE ACTIVITY NAME SCHEMA NAME SQL STATEMENTS SUCC AUTH CHECK
VALUE -----------------PACKAGE EDUCDBP8 DSNESPCS DSNESM68 149EEA901A79FE48 NONNESTED 'BLANK' 'BLANK' 65 NO
DSNESM68 ------------------BPOOL HIT RATIO (%) GETPAGES GETPAGES-FAILED BUFFER UPDATES SYNCHRONOUS WRITE SYNCHRONOUS READ SEQ. PREFETCH REQS LIST PREFETCH REQS DYN. PREFETCH REQS PAGES READ ASYNCHR.
TOTAL -------10 65 0 0 0 2 0 2 0 56
ACCOUNTING TRACE COMPLETE
V3.1.0.1
Student Notebook
Uempty
$FFRXQWLQJ 7UDFHV DQG 948%(

&ODVV HODSVHG WLPH
&ODVV HODSVHG WLPH
&ODVV HODSVHG WLPH &ODVV HODSVHG WLPH

&ODVV ORFN ODWFK VXVSHQVLRQ
75 76 [ PV
75 [ PV
63 /3 75 [ PV
&ODVV &ODVV &38 V\QFKURQRXV ,2 WLPH VXVSHQVLRQ
&ODVV RWKHU UHDG ,2 VXVSHQVLRQ
63 /3
6HTXHQWLDO SUHIHWFK LQFO G\QDPLF SUHIHWFK /LVW SUHIHWFK
Figure 9-5. Accounting Traces and VQUBE
CF963.2
Notes:
VQUBE takes into account only the CPU time and the I/O wait time related to the execution of SQL statements. Therefore, VQUBE ignores lock waits, the OTHER bullet, and everything that happens between SQL statements. The CPU estimate is based on z990 processors. Therefore, if the accounting trace was generated on processors with another MIPS rate, the touch value (0.02ms) must be corrected. If, for instance, the processor speed is 50% of a z990, the touch value would be 0.04ms. And, do not forget that VQUBE is upper bound, therefore the measured values with accounting traces are, in most cases, lower than the VQUBE estimate. For TR, VQUBE ignores list prefetch, and buffer pool hits and disk cache hits. Here too, the measured values are, in most cases, lower than the VQUBE estimate.
9-11
Student Notebook
$QDO\]LQJ DQ $FFRXQWLQJ 7UDFH

0DMRU FRQWULEXWRU WR FODVV HODSVHG WLPH"
&ODVV HODSVHG WLPH
&ODVV HODSVHG WLPH &ODVV HODSVHG WLPH
6HH QH[W SDJH
0RVW RI WKH HODSVHG WLPH VSHQW RXWVLGH '% &,&6 ,06'& $FFHVV WR RWKHU GDWDEDVHV )LOH SURFHVVLQJ 3URJUDP LQVWUXFWLRQV
Figure 9-6. Analyzing an Accounting Trace (1)
CF963.2
Notes:
The first thing to look at is the ratio between the SQL time (class 2 elapsed time) and the non-SQL time (class 1 elapsed time - class 2 elapsed time). If most of the time is spent on non-SQL activities, the reason for bad performance should be investigated outside DB2. Some of the most common contributors outside DB2 are shown on the visual, but this is, of course, not a complete list.
V3.1.0.1
Student Notebook
Uempty
$QDO\]LQJ DQ $FFRXQWLQJ 7UDFH

0DMRU FRQWULEXWRUV WR FODVV HODSVHG WLPH"
&ODVV &38 WLPH
&ODVV ,2 VXVSHQVLRQ

+LJK DYHUDJH V\QFKURQRXV ,2 VXVSHQVLRQ WLPH SHU ,2 RYHU PV
&ODVV ORFNODWFK VXVSHQVLRQ

/HVV WKDQ RQFH D GD\ IRU D JLYHQ ORFN VLWXDWLRQ <(6
27+(5
1RW DSSOLFDWLRQ UHODWHG

12
12
0LVVLQJ PDWFKLQJ FROXPQV 1RQLQGH[DEOH SUHGLFDWHV 1RQ%RROHDQ WHUP SUHGLFDWHV :URQJ VXETXHU\ W\SH $YRLGDEOH VRUW 0LVVLQJ GHQRUPDOL]DWLRQ 1RW LQGH[RQO\
Figure 9-7. Analyzing an Accounting Trace (2)
<(6
1RW DSSOLFDWLRQ UHODWHG
,JQRUH LW
)LQG FXOSULW 5HGXFH FRPPLW LQWHUYDO (QDEOH ORFN DYRLGDQFH &KHFN LVRODWLRQ OHYHO
CF963.2
Notes:
If the major contributor to class 1 elapsed time is class 2 elapsed time, the next step is to find out if the performance problem is application- or system-related (or both). If the major contributors to class 2 elapsed time are the OTHER counters, then the problem is system-related (CPU queuing too high, excessive z/OS paging, VSAM problems, ...). The solution to these problems is not within the scope of this course (see course CG88 for this). If the major contributor to class 2 elapsed time is class 3 synchronous I/O suspension time and if the average per synchronous I/O suspension (class 3 synchronous I/O suspension elapsed time divided by class 3 synchronous I/O suspension events) is high (over 10ms), then the problem is also system-related, as the I/O queuing time must be very high. Remember that the class 3 synchronous I/O suspension time is a mix of disk cache hits (less than 1ms) and physical drive I/Os. With todays (2005) disk subsystems, a drive I/O without queuing takes roughly 7ms on average. Everything over 7ms on average is queuing and should be reduced by I/O tuning (also covered in course CG88). An example: Assume that accounting measurements show an average per synchronous I/O of 15ms and that the disk cache hit ratio is 50% (can be measured using disk
Copyright IBM Corp. 2000, 2005 Unit 9. Monitoring Application Performance 9-13
Student Notebook
subsystem monitoring tools). This means that every other I/O is a disk cache hit. Therefore, the measured 15ms are the weighted average of disk cache hits (1ms) and real drive I/Os. This means that one drive I/O takes roughly 30ms in average, therefore 23ms of I/O queuing. Having a queuing time of over 300% of the drive I/O time (7ms) is obviously a serious I/O problem which should be solved. High class 3 lock/latch suspension times are a little more tricky. If a lock situation causing high class 3 lock/latch suspension times happens once a day, this must be considered as a normal situation and should not lead to any corrective actions. For example, if a popular CICS transaction updates a popular row and it takes 1 second between this update and the end of the transaction (the commit point), it could happen that, from time to time, 10 users start this transaction at nearly the same time. Obviously, the last user will have to wait for roughly 10 seconds, the second last for 9 seconds, and so on. If this situation happens once a day, it would be a waste of time and money to do something to shorten this wait time. If this happens once per minute, then corrective actions should be taken. See unit 8 for details about this. All other situations (high class 2 CPU time, or high class 3 I/O suspension time without a high average per I/O, or both) are application-related, and the solutions are those covered in units 2 through 6 in this course. The list on the visual repeats the major reasons for bad performance, but, of course, is not complete.
V3.1.0.1
Student Notebook
Uempty
0RVW 8VHIXO $FFRXQWLQJ 5HSRUWV
$YHUDJHV E\ WUDQVDFWLRQ SODQ SDFNDJH
/RQJUXQQLQJ WUDQVDFWLRQV &ODVV HODSVHG WLPH ! VHFRQGV
7RS &38 FRQVXPHUV
&38 3 3 3
Figure 9-8. Most Useful Accounting Reports
CF963.2
Notes:
If you know the name of the program(s) causing performance problems (if, for instance, users complained about the bad performance of some online transaction, like in unit 1), it is easy to print the corresponding accounting trace record(s) using a performance monitor. The search criteria are the package name and, for online transactions, a from/to time limitation to reduce the output volume and, optionally, the userid of the executor. If you would like to find the bad performing programs in your production system without knowing the program names in advance, the visual shows 3 report types which are very useful for finding these programs. You cannot print everything that is on an accounting trace file, as the volume may be very high, and having a report of many thousands of pages is useless. Nobody will look at it. Of course, for the second report, the class 1 elapsed time should be adapted to your installation. If 50% of your transactions have a class 1 elapsed time greater than 5 seconds, the output volume is again high and useless. In this case, the limit should be increased to, lets say, 10 seconds (or even more), to get a reasonable output volume.
9-15
Student Notebook
The way to produce these reports is dependent on the performance monitor used to produce them and therefore cannot be shown here. Unfortunately, the user interfaces for the different performance monitors are very different.
V3.1.0.1
Student Notebook
Uempty
8QLW 6XPPDU\
.H\ SRLQW 8VH DFFRXQWLQJ WUDFHV WR ORFDWH SHUIRUPDQFH SUREOHPV
CF963.2
Notes:
9-17
Student Notebook
V3.1.0.1
backpg
Back page

Application Performance and Tuning

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Application Performance and Tuning

Transféré par

Droits d'auteur :

Formats disponibles

V3.1.0.

DB2 UDB for z/OS Application Performance and Tuning

IBM Certified Course Material

June 2005 Edition

Copyright IBM Corp. 2000, 2005

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

Copyright IBM Corp. 2000, 2005

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

Copyright IBM Corp. 2000, 2005

Towards better tables

Learning to live with the optimizer

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

Worried about CPU time?

Preventing long lock waits

Tuning operational applications

Copyright IBM Corp. 2000, 2005

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

xviii DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

Unit 1. Application Performance Issues and Management Methods

What You Should Be Able to Do

Copyright IBM Corp. 2000, 2005

Unit 1. Application Performance Issues and Management Methods

&RS\ULJKW ,%0 &RUSRUDWLRQ 

Figure 1-1. Unit Objectives

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

:K\ 3HUIRUPDQFH 'LVDSSRLQWPHQWV"

Figure 1-2. Why Performance Disappointments?

Copyright IBM Corp. 2000, 2005

Unit 1. Application Performance Issues and Management Methods

)1$0( 0,..2 &,7< 0,/$1

/1$0( &86712 -21(6  .26., 

5HVSRQVH WLPH VRPHWLPHV YHU\ ORQJ

&RS\ULJKW ,%0 &RUSRUDWLRQ 

Figure 1-3. Users Complaining

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

'%$ &KHFNV (;3/$,1

,QGH[ ; 0&  < 1 6257

 URZV  SDJHV

:KROH UHVXOW PDWHULDOL]HG DW 23(1 &85625

Figure 1-4. DBA Checks EXPLAIN

Copyright IBM Corp. 2000, 2005

Unit 1. Application Performance Issues and Management Methods

'%$ $GGV /1$0( WR ;

,QGH[ ; 0&  6257 1 ,1'(;21/<

 URZV  SDJHV

5HVXOW PDWHULDOL]HG URZ E\ URZ DW )(7&+

6(/(&7 )520 :+(5(

&RS\ULJKW ,%0 &RUSRUDWLRQ 

Figure 1-5. DBA Adds LNAME to X3

DB2 UDB for z/OS Application Performance

Copyright IBM Corp. 2000, 2005

8VHUV .HHS &RPSODLQLQJ

'%$ DQDO\]HV DQ DFFRXQWLQJ WUDFH 6HOHFWLRQ FULWHULD

/RFDO UHVSRQVH WLPH ! V

&RS\ULJKW ,%0 &RUSRUDWLRQ 

Figure 1-6. Users Keep Complaining

&RS\ULJKW ,%0 &RUSRUDWLRQ

/1$0( &86712 -21(6 .26.,

&RS\ULJKW ,%0 &RUSRUDWLRQ

,QGH[ ; 0& < 1 6257

URZV SDJHV

'%$ $GGV /1$0( WR ;

,QGH[ ; 0& 6257 1 ,1'(;21/<

URZV SDJHV

&RS\ULJKW ,%0 &RUSRUDWLRQ

'%$ DQDO\]HV DQ DFFRXQWLQJ WUDFH 6HOHFWLRQ FULWHULD

/RFDO UHVSRQVH WLPH ! V

&RS\ULJKW ,%0 &RUSRUDWLRQ

&RS\ULJKW ,%0 &RUSRUDWLRQ

:RUVW FDVH LGHQWLILHG %LJ FLW\

&RS\ULJKW ,%0 &RUSRUDWLRQ

(;3/$,1 ,QGH[ ; 0& 6257 1 ,1'(;21/<

&RS\ULJKW ,%0 &RUSRUDWLRQ

&RS\ULJKW ,%0 &RUSRUDWLRQ

WRXFKHV UDQGRP

WRXFKHV DOO UDQGRP

! /2&$/ 5(63216( 7,0( 83 72 V

'% H[DPLQHV DQ LQGH[ URZ RU D WDEOH URZ

:K\ 'LG 2SWLPL]HU 1RW &KRRVH ;"

; :RXOG 3UHYHQW 6RUW %XW

,QGH[ ; 0& 1 1 6257

URZV SDJHV

'% PD\ VFDQ ZKROH ;

$SSOLFDWLRQ WXQLQJ YHU\ VORZ LI QRW HVWLPDWHEDVHG LI FHQWUDOL]HG

&RS\ULJKW ,%0 &RUSRUDWLRQ

&RS\ULJKW ,%0 &RUSRUDWLRQ