Vous êtes sur la page 1sur 80

IBM Americas Advanced Technical Support

IBM SAP Technical Brief

Tuning SAP / Oracle / pSeries

Mark Gordon

IBM Solutions Advanced Technical Support

Version: 1.1
Date: November 14, 2003

© 2003 International Business Machines, Inc.


IBM Americas Advanced Technical Support

1. ACKNOWLEDGEMENTS ............................................................................................................ 4
2. DISCLAIMERS .............................................................................................................................. 4
3. COPYRIGHTS................................................................................................................................ 4
4. FEEDBACK .................................................................................................................................... 4
5. VERSION UPDATES ..................................................................................................................... 4
6. INTRODUCTION........................................................................................................................... 5
7. ANALYZING A PROBLEM WITH A SPECIFIC PROGRAM OR TRANSACTION .............. 7
7.1. COMPONENTS OF SAP RESPONSE TIME ........................................................................................ 7
7.2. MAJORITY OF TIME IS CPU ON APPLICATION SERVER................................................................... 8
7.2.1. Summary of Majority of time is CPU on application server.............................................. 18
7.3. MAJORITY OF TIME IS DATABASE REQUEST TIME ....................................................................... 19
7.3.1. Slow per-row SQL performance ....................................................................................... 19
7.3.1.1. Summary of Slow per-row SQL performance........................................................... 26
7.3.2. Program performs too many SQL operations ................................................................... 27
7.3.2.1. Summary of Program performs too many SQL operations........................................ 30
7.4. MAJORITY OF TIME IS NOT CPU OR DB REQUEST TIME .............................................................. 31
8. SYSTEM PERFORMANCE PROBLEMS.................................................................................. 31
8.1. PERFORM SQL CACHE ANALYSIS ............................................................................................. 31
8.1.1. Indicator of inefficient SQL access................................................................................... 31
8.1.2. SQL Bget ratios that are not problems ............................................................................. 33
8.1.3. Causes of inefficient SQL access ...................................................................................... 33
8.1.4. Reasons that the predicates do not match indexes ............................................................ 35
8.1.5. Actions when predicates do not match indexes ................................................................. 36
8.1.6. High-impact SQL caused by incorrect use of SAP data model.......................................... 37
8.1.6.1. Summary of incorrect use of data model................................................................... 41
8.1.7. High-impact SQL when optimizer takes wrong access path.............................................. 41
8.1.7.1. Summary of wrong choice of access path ................................................................. 50
8.1.8. High-impact SQL is a symptom of another problem ......................................................... 51
8.1.8.1. Summary of symptom of another problem................................................................ 57
8.2. CREATE CANDIDATE LIST FROM ST03N..................................................................................... 57
9. SYSTEM HEALTH CHECK ....................................................................................................... 58
9.1. CPU ACTIVITY ......................................................................................................................... 58
9.2. I/O ACTIVITY ........................................................................................................................... 59
9.2.1. Good I/O performance in Oracle ..................................................................................... 60
9.2.2. Symptom of I/O hotspot on disk........................................................................................ 61
9.2.3. May be AIX constraint or disk performance problem ....................................................... 62
9.3. ORACLE REVIEW ...................................................................................................................... 63
9.3.1. Database Hit Rate............................................................................................................ 63
9.3.2. Database delays............................................................................................................... 64
9.4. SAP BUFFER SETTINGS ............................................................................................................. 65
© 2003 International Business Machines, Inc.

Page 2
IBM Americas Advanced Technical Support
9.5. SAP BUFFERED TABLE STATISTICS ............................................................................................ 66
9.5.1. Not buffered but could be buffered ................................................................................... 66
9.6. EVALUATE MEMORY IN AIX ..................................................................................................... 67
9.6.1. AIX paging....................................................................................................................... 67
9.6.2. Evaluating increasing memory for Oracle or SAP............................................................ 69
10. FOUR GUIDELINES FOR AVOIDING PERFORMANCE PROBLEMS............................ 71
10.1. USE THE SAP DATA MODEL .................................................................................................. 71
10.2. USE ARRAY OPERATIONS ON THE DATABASE .......................................................................... 72
10.3. CHECK WHETHER THE DATABASE CALL CAN BE AVOIDED....................................................... 72
10.4. WRITE ABAP PROGRAMS THAT ARE LINE-ITEM SCALABLE .................................................... 72
11. APPENDIX 1: SUMMARY OF PERFORMANCE MONITORING TOOLS ....................... 73
11.1. SAP ..................................................................................................................................... 73
11.1.1. DB02 Transaction............................................................................................................ 73
11.1.2. SE11 Transaction ............................................................................................................ 73
11.1.3. SE30 Transaction ............................................................................................................ 73
11.1.4. SM12 Transaction............................................................................................................ 73
11.1.5. SM50 Transaction............................................................................................................ 73
11.1.6. SM51 Transaction............................................................................................................ 74
11.1.7. SM66 Transaction............................................................................................................ 74
11.1.8. STAT Transaction ............................................................................................................ 74
11.1.9. STAD Transaction ........................................................................................................... 74
11.1.10. ST02 Transaction......................................................................................................... 74
11.1.11. ST03N Transaction ...................................................................................................... 74
11.1.12. ST04 Transaction......................................................................................................... 75
11.1.13. ST05 Transaction......................................................................................................... 75
11.1.14. ST06 Transaction......................................................................................................... 75
11.1.15. ST10 Transaction......................................................................................................... 75
11.1.16. RSINCL00 Program..................................................................................................... 75
11.1.17. SQLR Transaction........................................................................................................ 75
11.1.18. RSTRC000 Program .................................................................................................... 75
11.2. AIX ..................................................................................................................................... 75
11.2.1. nmon................................................................................................................................ 75
11.2.2. ptx ................................................................................................................................... 76
11.2.3. iostat................................................................................................................................ 76
11.2.4. vmstat .............................................................................................................................. 76
11.3. ORACLE ............................................................................................................................... 76
11.3.1. STATSPACK.................................................................................................................... 76
11.3.2. Trace and tkprof .............................................................................................................. 76
12. APPENDIX 2: REFERENCE MATERIALS........................................................................... 78
12.1. ORACLE MANUALS ............................................................................................................... 78
12.2. IBM MANUALS ..................................................................................................................... 78

© 2003 International Business Machines, Inc.

Page 3
IBM Americas Advanced Technical Support

1. Acknowledgements
Thank you to Walter Orb of IBM Germany, who was for several years the pSeries SAP performance lead
for IBM Americas Solutions ATS, and who showed me many of the processes used in this paper.

Thank you to Marty Carangelo, Dale Martin, and Ralf Schmidt-Dannert, for their contributions on Oracle
and pSeries performance.

Thank you to Phil Hardy and Damir Rubic, who reviewed the paper and offered many improvements.

2. Disclaimers
IBM has not formally reviewed this paper. While effort has been made to verify the information, this
paper may contain errors. IBM makes no warranties or representations with respect to the content hereof
and specifically disclaims any implied warranties of merchantability or fitness for any particular purpose.
IBM assumes no responsibility for any errors that may appear in this document. The information
contained in this document is subject to change without any notice. IBM reserves the right to make any
such changes without obligation to notify any person of such revision or changes. IBM makes no
commitment to keep the information contained herein up to date.

The paper contains examples from systems ranging from SAP 4.6 and Oracle 8.1.7 and AIX 4.3 up to
SAP 6.20 Oracle 9.2 and AIX 5.2.

The processes and guidelines in this paper are the compilation of experiences analyzing performance on a
variety of SAP systems. Your results may vary in applying them to your system.

Examples have been edited to clarify points for the paper.

3. Copyrights
SAP and R/3 are copyrights of SAP A.G.
RS/6000, pSeries, and AIX are copyrights of IBM Corporation.
Oracle is a copyright of Oracle Corporation.
OraPerf.com is a copyright of OraPerf.com

4. Feedback
Please send comments or suggestions for changes to gordonmr@us.ibm.com.

5. Version Updates
• Version 1.0 – initial version
• Version 1.1 – add SM04 Sessions

© 2003 International Business Machines, Inc.

Page 4
IBM Americas Advanced Technical Support

6. Introduction
There are two intended audiences for this paper – Oracle DBAs and SAP BASIS administrators. Either
may be doing performance analysis on an SAP system with Oracle on AIX. The goal of this paper is to
provide each audience with material that is useful and new: An SAP Basis administrator experienced with
other databases should find the ORACLE specific tuning tools and techniques helpful, while the
experienced ORACLE adminstrator is presented with SAP specific tuning tools and techniques.

This paper covers the two most common types of performance problems – database performance and
inefficient ABAP coding. While there are other causes of problems in SAP (e.g. network performance,
external RFC interfaces, SAP instance configuration, SAP sort, etc), database and ABAP performance are
the most common and generally have the biggest impact. In order to provide the most benefit in the
smallest paper, these other issues are not included in this paper.

This paper has a process-based approach, where different goals are pursued via different processes and
tools.
• To fix a problem reported for a specific program, we will perform elapsed time analyses of
programs, determine where time is spent, and optimize these long running parts. This includes
interpretation of ST03N and STAT/STAD records, and using ST05 and SE30. The paper will
demonstrate how to use the SAP stats to obtain database performance statistics, identify I/O
bottlenecks and SAP problems, etc. The benefit of this approach is that it is focused on an area
that has been identified as a business problem.
• To check for inefficient use of DB resources and improve overall database server
performance, we will use ST04 statement cache analysis. The value of this approach is that it
offers a very big potential payoff in reducing resource usage and increasing system efficiency.
The disadvantage is that one may be finding and solving problems that no end-user cares about.
For example, if we can improve the elapsed time of a batch job from 2 hours to 10 minutes, but the
job runs at 2:00 AM, and nobody needs the output until 8:00 AM, it may not really be a problem.
Even if it is not a business problem, it may still be beneficial to address a problem of this type as
part of optimizing resource consumption, in order to reduce the computing resources required to
support business requirements.
• To do a system health check, review AIX paging and CPU usage, Oracle I/O statistics, ST10 and
ST02 buffering. The operating environment needs to be running well for good performance, but
problems in these areas can be symptoms of other problems. For example, inefficient SQL can
cause high CPU usage or high I/O activity. Therefore, a health check should be done together with
analysis of SQL and ABAP problems.

This paper has many examples, and it describes what is good or bad in each example. There are not
always specific rules given on what is good or bad, such as “Database request time” over 40% of “elapsed
time” is bad and under 40% is good. Rather, this paper tries to focus on an opportunity-based approach,
such as:
• Look for where a program (or the SAP and database system) spends time.
• Ask “If I fix a problem in this area, will people notice and care that it has been fixed?”

© 2003 International Business Machines, Inc.

Page 5
IBM Americas Advanced Technical Support
It will discuss how to estimate the impact of solving a problem. System wide performance analysis (such
as a statement of cache analysis, or ST03 analysis) will generally turn up several candidates. By
estimating the impact of fixing these problems, one can decide which to address first.

When doing this analysis, it is important to identify and track specific issues. Often, a performance issue
may not have enough impact to merit a new index, or an ABAP change. In this case, we want to track that
we have analyzed it, and chosen not to do anything, so that we don’t waste time discovering it again next
year.

This paper refers to a number of SAP Notes. An OSS userid, or userid that allows access to
service.sap.com, is a prerequisite for anyone doing performance analysis on an SAP system, whether the
person is an Oracle DBA, AIX administrator, or SAP BASIS Administrator.

This paper breaks performance management into three parts, which are discussed in Sections 7, 8, and 9:
• Analyzing a problem with a specific program or transaction
• System performance problems
• System Health Check.

Since AIX-level symptoms such as paging, excessive CPU use, and high I/O rates can be symptoms of
application and SQL problems, one needs to start with reviewing the SAP and Oracle indicators, before
taking action based on the AIX level indicators.

© 2003 International Business Machines, Inc.

Page 6
IBM Americas Advanced Technical Support

7. Analyzing a problem with a specific program or transaction


If a problem has been identified, then the first step is to determine where most of the response time is
spent. Based on this first step, there are different tools that are used to monitor the ABAP, database,
RFCs, etc.

7.1. Components of SAP response time


SAP note 364625 describes the different components of SAP dialog step response time. This paper is
focused on programs that have high database request time, or high CPU time. These are the two most
frequent causes of performance problems.

Figure 1: STAD transaction display of time components

© 2003 International Business Machines, Inc.

Page 7
IBM Americas Advanced Technical Support

7.2. Majority of time is CPU on application server


As an example, assume that we have been asked to investigate the performance of VA05. It runs over
an hour for some users. We check ST03N.

Figure 2: ST03N VA05


Average times are somewhat slow (“0 Response” is 3,933 ms per dialog step), with CPU on the
application server about twice the length of database request time. Average times can hide problems
with a few long running dialog steps.

© 2003 International Business Machines, Inc.

Page 8
IBM Americas Advanced Technical Support
Check the ST03N “Hit lists”, to see what the elapsed time components are when it runs for a long
time. The longest running dialog steps are saved in the ST03N hit lists.

Figure 3: ST03N VA05 top time


There are some VA05 programs in the top time list, and the CPU time is over four times as long as DB
request time. High CPU time on the application server can point to problems in the ABAP code.

Since CPU time is the majority of elapsed time, use SE30 to trace the ABAP runtime. We arrange for
an end-user (ID MOO in the examples below) to run VA05 so we can trace it.

© 2003 International Business Machines, Inc.

Page 9
IBM Americas Advanced Technical Support
RUN SE30 and select the TMP variant.

Figure 4: SE30 TMP variant selected

© 2003 International Business Machines, Inc.

Page 10
IBM Americas Advanced Technical Support
Then press change to change the TMP variant options.

Figure 5: SE30 gather stats on internal tables


Enable statistics for internal tables. By default these are off, but they are the most common source of
high CPU use in ABAP programs.

© 2003 International Business Machines, Inc.

Page 11
IBM Americas Advanced Technical Support
In the “Duration/Type” tab, select “by call”.

Figure 6: SE30 Aggregation level


Select Aggregation level by call, so that different calls on the same table are reported separately.
(Aggregation level “None” will cause the trace to grow very big very quickly.)

Press save on the screen in Figure 6 and green arrow to go back to the SE30 main screen shown in
Figure 4. Then press “Enable/Disable” on the SE30 main screen to show the list of work processes
here in Figure 7.

© 2003 International Business Machines, Inc.

Page 12
IBM Americas Advanced Technical Support

Figure 7: SE30 start/end measurement


Select the process to be traced, and press start.

Figure 8: SE30 active

© 2003 International Business Machines, Inc.

Page 13
IBM Americas Advanced Technical Support
After tracing for a while, press “End measurement”, and green arrow back to the main SE30 screen
shown in Figure 4. Select the trace file, and press “analyze”.

Figure 9: SE30 analysis


The program spent over 90% of elapsed time processing ABAP on the application server.

© 2003 International Business Machines, Inc.

Page 14
IBM Americas Advanced Technical Support
Press “Hit list” and then sort the list by “Net” to see where the time was spent.

Figure 10: SE30 sorted by Net time


Here, the majority of time is spent on a “READ TABLE” ABAP statement. Calculate the time per call
as (544,981,426 microseconds/14,526 READ TABLE calls). Each READ TABLE takes over 35 ms,
which is very long.

Read table can be very fast, depending on the size of the table and the options used in defining the
table. Note that the “Read Table IT_2396” statement takes about 30 microseconds per call (461269
microseconds/14526 calls).

© 2003 International Business Machines, Inc.

Page 15
IBM Americas Advanced Technical Support
Select the slow “Read Table” statement and press ‘Display source code” to see the ABAP.

Figure 11: SE30 source code


Goto > attributes – to display the owner of the program .

Figure 12: SE30 program attributes


This is SAP code. We will open an OSS message against it.

© 2003 International Business Machines, Inc.

Page 16
IBM Americas Advanced Technical Support
Performance problems reading internal tables are very common performance and scalability issues.
In the SE30 transaction, press the “Tips and tricks” button to see various suggestions on ways to
improve ABAP programming. “Tips and tricks” contains several examples of common problems with
internal tables.

Figure 13: SE30 tips and tricks

© 2003 International Business Machines, Inc.

Page 17
IBM Americas Advanced Technical Support
One can review “Tips and tricks” for suggestions on ways to improve ABAP programs. In this case,
the problem appears to be that the ABAP is doing a linear search of the internal table.

Figure 14: SE30 binary search


When a large internal table is read without the ‘BINARY SEARCH’ option, the program becomes
much slower as the number of items processed grows and the table grows. The slow internal table
read access in Figure 14 took 203 microseconds. In our trace of the program shown in Figure 10, we
saw that internal table read accesses took over 35 milliseconds – a couple orders of magnitude longer.

7.2.1. Summary of Majority of time is CPU on application server


Check the characteristics of the program. If a majority of time is spent in CPU time on the
application server, use SE30 to trace the program.

This is an SAP program, so we would open an OSS message to SAP. If the program were a
custom program, then we would send this to the development team to investigate ways to speed up
processing the internal tables, such as “BINARY SEARCH” option on READ TABLE.

© 2003 International Business Machines, Inc.

Page 18
IBM Americas Advanced Technical Support

7.3. Majority of time is Database Request time


7.3.1. Slow per-row SQL performance
We have been asked to investigate the performance of the MIGO transaction. As a first step,
check the response time characteristics of MIGO using ST03N.

Figure 15: MIRO ST03N times


Figure 15 shows that the average MIRO dialog step runs about 1 second (“0 Response” column),
and about 2/3 of the elapsed time (“0 DB” column) is database request time. Next, select the
“Database” tab to see if individual SQL calls are slow.

© 2003 International Business Machines, Inc.

Page 19
IBM Americas Advanced Technical Support

Figure 16: MIRO ST03N Database


In Figure 16, note that the average sequential read time is rather long -- nearly 35 ms. This can be
normal for array fetch operations which retrieve many rows, but may also be a symptom of
inefficient SQL. In general, average sequential read times in ST03N should be less than 10-20
ms.

If average sequential read times are long, check the STAT/STAD records for the transaction, to
determine whether many rows are retrieved with each sequential read. If there are many rows per
read, use the time per row when evaluating performance.

© 2003 International Business Machines, Inc.

Page 20
IBM Americas Advanced Technical Support
Since DB request time is the majority of time, and individual SQL calls may be long, trace the
transaction with ST05. Here is the output of an ST05 SQL trace.

Figure 17: MIRO ST05 trace


In Figure 17, note that there are some slow SELECTs on RBKP – each takes over 300 ms and
returns no rows. (The duration is in microseconds.) In general, a statement that matches the
indexes well will take a few ms per row. Press the “ABAP display” button, which is the sheet of
paper button in Figure 17, in order to see the ABAP source code.

© 2003 International Business Machines, Inc.

Page 21
IBM Americas Advanced Technical Support

Figure 18: MIRO ABAP display


The cursor will be positioned at the source line in the ABAP. In Figure 18, the ABAP is SAP
dynamic SQL, which has created the SQL statement in the ‘code’ variable at runtime.

© 2003 International Business Machines, Inc.

Page 22
IBM Americas Advanced Technical Support

Figure 19: MIGO program attributes


After pressing the “Attributes” tab in Figure 18, the program attributes in Figure 19 show that the
owner of this code is SAP.

Review the access path used by Oracle -- press the Explain SQL button in Figure 17.

Figure 20: MIRO RBKP table scan


© 2003 International Business Machines, Inc.

Page 23
IBM Americas Advanced Technical Support
“TABLE ACCESS FULL” in Figure 20 means that no index on the table is used. Get the local
predicates (the WHERE “column operator value” clauses) from the statement in Figure 20. We
will compare them to the columns in the indexes, to see if the predicates reference indexed
columns.
• MANDT =
• LIFNR =
• WAERS =
• RMWWR =
• BUKRS =
• BLDAT =
• RBSTAT =
• XBLNR =

Drill into the table name in Figure 20, to display the indexes and indexed columns.

Figure 21: RBKP has three indexes


In the index display in Figure 21, Oracle will apply predicates to the index columns starting from
the top column of an index. With the predicates in the statement, MANDT is the only matching
column in these indexes, since USNAM and ERFNAM, which are adjacent to MANDT, are not in
© 2003 International Business Machines, Inc.

Page 24
IBM Americas Advanced Technical Support
the predicates. Thus RBSTAT, which is the third column in the RBKP~3 and RBKP~4 indexes,
cannot be processed as matching column.

So, the problem seems to be that the predicates in the SQL do not match the indexes on the table.
Since Figure 19 showed that the program is SAP code, we would:
• First, check the SAP data dictionary definitions
• Second, search SAP notes for this problem
• Third, open an OSS message

Check the indexes already defined in the data dictionary using: SE11 > display > indexes.

Figure 22: RBKP DDIC has 5 secondary indexes


The data dictionary contains three indexes (1, 2 and 4) that were not in the list of indexes in Figure
21. These are SAP indexes that by default are not enabled on the DB, but that can be created on
the database if they are required for the business processes being used.

© 2003 International Business Machines, Inc.

Page 25
IBM Americas Advanced Technical Support
Select index in the list in Figure 22, and press ‘Choose’.

Figure 23: RBKP index in data dictionary


Comparing the columns in this index to the predicates in Figure 20, the predicates match MANDT,
XBLNR, LIFNR, and BUKRS. It looks like creating the RBKP~2 index will solve the problem.

If the index is created, and does not solve the problem, then one would go to service.sap.com in
order to search SAP notes, and then open an OSS message if no solution was found.

7.3.1.1. Summary of Slow per-row SQL performance


The transaction was identified as a candidate to review. We checked the runtime
characteristics using ST03N, and found that the majority of time was DB time, and that per-call
DB times were rather slow. The transaction was traced with ST05, and slow SQL was found.
The slow SQL statement was initiated from an SAP ABAP program. We checked the data
dictionary, and found that there were indexes on RBKP that were defined in the data
dictionary, but not defined on the database. We create the index using the SAP definition.

© 2003 International Business Machines, Inc.

Page 26
IBM Americas Advanced Technical Support
Since the program is an SAP program, if the index had not been found in the data dictionary,
we would search OSS for notes about this problem, and if none were found, would open an
OSS message to SAP.

7.3.2. Program performs too many SQL operations


A different variety of inefficient SQL is when a program uses single row selects instead of array
selects. SAP has a construct (FOR ALL ENTRIES) that uses an internal table in an ABAP to
specify the key values to be fetched from a table in Oracle. The FAE can be executed as an IN list
(if there is only one variable column in the internal table) or as an OR (if there are two or more
variable columns in the internal table.

In this example, we have been asked to review the performance of a program that is slow. In
checking the stats for the program, we see that database request time (“0 DB”) is long, but the
average time for each sequential read is short, just 2.5 ms. (You may also see low average read
times with long running programs if the counters for database time fill or wrap. See SAP note
99584 for details.)

Figure 24: ZMM program - lots of fast DB accesses


Average CPU time (“0 CPU”) is about ¼ of database request time, so we need to look at the
database accesses with ST05 to investigate.

© 2003 International Business Machines, Inc.

Page 27
IBM Americas Advanced Technical Support
We trace the transaction with ST05, and then list the trace.

Figure 25: ZMM ST05


The program behavior is interesting. It repeatedly selects rows from the same table, EBKN.

Next, summarize the trace (goto > summary)

Figure 26: ST05 summarized trace

© 2003 International Business Machines, Inc.

Page 28
IBM Americas Advanced Technical Support
Choose the starting point and end point in the summary, and then press the summary button to
compress the trace by table and operation.

Figure 27: ZMM program compressed trace


Now we can see that almost all time in the database is single row fetches of EBKN. From the
ST05 trace in Figure 25, display the ABAP source.

© 2003 International Business Machines, Inc.

Page 29
IBM Americas Advanced Technical Support

Figure 28: ZMM program source


The program is looping through an internal table (LOOP AT irep_req). For each entry in the table,
the program selects a single row from EBKN. If it is possible to change the program to do a
single FOR ALL ENTRIES select on EBKN from the internal table, the program would probably
run much faster, since reading N rows at once from oracle is generally faster than reading 1 row N
times.

Since the program is custom code (Zxx), we send it back to the ABAP team, to review whether the
program could be restructured to use SAP array fetch operations.

7.3.2.1. Summary of Program performs too many SQL operations


ST03N statistics showed long database time and relatively small CPU time, but per-call
sequential DB request times were good (2.5 ms). Since the STAT/STAD statistics may have
wrapped or filled for such a long job, we used ST05 to trace the SQL and also confirm the SQL
response times. The trace showed that the same table was repeatedly accessed. The ABAP
program is looping through an internal table, fetching one row at a time from the database.
The program needs to be re-evaluated, to determine if the single-row access to the database can
be converted into an array operation, such as FOR ALL ENTIRIES.

© 2003 International Business Machines, Inc.

Page 30
IBM Americas Advanced Technical Support

7.4. Majority of time is not CPU or DB request time


There are a number of SAP response time categories that can also indicate performance problems that
are not covered in this paper. SQL and ABAP, which are covered in this paper, are the source of the
vast majority of performance problems.

The paper ‘Tuning SAP / DB2 / zSeries” on IBM Techdocs (www.ibm.com/support/techdocs)


contains detailed explanations about additional SAP sources of program delays (such as ENQ delay,
SAP sort, local I/O, commit work and wait, RFC, GUI, network, etc), their symptoms and possible
fixes.

8. System performance problems


8.1. Perform SQL Cache Analysis
In order to find inefficient SQL that may have a system-wide impact, because of using excessive CPU
or doing excessive I/O, review the SQL cache (ST04 >detail analysis > SQL request).

8.1.1. Indicator of inefficient SQL access


The primary indicator shown in ST04 of inefficient SQL access is when both ‘Bgets/exec’ and
‘Bgets/row’ are high. A Bget (buffer get) is when oracle references a page of memory in Oracle
database memory. A high ratio is a sign that Oracle must search a large amount of data to find the
results. Searching many pages will consume extra CPU and may cause extra I/O operations.
Normally, it might take 5-20 buffer gets to retrieve a single row, and it can be much less when
performing array operations that retrieve many rows with a single call to the DB.

Since the ST04 SQL cache does not have per-statement CPU, elapsed time, or delay statistics, one
cannot easily use the SQL cache to find locking or other delay problems, or to estimate the time
impact of the inefficient SQL. In order to determine the runtime of inefficient SQL statements,
one must trace them with ST05.

With Oracle 9i, the Oracle V$SQL view contains CPU and elapsed time per statement. One can
use the information in V$SQL to augment the information displayed in the SAP ST04 SQL cache
when analyzing problems such as locking or I/O constraints that cause long statement elapsed
time.

© 2003 International Business Machines, Inc.

Page 31
IBM Americas Advanced Technical Support

Figure 29: ST04 SQL cache - big problem example


In Figure 29, the last statement in the list looks like it is very inefficient. The SQL cache
statements were reset and then evaluated since reset. There were 497,407,252 buffer gets over the
interval, and one statement has executed over 105,576,852 buffer gets – 20% of the total. Both
Bgets/exec and Bgets/row are over 95,000, which is much higher than the 5-20 Bgets/row that one
might see if the SQL were efficiently accessing the data

Figure 30: ST04 SQL cache - moderately inefficient SQL


Even when the Bgets/row is lower than the extreme example shown in Figure 29, if a statement is
executed frequently enough, it can cause an important impact on the system with fewer Bgets/row.
The statements circled in Figure 30 are each 2-3% of buffer gets during the interval. By
addressing and fixing several of these problems, one can make a measurable decrease in CPU
usage on the system.
© 2003 International Business Machines, Inc.

Page 32
IBM Americas Advanced Technical Support

The higher ‘Bgets/row’ and ‘Bgets/exec’ are, the more likely that the SQL is inefficient, and
Oracle is doing extra work to retrieve the data. If a statement is in the top 20 in the SQL cache
ordered by buffer gets (or Disk reads) during a peak time, and the Bget/exec > 50 and
Bgets/row> 50, the statement should be examined to determine if it can be improved.

8.1.2. SQL Bget ratios that are not problems


• When Bgets/exec is high and Bgets/row is low, SAP is doing array operations, which affect
many rows per database call. This is generally an efficient way to access the database.
• When Bgets/exec is low and Bgets/row is high, this shows that the SQL seldom retrieves
rows. If “Proc rows” is 0, check whether the table contains data, and if the statement is
needed in the program.
• When Bgets/exec is low, and Bgets/row is low, then the data has been retrieved efficiently.

8.1.3. Causes of inefficient SQL access


There are two main causes of inefficient SQL access:
• The local predicates do not match the available indexes, and
• Oracle chooses the wrong access path.

Local predicates are the WHERE “COLUMN operator value” clauses in the SQL. If the
predicates don’t contain columns that are in indexes, then Oracle generally cannot access the data
efficiently.

© 2003 International Business Machines, Inc.

Page 33
IBM Americas Advanced Technical Support

Figure 31: Sample explain


In Figure 31, the local predicates are:
• MANDT =
• VGBEL =
• ROWNUM <=

ROWNUM <= is a predicate that is inserted by Oracle to stop fetching rows before the complete
result set has been retrieved as soon as the ROWNUM count has been reached. MANDT is
usually automatically inserted into the SQL by SAP. The program specified VGBEL=.

One can drill into the table or indexes shown in Figure 31, to see the indexes on the table.

© 2003 International Business Machines, Inc.

Page 34
IBM Americas Advanced Technical Support

Figure 32: Display indexes from explain


Now that we have predicates and index columns, one can compare to determine if the predicate
columns are contained in indexes. Comparing Figure 31 and Figure 32, we see that the only
indexed predicate column is MANDT. This means that Oracle must look at all the rows of the
table, in order to find the rows matching the ‘VGBEL =’ predicate.

8.1.4. Reasons that the predicates do not match indexes


There are two basic scenarios:
• The program is not using the SAP data model correctly, and
• The business process may require a new index on the table

Many kinds of information are redundantly stored in SAP. For instance, the transaction data for a
billing document may contain columns with the document number for the delivery or sales order
© 2003 International Business Machines, Inc.

Page 35
IBM Americas Advanced Technical Support
being billed. But just because the information is present does not mean that is the right way to
retrieve it. Billing documents, to continue the example above, are indexed to support lookup by
billing document number. If a program tries to access the billing tables using the delivery column,
in order to find the associated billing document, the column with the delivery number will not be
indexed, and access will be slow. Usually, the solution is not to create an index, but to use the
SAP data model correctly.

There is an example of incorrect use of the SAP data model below in section 8.1.6.

SAP has several SAP notes that describe wrong and right ways to use the SAP data model to
retrieve application information:
• MM – SAP note 191492
• SD – SAP note 185530
• PP – SAP note 187906

8.1.5. Actions when predicates do not match indexes


It can occur that specific business processes require new indexes on tables. Here are some
guidelines on how to approach the problem when you find a program where the predicates do not
match the indexes:
ABAP Creator Table Creator Action when predicates do not match indexes
SAP SAP • Check data dictionary for standard indexes which
are not active in DB
• Check OSS notes
• Open OSS message to SAP
Custom Custom • Rewrite program if possible
• Evaluate table access patterns, whether table can be
buffered
• Create index
Custom SAP • Check data dictionary for standard indexes that are
not active in DB
• Review use of data model (will generally fix
problem)
• Change program
• Create index (should very seldom be necessary)

© 2003 International Business Machines, Inc.

Page 36
IBM Americas Advanced Technical Support

8.1.6. High-impact SQL caused by incorrect use of SAP data model


Sort the SQL cache by buffer gets, to bring the high impact statements to the top.

Figure 33: VBRP high impact statement


The first statement in the list is very, very inefficient – it performs over 100,000 Buffer gets per
execution, and retrieves one row per execution.

© 2003 International Business Machines, Inc.

Page 37
IBM Americas Advanced Technical Support

Explain the statement to see what access path is used.

Figure 34: VBRP explain


Note the predicates, which we will need later.
• MANDT =
• VGBEL =
• ROWNUM <=

The statement is doing an index range scan, so it is using the index, but we cannot tell yet how
many index columns are matched in the SQL.

Drill in on the table VBRP to check the indexes to see which columns from the predicates are in
the indexes.

© 2003 International Business Machines, Inc.

Page 38
IBM Americas Advanced Technical Support

Figure 35: VBRP table and index


The only predicate column that matches is MANDT. Since there is generally only one productive
MANDT in each system, this does no filtering on the index. Oracle checks the index, and then has
to go to the table and look at every row that satisfied the MANDT= condition to check the value of
VGBEL.

Use the “Display call point” button in Figure 33 to look at the source code.

Figure 36: Program accessing VBRP


Goto > attributes to see the owner of the program.
© 2003 International Business Machines, Inc.

Page 39
IBM Americas Advanced Technical Support

Figure 37: Program accessing VBRP not from SAP


So, we see that the owner of the program is not SAP. From the table in Section 8.1.5, we know
that the first action is to examine if the program is using the data model correctly.

SAP note 185530 describes several incorrect and correct lookups for SD. This problem is
contained in the note:

Figure 38: SAP note with examples of using the SD data model
© 2003 International Business Machines, Inc.

Page 40
IBM Americas Advanced Technical Support
The action for this problem is to send it back to the developer.

In this proposed fix, for SD documents there is a table VBFA (document flow) that contains the
predecessor and successors for sales documents. Given a sales order number, one can find all the
subsequent documents related to the sales order, or given an invoice number, one can find all the
predecessor documents that lead to that invoice document.

8.1.6.1. Summary of incorrect use of data model


Statement is found in ST04 SQL cache with very high Bgets/exec and high Bgets/row. Check
the predicates and indexes, only one predicate (MANDT=) matches an index. Check the
owner of the program. It is not SAP, so we suspect that the programmer is looking in the
wrong place for the information. Check SAP note 185530, since the problem here is in use of
the SD tables, and find the proposed fix. Send the program back to developer to re-work.

8.1.7. High-impact SQL when optimizer takes wrong access path


While the Oracle cost–based optimizer generally does a good job of determining the best way to
access tables based on the predicates in the SQL, it can go wrong. Here the SQL cache has been
sorted by Disk reads, to find statements that run a long time and cause lots of I/O.

Figure 39: ST04 SQL cache for BDCPV example


The statement does over 500,000 buffer gets per execution, and returns less than one row for each
execution. This is very, very inefficient. Explain the statement, to see what it is doing.

© 2003 International Business Machines, Inc.

Page 41
IBM Americas Advanced Technical Support

Figure 40: BDCPV explain


Get the predicates, which will be needed later:
• MANDT =
• MESTYPE =
• PROCESS =
• CRETIME <=
• CDOBJCL =

Figure 40 shows that BDCPV is a view on BDCP and BDCPS. BDCP is accessed using
BDCP~POS, and BDCPS is accessed using BDCPS~1, and then the two result sets are sort merged
together.

The ST04 statistics tell us that this cannot be a good way to access the data, since each execution
takes over 500,000 buffer gets, and returns on average less than one row. If it were an efficient
way to access the data, it might take 20-30 buffer gets per execution.

© 2003 International Business Machines, Inc.

Page 42
IBM Americas Advanced Technical Support
One can use “SE11 > display > utilities > database object > display” to display the definition of the
view on BDCPV, in order to determine which tables contain which predicate columns.

Figure 41: Display definition of BDCPV view

© 2003 International Business Machines, Inc.

Page 43
IBM Americas Advanced Technical Support
In Figure 41, the columns MANDT, MESTYPE, and PROCESS are in BDCPS, and CRETIME
and CDOBJCL are in BDCP.

Check the indexes used, to see how well the predicates and indexes match.

Figure 42: BDCP indexes


The index being used (BDCP~POS) matches MANDT and CRETIME. There is another index
(BDCP~1) containing CRETIME and CDOBJCL, which might have been used, but it does not
© 2003 International Business Machines, Inc.

Page 44
IBM Americas Advanced Technical Support
contain MANDT, which must be checked to satisfy the join conditions. We must check further to
determine what the best access path is.

Figure 43: BDCPS index columns


The index being used on BDCPS (BDCPS~1) contains the columns MANDT, MESTYPE,
PROCESS, and CPIDENT.

Going back to our list of predicates above and comparing them with the explain, we have
• MANDT (in BDCPS~1 and BDCP~POS)
• MESTYPE (in BDCPS~1)
• PROCESS (in BDCPS~1)
• CRETIME (in BDCP~POS)
• CDOBJCL (evaluated in BDCP table, not in index)

Next, we want to determine where the filtering takes place -- whether the predicates on BDCPS
eliminate the most rows, or predicates on BDCP. Usually, the table where the filtering takes place
will be chosen as the first (aka driving) table in a join. In order to find the filtering predicates, we
have to trace the program with ST05, and get sample values of the variables, since ST04 does not
show the values.

© 2003 International Business Machines, Inc.

Page 45
IBM Americas Advanced Technical Support
Trace the program with ST05, list the trace, and press the ‘replace vars’ button.

Figure 44: ST05 replaced variables

© 2003 International Business Machines, Inc.

Page 46
IBM Americas Advanced Technical Support
Next, use SE16 to search each table using its predicates. First, check BDCP using the predicates
and values on CRETIME and CDOBJCL from Figure 44.

Figure 45: SE16 BDCP


Number of entries:

Figure 46: BDCP number of entries


There are over 6000 candidate rows in BDCP after filtering by CDOBJCL and CRETIME and
MANDT (which is implicitly added by SE16).

© 2003 International Business Machines, Inc.

Page 47
IBM Americas Advanced Technical Support

Figure 47: SE16 BDCPS

Figure 48: BDCPS number of entries


But after filtering BDCPS by its predicates PROCESS and MESTYPE and MANDT, there is only
one candidate row left.

We now know that the filtering takes place on BDCPS in index BDCPS~1, which contains
MANDT, MESTYPE and PROCESS. This index also contains CPIDENT (see Figure 43), which
can be used to do a primary key unique index lookup into BDCP (see Figure 42).

So, we want BDCPS to be the outer table (the first one referenced) and then have BDCP accessed
using its primary index BDCP~0.

© 2003 International Business Machines, Inc.

Page 48
IBM Americas Advanced Technical Support
When we explain the statement again, this time using the Oracle RULE hint, this is the access path
that Oracle chooses -- BDCPS~1 is accessed first, then BDCP~0.

Figure 49: BDCPV access path with RULE hint

Oracle has a number of hints that can be used to influence the access path selection. See the
Oracle manuals referenced below in Section 12.1.

See also SAP notes 129385 and 130480 regarding use of hints.

© 2003 International Business Machines, Inc.

Page 49
IBM Americas Advanced Technical Support
Check the program source, and see that it is SAP code.

Figure 50: BDCPV source code


Having determined that the program is SAP code, and that the database is taking a bad access path,
we open a message to OSS.

8.1.7.1. Summary of wrong choice of access path


Statement is found in SQL cache with many buffer gets per execution, many buffer gets per
row, and many disk reads per execution. Explain the statement, and we see that predicates are
being applied in both indexes, which looks reasonable. Check the predicates separately on the
two tables, to see where the filtering is done, and find that all the filtering is done in BDCPS,
so a primary key lookup into BDCP would be more efficient than the range scan that is being
done on BDCP. Check the source code, and see that it is SAP. Open OSS message.

© 2003 International Business Machines, Inc.

Page 50
IBM Americas Advanced Technical Support

8.1.8. High-impact SQL is a symptom of another problem


Here, the SQL cache has been sorted by buffer gets, to highlight high impact statements.

Figure 51: A616 in SQL cache


The highlighted statement performs over 200 Bgets/exec, and on average retrieves less than one
row per execution. Since the statement is executed so frequently, it is one of the top statements in
total Buffer gets.

© 2003 International Business Machines, Inc.

Page 51
IBM Americas Advanced Technical Support

Figure 52: A616 explain


Explain shows that index A616~0 is being used to access the table.

The statement predicates are:


• MANDT =
• KAPPL =
• KSCHL =
• DATBI >=
• DATAB <=
• KUNNR =
• ROWNUM <=

Drill into the A616 table in Figure 52 to display the indexes.

© 2003 International Business Machines, Inc.

Page 52
IBM Americas Advanced Technical Support

Figure 53: A616 index


Comparing the index columns with the predicates, MANDT, KAPPL, KSCHL match, but since
KNUMA_AG is not in the SQL, then KUNNR and DATBI cannot be matched in the index, but
can be processed via range scan.

© 2003 International Business Machines, Inc.

Page 53
IBM Americas Advanced Technical Support
In Figure 51, press the button ‘Display call point in ABAP program’.

Figure 54: A616 ABAP source


The program is SAP code, not in the customer Z* or Y* namespace. While the problem is not
exactly that the predicates do not match indexes (five predicates are present in the index, not all
with matching index access), we check the A616 table DDIC settings. We would check for
undefined indexes as shown in Section 7.3.1, and also check technical settings. Using SE13, check
the technical settings.

© 2003 International Business Machines, Inc.

Page 54
IBM Americas Advanced Technical Support

Figure 55: A616 technical settings

The table is configured as buffered on the application server, but there must be some sort of
problem, since the SQL cache statistics in Figure 51 show over 200,000 SQL calls. When a table
is buffered on the application server, there should be very few calls to the database server to
retrieve rows from the table.

It may be that the table is frequently changed and invalidated and then re-loaded or that the table
does not fit in the application server buffer. These can be checked in Figure 57 below. The table
may be read by the ABAP in a way that bypasses buffering on the application server. See SAP
note 47239 regarding buffered table behavior.

© 2003 International Business Machines, Inc.

Page 55
IBM Americas Advanced Technical Support
So, check the table buffers using ST02.

Figure 56: ST02 buffers with swaps


There are swaps, which means that the buffer has filled, and may not be able to hold all the objects
referenced by the programs running on the application server. Drill in on the ‘Generic key’ line,
then press the ‘Buffered objects’ button to see the next screen.

Figure 57: ST02 buffered objects


See SAP note 3501 for information on the meaning of the buffered status. Error means that there
was an error loading the table into the buffer. In practice this means that the buffer did not have
enough space to hold the table.

© 2003 International Business Machines, Inc.

Page 56
IBM Americas Advanced Technical Support

There are very few changes on the table, so the problem is not that the table is being invalidated
and re-loaded.

So, we have now determined that the frequent calls to A616 are not a problem with inefficient
SQL, but are caused by something else – that the generic buffers on the application servers are not
large enough.

8.1.8.1. Summary of symptom of another problem


There is an SQL statement that is rather high in Bgets/exec and that is one of the highest
statements in total Buffer gets, which means that it is having a high impact on database
resource use. Look at the program source, and it is SAP code. So, we check the table
attributes and indexes. The table technical settings show that the table should be buffered.
Check the buffers on the application server, and find that the generic buffer is too small.
The frequent calls to A616 are a symptom of another problem, the small generic buffer size.

There are other situations where the SQL can be a symptom of another problem, as when
buffered tables are accessed from ABAP in a way that bypasses the SAP buffering (SAP note
47239).

8.2. Create candidate list from ST03N


Run ST03N, sort by total database time.

Figure 58: ST03N by DB time


Transactions which have the longest DB time, and thus probably the largest impact on the database
server, are at the top of the list. Next, click on the “Database” tab, to see the per-call times.
© 2003 International Business Machines, Inc.

Page 57
IBM Americas Advanced Technical Support

Figure 59: ST03N by DB time database tab


Check the average times for sequential read (‘0 Seq read time’ column). If this is long, it usually
means one of two things:
• Each read retrieves many rows (this is not a problem, if the program needs the data), or
• The SQL access is inefficient, and it takes a long time for a few rows (this is a problem).

At this point, we cannot tell which is occurring, but one can trace the programs with ST05, as shown
in Section 7.3, to determine whether there is a problem or not.

9. System Health Check


9.1. CPU activity
As discussed above, SQL problems can contribute to high CPU use on the DB server, and ABAP
coding problems can contribute to high CPU use on the application server. If the system has high
CPU utilization, then first search for SQL and application problems that may be causing high CPU
use.

SAP ST06 transaction has statistics on recent use, and on historical CPU use. Since ST06 has
statistics based on hourly averages, it will not show CPU constraints until they are very severe. It is
generally better to monitor CPU use using an AIX based tool, and gather statistics on 5 or 10 minute
intervals, in order to be able to calculate average and peak activity.

One can use nmon, sar, ptx, or other tools to gather and report on the utilization.

© 2003 International Business Machines, Inc.

Page 58
IBM Americas Advanced Technical Support
While it may be acceptable for an application server running batch work to run at 100% utilization, if
an application server supporting dialog or critical interfaces is frequently hitting 100% utilization, it
can have a significant impact on response time and end-user satisfaction.

If the DB server has high CPU activity:


• Evaluate the impact of inefficient SQL and applications as in Section 8.1.
• If there is a SAP application instance on the DB server, investigate whether it can be moved
off.
• If no work can be moved off, evaluate moving some batch jobs to run outside of peak time.
• Acquire more CPU power for the DB server

If one application server has high CPU activity


• Evaluate the impact of inefficient ABAP code on excessive CPU use as in Section 7.2.
• Evaluate changes in the SAP login groups to balance the workload differently
• Change the mix of batch/update/dialog to reduce the amount of work on the server
• Acquire more CPU power

9.2. I/O activity


I/O activity should be monitored at the Oracle level from SAP using ST04, or as an alternative, one
can use STATSPACK to monitor it at the Oracle level directly from Oracle. Oracle STATSPACK
has the benefit that it can be configured to automatically gather statistics, which can be used for
historical reporting and comparison. ST04 file statistics are gathered by logging in to view the current
statistics, or can be manually reset for an interval.

ST04 and STATSPACK both report average read times, which are more useful than the read rates
available with most AIX monitors.

Another reason for starting with the Oracle view of I/O activity is that there are several different
configuration problems that can cause the symptom of slow I/O at the Oracle level, though physical
I/O at the AIX level is fast. The actual problem is serialization on a resource (e.g. AIO server
processes, filesystem buffers, or filesystem i-nodes) in AIX after the I/O is issued by Oracle, but
before AIX sends the I/O to the disk.

ROTs for sites using Oracle database on JFS filesytems


• Configure AIO max servers to be (oracle data files * 1.25), to prevent shortages of AIO servers
• Set vmtune/vmo numfsbufs to at least 300, to prevent shortage of fsbufs
• Limit the size of each datafile to 2 GB, to reduce i-node serialization problems

ROTs for sites using Oracle database on JFS2 filesystems


• Configure AIO max servers to be (oracle data files * 1.25), to prevent shortages of AIO servers
• Set vmtune/vmo numfsbufs to at least 300, to prevent shortage of fsbufs
• Mount the Oracle database filesystems with the cio option, to avoid i-node serialization
problems

© 2003 International Business Machines, Inc.

Page 59
IBM Americas Advanced Technical Support
There are also queue sizes on disk adapters and hdisks that can be checked and set in the ODM via
SMIT. In our experience, the defaults work fine with SAP.

ST04 > detail analysis menu > file system requests can be used to view read and write activity and
times, as viewed by Oracle. With AIX AIO, the write times are often unreliable, so it is generally best
to focus on the read times as an indicator of performance problems.

9.2.1. Good I/O performance in Oracle


In the following section of an Oracle v$filestat report (ST04 > detail analysis menu > file system
requests) note that the average I/O response times are small single digit, and they are not widely
skewed. Here, the I/O response times are good. (Though there could still be performance
problems due to low hit-rate in Oracle, or other factors.)

Figure 60: ST04 file system requests with good response times

© 2003 International Business Machines, Inc.

Page 60
IBM Americas Advanced Technical Support

9.2.2. Symptom of I/O hotspot on disk


Figure 61 is an ST04 “file system request” report (Oracle v$filestat) that has been downloaded to
Excel.

Figure 61: ST04 file system requests


It has widely varying I/O response times on the most frequently read datafiles. When there is skew
in the response time of read-intensive files, it is a common symptom of an I/O constraint at the
physical disk level, where the data files are not spread across enough disks. (A disk may be called
a vpath, hdiskpower, LUN, etc – it is the AIX representation of the physical location where the
data is stored).

In order to prevent this from happening, we recommend using a database layout where each LV is
configured to reside on many hdisks. In SMIT, this would be done by creating the LV on several
hdisks, and specifying “Range of physical volumes maximum”. See the paper “Configuring the
Enterprise Storage Server (ESS) for Oracle OLTP Applications”, which is document number
WP100319 at http://www.ibm.com/support/techdocs, for more details about configuring a “stripe
and spread” layout for the database.

© 2003 International Business Machines, Inc.

Page 61
IBM Americas Advanced Technical Support
The solution for I/O hotspots is to analyze the I/O at the physical level in the storage system, and if
overloaded spots are found, to spread the datafile across more disks:
• Determine name(s) of file(s) with slow response time – ST04 or STATSPACK
• Convert filename to filesystem – using ls or find commands
• Convert filesystem to LV – using df or lsfs
• List PVs (or vpaths) in the LV – using lslv –p
• If ESS, list LUNs for the vpaths – using lsess
• Use ESS expert to check I/O activity for clusters, ranks, LUNs

9.2.3. May be AIX constraint or disk performance problem


Since Oracle I/O statistics contain the time for the I/O to go through AIX and then be issued to
disk, there are configuration problems that can masquerade as I/O bottlenecks. The ROTs for
avoiding the most common configuration problems were listed in Section 9.2.
The AIX serialization problem mentioned here is not common, but can occur on a large system
when a specific workload is being processed in parallel, so that many jobs are accessing and
changing the same tables and same Oracle datafiles. This might occur when doing parallel IDOC
processing, or running any business process (delivery due, billing, forecasting, etc) via many
parallel batch jobs.

Figure 62: ST04 file system requests slow on high write files
The average I/O times are good on some files, but the files that have the most write activity have
very slow I/O times. This could occur for several reasons, such as AIX i-node contention, or write
cache filling on the disk system. I-node serialization occurs on JFS structured datafiles. When the
file is being written, it is locked to serialize access and preserve integrity. This serialization blocks
readers and writers from using the datafile.

When there are slow Oracle I/O times on frequently changed datafiles, the actions to resolve the
problem might be:
• Examine disk statistics (e.g. ESS expert) for signs of disk or cache overload related to high
write activity, if none found, then
• Open a perfpmr to confirm whether AIX i-node serialization is a problem
• If i-node serialization is a problem, then
o Move data to JFS2 filesystems with AIX 5.2, and mount with cio option, which gets rid
of i-node serialization, or
© 2003 International Business Machines, Inc.

Page 62
IBM Americas Advanced Technical Support
o If JFS2 is not an option, and datafiles are currently larger than 2 GB, rebuild the slow
tablespaces with a max 2 GB datafile size, or
o If JFS2 is not an option, convert to raw LV structured DB.

9.3. Oracle Review


There are a couple quick checks one can do, as part of the system health check. See the SAP note
618868 for more information on oracle performance FAQs.

9.3.1. Database Hit Rate


ST04 can be used to review the hit rate in the database. Since SAP often executes many SQL
operations per dialog step, the goal should generally be to have a hit rate in the high 90s. A 96%
hit rate is two times the misses of a 98% hit rate. A 94% hit rate is three times the misses of a
98% hit rate. The impact of cache misses quickly grows as hit rate decreases.

Since inefficient SQL can cause the symptom of low hit rates, the SQL cache analysis process in
Section 8.1 should be the first action, when a low hit rate is seen. After the SQL analysis, one can
review adding more memory to Oracle (in order to increase hit rate) by using the process in
Section 9.6.2.

Figure 63: ST04 database overview

© 2003 International Business Machines, Inc.

Page 63
IBM Americas Advanced Technical Support

9.3.2. Database delays


The oracle v$system_event view can be used to review the causes of delay in oracle. These delay
statistics are recorded in STATSPACK. STATSPACK reports can be uploaded to the web site
www.oraperf.com for an analysis.

As with hit-rates, many Oracle delays (buffer busy waits, enqueue, etc) can be symptoms of SQL
or application design problems, and may not fixed with database parameters.

Normally, db file sequential read is the largest source of delay. See SAP note 619188 for more
information on Oracle wait events.

Figure 64: ST04 - Oracle v$system_event delay statistics

While a workload is running, you can use “ST04 > detail analysis > oracle sessions”, to display
wait events for the Oracle processes. Associating an Oracle wait event with its SQL statement can
make it easier to determine the cause of the delay.

Figure 65: ST04 Oracle sessions

© 2003 International Business Machines, Inc.

Page 64
IBM Americas Advanced Technical Support

9.4. SAP buffer settings


When SAP buffers fill and swap, it can cause performance problems for transactions and batch jobs.
One example of this was shown in Section 51, where the generic buffer area was filled. In addition to
this, another common problem is when the program buffer fills, causing swaps and impacting response
time. One can check ST03N, to see the impact of program swaps overall.

Figure 66: ST02 swaps


ST02 has swaps on programs, CUA, screen. We check ST03N, to evaluate the system-wide impact.

Figure 67: ST03N overview


© 2003 International Business Machines, Inc.

Page 65
IBM Americas Advanced Technical Support
In Figure 67, ST03N data that has been downloaded to Excel. Note that load and generate time is
about 5% of average dialog response time. However, this impact will not be evenly distributed. When
the programs and screens referenced by a program are in SAP buffers, the program will run quickly.
If they are not present, then it might take several seconds longer than normal.

If the application servers are memory constrained, then one may choose to live with this impact. If
one wants to be able to have more reliable transaction response times, then one would increase the size
of the buffers that are swapping.

9.5. SAP buffered table statistics


There are two quick checks that can be done:
• Tables that are not buffered, but which should be, and
• Tables which are buffered, but whose settings should be changed.

9.5.1. Not buffered but could be buffered


The most common problem with buffered tables is tables that could be buffered, but are not.
Generally, these are custom tables. Most SAP tables have the appropriate settings already
enabled.

Figure 68 is a segment of an ST10 report for not buffered tables, ordered by calls to the database.
Impact on the database is more a function of the buffer gets per row (as shown in Section 8.1) and
not calls, but calls or rows are the only orders available with ST10.

Figure 68: ST10 not buffered tables

© 2003 International Business Machines, Inc.

Page 66
IBM Americas Advanced Technical Support
There are three tables in this list that may be buffering candidates:
• YMSESPIV02
• OICDC
• YVGMDRCADDR

Since OICDC is a SAP standard table (it does not start with Y or Z), one can check in SE13 if
buffering is allowed but not currently enabled. If it is allowed but not enabled, we can turn
buffering on.

The two Y* tables are custom tables, so we must check:


• How large is the table
• Will it be changed frequently (it looks like not, from ST10 stats)
• Can the application tolerate a short period where the buffered versions are different, when
the table is changed and changes are propagated.

Since the tables are all read with direct read, if the table is reasonably small (e.g. <5 MB), not
changed much, and the application can tolerate inconsistency, then they could be fully buffered.
If the tables are very large, then they might be single record buffered, to save buffer space by
buffering only the referenced rows.

SELECT SINGLE (SAP direct read) can be read from either the generic or single record buffer.
SAP SELECT can only be read from the generic buffer.

9.6. Evaluate memory in AIX


9.6.1. AIX paging
The standard tools (nmon, ptx, sar) can be used for monitoring paging. The page space page-ins
and page-outs (paging to disk) are the important indicator to check.

If you encounter AIX paging on a system with an SAP instance, consider paging first as a
symptom of an application problem, and approach it from the application statistics in SAP.
Since SAP memory use can vary greatly, as programs allocate and free SAP memory, ABAP
problems such as the slow internal table access described in Section 7.2 can contribute to paging
problems. If a report processing many line items runs quickly, while it is running it will acquire
more memory, and then release it. If the program runs too long because of inefficient ABAP
coding, then the program will need the memory for longer than it should. When several programs
do this, it can cause paging. If the programs were coded efficiently, then they would quickly
finish, and this would reduce the likelihood that they would all be running simultaneously.

After having found the programs and users with large memory requirements, one can consider
running the reports in batch on a server specially configured for large memory requirements, or the
huge reports can be automated to run at night, when there may be less demand for memory.

© 2003 International Business Machines, Inc.

Page 67
IBM Americas Advanced Technical Support
Check for running programs using large amounts of memory with ‘SM04 > goto > memory’:

Figure 69: SM04 > goto > memory


If a user is running several transactions, only one of the transactions will be displayed in SM04.
Choose a user, and press “Sessions” to see the names of the transactions.

Figure 70: SM04 Sessions


One can then examine these transactions further to determine which use the memory. Use table
TSTC to see transaction name and code. VA05 is “List of Sales Orders”.

© 2003 International Business Machines, Inc.

Page 68
IBM Americas Advanced Technical Support
ST03N has historical statistics on memory usage.

Figure 71: ST03N memory use statistics

9.6.2. Evaluating increasing memory for Oracle or SAP


If you have evaluated SAP or Oracle memory use, and would like to increase the buffer sizes in
Oracle or SAP, how do you determine whether there is memory available, or whether adding
memory will cause paging? The free page information on commands such as vmstat does not
always reflect how much memory is really available to be added to SAP or Oracle buffers.

If vmtune shows free memory, then there are truly free memory pages that can be added to SAP or
Oracle buffers. Check vmstat (or ST06) over the course of several days, and if it consistently
shows free memory, then use the minimum free pages to determine the amount of memory that is
available to add to Oracle or SAP. Do not try to allocate all free memory to SAP or Oracle.

© 2003 International Business Machines, Inc.

Page 69
IBM Americas Advanced Technical Support

Figure 72: ST06 displays free memory

There are situations, such as with a database server or NFS server, where AIX may not show
almost no free pages, but there is still a lot of memory that can be added to SAP or Oracle.

AIX maps file pages into available physical memory, in order to help performance. There are
AIX parameters on the vmtune (AIX 5 vmo) command that set the limits of memory-mapped files.
On a database server, first check that SAP note 78498 has been implemented to establish limits
on real memory used for memory-mapped files.

Check the amount of real memory that is being used for memory-mapped files. This can be done
with the command svmon –G.

Figure 73: svmon -G output

© 2003 International Business Machines, Inc.

Page 70
IBM Americas Advanced Technical Support
The total of AIX memory mapped pages is the sum of “pers” and “clnt” columns, for both “pin”
and “inuse” rows. In this example, 234+1,585,658=1,585,892

Check the memory-mapped file page limits with the vmtune (vmo) command:

Figure 74: vmtune maxperm display


Note in Figure 74 that vmtune has been used to limit maxperm to 12%, or 473,836 pages. Figure
73 showed that actual perm storage in memory was 1,585,892 pages. When strict_maxperm is 0,
AIX can expand perm storage use past the maxperm setting. 1,585,892 - 473,836 = approximately
1,000,000 pages (4GB) that could be potentially given to Oracle or SAP.

Rather than having Oracle have a cache miss and do I/O that AIX fulfills from memory-mapped
file cache, it is usually more efficient to give the memory to Oracle. Likewise, if additional SAP
buffer memory can reduce database calls, then it is generally good to give the memory to SAP.

Since SAP and Oracle memory use can vary widely from hour to hour, run the svmon command
periodically for several days or a week, to determine the minimum number of (persistent + client)
pages over the period. This minimum (pers+clnt pin+inuse) should be compared with vmtune
(vmo) maxperm, to determine if there is available memory.

Make changes gradually, and don’t over-allocate the memory in Oracle and SAP buffers, as that
can cause AIX paging.

Don’t add memory to Oracle or SAP just because AIX shows that there are available pages. If
Oracle hit rates are low, or SAP buffers need to be increased to support the workload, then
increasing the size of the Oracle and SAP buffers is reasonable.

10. Four guidelines for avoiding performance problems


As seen in the examples above, there are a few general rules for avoiding performance problems.

10.1. Use the SAP data model


If you’re evaluating the performance of custom code, and it runs slowly because the predicates don’t
match the indexes on SAP tables, the odds are very good that the program is looking for the data in the
wrong place. Most SAP business documents (e.g. sales order, purchase order, delivery note, etc) can
be found using the document number with a standard SAP table and standard SAP index.

In addition to the example in Section 8.1.6, See SAP notes 185530, 187906, and 191492 for examples
of incorrect and correct use of the SAP data model.

© 2003 International Business Machines, Inc.

Page 71
IBM Americas Advanced Technical Support
The symptom of this problem is in the ST04 SQL cache - high buffer gets per exec and high buffer
gets per row. This happens when the predicates on the SQL do not match the index columns on the
table and Oracle has to read many blocks of data to retrieve the result.

10.2. Use array operations on the database


If the program builds internal tables that contain keys for selects on other tables, evaluate whether an
array operation such as FOR ALL ENTRIES can be used to perform array selects, rather than using
LOOP AT with individual database calls.

The symptom of this problem is seen in ST05 traces, where a program makes frequent calls to a table,
and each call accesses few rows.

10.3. Check whether the database call can be avoided


There are several versions of this problem:
• A table is set as buffered, but the application server generic or single record buffers are too
small to hold the table rows. The cure for this is to increase the size of the SAP buffers.
• A table is not set to be buffered, but should be. In this case, the table is usually read-only, and
the application can tolerate a small interval when the data is not in synch on all the application
servers. In this case, the table attributes should be changed to buffered.
• A program is repeatedly fetching the same information from the database. This problem can
be detected by using the ‘duplicate selects’ function in ST05. The program needs to be
examined to see how it can be changed so that it does not have to repeatedly go back to the
database for the same information

10.4. Write ABAP programs that are line-item scalable


If the program will process many lines in a report:
• Use BINARY SEARCH on the “READ TABLE from itab” statements
• Evaluate whether internal tables need to be defined as SORTED

The symptom of this problem is high CPU use for a program, where CPU use does not scale with
additional line items – e.g. a 100 line report takes 1 second CPU, but a 1000 line item report takes
more than 10 seconds CPU, and a 10,000 line report takes much more than 100 seconds CPU. These
scalability problems get worse as the report lines increase.

© 2003 International Business Machines, Inc.

Page 72
IBM Americas Advanced Technical Support

11. Appendix 1: summary of performance monitoring tools


A quick summary of key tools and their main functions in performance monitoring follows.

11.1. SAP
11.1.1. DB02 Transaction
DB02 is used to display information about tables and indexes in the database, such as space usage
trends, size of individual tables, etc.:
• Display all indexes defined on tables (DB02 > detail analysis)
• Check column cardinality (DB02 > detail analysis > enter table name > select table >
press table columns)

11.1.2. SE11 Transaction


SE11 is used to gather information about data dictionary and database definition of tables, indexes,
and views:
• Display table columns and indexes (SE11 > enter table name > display > extras >
database objects > check)
• Display secondary indexes defined in data dictionary (SE11 > enter table name > display
> indexes). There may be data dictionary indexes that are not active on the database.
• Display view definitions

11.1.3. SE30 Transaction


When STAT or ST03 shows that most of a program’s elapsed time is CPU, SE30 is used to
investigate where CPU time is spent in an ABAP program

11.1.4. SM12 Transaction


SM12 > extras > statistics can be used to view lock statistics:
• High percentages of rejects can point to a concurrency problem (multiple programs trying
to enqueue the same SAP object) that may be solved via SAP settings such as “late
exclusive material block” in the OMJI transaction. There are different SAP settings for
different parts of the business processes, so these changes would be implemented with SAP
functional experts.
• High counts of error can point to a problem where the enqueue table is too small. Compare
“peak util” with “granule arguments” and “granule entries” to check for the table filling.

11.1.5. SM50 Transaction


SM50 is an overview of activity on an SAP instance. If many processes are doing the same thing
(e.g. access same table, ENQ, CPIC, etc), this can point to where further investigation is required.

© 2003 International Business Machines, Inc.

Page 73
IBM Americas Advanced Technical Support

11.1.6. SM51 Transaction


SM51 can be used to check instance queues (goto > queue information)
• If there are queues for DIA, UPD, UP2, etc, there will be “wait for work process” in STAT
and ST03.
• If there are queues for ENQ, there is a problem with enqueue performance.

11.1.7. SM66 Transaction


Gives an overview of running programs on an SAP system. If many processes are doing the same
thing (e.g. access same table, ENQ, CPIC, etc), this can point to where further investigation is
required.

11.1.8. STAT Transaction


Displays STAT records for a single SAP instance.

11.1.9. STAD Transaction


Is used to displays STAT records for an interval from all instances on an SAP system.

11.1.10. ST02 Transaction


ST02 is used to monitor the activity in SAP managed buffer areas, such as program buffer, generic
buffer, roll, and EM.

11.1.11. ST03N Transaction


ST03 is not a tool for solving performance problems. It is a tool that is mainly useful for tracking
historical activity. One can monitor average response times for individual transactions and for the
system as a whole, and get counts of dialog steps to use for trend analysis.

There are a few limited ways that it might be used in performance monitoring:
• As a filter for inefficient programs. Use the ST03 “transaction” profile, sort the list by elapsed
time, and look for transactions which use very little CPU relative to elapsed time, e.g. 10% or
less of elapsed time is CPU on the application server. These may have problems such as
inefficient database access, slow RFC calls, etc.
• As a filter for problems that occur at a certain time of the day. Run ST03, and select “dialog”
process display. Use the ST03 “times” profile, press the right arrow to go to the screen that
displays average direct read, sequential read, and change times. Look for hours of the day
when the average time goes up. This could point to a time when there is an I/O constraint, or
CPU constraint on the DB server.
• Use as a filter for database performance problems, in very limited circumstances. If average
“sequential read” times are over 10 ms for dialog, and commit time is over 25-30 ms, there
may be some sort of database performance problem. Check SQL cache with ST04, look for
I/O constraints and other database problems.

© 2003 International Business Machines, Inc.

Page 74
IBM Americas Advanced Technical Support

11.1.12. ST04 Transaction


ST04 has many functions, the most important for performance are: viewing the SQL cache,
checking Oracle delays using v$system_event, and monitoring database buffer activity and
monitoring Oracle shadow processes.

11.1.13. ST05 Transaction


ST05 is one of the most important tools for SAP performance, among its functions are:
• Trace calls to database server to check for inefficient SQL when program is known.
• Compress and save SQL traces for regression testing and comparisons.
• Trace RFC, enqueue, and locally buffered table calls.

11.1.14. ST06 Transaction


Display AIX-level stats for the application server – paging, CPU usage, and disk activity.

11.1.15. ST10 Transaction


ST10 is used to monitor table activity, and table buffering in SAP on the application server:
• Check for tables that are candidates for buffering in SAP
• Check for incorrectly buffered tables

11.1.16. RSINCL00 Program


Expand ABAP source and include files, with cross-reference of table accesses. This is useful
when examining ABAP source, as it gives an overview of the whole program.

11.1.17. SQLR Transaction


Merge an ST05 trace with STAT records, to determine which dialog step executed which
statements. This is useful when tracing a transaction made up of many dialog steps, to join the
trace to the dialog step that issued the problematic SQL.

Depending on the SAP version, this may be available as a transaction SQLR, program
/SQLR/0001, or program SQLR0001.

11.1.18. RSTRC000 Program


Lock a user mode into a work process. This is useful for doing traces (such as OS level or Oracle
level traces) where one needs to know the PID to establish the trace. Once the user is locked into
a work process, one can determine the PID or the work process and Oracle shadow process, and
use them to establish the trace.

11.2. AIX
11.2.1. nmon
Can be used to record and report many different AIX indicators. It is very useful for tracking and
trending CPU, I/O activity, or paging. The I/O activity is reported as rates, not rate and response
time, so one cannot use it to determine I/O delay. Oracle statistics should be used for I/O delays.
© 2003 International Business Machines, Inc.

Page 75
IBM Americas Advanced Technical Support

11.2.2. ptx
Can be used to record and report many different AIX indicators. It is very useful for tracking and
trending CPU, I/O activity, or paging activity. The I/O activity is reported as rates, not rate and
response time, so one cannot use it to determine I/O delay. Oracle statistics should be used for I/O
delays.

11.2.3. iostat
Shows I/O rates, but not response times. No easy way to record and play back. Not too useful in
diagnosing problems.

11.2.4. vmstat
Shows CPU use, and paging activity. No easy way to record and play back.

Some caveats on vmstat:


• fre is not really free memory at an application level, since AIX maps file pages into
memory, and will tend to fill real memory with anything that has been touched by a
program. If you are checking for available memory on a DB or application server, in
order to increase the size of Oracle or SAP buffers, see the process in section 9.6.2.
• In the page section, pi is most important, the rest of the page indicators are much less
important. Pi and po are real page activity to disk. Page-ins cause delay to a program that
is waiting on the page fault. Page-outs may just be AIX moving infrequently referenced
pages out to disk.

11.3. Oracle
The most important Oracle performance tool, SQL cache analysis, is in SAP ST04 transaction.

11.3.1. STATSPACK
STATSPACK can be used to monitor trend information, such as I/O activity, Oracle delays (the
v$system_event view), and CPU use. Since it cannot link SQL statements to their SAP program
source line, as can be done with ST04 SQL cache, the STATSPACK list of high impact SQL is of
limited use compared to ST04 SQL cache. STATSPACK shows the program name, but one then
has to search for the statement in the program.

STATSPACK setup is described in Oracle metalink DOCID 149124.1.

11.3.2. Trace and tkprof


Since Oracle does not show delay statistics (the v$system_event waits) in the SQL cache, one
needs to use oracle trace to evaluate the causes of delay for a specific SQL statement. This is
seldom needed, since locking problems can be seen at runtime with SM50 and SM66, and I/O
counts and statement hit-rate can be seen with ST04 SQL cache.

Lock the user running the test in an SAP work process as discussed in section 11.1.18, then trace
the PID of the user’s oracle process and then tkprof the trace.

© 2003 International Business Machines, Inc.

Page 76
IBM Americas Advanced Technical Support
SAP note 654176 has information on using TKPROF. Oracle metalink also describes the process
for using trace and tkprof in DOCID 142898.1.

© 2003 International Business Machines, Inc.

Page 77
IBM Americas Advanced Technical Support

12. Appendix 2: Reference Materials


12.1. Oracle Manuals
• 9i Database Performance Tuning Guide and Reference – A96533-02

12.2. IBM manuals


• AIX 5L Performance Tools Handbook – SG24-6039

© 2003 International Business Machines, Inc.

Page 78
IBM Americas Advanced Technical Support

Figure 1: STAD transaction display of time components 7


Figure 2: ST03N VA05 8
Figure 3: ST03N VA05 top time 9
Figure 4: SE30 TMP variant selected 10
Figure 5: SE30 gather stats on internal tables 11
Figure 6: SE30 Aggregation level 12
Figure 7: SE30 start/end measurement 13
Figure 8: SE30 active 13
Figure 9: SE30 analysis 14
Figure 10: SE30 sorted by Net time 15
Figure 11: SE30 source code 16
Figure 12: SE30 program attributes 16
Figure 13: SE30 tips and tricks 17
Figure 14: SE30 binary search 18
Figure 15: MIRO ST03N times 19
Figure 16: MIRO ST03N Database 20
Figure 17: MIRO ST05 trace 21
Figure 18: MIRO ABAP display 22
Figure 19: MIGO program attributes 23
Figure 20: MIRO RBKP table scan 23
Figure 21: RBKP has three indexes 24
Figure 22: RBKP DDIC has 5 secondary indexes 25
Figure 23: RBKP index in data dictionary 26
Figure 24: ZMM program - lots of fast DB accesses 27
Figure 25: ZMM ST05 28
Figure 26: ST05 summarized trace 28
Figure 27: ZMM program compressed trace 29
Figure 28: ZMM program source 30
Figure 29: ST04 SQL cache - big problem example 32
Figure 30: ST04 SQL cache - moderately inefficient SQL 32
Figure 31: Sample explain 34
Figure 32: Display indexes from explain 35
Figure 33: VBRP high impact statement 37
Figure 34: VBRP explain 38
Figure 35: VBRP table and index 39
Figure 36: Program accessing VBRP 39
Figure 37: Program accessing VBRP not from SAP 40
Figure 38: SAP note with examples of using the SD data model 40
Figure 39: ST04 SQL cache for BDCPV example 41
Figure 40: BDCPV explain 42
Figure 41: Display definition of BDCPV view 43
Figure 42: BDCP indexes 44
Figure 43: BDCPS index columns 45
Figure 44: ST05 replaced variables 46
© 2003 International Business Machines, Inc.

Page 79
IBM Americas Advanced Technical Support
Figure 45: SE16 BDCP 47
Figure 46: BDCP number of entries 47
Figure 47: SE16 BDCPS 48
Figure 48: BDCPS number of entries 48
Figure 49: BDCPV access path with RULE hint 49
Figure 50: BDCPV source code 50
Figure 51: A616 in SQL cache 51
Figure 52: A616 explain 52
Figure 53: A616 index 53
Figure 54: A616 ABAP source 54
Figure 55: A616 technical settings 55
Figure 56: ST02 buffers with swaps 56
Figure 57: ST02 buffered objects 56
Figure 58: ST03N by DB time 57
Figure 59: ST03N by DB time database tab 58
Figure 60: ST04 file system requests with good response times 60
Figure 61: ST04 file system requests 61
Figure 62: ST04 file system requests slow on high write files 62
Figure 63: ST04 database overview 63
Figure 64: ST04 - Oracle v$system_event delay statistics 64
Figure 65: ST04 Oracle sessions 64
Figure 66: ST02 swaps 65
Figure 67: ST03N overview 65
Figure 68: ST10 not buffered tables 66
Figure 69: SM04 > goto > memory 68
Figure 70: SM04 Sessions 68
Figure 71: ST03N memory use statistics 69
Figure 72: ST06 displays free memory 70
Figure 73: svmon -G output 70
Figure 74: vmtune maxperm display 71

© 2003 International Business Machines, Inc.

Page 80

Vous aimerez peut-être aussi