Académique Documents
Professionnel Documents
Culture Documents
Mark Gordon
Version: 1.1
Date: November 14, 2003
1. ACKNOWLEDGEMENTS ............................................................................................................ 4
2. DISCLAIMERS .............................................................................................................................. 4
3. COPYRIGHTS................................................................................................................................ 4
4. FEEDBACK .................................................................................................................................... 4
5. VERSION UPDATES ..................................................................................................................... 4
6. INTRODUCTION........................................................................................................................... 5
7. ANALYZING A PROBLEM WITH A SPECIFIC PROGRAM OR TRANSACTION .............. 7
7.1. COMPONENTS OF SAP RESPONSE TIME ........................................................................................ 7
7.2. MAJORITY OF TIME IS CPU ON APPLICATION SERVER................................................................... 8
7.2.1. Summary of Majority of time is CPU on application server.............................................. 18
7.3. MAJORITY OF TIME IS DATABASE REQUEST TIME ....................................................................... 19
7.3.1. Slow per-row SQL performance ....................................................................................... 19
7.3.1.1. Summary of Slow per-row SQL performance........................................................... 26
7.3.2. Program performs too many SQL operations ................................................................... 27
7.3.2.1. Summary of Program performs too many SQL operations........................................ 30
7.4. MAJORITY OF TIME IS NOT CPU OR DB REQUEST TIME .............................................................. 31
8. SYSTEM PERFORMANCE PROBLEMS.................................................................................. 31
8.1. PERFORM SQL CACHE ANALYSIS ............................................................................................. 31
8.1.1. Indicator of inefficient SQL access................................................................................... 31
8.1.2. SQL Bget ratios that are not problems ............................................................................. 33
8.1.3. Causes of inefficient SQL access ...................................................................................... 33
8.1.4. Reasons that the predicates do not match indexes ............................................................ 35
8.1.5. Actions when predicates do not match indexes ................................................................. 36
8.1.6. High-impact SQL caused by incorrect use of SAP data model.......................................... 37
8.1.6.1. Summary of incorrect use of data model................................................................... 41
8.1.7. High-impact SQL when optimizer takes wrong access path.............................................. 41
8.1.7.1. Summary of wrong choice of access path ................................................................. 50
8.1.8. High-impact SQL is a symptom of another problem ......................................................... 51
8.1.8.1. Summary of symptom of another problem................................................................ 57
8.2. CREATE CANDIDATE LIST FROM ST03N..................................................................................... 57
9. SYSTEM HEALTH CHECK ....................................................................................................... 58
9.1. CPU ACTIVITY ......................................................................................................................... 58
9.2. I/O ACTIVITY ........................................................................................................................... 59
9.2.1. Good I/O performance in Oracle ..................................................................................... 60
9.2.2. Symptom of I/O hotspot on disk........................................................................................ 61
9.2.3. May be AIX constraint or disk performance problem ....................................................... 62
9.3. ORACLE REVIEW ...................................................................................................................... 63
9.3.1. Database Hit Rate............................................................................................................ 63
9.3.2. Database delays............................................................................................................... 64
9.4. SAP BUFFER SETTINGS ............................................................................................................. 65
© 2003 International Business Machines, Inc.
Page 2
IBM Americas Advanced Technical Support
9.5. SAP BUFFERED TABLE STATISTICS ............................................................................................ 66
9.5.1. Not buffered but could be buffered ................................................................................... 66
9.6. EVALUATE MEMORY IN AIX ..................................................................................................... 67
9.6.1. AIX paging....................................................................................................................... 67
9.6.2. Evaluating increasing memory for Oracle or SAP............................................................ 69
10. FOUR GUIDELINES FOR AVOIDING PERFORMANCE PROBLEMS............................ 71
10.1. USE THE SAP DATA MODEL .................................................................................................. 71
10.2. USE ARRAY OPERATIONS ON THE DATABASE .......................................................................... 72
10.3. CHECK WHETHER THE DATABASE CALL CAN BE AVOIDED....................................................... 72
10.4. WRITE ABAP PROGRAMS THAT ARE LINE-ITEM SCALABLE .................................................... 72
11. APPENDIX 1: SUMMARY OF PERFORMANCE MONITORING TOOLS ....................... 73
11.1. SAP ..................................................................................................................................... 73
11.1.1. DB02 Transaction............................................................................................................ 73
11.1.2. SE11 Transaction ............................................................................................................ 73
11.1.3. SE30 Transaction ............................................................................................................ 73
11.1.4. SM12 Transaction............................................................................................................ 73
11.1.5. SM50 Transaction............................................................................................................ 73
11.1.6. SM51 Transaction............................................................................................................ 74
11.1.7. SM66 Transaction............................................................................................................ 74
11.1.8. STAT Transaction ............................................................................................................ 74
11.1.9. STAD Transaction ........................................................................................................... 74
11.1.10. ST02 Transaction......................................................................................................... 74
11.1.11. ST03N Transaction ...................................................................................................... 74
11.1.12. ST04 Transaction......................................................................................................... 75
11.1.13. ST05 Transaction......................................................................................................... 75
11.1.14. ST06 Transaction......................................................................................................... 75
11.1.15. ST10 Transaction......................................................................................................... 75
11.1.16. RSINCL00 Program..................................................................................................... 75
11.1.17. SQLR Transaction........................................................................................................ 75
11.1.18. RSTRC000 Program .................................................................................................... 75
11.2. AIX ..................................................................................................................................... 75
11.2.1. nmon................................................................................................................................ 75
11.2.2. ptx ................................................................................................................................... 76
11.2.3. iostat................................................................................................................................ 76
11.2.4. vmstat .............................................................................................................................. 76
11.3. ORACLE ............................................................................................................................... 76
11.3.1. STATSPACK.................................................................................................................... 76
11.3.2. Trace and tkprof .............................................................................................................. 76
12. APPENDIX 2: REFERENCE MATERIALS........................................................................... 78
12.1. ORACLE MANUALS ............................................................................................................... 78
12.2. IBM MANUALS ..................................................................................................................... 78
Page 3
IBM Americas Advanced Technical Support
1. Acknowledgements
Thank you to Walter Orb of IBM Germany, who was for several years the pSeries SAP performance lead
for IBM Americas Solutions ATS, and who showed me many of the processes used in this paper.
Thank you to Marty Carangelo, Dale Martin, and Ralf Schmidt-Dannert, for their contributions on Oracle
and pSeries performance.
Thank you to Phil Hardy and Damir Rubic, who reviewed the paper and offered many improvements.
2. Disclaimers
IBM has not formally reviewed this paper. While effort has been made to verify the information, this
paper may contain errors. IBM makes no warranties or representations with respect to the content hereof
and specifically disclaims any implied warranties of merchantability or fitness for any particular purpose.
IBM assumes no responsibility for any errors that may appear in this document. The information
contained in this document is subject to change without any notice. IBM reserves the right to make any
such changes without obligation to notify any person of such revision or changes. IBM makes no
commitment to keep the information contained herein up to date.
The paper contains examples from systems ranging from SAP 4.6 and Oracle 8.1.7 and AIX 4.3 up to
SAP 6.20 Oracle 9.2 and AIX 5.2.
The processes and guidelines in this paper are the compilation of experiences analyzing performance on a
variety of SAP systems. Your results may vary in applying them to your system.
3. Copyrights
SAP and R/3 are copyrights of SAP A.G.
RS/6000, pSeries, and AIX are copyrights of IBM Corporation.
Oracle is a copyright of Oracle Corporation.
OraPerf.com is a copyright of OraPerf.com
4. Feedback
Please send comments or suggestions for changes to gordonmr@us.ibm.com.
5. Version Updates
• Version 1.0 – initial version
• Version 1.1 – add SM04 Sessions
Page 4
IBM Americas Advanced Technical Support
6. Introduction
There are two intended audiences for this paper – Oracle DBAs and SAP BASIS administrators. Either
may be doing performance analysis on an SAP system with Oracle on AIX. The goal of this paper is to
provide each audience with material that is useful and new: An SAP Basis administrator experienced with
other databases should find the ORACLE specific tuning tools and techniques helpful, while the
experienced ORACLE adminstrator is presented with SAP specific tuning tools and techniques.
This paper covers the two most common types of performance problems – database performance and
inefficient ABAP coding. While there are other causes of problems in SAP (e.g. network performance,
external RFC interfaces, SAP instance configuration, SAP sort, etc), database and ABAP performance are
the most common and generally have the biggest impact. In order to provide the most benefit in the
smallest paper, these other issues are not included in this paper.
This paper has a process-based approach, where different goals are pursued via different processes and
tools.
• To fix a problem reported for a specific program, we will perform elapsed time analyses of
programs, determine where time is spent, and optimize these long running parts. This includes
interpretation of ST03N and STAT/STAD records, and using ST05 and SE30. The paper will
demonstrate how to use the SAP stats to obtain database performance statistics, identify I/O
bottlenecks and SAP problems, etc. The benefit of this approach is that it is focused on an area
that has been identified as a business problem.
• To check for inefficient use of DB resources and improve overall database server
performance, we will use ST04 statement cache analysis. The value of this approach is that it
offers a very big potential payoff in reducing resource usage and increasing system efficiency.
The disadvantage is that one may be finding and solving problems that no end-user cares about.
For example, if we can improve the elapsed time of a batch job from 2 hours to 10 minutes, but the
job runs at 2:00 AM, and nobody needs the output until 8:00 AM, it may not really be a problem.
Even if it is not a business problem, it may still be beneficial to address a problem of this type as
part of optimizing resource consumption, in order to reduce the computing resources required to
support business requirements.
• To do a system health check, review AIX paging and CPU usage, Oracle I/O statistics, ST10 and
ST02 buffering. The operating environment needs to be running well for good performance, but
problems in these areas can be symptoms of other problems. For example, inefficient SQL can
cause high CPU usage or high I/O activity. Therefore, a health check should be done together with
analysis of SQL and ABAP problems.
This paper has many examples, and it describes what is good or bad in each example. There are not
always specific rules given on what is good or bad, such as “Database request time” over 40% of “elapsed
time” is bad and under 40% is good. Rather, this paper tries to focus on an opportunity-based approach,
such as:
• Look for where a program (or the SAP and database system) spends time.
• Ask “If I fix a problem in this area, will people notice and care that it has been fixed?”
Page 5
IBM Americas Advanced Technical Support
It will discuss how to estimate the impact of solving a problem. System wide performance analysis (such
as a statement of cache analysis, or ST03 analysis) will generally turn up several candidates. By
estimating the impact of fixing these problems, one can decide which to address first.
When doing this analysis, it is important to identify and track specific issues. Often, a performance issue
may not have enough impact to merit a new index, or an ABAP change. In this case, we want to track that
we have analyzed it, and chosen not to do anything, so that we don’t waste time discovering it again next
year.
This paper refers to a number of SAP Notes. An OSS userid, or userid that allows access to
service.sap.com, is a prerequisite for anyone doing performance analysis on an SAP system, whether the
person is an Oracle DBA, AIX administrator, or SAP BASIS Administrator.
This paper breaks performance management into three parts, which are discussed in Sections 7, 8, and 9:
• Analyzing a problem with a specific program or transaction
• System performance problems
• System Health Check.
Since AIX-level symptoms such as paging, excessive CPU use, and high I/O rates can be symptoms of
application and SQL problems, one needs to start with reviewing the SAP and Oracle indicators, before
taking action based on the AIX level indicators.
Page 6
IBM Americas Advanced Technical Support
Page 7
IBM Americas Advanced Technical Support
Page 8
IBM Americas Advanced Technical Support
Check the ST03N “Hit lists”, to see what the elapsed time components are when it runs for a long
time. The longest running dialog steps are saved in the ST03N hit lists.
Since CPU time is the majority of elapsed time, use SE30 to trace the ABAP runtime. We arrange for
an end-user (ID MOO in the examples below) to run VA05 so we can trace it.
Page 9
IBM Americas Advanced Technical Support
RUN SE30 and select the TMP variant.
Page 10
IBM Americas Advanced Technical Support
Then press change to change the TMP variant options.
Page 11
IBM Americas Advanced Technical Support
In the “Duration/Type” tab, select “by call”.
Press save on the screen in Figure 6 and green arrow to go back to the SE30 main screen shown in
Figure 4. Then press “Enable/Disable” on the SE30 main screen to show the list of work processes
here in Figure 7.
Page 12
IBM Americas Advanced Technical Support
Page 13
IBM Americas Advanced Technical Support
After tracing for a while, press “End measurement”, and green arrow back to the main SE30 screen
shown in Figure 4. Select the trace file, and press “analyze”.
Page 14
IBM Americas Advanced Technical Support
Press “Hit list” and then sort the list by “Net” to see where the time was spent.
Read table can be very fast, depending on the size of the table and the options used in defining the
table. Note that the “Read Table IT_2396” statement takes about 30 microseconds per call (461269
microseconds/14526 calls).
Page 15
IBM Americas Advanced Technical Support
Select the slow “Read Table” statement and press ‘Display source code” to see the ABAP.
Page 16
IBM Americas Advanced Technical Support
Performance problems reading internal tables are very common performance and scalability issues.
In the SE30 transaction, press the “Tips and tricks” button to see various suggestions on ways to
improve ABAP programming. “Tips and tricks” contains several examples of common problems with
internal tables.
Page 17
IBM Americas Advanced Technical Support
One can review “Tips and tricks” for suggestions on ways to improve ABAP programs. In this case,
the problem appears to be that the ABAP is doing a linear search of the internal table.
This is an SAP program, so we would open an OSS message to SAP. If the program were a
custom program, then we would send this to the development team to investigate ways to speed up
processing the internal tables, such as “BINARY SEARCH” option on READ TABLE.
Page 18
IBM Americas Advanced Technical Support
Page 19
IBM Americas Advanced Technical Support
If average sequential read times are long, check the STAT/STAD records for the transaction, to
determine whether many rows are retrieved with each sequential read. If there are many rows per
read, use the time per row when evaluating performance.
Page 20
IBM Americas Advanced Technical Support
Since DB request time is the majority of time, and individual SQL calls may be long, trace the
transaction with ST05. Here is the output of an ST05 SQL trace.
Page 21
IBM Americas Advanced Technical Support
Page 22
IBM Americas Advanced Technical Support
Review the access path used by Oracle -- press the Explain SQL button in Figure 17.
Page 23
IBM Americas Advanced Technical Support
“TABLE ACCESS FULL” in Figure 20 means that no index on the table is used. Get the local
predicates (the WHERE “column operator value” clauses) from the statement in Figure 20. We
will compare them to the columns in the indexes, to see if the predicates reference indexed
columns.
• MANDT =
• LIFNR =
• WAERS =
• RMWWR =
• BUKRS =
• BLDAT =
• RBSTAT =
• XBLNR =
Drill into the table name in Figure 20, to display the indexes and indexed columns.
Page 24
IBM Americas Advanced Technical Support
the predicates. Thus RBSTAT, which is the third column in the RBKP~3 and RBKP~4 indexes,
cannot be processed as matching column.
So, the problem seems to be that the predicates in the SQL do not match the indexes on the table.
Since Figure 19 showed that the program is SAP code, we would:
• First, check the SAP data dictionary definitions
• Second, search SAP notes for this problem
• Third, open an OSS message
Check the indexes already defined in the data dictionary using: SE11 > display > indexes.
Page 25
IBM Americas Advanced Technical Support
Select index in the list in Figure 22, and press ‘Choose’.
If the index is created, and does not solve the problem, then one would go to service.sap.com in
order to search SAP notes, and then open an OSS message if no solution was found.
Page 26
IBM Americas Advanced Technical Support
Since the program is an SAP program, if the index had not been found in the data dictionary,
we would search OSS for notes about this problem, and if none were found, would open an
OSS message to SAP.
In this example, we have been asked to review the performance of a program that is slow. In
checking the stats for the program, we see that database request time (“0 DB”) is long, but the
average time for each sequential read is short, just 2.5 ms. (You may also see low average read
times with long running programs if the counters for database time fill or wrap. See SAP note
99584 for details.)
Page 27
IBM Americas Advanced Technical Support
We trace the transaction with ST05, and then list the trace.
Page 28
IBM Americas Advanced Technical Support
Choose the starting point and end point in the summary, and then press the summary button to
compress the trace by table and operation.
Page 29
IBM Americas Advanced Technical Support
Since the program is custom code (Zxx), we send it back to the ABAP team, to review whether the
program could be restructured to use SAP array fetch operations.
Page 30
IBM Americas Advanced Technical Support
Since the ST04 SQL cache does not have per-statement CPU, elapsed time, or delay statistics, one
cannot easily use the SQL cache to find locking or other delay problems, or to estimate the time
impact of the inefficient SQL. In order to determine the runtime of inefficient SQL statements,
one must trace them with ST05.
With Oracle 9i, the Oracle V$SQL view contains CPU and elapsed time per statement. One can
use the information in V$SQL to augment the information displayed in the SAP ST04 SQL cache
when analyzing problems such as locking or I/O constraints that cause long statement elapsed
time.
Page 31
IBM Americas Advanced Technical Support
Page 32
IBM Americas Advanced Technical Support
The higher ‘Bgets/row’ and ‘Bgets/exec’ are, the more likely that the SQL is inefficient, and
Oracle is doing extra work to retrieve the data. If a statement is in the top 20 in the SQL cache
ordered by buffer gets (or Disk reads) during a peak time, and the Bget/exec > 50 and
Bgets/row> 50, the statement should be examined to determine if it can be improved.
Local predicates are the WHERE “COLUMN operator value” clauses in the SQL. If the
predicates don’t contain columns that are in indexes, then Oracle generally cannot access the data
efficiently.
Page 33
IBM Americas Advanced Technical Support
ROWNUM <= is a predicate that is inserted by Oracle to stop fetching rows before the complete
result set has been retrieved as soon as the ROWNUM count has been reached. MANDT is
usually automatically inserted into the SQL by SAP. The program specified VGBEL=.
One can drill into the table or indexes shown in Figure 31, to see the indexes on the table.
Page 34
IBM Americas Advanced Technical Support
Many kinds of information are redundantly stored in SAP. For instance, the transaction data for a
billing document may contain columns with the document number for the delivery or sales order
© 2003 International Business Machines, Inc.
Page 35
IBM Americas Advanced Technical Support
being billed. But just because the information is present does not mean that is the right way to
retrieve it. Billing documents, to continue the example above, are indexed to support lookup by
billing document number. If a program tries to access the billing tables using the delivery column,
in order to find the associated billing document, the column with the delivery number will not be
indexed, and access will be slow. Usually, the solution is not to create an index, but to use the
SAP data model correctly.
There is an example of incorrect use of the SAP data model below in section 8.1.6.
SAP has several SAP notes that describe wrong and right ways to use the SAP data model to
retrieve application information:
• MM – SAP note 191492
• SD – SAP note 185530
• PP – SAP note 187906
Page 36
IBM Americas Advanced Technical Support
Page 37
IBM Americas Advanced Technical Support
The statement is doing an index range scan, so it is using the index, but we cannot tell yet how
many index columns are matched in the SQL.
Drill in on the table VBRP to check the indexes to see which columns from the predicates are in
the indexes.
Page 38
IBM Americas Advanced Technical Support
Use the “Display call point” button in Figure 33 to look at the source code.
Page 39
IBM Americas Advanced Technical Support
SAP note 185530 describes several incorrect and correct lookups for SD. This problem is
contained in the note:
Figure 38: SAP note with examples of using the SD data model
© 2003 International Business Machines, Inc.
Page 40
IBM Americas Advanced Technical Support
The action for this problem is to send it back to the developer.
In this proposed fix, for SD documents there is a table VBFA (document flow) that contains the
predecessor and successors for sales documents. Given a sales order number, one can find all the
subsequent documents related to the sales order, or given an invoice number, one can find all the
predecessor documents that lead to that invoice document.
Page 41
IBM Americas Advanced Technical Support
Figure 40 shows that BDCPV is a view on BDCP and BDCPS. BDCP is accessed using
BDCP~POS, and BDCPS is accessed using BDCPS~1, and then the two result sets are sort merged
together.
The ST04 statistics tell us that this cannot be a good way to access the data, since each execution
takes over 500,000 buffer gets, and returns on average less than one row. If it were an efficient
way to access the data, it might take 20-30 buffer gets per execution.
Page 42
IBM Americas Advanced Technical Support
One can use “SE11 > display > utilities > database object > display” to display the definition of the
view on BDCPV, in order to determine which tables contain which predicate columns.
Page 43
IBM Americas Advanced Technical Support
In Figure 41, the columns MANDT, MESTYPE, and PROCESS are in BDCPS, and CRETIME
and CDOBJCL are in BDCP.
Check the indexes used, to see how well the predicates and indexes match.
Page 44
IBM Americas Advanced Technical Support
contain MANDT, which must be checked to satisfy the join conditions. We must check further to
determine what the best access path is.
Going back to our list of predicates above and comparing them with the explain, we have
• MANDT (in BDCPS~1 and BDCP~POS)
• MESTYPE (in BDCPS~1)
• PROCESS (in BDCPS~1)
• CRETIME (in BDCP~POS)
• CDOBJCL (evaluated in BDCP table, not in index)
Next, we want to determine where the filtering takes place -- whether the predicates on BDCPS
eliminate the most rows, or predicates on BDCP. Usually, the table where the filtering takes place
will be chosen as the first (aka driving) table in a join. In order to find the filtering predicates, we
have to trace the program with ST05, and get sample values of the variables, since ST04 does not
show the values.
Page 45
IBM Americas Advanced Technical Support
Trace the program with ST05, list the trace, and press the ‘replace vars’ button.
Page 46
IBM Americas Advanced Technical Support
Next, use SE16 to search each table using its predicates. First, check BDCP using the predicates
and values on CRETIME and CDOBJCL from Figure 44.
Page 47
IBM Americas Advanced Technical Support
We now know that the filtering takes place on BDCPS in index BDCPS~1, which contains
MANDT, MESTYPE and PROCESS. This index also contains CPIDENT (see Figure 43), which
can be used to do a primary key unique index lookup into BDCP (see Figure 42).
So, we want BDCPS to be the outer table (the first one referenced) and then have BDCP accessed
using its primary index BDCP~0.
Page 48
IBM Americas Advanced Technical Support
When we explain the statement again, this time using the Oracle RULE hint, this is the access path
that Oracle chooses -- BDCPS~1 is accessed first, then BDCP~0.
Oracle has a number of hints that can be used to influence the access path selection. See the
Oracle manuals referenced below in Section 12.1.
See also SAP notes 129385 and 130480 regarding use of hints.
Page 49
IBM Americas Advanced Technical Support
Check the program source, and see that it is SAP code.
Page 50
IBM Americas Advanced Technical Support
Page 51
IBM Americas Advanced Technical Support
Page 52
IBM Americas Advanced Technical Support
Page 53
IBM Americas Advanced Technical Support
In Figure 51, press the button ‘Display call point in ABAP program’.
Page 54
IBM Americas Advanced Technical Support
The table is configured as buffered on the application server, but there must be some sort of
problem, since the SQL cache statistics in Figure 51 show over 200,000 SQL calls. When a table
is buffered on the application server, there should be very few calls to the database server to
retrieve rows from the table.
It may be that the table is frequently changed and invalidated and then re-loaded or that the table
does not fit in the application server buffer. These can be checked in Figure 57 below. The table
may be read by the ABAP in a way that bypasses buffering on the application server. See SAP
note 47239 regarding buffered table behavior.
Page 55
IBM Americas Advanced Technical Support
So, check the table buffers using ST02.
Page 56
IBM Americas Advanced Technical Support
There are very few changes on the table, so the problem is not that the table is being invalidated
and re-loaded.
So, we have now determined that the frequent calls to A616 are not a problem with inefficient
SQL, but are caused by something else – that the generic buffers on the application servers are not
large enough.
There are other situations where the SQL can be a symptom of another problem, as when
buffered tables are accessed from ABAP in a way that bypasses the SAP buffering (SAP note
47239).
Page 57
IBM Americas Advanced Technical Support
At this point, we cannot tell which is occurring, but one can trace the programs with ST05, as shown
in Section 7.3, to determine whether there is a problem or not.
SAP ST06 transaction has statistics on recent use, and on historical CPU use. Since ST06 has
statistics based on hourly averages, it will not show CPU constraints until they are very severe. It is
generally better to monitor CPU use using an AIX based tool, and gather statistics on 5 or 10 minute
intervals, in order to be able to calculate average and peak activity.
One can use nmon, sar, ptx, or other tools to gather and report on the utilization.
Page 58
IBM Americas Advanced Technical Support
While it may be acceptable for an application server running batch work to run at 100% utilization, if
an application server supporting dialog or critical interfaces is frequently hitting 100% utilization, it
can have a significant impact on response time and end-user satisfaction.
ST04 and STATSPACK both report average read times, which are more useful than the read rates
available with most AIX monitors.
Another reason for starting with the Oracle view of I/O activity is that there are several different
configuration problems that can cause the symptom of slow I/O at the Oracle level, though physical
I/O at the AIX level is fast. The actual problem is serialization on a resource (e.g. AIO server
processes, filesystem buffers, or filesystem i-nodes) in AIX after the I/O is issued by Oracle, but
before AIX sends the I/O to the disk.
Page 59
IBM Americas Advanced Technical Support
There are also queue sizes on disk adapters and hdisks that can be checked and set in the ODM via
SMIT. In our experience, the defaults work fine with SAP.
ST04 > detail analysis menu > file system requests can be used to view read and write activity and
times, as viewed by Oracle. With AIX AIO, the write times are often unreliable, so it is generally best
to focus on the read times as an indicator of performance problems.
Figure 60: ST04 file system requests with good response times
Page 60
IBM Americas Advanced Technical Support
In order to prevent this from happening, we recommend using a database layout where each LV is
configured to reside on many hdisks. In SMIT, this would be done by creating the LV on several
hdisks, and specifying “Range of physical volumes maximum”. See the paper “Configuring the
Enterprise Storage Server (ESS) for Oracle OLTP Applications”, which is document number
WP100319 at http://www.ibm.com/support/techdocs, for more details about configuring a “stripe
and spread” layout for the database.
Page 61
IBM Americas Advanced Technical Support
The solution for I/O hotspots is to analyze the I/O at the physical level in the storage system, and if
overloaded spots are found, to spread the datafile across more disks:
• Determine name(s) of file(s) with slow response time – ST04 or STATSPACK
• Convert filename to filesystem – using ls or find commands
• Convert filesystem to LV – using df or lsfs
• List PVs (or vpaths) in the LV – using lslv –p
• If ESS, list LUNs for the vpaths – using lsess
• Use ESS expert to check I/O activity for clusters, ranks, LUNs
Figure 62: ST04 file system requests slow on high write files
The average I/O times are good on some files, but the files that have the most write activity have
very slow I/O times. This could occur for several reasons, such as AIX i-node contention, or write
cache filling on the disk system. I-node serialization occurs on JFS structured datafiles. When the
file is being written, it is locked to serialize access and preserve integrity. This serialization blocks
readers and writers from using the datafile.
When there are slow Oracle I/O times on frequently changed datafiles, the actions to resolve the
problem might be:
• Examine disk statistics (e.g. ESS expert) for signs of disk or cache overload related to high
write activity, if none found, then
• Open a perfpmr to confirm whether AIX i-node serialization is a problem
• If i-node serialization is a problem, then
o Move data to JFS2 filesystems with AIX 5.2, and mount with cio option, which gets rid
of i-node serialization, or
© 2003 International Business Machines, Inc.
Page 62
IBM Americas Advanced Technical Support
o If JFS2 is not an option, and datafiles are currently larger than 2 GB, rebuild the slow
tablespaces with a max 2 GB datafile size, or
o If JFS2 is not an option, convert to raw LV structured DB.
Since inefficient SQL can cause the symptom of low hit rates, the SQL cache analysis process in
Section 8.1 should be the first action, when a low hit rate is seen. After the SQL analysis, one can
review adding more memory to Oracle (in order to increase hit rate) by using the process in
Section 9.6.2.
Page 63
IBM Americas Advanced Technical Support
As with hit-rates, many Oracle delays (buffer busy waits, enqueue, etc) can be symptoms of SQL
or application design problems, and may not fixed with database parameters.
Normally, db file sequential read is the largest source of delay. See SAP note 619188 for more
information on Oracle wait events.
While a workload is running, you can use “ST04 > detail analysis > oracle sessions”, to display
wait events for the Oracle processes. Associating an Oracle wait event with its SQL statement can
make it easier to determine the cause of the delay.
Page 64
IBM Americas Advanced Technical Support
Page 65
IBM Americas Advanced Technical Support
In Figure 67, ST03N data that has been downloaded to Excel. Note that load and generate time is
about 5% of average dialog response time. However, this impact will not be evenly distributed. When
the programs and screens referenced by a program are in SAP buffers, the program will run quickly.
If they are not present, then it might take several seconds longer than normal.
If the application servers are memory constrained, then one may choose to live with this impact. If
one wants to be able to have more reliable transaction response times, then one would increase the size
of the buffers that are swapping.
Figure 68 is a segment of an ST10 report for not buffered tables, ordered by calls to the database.
Impact on the database is more a function of the buffer gets per row (as shown in Section 8.1) and
not calls, but calls or rows are the only orders available with ST10.
Page 66
IBM Americas Advanced Technical Support
There are three tables in this list that may be buffering candidates:
• YMSESPIV02
• OICDC
• YVGMDRCADDR
Since OICDC is a SAP standard table (it does not start with Y or Z), one can check in SE13 if
buffering is allowed but not currently enabled. If it is allowed but not enabled, we can turn
buffering on.
Since the tables are all read with direct read, if the table is reasonably small (e.g. <5 MB), not
changed much, and the application can tolerate inconsistency, then they could be fully buffered.
If the tables are very large, then they might be single record buffered, to save buffer space by
buffering only the referenced rows.
SELECT SINGLE (SAP direct read) can be read from either the generic or single record buffer.
SAP SELECT can only be read from the generic buffer.
If you encounter AIX paging on a system with an SAP instance, consider paging first as a
symptom of an application problem, and approach it from the application statistics in SAP.
Since SAP memory use can vary greatly, as programs allocate and free SAP memory, ABAP
problems such as the slow internal table access described in Section 7.2 can contribute to paging
problems. If a report processing many line items runs quickly, while it is running it will acquire
more memory, and then release it. If the program runs too long because of inefficient ABAP
coding, then the program will need the memory for longer than it should. When several programs
do this, it can cause paging. If the programs were coded efficiently, then they would quickly
finish, and this would reduce the likelihood that they would all be running simultaneously.
After having found the programs and users with large memory requirements, one can consider
running the reports in batch on a server specially configured for large memory requirements, or the
huge reports can be automated to run at night, when there may be less demand for memory.
Page 67
IBM Americas Advanced Technical Support
Check for running programs using large amounts of memory with ‘SM04 > goto > memory’:
Page 68
IBM Americas Advanced Technical Support
ST03N has historical statistics on memory usage.
If vmtune shows free memory, then there are truly free memory pages that can be added to SAP or
Oracle buffers. Check vmstat (or ST06) over the course of several days, and if it consistently
shows free memory, then use the minimum free pages to determine the amount of memory that is
available to add to Oracle or SAP. Do not try to allocate all free memory to SAP or Oracle.
Page 69
IBM Americas Advanced Technical Support
There are situations, such as with a database server or NFS server, where AIX may not show
almost no free pages, but there is still a lot of memory that can be added to SAP or Oracle.
AIX maps file pages into available physical memory, in order to help performance. There are
AIX parameters on the vmtune (AIX 5 vmo) command that set the limits of memory-mapped files.
On a database server, first check that SAP note 78498 has been implemented to establish limits
on real memory used for memory-mapped files.
Check the amount of real memory that is being used for memory-mapped files. This can be done
with the command svmon –G.
Page 70
IBM Americas Advanced Technical Support
The total of AIX memory mapped pages is the sum of “pers” and “clnt” columns, for both “pin”
and “inuse” rows. In this example, 234+1,585,658=1,585,892
Check the memory-mapped file page limits with the vmtune (vmo) command:
Rather than having Oracle have a cache miss and do I/O that AIX fulfills from memory-mapped
file cache, it is usually more efficient to give the memory to Oracle. Likewise, if additional SAP
buffer memory can reduce database calls, then it is generally good to give the memory to SAP.
Since SAP and Oracle memory use can vary widely from hour to hour, run the svmon command
periodically for several days or a week, to determine the minimum number of (persistent + client)
pages over the period. This minimum (pers+clnt pin+inuse) should be compared with vmtune
(vmo) maxperm, to determine if there is available memory.
Make changes gradually, and don’t over-allocate the memory in Oracle and SAP buffers, as that
can cause AIX paging.
Don’t add memory to Oracle or SAP just because AIX shows that there are available pages. If
Oracle hit rates are low, or SAP buffers need to be increased to support the workload, then
increasing the size of the Oracle and SAP buffers is reasonable.
In addition to the example in Section 8.1.6, See SAP notes 185530, 187906, and 191492 for examples
of incorrect and correct use of the SAP data model.
Page 71
IBM Americas Advanced Technical Support
The symptom of this problem is in the ST04 SQL cache - high buffer gets per exec and high buffer
gets per row. This happens when the predicates on the SQL do not match the index columns on the
table and Oracle has to read many blocks of data to retrieve the result.
The symptom of this problem is seen in ST05 traces, where a program makes frequent calls to a table,
and each call accesses few rows.
The symptom of this problem is high CPU use for a program, where CPU use does not scale with
additional line items – e.g. a 100 line report takes 1 second CPU, but a 1000 line item report takes
more than 10 seconds CPU, and a 10,000 line report takes much more than 100 seconds CPU. These
scalability problems get worse as the report lines increase.
Page 72
IBM Americas Advanced Technical Support
11.1. SAP
11.1.1. DB02 Transaction
DB02 is used to display information about tables and indexes in the database, such as space usage
trends, size of individual tables, etc.:
• Display all indexes defined on tables (DB02 > detail analysis)
• Check column cardinality (DB02 > detail analysis > enter table name > select table >
press table columns)
Page 73
IBM Americas Advanced Technical Support
There are a few limited ways that it might be used in performance monitoring:
• As a filter for inefficient programs. Use the ST03 “transaction” profile, sort the list by elapsed
time, and look for transactions which use very little CPU relative to elapsed time, e.g. 10% or
less of elapsed time is CPU on the application server. These may have problems such as
inefficient database access, slow RFC calls, etc.
• As a filter for problems that occur at a certain time of the day. Run ST03, and select “dialog”
process display. Use the ST03 “times” profile, press the right arrow to go to the screen that
displays average direct read, sequential read, and change times. Look for hours of the day
when the average time goes up. This could point to a time when there is an I/O constraint, or
CPU constraint on the DB server.
• Use as a filter for database performance problems, in very limited circumstances. If average
“sequential read” times are over 10 ms for dialog, and commit time is over 25-30 ms, there
may be some sort of database performance problem. Check SQL cache with ST04, look for
I/O constraints and other database problems.
Page 74
IBM Americas Advanced Technical Support
Depending on the SAP version, this may be available as a transaction SQLR, program
/SQLR/0001, or program SQLR0001.
11.2. AIX
11.2.1. nmon
Can be used to record and report many different AIX indicators. It is very useful for tracking and
trending CPU, I/O activity, or paging. The I/O activity is reported as rates, not rate and response
time, so one cannot use it to determine I/O delay. Oracle statistics should be used for I/O delays.
© 2003 International Business Machines, Inc.
Page 75
IBM Americas Advanced Technical Support
11.2.2. ptx
Can be used to record and report many different AIX indicators. It is very useful for tracking and
trending CPU, I/O activity, or paging activity. The I/O activity is reported as rates, not rate and
response time, so one cannot use it to determine I/O delay. Oracle statistics should be used for I/O
delays.
11.2.3. iostat
Shows I/O rates, but not response times. No easy way to record and play back. Not too useful in
diagnosing problems.
11.2.4. vmstat
Shows CPU use, and paging activity. No easy way to record and play back.
11.3. Oracle
The most important Oracle performance tool, SQL cache analysis, is in SAP ST04 transaction.
11.3.1. STATSPACK
STATSPACK can be used to monitor trend information, such as I/O activity, Oracle delays (the
v$system_event view), and CPU use. Since it cannot link SQL statements to their SAP program
source line, as can be done with ST04 SQL cache, the STATSPACK list of high impact SQL is of
limited use compared to ST04 SQL cache. STATSPACK shows the program name, but one then
has to search for the statement in the program.
Lock the user running the test in an SAP work process as discussed in section 11.1.18, then trace
the PID of the user’s oracle process and then tkprof the trace.
Page 76
IBM Americas Advanced Technical Support
SAP note 654176 has information on using TKPROF. Oracle metalink also describes the process
for using trace and tkprof in DOCID 142898.1.
Page 77
IBM Americas Advanced Technical Support
Page 78
IBM Americas Advanced Technical Support
Page 79
IBM Americas Advanced Technical Support
Figure 45: SE16 BDCP 47
Figure 46: BDCP number of entries 47
Figure 47: SE16 BDCPS 48
Figure 48: BDCPS number of entries 48
Figure 49: BDCPV access path with RULE hint 49
Figure 50: BDCPV source code 50
Figure 51: A616 in SQL cache 51
Figure 52: A616 explain 52
Figure 53: A616 index 53
Figure 54: A616 ABAP source 54
Figure 55: A616 technical settings 55
Figure 56: ST02 buffers with swaps 56
Figure 57: ST02 buffered objects 56
Figure 58: ST03N by DB time 57
Figure 59: ST03N by DB time database tab 58
Figure 60: ST04 file system requests with good response times 60
Figure 61: ST04 file system requests 61
Figure 62: ST04 file system requests slow on high write files 62
Figure 63: ST04 database overview 63
Figure 64: ST04 - Oracle v$system_event delay statistics 64
Figure 65: ST04 Oracle sessions 64
Figure 66: ST02 swaps 65
Figure 67: ST03N overview 65
Figure 68: ST10 not buffered tables 66
Figure 69: SM04 > goto > memory 68
Figure 70: SM04 Sessions 68
Figure 71: ST03N memory use statistics 69
Figure 72: ST06 displays free memory 70
Figure 73: svmon -G output 70
Figure 74: vmtune maxperm display 71
Page 80