Tuning SQL Queries For Performance

Tuning SQL Queries for performance
1. 1. Introduction
SQL (Structured Query Language) is the heart of Oracle. Contrary to the usual notion that an
SQL statement is correct as long as it returns the expected results, an SQL statement is correct
only if it produces the right result in the shortest time possible without impeding the performance
of any other system resource.
0.1 1.1 Select the most efficient Table Name sequence

One of the most important ways you can tune SQL statements is to make sure your
SELECT statement references the tables in the most efficient sequence. The sequence of
conditions in your where clause is of higher priority to the rule based optimizer than the from
sequence. If two index paths over two tables have different rule-based rankings, then the table
with the lowest numeric ranking will be the driving table. Only when the two tables have equal
query path rankings does the from sequence come into play.
In the query SELECT COUNT (*) FROM EMP, DEPT regardless of the sequence in which
you specify the table names, the optimizer tries to reorder table processing based on what is
most efficient. It takes into account such factors as the indexes specified for the tables. If you are
running a rule-based optimizer, and an optimizer cannot make an intelligent decision, then
ORACLE simply executes the statement in the order in which the tables are parsed. Because the
parser processes the tables from right to left, the table name you specify last is usually processed
first.
If an SQL statement referencing multiple tables is taking longer than is acceptable,
examine the effect of table sequences on your retrievals.
0.2 1.2 Driving Table

The object of all SQL query and update statements is to minimize the total physical number
of database blocks that need to be read. If you specify more than one table in the FROM clause
of the SELECT statement, you must choose one as the driving table. By making the correct
choice, you can make enormous improvements in performance. Specifying the correct driving
table makes a huge difference in performance. When oracle processes multiple tables, it uses an
internal sort/merge procedure to join your two tables. First it scans the first table, next it scans the
second table, then merges all the rows retrieved from the second table with those retrieved from
the first table. Performance is better if table with more number of rows is mentioned first in the
select statement.
0.3 1.3 Select the most efficient WHERE clause sequence

The way you specify conditions in the WHERE clause of your select statements has a
major impact on the performance of your SQL. In the absence of any other information, the
ORACLE optimizer uses these conditions to specify the most efficient conditions to determine the
best retrieval path for the database. If you are able to specify the most efficient conditions early in
your WHERE clause, the rule based optimizer will be more effective in selecting the most efficient
path from the available paths with equal optimizer rankings.
0.4 1.4 Use ROWID whenever possible.

The ROWID of the record is the single fastest method of record retrieval. ROWID is
actually an encoded key representing the physical record number within an actual ORACLE
database block on the database. Use ROWID whenever possible to get the best performance out
of our retrievals.
You can improve performance by selecting a record before updating or deleting it and
including ROWID in the initial select list. This allows ORACLE to perform a much more efficient
second record access.
0.5 1.5 Reduce the number of trips to the database

Every time a SQL statement is executed, ORACLE needs to perform many internal
processing steps. The statement needs to be parsed, indexes evaluated, variables bound, and
data blocks read. The more you can reduce the number of database accesses, the more
overhead you can save. Reducing the physical number of trips to the database is particularly
beneficial in client-server configurations where the database may need to be accessed over the
network.
One simple way one can increase the number of rows of data you can fetch with one
database access and thus reduce the number of physical calls needed is to reset the
ARRAYSIZE parameter in SQL*Plus, SQL*Forms, and Pro*C. A setting of 200 is suggested.
0.6 1.6 Combine simple, Unrelated Database accesses

If you are running a number of simple database queries, you can improve performance by
combining them into a single query, even if they are not related.
Eg: Three queries
SELECT NAME FROM EMP WHERE EMP_NO = 1234;
SELECT NAME FROM DPT WHERE DPT_NO = 10;
SELECT NAME FROM CAT WHERE CAT_TYPE = ‘RD’;
can be combined into a single query as
SELECT E.NAME, D.NAME, C.NAME FROM CAT C, DPT D, EMP E, DUAL X WHERE
NVL(‘X’,X.DUMMY) = NVL(‘X’,E.ROWID(+)) AND NVL(‘X’,X.DUMMY) = NVL(‘X’,D.ROWID(+))
AND
NVL(‘X’,X.DUMMY) = NVL(‘X’,C.ROWID(+))
AND E.EMP_NO = 1234 AND D.DPT_NO = 10 AND C.CAT_TYPE = ‘RD’;
To combine all these separate queries into one SQL statement, you must perform an outer
join on each table with a table which is always valid(which returns atleast one row). The easiest
way is to set up a dummy outer join with the system DUAL. This type of processing gives you the
best performance pay off on machines connected to busy networks. Every time a SQL statement
is executed, the RDBMS kernel is visited a number of times: atleast once to parse the statement,
once to bind the variables, and once to retrieve the selected rows. With this simple example, you
reduce network overhead by two-thirds.
0.7 1.7 Use Count(INDEX_COLUMN) instead of Count(*)

Contrary to the popular belief, count(*) is faster than count(1). If the rows are being
returned via an index, counting an indexed column E.g COUNT(EMP_NO) is still faster. On
different computers, COUNT(*) consistently runs 15% to 20% faster than COUNT(1) and
COUNT(INDEX_COLUMN) is 5% faster again.
0.8 1.8 Use WHERE in place of HAVING

In general, avoid including a HAVING clause in the SELECT statements. The HAVING
clause filters selected rows only after all rows have been fetched. This could include sorting,
summing etc., Restricting rows via the WHERE clause, rather than HAVING clause helps reduce
these overheads.
0.9 1.9 Tune Views

Views are effectively SELECT statements and can be tuned just as any other SELECT
statements. At all costs, avoid specifying views under views or views within SQL sub-query
clauses. These statements tend to confuse the optimizer, resulting in full table scans.
0.10 1.10 Minimize Table lookups in a query
To improve performance, minimize the number of table lookups in queries, particularly if
your statements include sub-query SELECTs and multi-column UPDATEs.
Eg:
Instead of specifying
SELECT TABLE_NAME
FROM TABLES
WHERE TABLE_NAME = (SELECT TABLE_NAME FROM TAB_COLUMS
WHERE VERSION = 604)
AND
DB_VERSION=(SELECT DB_VERSION FROM TAB_COLUMNS WHERE
VERSION = 604)
Specify the following:
SELECT TABLE_NAME
FROM TABLES
WHERE TABLE_NAME, DB_VERSION=(SELECT TABLE_NAME,
DB_VERSION FROM TAB_COLUMNS WHERE VERSION = 604)
0.11 1.11 Use Table Aliases

Use the table aliases and prefix all column names by their aliases where there is more than
one table involved in a query. This will reduce the parse time and prevent syntax errors from
occurring when ambiguously named columns are added later on.
0.12 1.12 Use NOT EXISTS in place of NOT IN

In sub-query statements such a the following, the NOT IN clause causes an internal
sort/merge
SELECT ….. FROM EMP WHERE DEPT_NO NOT IN ( SELECT DEPT_NO FROM DEPT
WHERE DEPT_CAT = ‘A’)
To improve performance, replace this code with:
SELECT … FROM EMP E WHERE NOT EXISTE (SELECT ‘X’ FROM DEPT WHERE DEPT_NO
= E.DEPT_NO AND DEPT_CAT = ‘A’)
0.13 1.13 Use joins in pace of exists

In general, join tables rather than specifying sub-queries for them such as the following:
SELECT ….. FROM EMP E WHERE EXISTS ( SELECT ‘X‘ FROM DEPT WHERE DEPT_NO =
DEPT_NO AND DEPT_CAT = ‘A’)
To improve performance specify:
SELECT … FROM DEPT D, EMP E WHERE E.DEPT_NO = D.DEPT_NO AND DEPT_CAT =
‘A’;
0.14 1.14 Use EXISTS in place of DISTINCT

Avoid joins that require the DISTINCT qualifier on the SELECT list when you submit
queries used to determine information at the owner end of a one-to many relationship. Eg of such
a query is:
SELECT DISTINCT DEPT_CODE, DEPT_NAME FROM DEPT D, EMP E WHERE
D.DEPT_CODE = E.DEPT_CODE.
EXISTS is a faster alternative because the RDBMS kernel realizes that when the sub-
query has been satisfied once, the query can be terminated.
SELECT DEPT_CODE, DEPT_NAME FROM DEPT D WHERE EXISTE (SELECT ‘X’
FROM EMP E WHERE E.DEPT_CODE = D.DEPT_CODE);
0.15 1.15 Which one is faster: Indexed Retrieval or Full-table scan?

Full table scans can be efficient because they require little disk head movement. The disk
starts reading at one point and continues reading contiguous data blocks. Indexed retrieval are
usually more efficient, as you would expect. But because indexes retrieve records in a logical
sequence, not in the order in which they are physically located on the disk, indexed retrievals may
result in a lot of disk head movement- perhaps retrieving only one record per read. To a large
extent, the choice between an indexed retrieval and a full table scan depends upon the size of the
table and the pattern of access to that table. If large portions of a large table are being processed,
a serial search can actually be faster. If the rows being accessed sequentially are randomly
dispersed throughout the table, processing them in sequence might be quite slow. In addition to
the disk head movement requires to retrieve the records, every read of a row requires an
additional read of the index.
ORACLE Corporation recommends that if tables with fewer than eight data blocks are
specified in a query, then full table scan is more efficient than an indexed retrieval. For large
tables, an indexed retrieval is usually faster.
Choosing a full table scan over an indexed retrieval depends directly on how many rows
of the table can fit into a single ORACLE block. ORACLE blocks are read , written, and cached in
the SGA as entire blocks. The more rows contained within a block, the fewer physical reads are
needed to scan the entire table. The more dispersed the indexed consecutive rows are
throughout the table, and the fewer the number of rows that can be contained in the ORACLE
block, the less the likelihood of the next row’s being within the SGA cache. If the only columns
being referenced were the indexed columns or the pseudo columns, an index read would always
be the most efficient.
0.16 1.16 Avoid calculations on indexed columns

The optimizer does not use an index if the indexed column is a part of a fuction(in the
WHERE clause). Avoid doing calculations on indexed columns. When the optimizer encounters a
calculation on an indexed column, it will not use the index and will perform a full-table scan
instead.
Use :
SELECT …. FROM DEPT WHERE SAL > 250000/12;
Instead of
SELECT …. FROM DEPT WHERE SAL *12 > 250000;
0.17 1.17 Include additional columns in a concatenated index

In some cases, we gain performance benefits by including additional columns in a
concatenated index. It may allow you to satisfy queries without having to perform a physical read
of the actual table. Although most of the overhead for record retrieval is incurred by having to
locate the address of the record, you can still save a substantial amount of overhead by avoiding
a physical read of the record. Because indexes return records in an ordered sequence, actually
having to retrieve the record also requires extensive head movement on the disk.
0.18 1.18 Avoid using NOT on Indexed columns

Avoid using NOT when testing indexed columns. The NOT function has the same effect on
indexes that functions do. When ORACLE encounters a NOT, it will choose not to use the index
and will perform a full table scan instead.
0.19 1.19 Use UNION in place of OR

Always consider UNION instead of OR in WHERE clause. Using OR on an indexed column
causes the optimizer to perform a full table scan rather than an indexed retrieval. Choosing
UNION over OR will be effective only if both columns are indexed; if either is not indexed, you
may actually increase overhead by not choosing OR. If you do use OR, be sure that you put most
specific index first in the OR’s predicate list, and put the index that passes the most records last
in the list.
0.20 1.20 Use Truncate for full table delete
If you need to delete all the rows in a table, don't use DELETE to delete them all, as the DELETE
statement is a logged operation and can take time. To perform the same task much faster, use
the TRUNCATE TABLE instead, which is not a logged operation. Besides deleting all of the
records in a table, this command will also reset the seed of any IDENTITY column back to its
original value.
2. 2. References
Oracle Performance Tuning by Corrigan, Peter

Tuning SQL Queries For Performance

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Tuning SQL Queries For Performance

Transféré par

Droits d'auteur :

Formats disponibles

Tuning SQL Queries for performance

0.1 1.1 Select the most efficient Table Name sequence

0.2 1.2 Driving Table

0.3 1.3 Select the most efficient WHERE clause sequence

0.4 1.4 Use ROWID whenever possible.

0.5 1.5 Reduce the number of trips to the database

0.6 1.6 Combine simple, Unrelated Database accesses

0.7 1.7 Use Count(INDEX_COLUMN) instead of Count(*)

0.8 1.8 Use WHERE in place of HAVING

0.9 1.9 Tune Views

0.11 1.11 Use Table Aliases

0.12 1.12 Use NOT EXISTS in place of NOT IN

0.13 1.13 Use joins in pace of exists

0.14 1.14 Use EXISTS in place of DISTINCT

0.15 1.15 Which one is faster: Indexed Retrieval or Full-table scan?

0.16 1.16 Avoid calculations on indexed columns

0.17 1.17 Include additional columns in a concatenated index

0.18 1.18 Avoid using NOT on Indexed columns

0.19 1.19 Use UNION in place of OR

Vous aimerez peut-être aussi