Vous êtes sur la page 1sur 8

SQL Server Performance: Query Tuning vs.

Process Tuning
In the different projects where I perform a performance audit or am involved in
performance tuning, I often find that while queries and stored procedures function
correctly, they are not efficient in processing larger data sets. Many OLTP dataases are
set up as OL!P dataases to handle single records instead of a record set as a whole.
In tuning a query with proper "#$%$ clauses, inde&es, etc., you can often achieve a
performance gain. #owever, when tuning the complete process, or handling record sets
as a whole, the performance gain can e many times greater with this tuning.
Optimize for Changes or Selections
Tuning is a process of getting the optimal speed when wor'ing through the mutations
using the least amount of recourses possile, while still 'eeping optimal performance in
selections of data. !dding an inde& to a tale will speed up selections, ut slow down
any mutations (inserts, updates, or deletes) in the tale. The choice in defining the
optimal alance etween these two depends not only on the queries eing e&ecuted on
the dataase, ut also on their frequency and priority.
If, for instance, all mutations are handled in off*hours, the tuning is often est if these
can e completed within these hours, ut the ma&imum performance is set up for the
selections during office hours. +uring these hours, people are waiting for results of a
Use of Resources
The performance is determined y the limitations of the availale resources. The specific
hardware availale, type of resource needed to perform the requested query and the
concurrent use of the server and its resources determine the time needed. Often one of
the resources determines the major part of the query cost. "hen performing on the fly
calculations, the processor is a 'ey issue. "hen the amount of data increases, memory
and dis' I,O are a large influence. "hen tuning, the iggest gain can e reached y
addressing these resources first. The query e&ecution plan gives insight into the use of
the resources.
Query Tuning
In query tuning, the main focus is to e&ecute the e&isting queries with the est
performance possile. -y creating inde&es, retrieving only the necessary columns and
rows with correct where clauses, using inde&ed views, using pre*calculated values,
spreading tales over multiple dis's, etc., a given query.s speed can e increased
tremendously. #owever, there is a limit to the e&tent this can e achieved. !fter this,
e&tra resources li'e more memory or faster dis's can e added.
1 of 8
Process Tuning
-y altering the complete process of handling the dataflow, the target is to use as few
connections as possile, and limiting the numer of query e&ecutions and comple&ity.
This may require the data model to e altered. -ecause the process is always very
specific to the given situation and often influences many aspects of the dataase, there
are no general guidelines to lead through this tuning. #owever, y identifying the area
where the largest amount of time or resources are used in handling the data, a critical
loo' at the current situation can lead to new methods. Most of the time, there is more
than one method of handling the dataflow.
-elow are some illustrative e&amples.
Connections an !"ecutions
Ma'ing a connection to a dataase ta'es time. $&ecuting even a simple query ta'es
time to compile and e&ecute. This overhead is partly dependant on the tale and its
content, ut always ta'es some time. /or instance, the following code creates a tale
with one field. "ithout inserting any data, I query the tale repeatedly and note the time0
CREATE TABLE tbl_Test (TestID CHAR(1))
SELECT @Time1 = GETDATE(), @i=0
SELECT ! "R#M tbl_Test
SELECT @Time2 = GETDATE(), @i=@i$1
%RINT &TIME IS & $ C#N'ERT(CHAR, @Time2, 1() $ &, i = & $
C#N'ERT(CHAR, @i) $ &, TIMEDI"" = & $ C#N'ERT (CHAR,
DATEDI""(ms, @Time1, @Time2))
SELECT ! "R#M tbl_Test
SELECT @Time1 = GETDATE(), @i=@i$1
%RINT &TIME IS & $ C#N'ERT(CHAR, @Time2, 1() $ &, i = & $
C#N'ERT(CHAR, @i) $ &, TIMEDI"" = & $ C#N'ERT (CHAR,
DATEDI""(ms, @Time2, @Time1))
I" @i ) 1000 G#T# AGAIN
This will produce 1222 empty selections. The messages tell me the time difference
etween the previous selection and the current one. The first 132 or so selections are
e&ecuted within the same millisecond. #owever, even in this small selection, after 132
selections there is some overhead that can e measured.
2 of 8
TIME IS 1(*(2+(,0, i = 1*-, TIMEDI"" = 0
(0 ./0(s) 122e3te4)
TIME IS 1(*(2+(,0, i = 1*+, TIMEDI"" = 0
(0 ./0(s) 122e3te4)
TIME IS 1(*(2+(,0, i = 150, TIMEDI"" = 15
(0 ./0(s) 122e3te4)
TIME IS 1(*(2+((6, i = 151, TIMEDI"" = 0
(0 ./0(s) 122e3te4)
TIME IS 1(*(2+((6, i = 152, TIMEDI"" = 0
!s the tale definitions grow more comple& and the numer of records in them
increases, this will occur faster and with greater time loss.
Recor Set Size
The selection speed of different record sets is not linear to the numer of rows. -ecause
many steps have to e ta'en for any selections, getting e&tra records out of the
dataase often hardly ta'es any more time. In a typical dataase, I have aout 14 million
records in a tale. -y ma'ing a selection of 52,222, 62,222, 122,222, and 162,222
records, I calculated the e&ecution time per record. These are some of the results0
Rows Rows / Second
20,000 476
51,987 456
20,000 377
51,987 702
50,000 704
133,276 1,293
50,000 694
133,276 1,211
100,000 1,369
282,818 2,643
100,000 1,388
282,818 2,525
150,000 2,027
421,581 3,798
150,000 2,027
3 of 8
421,581 3,603
20,000 408
51,987 577
20,000 400
51,987 742
50,000 735
133,276 1,402
50,000 735
133,276 1,373
100,000 1,449
282,818 2,525
100,000 1,470
282,818 2,459
150,000 2173
421,581 4,093
150,000 2,142
421,581 4,053
4 of 8
This test indicates that one selection of 122,222 records is aout three times as fast as
four selections of 52,222 records each. 7o if possile, get all the information you need in
one selection instead of going ac' to the dataase many times.
! well*designed relational OL!P dataase gets all mutations via source files. In a file, a
complete record for an entity is given in a single line. The definition of the records is as
/irst 62 characters identify the file and its origin with fi&ed field lengths.
!fter this, one or more categories are listed. These correspond to one or more tales in
our dataase. The first four characters identify the category. The ne&t three characters
identify the length of the category content. !fter this, the ne&t category starts.
"ithin a category, one or more elements are listed. These correspond to fields in the
dataase. The first four characters identify the element, the ne&t three characters identify
the length of the content.
-ecause the numer of categories, as well as the numer of elements varies, and they
have to e lin'ed to a single entity, the chosen method was to parse the file in a .8$T
5 of 8
application to split it into a relational model for each record. /rom here, for each record
an I87$%T with 9!L:$7 was given for each tale in the dataase. Loading files of a
million records (as was common) with an average of nine tales leads to ; million
connections , e&ecutions to the dataase.
I set up a file import tale, where the complete record was ul' loaded into the dataase.
I added an identity field to uniquely identify a record , entity. /rom here, I parsed the
identifying fields for the file, wrote them to a separate tale, and removed this part of the
8e&t, I inserted all first categories into a separate tale, removed this category from the
ody, and repeated the step until all categories were split. I repeated this step for each
element in a category.
I used this code for it0
INSERT INT# 4b/9tbl_Re3/.4I:2/ (Re3/.4ID, ID"iel4,
St1t;s, 999)
SELECT Re3/.4ID, S8BSTRING (Re3/.4B/4<, 1, 10), S8BSTRING
(Re3/.4B/4<, 11, 1), 9999
"R#M 4b/9tbl_Re3/.4Im=/.t
77REM#'E REC#RD IN"# "R#M B#D>
8%DATE 4b/9tbl_Re3/.4Im=/.t
SET Re3/.4B/4< = S8BSTRING(Re3/.4B/4<, *1, LEN(Re3/.4B/4<) 7
SET @C1te?/.<C/;:t = 0
SET @C1te?/.<C/;:t = @C1te?/.<C/;:t $ 1
INSERT INT# 4b/9tbl_Re3/.4C1te?/.< (Re3/.4ID, S/.t#.4e.,
C1tN;mbe., C1tB/4<)
SELECT Re3/.4ID, @C1te?/.<C/;:t, S8BSTRING (Re3/.4B/4<, 1,
2), S8BSTRING (Re3/.4B/4<, 5, C#N'ERT(INT,
S8BSTRING(Re3/.4B/4<, ,, ,)))
"R#M 4b/9tbl_Re3/.4Im=/.t
DELETE "R#M 4b/9tbl_Re3/.4Im=/.t
@HERE (LEN(Re3/.4B/4<)= C#N'ERT(INT, S8BSTRING(Re3/.4B/4<,
,, ,)) $ 10)
8%DATE 4b/9tbl_Re3/.4Im=/.t
SET Re3/.4B/4< = S8BSTRING(Re3/.4B/4<, C#N'ERT(INT,
6 of 8
S8BSTRING(Re3/.4B/4<, ,, ,)) $ 5, LEN(Re3/.4B/4<) 7
C#N'ERT(INT, S8BSTRING(Re3/.4B/4<, ,, ,)) $ *)
SET @R/0s = @@R#@C#8NT
I" @R/0s A 0 G#T# CATEG#R>_L##%
INSERT INT# 4b/9tbl_Eleme:t (C1tID, Eleme:tN;mbe.,
SELECT C1tID, S8BSTRING (C1tB/4<, 1, (), S8BSTRING (C1tB/4<,
-, C#N'ERT(INT, S8BSTRING (C1tB/4<, *, ,)))
"R#M 4b/9tbl_Re3/.4C1te?/.<
8%DATE 4b/9tbl_Re3/.4C1te?/.<
SET C1tB/4< = N8LL
@HERE LEN(C1tB/4<) = C#N'ERT(INT, S8BSTRING(C1tB/4<, *, ,))
$ 6
8%DATE 4b/9tbl_Re3/.4C1te?/.<
SET C1tB/4< = S8BSTRING (C1tB/4<, C#N'ERT(INT,
S8BSTRING(C1tB/4<, *, ,)) $ -, LEN(C1tB/4<) 7 C#N'ERT(INT,
S8BSTRING(C1tB/4<, *, ,)) $ 6 )
SET @R/0s = @@R#@C#8NT
I" @R/0s A 0 G#T# ELEMENT _L##%
This code splits the elements into readale pieces for the dataase. In order to populate
the tales in the dataase, the following view gives the results in a tale*li'e record set0
C9Be.i3BtID AS &Be.i3BtID&,
C9N;mme. AS &C1te?/.ie&,
C9'/l?/.4e AS &'/l?/.4e&,
MAC(CASE E9Eleme:tN;mbe. @HEN &0110& THEN E9Eleme:tC/:te:t
ELSE N8LL END) AS &E0110&,
MAC(CASE E9Eleme:tN;mbe. @HEN &0120& THEN E9Eleme:tC/:te:t
ELSE N8LL END) AS &E0120&,
MAC(CASE E9Eleme:tN;mbe. @HEN &01,0& THEN E9Eleme:tC/:te:t
ELSE N8LL END) AS &E01,0&,
7 of 8
"R#M 4b/9tbl_Re3/.4C1te?/.< C
LE"T D#IN 4b/9tbl_Eleme:t E #N C9C1tID = E9C1tID
@HERE C9C1tN;mbe. = &01&
GR#8% B> C9C1tN;mbe., C9S/.t#.4e., C9Re3/.4ID
The whole process loops through all categories and each element in them. This is aout
52 loops each, hence <2 dataase e&ecutions. ! file of 122,222 records is completely
handled in the dataase in aout two minutes. The alternative, handling each record
separately ta'es close to an hour. -y changing the process, the performance is 56 times
!s illustrated, there can e a great performance oost y altering the data*process.
There is often more than one way to insert or manipulate the data in an OL!P dataase.
-y trying more than one method, insight can e gained on the options, fle&iility, and
processing speed. "hen wor'ing with large datasets, the ojective is to handle as many
records as possile in one set. This may result in a significant performance gain.
#$out the #uthor
8ils -evaart has een a +-! for over seven years and has had hands*on e&perience in
server management, dataase design, data warehousing, and performance tuning.
7tarting in 5223, he set up a company speciali=ing in data processing, covering issues of
effective maintenance, dataflow, dataase architecture, consultancy, and training.
8 of 8