Vous êtes sur la page 1sur 13

1

O OR RI IO ON N: : O OR RA AC CL LL L I I/ /O O N NU UM MB BL LR RS S C CA AL LI IB BR RA A1 1I IO ON N 1 1O OO OL L
Oriov i. cvrrevtt, ovt, araitabte a. beta, vv.vortea .oftrare. Oracte ritt accet vo tiabitit, for robtev. ari.ivg frov it. v.e, vor aoe.
Oracte rarravt it. fitve.. for av, vro.e. t. v.e i. vret, ai.cretiovar, at tbe .ote ri./ of tbe v.er.

OVLRVILW
Orion is a tool or predicting the perormance o an Oracle database without haing to install Oracle or create a
database. Unlike other I,O calibration tools, Orion is expressly designed or simulating Oracle database I,O
workloads using the same I,O sotware stack as Oracle. It can also simulate the eect o striping perormed by ASM.
1he ollowing types o I,O workloads are supported:
1. Small Random I,O. OL1P applications typically generate random reads and writes whose size is equialent to
the database block size, typically 8 KB. Such applications typically care about the throughput in I,Os Per Second
,IOPS, and about the aerage latency ,I,O turn-around time, per request. 1hese parameters translate to the
transaction rate and transaction turn-around time at the application layer.
Orion can simulate a random I,O workload with a gien percentage o reads s. writes, a gien I,O size, and a
gien number o outstanding I,Os. 1he I,Os are distributed across all disks.

2. Large Sequential I,O. Data warehousing applications, data loads, backups, and restores generate sequential read
and write streams composed o multiple outstanding 1 MB I,Os. Such applications are processing large amounts
o data, like a whole table or a whole database and they typically care about the oerall data throughput in
MegaBytes Per Second ,MBPS,.
Orion can simulate a gien number o sequential read or write streams o a gien I,O size with a gien number o
outstanding I,Os. Orion can optionally simulate ASM striping when testing sequential streams.

3. Large Random I,O. A sequential stream typically accesses the disks concurrently with other database traic.
\ith striping, a sequential stream is spread across many disks. Consequently, at the disk leel, multiple sequential
streams are seen as random 1 MB I,Os, which we also call Multi-User Sequential I,O.

4. Mixed \orkloads. Orion can simulate 2 simultaneous workloads: Small Random I,O and either Large Sequential
I,O or Large Random I,O. 1his enables you to simulate, or example, an OL1P workload o 8 KB random
reads and writes with a backup workload o 4 sequential read streams o 1 MB I,Os.

lor each type o workload, Orion can run tests at dierent leels o I,O load to measure perormance metrics like
MBPS, IOPS and I,O latency. Load is expressed in terms o the number o outstanding asynchronous I,Os.
Internally, or each such load level, the Orion sotware keeps issuing I,O requests as ast as they complete to
maintain the I,O load at that leel. lor random workloads ,large and small,, the load leel is the number o
outstanding I,Os. lor large sequential workloads, the load leel is a combination o the number o sequential streams
and the number o outstanding I,Os per stream. 1esting a gien workload at a range o load leels helps the user
understand how perormance is aected by load.

1LS1 1ARGL1S
Orion can, in theory, be used to test any disk-based character deice that supports asynchronous I,O. It has been
tested on the ollowing types o targets:
1. DAS ,direct-attached, storage: Orion can be used to test the perormance o one or more local disks or

2
olumes on the local host.
2. SAN ,storage-area network, storage: Orion can be run on any host that has all or parts o the SAN storage
mapped as character deices. 1he deices can correspond to striped or un-striped olumes exported by the
storage array,s, or indiidual disks.
3. Orion has not been extensiely tested on NAS ,network-attached storage,, although it may successully run
on NAS on certain operating systems platorms. In general, the perormance results on NAS storage are
dependent on the I,O patterns with which the data iles hae been created and updated. 1hereore, you
should initialize the data iles appropriately beore running Orion.

ORION IOR S1ORAGL VLNDORS
Storage endors can use Orion to understand how Oracle databases will perorm on their storage arrays. Vendors can
also use Orion to ormulate a recommendation on how to best conigure their storage arrays or Oracle databases.

ORION IOR ORACLL ADMINIS1RA1ORS
Oracle administrators can use Orion to ealuate and compare dierent storage arrays, based on their expected
workloads. 1hey can also use Orion to determine the optimal number o network connections, storage arrays, storage
array controllers, and disks or their peak workloads. Appendix A describes ways o characterizing a database`s
current I,O workload to iner its IOPS and MBPS requirements.

GL11ING S1AR1LD WI1H ORION

1. Download Orion. Select the ersion o Orion that corresponds to the operating system o your host serer.
Orion is currently aailable or Linux,x86, Solaris,SPARC and \indows,x86.
2. Install Orion:
a. Linux,Solaris: Unzip the Orion binary into a directory o your choice and add it to your PA1l, i
preerred.
$ gunzip orion10.2_linux.gz
b. \indows: Run the Installer, it will let you choose the directory where Orion and its DLLs will be
installed.
C:\temp> orion10.2_windows.msi

3. Select a test name, a unique identiier or this calibration run. \e will use the example test name mytest`
through the rest o this document.
4. Create a ile named mytest.lun. In the ile, list the raw olumes or iles to test. Put one olume name per line.
Do not put comments or anything else in this ile. lor example:

/dev/raw/raw1
/dev/raw/raw2
/dev/raw/raw3
/dev/raw/raw4
/dev/raw/raw5
/dev/raw/raw6
/dev/raw/raw7
/dev/raw/raw8

5. Veriy that all olumes are accessible using dd` or other equialent data copy tools. A typical sanity-check is to
do something like this ,example: Linux,:

3

$ dd if=/dev/raw/raw1 of=/dev/null bs=32k count=1024

Depending on your platorm, the data copy tool used and its interace may be dierent rom aboe.
6. Veriy that your platorm has the necessary libraries installed to do asynchronous I,Os. 1he Orion test is
completely dependent on asynchronous I,O. On Linux and Solaris, the library libaio needs to be in one o the
standard lib directories or accessible through the shell enironment`s library path ariable ,usually
LD_LIBRAR\_PA1l or LIBPA1l, depending on your shell,. \indows has built-in asynchronous I,O
libraries, so this issue doesn`t apply.
. As a irst test, we suggest a simple` test run. 1he simple test measures the perormance o .vatt ravaov reaa. at
dierent loads and then ,separately, targe ravaov reaa. at dierent loads. 1hese results should gie an idea as to
how the I,O perormance diers with I,O type and load. Such a test can be done with this short command line:

$ ./orion -run simple -testname mytest -num_disks 8
ORION: ORacle IO Numbers -- Version 10.2.0.1.0
Test will take approximately 30 minutes
Larger caches may take longer

1he I,O load leels generated by Orion take into account the number o disk spindles being tested ,proided by
you in num_disks,. Keep in mind that the number o spindles va, or va, vot be related to the number o olumes
speciied in the mytest.lun ile, depending on how these olumes are mapped.
8. 1he results o these tests will be recorded in the output iles shown below. 1he ile mytest_summary.txt is a good
starting point or eriying your input parameters and analyzing the output. 1he iles mytest_.cs contain
comma-separated alues or seeral I,O perormance measures ,discussed in detail below,.

OU1PU1 IILLS
Orion creates seeral output iles. Continuing the mytest` example, these iles are explained below:
1. mytest_summary.txt: 1his ile contains:
a. Input parameters
b. Maximum throughput obsered or the Large Random,Sequential workload
c. Maximum I,O rate obsered or the Small Random workload
d. Minimum latency obsered or the Small Random workload
2. mytest_mbps.csv: Comma-separated alue ile containing the data transer rate ,MBPS, results or the Large
Random,Sequential workload. 1his and all other CSV iles contain a two-dimensional table. Lach row in the
table corresponds to a large I,O load leel and each column corresponds to a small I,O load leel. 1hus, the
column headings are the number o outstanding small I,Os and the row headings are the number o outstanding
large I,Os ,or random large I,O tests, or the number o sequential streams ,or sequential large I,O tests,.
1he irst ew data points o the Orion MBPS output CSV ile or mytest` are shown below. 1he simple mytest
command-line gien earlier does not test combinations o large and small I,Os. lence, the MBPS ile has just
one column corresponding to 0 outstanding small I,Os. In the example below, at a load leel o 8 outstanding
large reads and no small I,Os, we get a data throughput o 103.06 MBPS.

Large/Small, 0
1, 19.18
2, 37.59
4, 65.53
6, 87.03
8, 103.06

4
10, 109.67
. . . . .
. . . . .
1he ollowing chart illustrates the data transer rate measured at dierent large I,O load leels. 1his chart can be
generated by loading mytest_mbps.cs into Microsot Lxcel or OpenOice and graphing the data points using the
application. Orion does not directly generate such graphs. 1he x-axis corresponds to the number o outstanding large
reads and the y-axis corresponds to the throughput obsered.



3. mytest_iops.csv: Comma-separated alue ile containing the I,O throughput ,in IOPS, results or the Small
Random workload. Like in the MBPS ile, the column headings are the number o outstanding small I,Os and
the row headings are the number o outstanding large I,Os ,when testing large random, or the number o
sequential streams ,or large sequential,.
In the general case, this CSV ile contains a two-dimensional table. loweer, or the simple mytest command-line
gien earlier, we aren`t testing combinations o large and small I,Os. lence, the IOPS results ile just has one
row with 0 large I,Os. In the example below, with 12 outstanding small reads and no large I,Os, we get a
throughput o 951 IOPS.


Large/Small, 1, 2, 3, 6, 9, 12 . . . .
0, 105, 208, 309, 569, 782, 951 . . . .


1he graph below ,generated by loading mytest_iops.cs into Lxcel and charting the data, illustrates the IOPS
throughput seen at dierent small I,O load leels.
Large MBPS
1 19.18
2 37.59
4 65.53
6 87.03
8 103.06
10 109.67
12 117.7
14 127.06
16 130.55
20 138.42
24 141.21
28 145.5
0
20
40
60
80
100
120
140
160
0 5 10 15 20 25 30

5

4. mytest_lat.csv: Comma-separated alue ile containing the latency results or the Small Random workload. Like
in the MBPS and IOPS iles, the column headings are the number o outstanding small I,Os and the row
headings are the number o outstanding large I,Os ,when testing large random I,Os, or the number o sequential
streams.
In the general case, this CSV ile contains a two-dimensional table. loweer, or the simple mytest command-line
gien earlier, we don`t test combinations o large and small I,Os. lence, the IOPS results ile just has one row
with 0 large I,Os, as shown below. In the example below, at a sustained load leel o 12 outstanding small reads
and no large I,Os, we get an I,O turn-around latency o milliseconds.

Large/Small, 1, 2, 3, 6, 9, 12 . . . .
0, 14.22, 14.69, 15.09, 16.98, 18.91, 21.25 . . . .

1he ollowing graph ,generated by loading mytest_lat.cs into Lxcel and charting the data, illustrates the small
I,O latency at dierent small I,O load leels or mytest.


1 14.22
2 14.69
3 15.09
6 16.98
9 18.91
12 21.25
18 25.92
24 30.2
30 34.8
36 39.04
48 47.68
60 56.3
69 62.16
0
10
20
30
40
50
60
70
0 20 40 60 80


5. mytest_trace.txt: Contains extended, unprocessed test output.
SmaII IOPS
1 105
2 208
3 309
6 569
9 782
12 951
18 1189
24 1369
30 1521
36 1638
48 1815
60 1941
69 2013 0
500
1000
1500
2000
2500
0 20 40 60 80

6
Lxcept or mytest_trace.txt, i any o these iles already exist, they will be oerwritten. Lach run o Orion appends to
mytest_trace.txt ,or the equialent trace ile name or your test name,.
NO1L: II ANY LRROR OCCURS DURING 1HL 1LS1, I1 WILL BL PRIN1LD 1O S1ANDARD
OU1PU1.

INPU1 PARAML1LRS
Orion can be used to test any o the workloads described in the oeriew using its arious command-line options. In
this section, we describe these options and the output iles.

MANDA1OR\ INPU1 PARAML1LRS
run: 1est run leel. 1his option proides simple command lines at the simple and normal run leels and allows
complex commands to be speciied at the adanced leel. I not set as -run advanced, then setting any other non-
mandatory parameter ,besides -cache_size or -verbose, will result in an error.
simple: Generates the Small Random I,O and the Large Random I,O workloads or a range o load leels.
In this option, small and large I,Os are tested in isolation. 1he only optional parameters that can be speciied
at this run leel are -cache_size and -erbose. 1his parameter corresponds to the ollowing inocation o
Orion:
%> ./orion -run advanced -testname mytest \
-num_disks NUM_DISKS \
-size_small 8 -size_large 1024 -type rand \
-simulate concat -write 0 -duration 60 \
-matrix basic
normal: Same as -simple, but also generates combinations o the small random I,O and large random
I,O workloads or a range o loads. 1he only optional parameters that can be speciied at this run leel are -
cache_size and -erbose. 1his parameter corresponds to the ollowing inocation o Orion:
%> ./orion -run advanced -testname mytest \
-num_disks NUM_DISKS \
-size_small 8 -size_large 1024 -type rand \
-simulate concat -write 0 -duration 60 \
-matrix detailed
advanced: Indicates that the test parameters will be speciied by the user. Any o the optional parameters
can be speciied at this run leel.
testname: Identiier or the test. 1he input ile containing the disk or ile names must be named testname.lun.
1he output iles will be named with the preix testname_.
num_disks: Actual number o physical disks used by the test. Used to generate a range or the load.

OP1IONAL INPU1 PARAML1LRS
help: Prints Orion help inormation. All other options are ignored when help is speciied.
size_small: Size o the I,Os ,in KB, or the Small Random I,O workload. ,Deault is 8,.
size_large: Size o the I,Os ,in KB, or the Large Random or Sequential I,O workload. ,Deault is 1024,.
type: 1ype o the Large I,O workload. ,Deault is rand,:
rand: Large Random I,O workload.
seq: Large Sequential I,O workload.
num_streamIO: Number o outstanding I,Os per sequential stream. Only alid or -type seq. ,Deault is 1,.

7
simulate: Data layout to simulate or Large Sequential I,O workload.
1

concat: A irtual olume is simulated by serially chaining the speciied LUNs. A sequential test oer this
irtual olume will go rom some point to the end o one LUN, ollowed by the beginning to end o the next
LUN, etc.
raid0: A irtual olume is simulated by striping across the speciied LUNs. 1he stripe depth is 1M by deault
,to match the Oracle ASM stripe depth, and can be changed by the -stripe parameter.
write: Percentage o I,Os that are writes, the rest being reads. 1his parameter applies to both the Large and Small
I,O workloads. lor Large Sequential I,Os, each stream is either read-only or write-only, the parameter speciies the
percentage o streams that are write-only. 1he data written to disk is garbage and unrelated to any existing data on
the disk. WARNING: WRI1L 1LS1S WILL OBLI1LRA1L ALL DA1A ON 1HL SPLCIIILD LUNS.
cache_size: Size o the storage array`s read or write cache ,in MB,. lor Large Sequential I,O workloads, Orion will
warm the cache by doing random large I,Os beore each data point. It uses the cache size to determine the duration
or this cache warming operation. I not speciied, warming will occur or a deault amount o time. I set to 0, no
cache warming will be done. ,Deault is not speciied, which means warming or a deault amount o time,.
duration: Duration to test each data point in seconds. ,Deault is 60,.
matrix: 1ype o mixed workloads to test oer a range o loads. ,Deault is detailed,.
basic: No mixed workload. 1he Small Random and Large Random,Sequential workloads will be tested
separately.
detailed: Small Random and Large Random,Sequential workloads will be tested in combination.
point: A single data point with outstanding Small Random I,Os and outstanding Large Random I,Os
or sequential streams. is set by the num_small parameter. is set by the num_large parameter.
col: Large Random,Sequential workloads only.
row: Small Random workloads only.
max: Same as detailed, but only tests the workload at the maximum load, speciied by the num_small
and num_large parameters.
num_small: Maximum number o outstanding I,Os or the Small Random I,O workload. Can only be speciied
when matrix is col, point, or max.
num_large: Maximum number o outstanding I,Os or the Large Random I,O workload or number o concurrent
large I,Os per stream. Can only be speciied when matrix is row, point, or max.
verbose: Prints progress and status inormation to standard output.


1
1he osets or I,Os are determined as ollows:
lor Small Random and Large Random workloads:
1he LUNs are concatenated into a single irtual LUN ,VLUN, and random osets are chosen within the VLUN.
lor Large Sequential workloads:
\ith striping ,-simulate raid0,. 1he LUNs are used to create a single striped VLUN. \ith no concurrent Small
Random workload, the sequential streams start at ixed osets within the striped VLUN. lor v streams, stream i will
start at oset VLUNsize ,i - 1, , ,v - 1,, except when v is 1, in which case the single stream will start at oset 0.
\ith a concurrent Small Random workload, streams start at random osets within the striped VLUN.
\ithout striping ,-simulate CONCAT,. 1he LUNs are concatenated into a single VLUN. 1he streams start at
random osets within the single VLUN.


8
COMMAND-LINL LXAMPLLS

For a preliminary run to understand your storage perIormance with read-only, small and large random I/O
workloads:
$ orion -run simple -testname mytest -num_disks 8
Similar to the above, but with a mixed small and large random I/O workload:
$ orion -run normal -testname mytest -num_disks 12
To generate combinations oI 32KB and 1MB reads to random locations:
$ orion -run advanced -testname mytest -num_disks 6 -size_small 32 \
-size_large 1024 -type rand -matrix detailed
To generate multiple sequential 1MB write streams, simulating 1MB RAID-0 stripes:
$ orion -run advanced -testname mytest -num_disks 15 -simulate raid0 \
-stripe 1024 -write 100 -type seq -matrix col -num_small 0

1ROUBLL-SHOO1ING
I you are getting an I,O error on one or more o the olumes speciied in the testname.lun ile:
o Veriy that you can access it in the same mode as the test ,read or write, using a ile copy program such as
dd.
o Veriy that your host operating system ersion is capable o doing asynchronous I,O.
o On Linux and Solaris, the library libaio needs to be in one o the standard lib directories or accessible
through the shell enironment`s library path ariable ,usually LD_LIBRAR\_PA1l or LIBPA1l,
depending on your shell,.
I you are running on NAS storage:
o 1he ile system must be properly mounted or Orion to run. Please consult your Oracle Installation
Guide or directions ,or example, Appendix B Using NAS Deices` in the Database Installation Guide
or Linux x86,.
o 1he mytest.lun ile should contain one or more paths o existing iles. Orion will not work on directories
or mount points. 1he ile has to be large enough or a meaningul test. 1he size o this ile should
represent the eentual expected size o your datailes ,say, ater a ew years o use,.
o \ou may see poor perormance doing asynchronous I,O oer NlS on Linux ,including 2.6 kernels,.
o I you`re doing read tests and the reads are hitting blocks o the ile that were not initialized or preiously
written to, some smart NAS systems may ake` the read by returning you zeroed-out blocks. 1he
workaround is to write all blocks ,using a tool like dd, beore doing the read test.
I you are running Orion on \indows:
o 1esting on raw partitions requires temporarily mapping the partitions to drie letters and speciying these
drie letters ,like ``.`x: ,in the test.lun ile.
I you are running an Orion 32-bit Linux,x86 binary on an x86_64 machine:
o Please copy a 32-bit libaio.so ile rom a 32-bit machine running the same Linux ersion.
1he mytestI you are testing with a lot o disks ,num_disks greater than around 30,:
o \ou should use the -duration option ,see the optional parameters section or more details, to speciy a
long duration ,like 120 seconds or more, or each data point. Since Orion tries to keep all spindles
running at a particular load leel, each data point requires a ramp-up time , which implies a longer
duration or the test.

9
o \ou may get the ollowing error message, instructing you to increase the duration time, but we suggest
doing it proactiely:

Specify a longer -duration value.

A duration o 2x the number o spindles seems to be a good rule o thumb. Depending on your disk
technology, your platorm may need more or less time.
I you get an error about libraries being used by Orion:
o Linux,Solaris: See I,O error troubleshooting aboe.
o N1-Only: Do not moe,remoe the Oracle libraries included in the distribution. 1hese need to be in
the same directory as orion.exe.
I you are seeing perormance numbers that are unbelieably good`:
o \ou may hae a large read and,or write cache somewhere between the Orion program and the disk
spindles. 1ypically, the storage array controller has the biggest eect. lind out the size o this cache and
use the -cache_size adanced option to speciy it to Orion ,see the optional parameters section or more
details,.
o 1he total size o your olumes may be really small compared to one or more caches along the way. 1ry
to turn o the cache. 1his is needed i the other olumes sharing your storage will see signiicant I,O
actiity in a production enironment ,and end up using large parts o the shared cache,.
I Orion is reporting a long estimated run time:
o 1he run time increases when -num_disks is high. Orion internally uses a linear ormula to determine
how long it will take to saturate the gien number o disks.
o 1he -cache_size parameter aects the run time, een when it`s not speciied. Orion does cache warming
or two minutes per data point by deault. I you hae turned o your cache ,which we recommend or
read caches where possible,, speciy -cache_size 0.
o 1he run time increases when a long -duration alue is speciied, as expected.



10
APPLNDIX A: CHARAC1LRIZING 1HL DA1ABASL I/O LOAD
2

1o properly conigure your database storage, you must understand your database`s perormance requirements. In
particular, you must answer the ollowing questions:
J. Will the I/O requests be primarily single-block or multi-block?
Databases issue multi-block I,Os when perorming the ollowing types o operations: parallel queries, queries
on large tables that require table scans, direct data loads, backups, and restores. In general, DSS and data
warehouse enironments issue large amounts o multi-block I,Os whereas OL1P databases primarily issue
single-block I,Os.
2. What is your average and peak I/Os per second (IOPS) requirement? What percentage of this traffic
are writes?
3. What is your average and peak throughput (in MBPS) requirement? What percentage of this traffic
are writes?
I your database`s I,O requests are primarily single-block, then you should ocus on ensuring that the storage can
accommodate your I,O request rate ,IOPS,. I they are primarily multi-block, then you should ocus on the storage`s
throughput capacity ,MBPS,.
Ior an existing J0gR2 database, you can characterize your I,O traic by looking at the database statistics in the
V>S\SS1A1 iew, as reerenced by the names gien below. 1hese statistics are cumulatie alues that should be
sampled during both the typical and peak periods.
single-block reads: physical read total IO requests` - physical read total multi block requests`
multi-block reads: physical read total multi block requests`
bytes read: physical read total bytes`
single-block writes: physical write total IO requests` - physical write total multi block requests`
multi-block writes: physical write total multi block requests`
bytes written: physical write total bytes`
Ior an existing pre-J0gR2 database, the I,O statistics are speciied in multiple iews. 1he bulk o the I,O traic is
described as ollows:
single-block data ile reads: V>lILLS1A1.SINGLLBLKRDS speciies the number o such I,O requests.
multi-block data ile reads: V>lILLS1A1.Pl\RDS - V>lILLS1A1.SINGLLBLKRDS speciies the
number o such I,O requests.
single-block data ile writes: V>lILLS1A1.Pl\\R1S speciies the number o such I,O requests.
multi-block data ile writes: V>lILLS1A1.Pl\BLK\R1 speciies the number o DB\R writes plus the
number o direct I,O blocks written. Unortunately, there isn`t a way to derie the number o multi-block
I,O requests, but in general, direct I,Os are multi-block I,Os. 1he number o direct I,O blocks written is
V>lILLS1A1.Pl\BLK\R1 - V>lILLS1A1.Pl\\R1S.
redo log writes: in V>S\SS1A1, redo blocks written` speciies the number o blocks written and redo
writes` speciies the number o I,O requests.
backup I,Os: in V>BACKUP_AS\NC_IO and V>BACKUP_S\NC_IO, the IO_COUN1 ield speciies the
number o I,O requests and the 1O1AL_B\1LS ield speciies the number o bytes read or written. Note
that each row o this iew corresponds to a data ile, the aggregate oer all data iles, or the output backup
piece.

2
1his section is taken rom the Oracle white paper Best Practices or a Low-Cost Storage Grid or Oracle Databases`. \e
recommend that you reerence this paper or the latest ersion o this inormation.

11
lashback log I,Os: In V>lLASlBACK_DA1ABASL_S1A1, lLASlBACK_DA1A, DB_DA1A, and
RLDO_DA1A show the number o bytes read or written rom the lashback logs, data iles, and redo logs,
respectiely, in the gien time interal. In V>S\SS1A1, the lashback log writes` statistic speciies the
number o write I,O requests to the lashback log.
Using this data, you can estimate the read and write IOPS requirement rom the number o physical reads and writes
issued in a gien duration o time, both at peak and normal loads. \ou can estimate the read and write MBPS
requirement rom the number o bytes read and written per second, again at both peak and normal loads. 1he
number o writes compared to the number o reads indicates the write percentage` to be speciied to Orion. lor
example, i your applications typically do 600 writes and 200 reads in a gien duration, your write percentage would be
5 ,-write 5,. Please read the warnings in the write option description before doing write tests.
Note that these statistics do not include all I,O traic, such as control ile and archier-generated I,Os.

APPLNDIX B: NO1LS IOR DA1A WARLHOUSING

I,O perormance should always be a key consideration or data warehouse designers and administrators. 1he typical
workload in a data warehouse is especially I,O intensie, with operations such as queries oer large olumes o data,
large data loads and index builds and creation o materialized iews. 1he underlying I,O system or a data warehouse
should be designed to meet these heay requirements.
Storage conigurations or a data warehouse should be chosen based on the I,O bandwidth that they can proide, and
not necessarily on their oerall storage capacity. 1he capacity o indiidual disk dries is growing aster than the I,O
throughput rates proided by those disks, leading to a situation in which a small number o disks can store a large
olume o data, but cannot proide the same I,O throughput as a larger number o small disks. \ou get maximal I,O
bandwidth by haing multiple disks and channels contribute to the heay database operations. Striping data iles across
deices ,ideally, all deices, is a way to achiee this. Implement a large stripe size ,1 MB, in order to ensure that the
time to position the disk is a small percentage o the time to transer the data.
1he workload or a data warehouse is typically characterized by sequential I,O throughput, issued by multiple
processes. \ou can simulate this type o workload with Orion. Depending on the type o system you plan to build,
you should run multiple dierent I,O simulations. lor example,
Daily workload when end-users and,or other applications query the system: read-only workload with possibly
many indiidual parallel I,Os.
1he data load, when end-users may or may not access the system: write workload with possibly parallel reads
,by the load programs and,or by end-users,.
Index and materialized iew builds, when end-users may or may not access the system: read,write workload.
Backups: read workload with likely ew other processes, but a possibly high degree o parallelism.
Use the ollowing options in the adanced user mode o Orion to simulate the dierent data warehouse-like workload
types:
run: use 'adanced' in order to simulate only large sequential I,Os.
large: the I,O size in KB or large sequential loads. Set this parameter to a multiple o the operating system
I,O size. lor a data warehouse, you should plan or single I,O requests that are as large as possible. On
most platorms, this size is 1 MB.
type: use 'seq' to simulate a large sequential I,O workload.
num_streamIO: increase this parameter in order to simulate parallel execution or indiidual operations.
Speciy a degree o parallelism that you plan to use or your database operations. A good starting point or the
degree o parallelism is the number o CPUs on your system multiplied by the number o parallel threads per
CPU.

12
simulate: use 'CONCA1' i the deices are already striped as they are presented to the database, e.g. i
striping takes place on the hardware leel or through a olume manager. Use 'raid0' or deices that hae yet
to be striped, or example by Oracle Automatic Storage Manager. 1he stripe size or 'raid0' deaults to 1 MB.
write: speciy the percentage o I,Os that you want to be writes. 1ake your data load programs into account
and the access to the system during the load window in order to set the percentage. \ARNING: \RI1L
1LS1S \ILL OBLI1LRA1L ALL DA1A ON 1lL LUNS.
matrix: use 'point' to simulate an indiidual sequential workload, or use 'col' to simulate an increasing number
o large sequential workloads.
num_large: speciy the maximum number o large I,Os.
In a clustered enironment, you will hae to inoke Orion in parallel on all nodes in order to simulate a clustered
workload. lor example, the ollowing example is an Orion run or a typical data warehouse workload. 1o get started,
a ile orion14.lun was created, with the ollowing contents:

/dev/vx/rdsk/asm_vol1_1500m
/dev/vx/rdsk/asm_vol2_1500m
/dev/vx/rdsk/asm_vol3_1500m
/dev/vx/rdsk/asm_vol4_1500m

1hese 4 disks are indiidual olumes that are not striped on the hardware leel, nor on a olume manager leel. ASM
would be used to stripe data iles across these 4 olumes. 1he ollowing command inokes Orion:

./orion -run advanced \
-testname orion14 \
-matrix point \
-num_small 0 \
-num_large 4 \
-size_large 1024 \
-num_disks 4 \
-type seq \
-num_streamIO 8 \
-simulate raid0 \
-cache_size 0 \
-verbose

1he run simulates 4 parallel sessions ,-num_large 4, running a statement with a degree o parallelism o 8 ,-
num_streamIO 8,. Orion simulates a raid0 striping. 1he internal disks in this case do not hae cache. As a result o
running this command, the ile orion14_summary.txt is generated with the ollowing contents:

ORION VERSION 10.2

Command line:
-run advanced -testname orion14 -matrix point -num_large 4 -size_large 1024 -
num_disks 4 -type seq -num_streamIO 8 -simulate raid0 -cache_size 0 -verbose

This maps to this test:
Test: orion14
Small IO size: 8 KB
Large IO size: 1024 KB
IO Types: Small Random IOs, Large Sequential Streams
Number of Concurrent IOs Per Stream: 8
Force streams to separate disks: No
Simulated Array Type: RAID 0

13
Stripe Depth: 1024 KB
Write: 0%
Cache Size: 0 MB
Duration for each Data Point: 60 seconds
Small Columns:, 0
Large Columns:, 4
Total Data Points: 1

Name: /dev/vx/rdsk/asm_vol1_1500m Size: 1572864000
Name: /dev/vx/rdsk/asm_vol2_1500m Size: 1573912576
Name: /dev/vx/rdsk/asm_vol3_1500m Size: 1573912576
Name: /dev/vx/rdsk/asm_vol4_1500m Size: 1573912576
4 FILEs found.

Maximum Large MBPS=57.30 @ Small=0 and Large=4

In other words, the maximum throughput or this speciic case with that workload is 5.30 MBps. In ideal conditions,
Oracle will be able to achiee up to about 95 o that number. lor this particular case, haing 4 parallel sessions
running the ollowing statement would approach the same throughput:

select /*+ NO_MERGE(sales) */ count(*) from
(select /*+ FULL(s) PARALLEL (s,8) */ * from all_sales s) sales

Note that the Oracle database has seeral optimizations that would make a statement like this run aster in case it
wasn't orced to read the entire table.
In a well-balanced data warehouse hardware coniguration, there is suicient I,O bandwidth to eed the CPUs. As a
starting point, you can use the basic rule that eery Glz o CPU power can drie at least 100 MBps. I.e., or a single
serer coniguration with our 3 Glz CPUs, your storage coniguration should at least be able to proide 4 3 100
~ 1200 MBps throughput. 1his number should be multiplied by the number o nodes in a RAC coniguration.

Vous aimerez peut-être aussi