Jccoracle Blogspot in

10th March 2013
After presenting Exadata Storage Indexes: Inside and Out at the 2013 Hotsos
Symposium, one of the presentation attendees posed a question about how
Storage Indexes are used in situations when uncommitted updates happen on
tables that have subsequent Smart Scan queries executed against them. In other
words, will Storage Indexes be used in situations when consistent reads require
block access vs. a 100% cell offload scenario. In this post I'll show some
examples that hopefully illustrate this behavior.
First, the details of the test environment:
Exadata X2-2 Quarter Rack
Oracle RDBMS version 11.2.0.3 with the J anuary 2013 QFSDP patches
applied against it
Storage Server image version 11.2.3.2.1.130109
I used a simple test table, MYOBJ _TEST4, a 20+million row tables consisting of 10
columns with data ordered by a column called COL1. COL1 is a numeric data type
with 1,000 distinct values ranging from 1 to 1,000:
SQL> sel ect col umn_name, num_di st i nct
2 f r omdba_t ab_col umns
3 wher e t abl e_name=' MYOBJ _TEST4' and col umn_name=' COL1' ;
COLUMN_NAME NUM_DI STI NCT
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
COL1 1000
El apsed: 00: 00: 00. 14
SQL> sel ect num_r ows f r omdba_t abl es wher e t abl e_name=' MYOBJ _TEST4' ;
NUM_ROWS
- - - - - - - - - -
20447000
El apsed: 00: 00: 00. 02
SQL>
DML, Consistent Reads, and
Storage Indexes on Exadata
Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
Page 1 of 113 John Clarke's Blog
2/6/2014 http://jccoracle.blogspot.in/
against this table with a narrow range predicate on COL1:
SESS1 SQL> SELECT count ( col 1)
2 FROM d14. myobj _t est 4
3 wher e col 1 bet ween 1 and 10;
COUNT( COL1)
- - - - - - - - - - -
204470
El apsed: 00: 00: 00. 09
SESS1 SQL> @si _myst at . sql
SESS1 SQL> set echo of f
SESS1 SQL> sel ect st at . name,
2 sess. val ue val ue
3 f r om v$myst at sess,
4 v$st at name st at
5 wher e st at . st at i st i c# = sess. st at i st i c#
6 and st at . name i n
7 ( ' cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad' ,
8 ' cel l physi cal I O i nt er connect byt es' ,
9 ' cel l physi cal I O byt es saved by st or age i ndex' ,
10 ' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan
' ,
11 ' consi st ent get s' , ' db bl ock get s' ,
12 ' physi cal r eads' , ' physi cal r eads di r ect ' )
13 or der by 1
14 /
St at i st i c Val ue
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - -
cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad 1, 805, 15
6, 352
cel l physi cal I O byt es saved by st or age i ndex 1, 786, 64
2, 432
cel l physi cal I O i nt er connect byt es 2, 496, 62
4
cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan 2, 488, 43
2
consi st ent get s 220, 365
db bl ock get s 0
physi cal r eads 220, 357
Classic
search
Send feedback
and we saved nearly all of this (1,786,642,432 bytes) with Storage Indexes.
Additionally, 2,496,624 bytes were transmitted over the storage interconnect.
For our first test, we'll update these rows, commit, and checkpoint:
SESS2 SQL> updat e d14. myobj _t est 4
2 set col 1=col 1
204470 r ows updat ed.
El apsed: 00: 00: 18. 73
SESS2 SQL> commi t ;
Commi t compl et e.
El apsed: 00: 00: 00. 00
SESS2 SQL> al t er syst emswi t ch l ogf i l e;
Syst emal t er ed.
El apsed: 00: 00: 00. 03
SESS2 SQL> al t er syst emcheckpoi nt ;
Syst emal t er ed.
El apsed: 00: 00: 00. 13
SESS2 SQL>
If we now re-run our test query in a different session, the statistics look like below:
COUNT( COL1)
- - - - - - - - - - -
204470
El apsed: 00: 00: 00. 08
Classic
search
Send feedback
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad 1, 805, 156,
352
cel l physi cal I O byt es saved by st or age i ndex 1, 786, 642,
432
db bl ock get s 0
physi cal r eads di r ect 220, 356
8 r ows sel ect ed.
El apsed: 00: 00: 00. 00
The statistics above should look familiar; in fact, they are identical to the previous
numbers before the update. We had updated this column with the same values,
and as expected, cel l sr v did the work we would expect and populate region
index values with the same high/low combinations across each storage region in
which MYOBJ _TEST4 was stored. What this test shows is that Storage Indexes
"work" with DML; region index values are not wiped out or otherwise invalidated,
but maintained.
What happens if we update some of these rows and not commit? Let's try it:
SESS2 SQL> sel ect sysdat e f r omdual ;
SYSDATE
- - - - - - - - - - - - - - - - - -
09- MAR 13 23: 59: 24
El apsed: 00: 00: 00. 00
Classic
search
Send feedback
2 set col 1=col 1
El apsed: 00: 00: 18. 54
SESS2 SQL>
Now we'll execute our same test query from a different session:
COUNT( COL1)
- - - - - - - - - - -
204470
El apsed: 00: 00: 01. 94
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - -
6, 352
2, 432
32
Classic
search
Send feedback
76
db bl ock get s 0
8 r ows sel ect ed.
From the statistics above, note the following:
The number of bytes eligible for predicate offload was the same as the
previous tests, which is expected
The number of bytes saved by Storage Indexes was also the same; we'll
come back to this towards the end of the post, but for now we can ascertain
that at the point-in-time that we conducted the test, the data residing in the
storage region's region indexes was the same as before the update
The number of bytes transmitted over the interconnect was higher compared
to the previous test. This is showing that in order to satisfy Oracle's
consistent read mechanisms, at least some of the data was required to be
shipped as blocks, not rows/columns (i.e., cel l sr v became a block server
not a row/column server)
The number of consistent gets was double the value as compared to before.
This is expected, since CR reads were required to fetch the read-consistent
image of blocks
Now, in our original session, let's commit and throw in a checkpoint for good
measure:
Commi t compl et e.
El apsed: 00: 00: 00. 00
Syst emal t er ed.
El apsed: 00: 00: 02. 23
In our original session we'll execute our test query:
Classic
search
Send feedback
COUNT( COL1)
- - - - - - - - - - -
El apsed: 00: 00: 00. 10
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
352
432
db bl ock get s 0
8 r ows sel ect ed.
Above, we see that our statistics are "back to normal". Again, we updated COL1
and set it equal to itself, so we'd expect to see the same values across our
statistics.
Classic
search
Send feedback
query? Below, I'll update the table and set COL1=COL1+10 to leave no rows that
match our test query's predicate condition, without committing:
SYSDATE
- - - - - - - - - - - - - - - - - -
10- MAR 13 00: 05: 53
El apsed: 00: 00: 00. 00
2 set col 1=col 1+10
El apsed: 00: 00: 18. 12
SESS2 SQL>
From our first session, let's run our query:
COUNT( COL1)
- - - - - - - - - - -
204470
El apsed: 00: 00: 02. 12
' ,
13 or der by 1
14 /
Classic
search
Send feedback
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - -
, 352
, 432
0
8
db bl ock get s 0
Very similar to our previous update-without-commit scenario, we see higher values
for bytes transmitted over the interconnect, the same Smart Scan/Storage Index
savings, and a higher value for consistent gets. If we now commit from our
second session:
Commi t compl et e.
El apsed: 00: 00: 00. 01
Syst emal t er ed.
El apsed: 00: 00: 03. 33
SESS2 SQL> al t er syst emswi t ch l ogf i l e;
Syst emal t er ed.
El apsed: 00: 00: 00. 03
SESS2 SQL>
... and re-execute our query, our statistics look like this:
COUNT( COL1)
- - - - - - - - - - -
0
El apsed: 00: 00: 00. 10
Classic
search
Send feedback
SQL> sel ect st at . name,
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
352
432
db bl ock get s 0
8 r ows sel ect ed.
From the above test, we see 0 rows returned, the same number of bytes eligible
for predicate offload, but also the same values for Storage Index savings.
Additionally, the number of bytes transmitted over the interconnect is higher than
we would expect based on the zero-row query output.
This happens because cellsrv hasn't yet "warmed up" the region index values with
the data that should (eventually) exist on disk. To confirm this, we simply ran our
test query a couple more times and after about a half minute or so, our statistics
Classic
search
Send feedback
show that values we would expect:
COUNT( COL1)
- - - - - - - - - - -
0
El apsed: 00: 00: 00. 06
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - -
, 352
, 696
cel l physi cal I O i nt er connect byt es 25, 944
cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan 1, 368
db bl ock get s 0
8 r ows sel ect ed.
Above, we see that our new statistic for
Classic
search
Send feedback
st or age i ndex is higher than in previous tests and nearly equal to the number
of bytes eligible for predicate offload - reasonable considering that our query
interconnect is now much lower than in the previous tests.
This demonstrates an interesting thing about Storage Indexes - they don't seem to
be "updated" immediately as data is either updated or fetched via SELECT, but
rather, can take a bit of time to properly stage and update. Eventually, the region
index data will reflect what's on disk and when this is true, your Storage Index
savings will be what they should be.
So what happens when updated blocks trickle to disk before a commit? In other
words, let's say we update some rows, DBWR gets these written to disk, and
subsequent queries occur? Let's update a number of rows in our table, not do a
commit, but flush our buffer cache and checkpoint. Before doing this, let's run a
test query:
SESS1 SQL> al t er syst emf l ush buf f er _cache;
Syst emal t er ed.
El apsed: 00: 00: 00. 25
COUNT( COL1)
- - - - - - - - - - -
408940
El apsed: 00: 00: 00. 13
' ,
13 or der by 1
14 /
Classic
search
Send feedback
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
, 352
, 168
db bl ock get s 0
8 r ows sel ect ed.
El apsed: 00: 00: 00. 00
SQL>
Since we retrieved more data than in previous tests, we would expect a smaller
Storage Index savings and more bytes shipped over the interconnect, which is
what we see. Now let's update these rows, not commit, but flush our buffer cache
and do a checkpoint:
Syst emal t er ed.
El apsed: 00: 00: 00. 17
SYSDATE
- - - - - - - - - - - - - - - - - -
10- MAR 13 01: 16: 50
El apsed: 00: 00: 00. 00
2 set col 1=col 1+10
El apsed: 00: 00: 19. 61
Syst emal t er ed.
El apsed: 00: 00: 00. 29
Syst emal t er ed.
Classic
search
Send feedback
El apsed: 00: 00: 03. 33
SESS2 SQL>
Now, if we run our test query (several times, after a 5 minute period to ensure that
region indexes have a chance of "updating"), we see this:
COUNT( COL1)
- - - - - - - - - - -
408940
El apsed: 00: 00: 04. 06
' ,
13 or der by 1
14 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - -
, 352
, 168
2
0
Classic
search
Send feedback
db bl ock get s 0
8 r ows sel ect ed.
El apsed: 00: 00: 00. 00
SESS1 SQL>
Above, we see more consistent gets and more bytes shipped over the
interconnect for the same CR reasons as before, but interestingly our bytes saved
by Storage Indexes remained 1,769,095,168. This, I believe, tells us that region
index values will not get updated until data is actually committed to disk, not just
written to disk in an uncommitted state (which a checkpoint will do).
Conclusion
As documented, Exadata Storage Indexes are maintained and updated
automatically by Oracle and Storage Index values will reflect the high/low values
per storage region for queries that access data using WHERE clauses and Smart
Scan. When rows in a table are updated, it appears as if these region index
values are updated upon commit and at least on the software versions used in this
test, your Storage Index I/O savings will reflect the values of the last committed
state of the data. While consistent reads will typically lead to less efficient Smart
Scans, it looks as if the Storage Index I/O savings are changed at the time of
commits (or rather, shortly thereafter).
Posted 10th March 2013 by J ohn Clarke
Enter your comment...

Comment as: Select profile...
Publ ish Previ ew
0
Add a comment
Classic
search
Send feedback
In this post I'm going to walk through the process we followed to apply Oracle's
J anuary 13th QFSDP (Quarterly Full Stack Download Patch), 16038715, on an
Exadata X2-2 Quarter Rack.
Download and Preparation
The first step is to download the patch 16038715 from MOS.
After downloading it, we staged it to our first compute node, cm01dbm01. For
frame of reference, our compute nodes are cm01dbm01 and cm01dbm02 and the
storage cells are cm01cel 01, cm01cel 02, and cm01cel 03. After unzipping the
QFSDP patch, the directory structure looked like most other QFSDP patches:
[ r oot @cm01dbm01 16038715] # l s
Dat abase I nf r ast r uct ur e README. ht ml README. t xt Syst emsManagement
[ r oot @cm01dbm01 16038715] #
After downloading and unzipping, I typically start reading through all the various
README files to outline exactly what needs to be done on each of the nodes,
Homes, etc. After digesting this information, our plan, based the versions and
configuration we're running, looks like this (I'm not going to regurgitate all the
patch README instructions, just show the list of steps):
1. First, I'll apply the InfiniBand patches if at a prior version, if applicable
2. Next, I'll apply the 11.2.3.2.1 Exadata storage server patch on the storage
cells
3. After this, we'll use yum to apply the latest firmware/package/OS patches on
the compute nodes. The used to be called "minimal packs", but this
terminology became obsolete with 11.2.3.1.0)
4. Next, I like to apply the PDU patches if applicable
5. Then we would apply the OPatch patch to both the RBDMS and GI homes,
which is included under the Database directory in the QFSDP patch
6. After this, we'll patch both the GI and RDBMS homes with the QDPE patch
included in the QFSDP
7. Finally we (may) apply the Enterprise Manager 12c Cloud Control patches
Patching the Infini Band swi tches
The first step, as is typical, is to ensure that the image version on the InfiniBand
switches is the proper version. For purposes of the J anuary QFSDP patch, the
version we're looking for it 1. 3. 3- 2. I logged into each switch and an the
"ver si on" command to validate the current version:
Applying the J anuary 2013
Classic
search
Send feedback
[ r oot @cm01sw- i b2 ~] # ver si on
SUN DCS 36p ver si on: 1. 3. 3- 2
SP boar d i nf o:
Manuf act ur i ng Dat e: 2010. 08. 21
Ser i al Number : "NCD4V1753"
Har dwar e Revi si on: 0x0005
Fi r mwar e Revi si on: 0x0000
BI OS ver si on: SUN0R100
BI OS dat e: 06/ 22/ 2010
[ r oot @cm01sw- i b2 ~] #
As you can see, we're at the proper version so no InfiniBand switch patching is
required.
Patching the Exadata Storage Cells to 11.2.3.2.1
The first thing to do is establish LO serial console access on the storage cells.
Below, I'm showing this for one of the three cells:
Maci nt osh- 8: ~ j cl ar ke$ ssh r oot @cm01cel 01- i l om
Passwor d:
Or acl e( R) I nt egr at ed Li ght s Out Manager
Ver si on 3. 0. 16. 10 r 65138
Copyr i ght ( c) 2011, Or acl e and/ or i t s af f i l i at es. Al l r i ght s r eser ved
.
- > st ar t / SP/ consol e
Ar e you sur e you want t o st ar t / SP/ consol e ( y/ n) ? y
Ser i al consol e st ar t ed. To st op, t ype ESC (
cm01cel 01. cent r oi d. coml ogi n: r oot
Passwor d:
Last l ogi n: Sat Feb 2 12: 04: 02 f r om172. 16. 150. 10
[ r oot @cm01cel 01 ~] #
Next, we'll run i pconf - ver i f y on all cells:
[ r oot @cm01cel 01 ~] # / opt / or acl e. cel l os/ i pconf - ver i f y
Ver i f yi ng of Exadat a conf i gur at i on f i l e / opt / or acl e. cel l os/ cel l . conf
Done. Conf i gur at i on f i l e / opt / or acl e. cel l os/ cel l . conf passed al l ver i
f i cat i on checks
[ r oot @cm01cel 01 ~] #
Classic
search
Send feedback
After, per the patch README I added the following ciphers
sshd (we're going to
run patchmgr from cm01dbm01):
[ r oot @cm01dbm01 ~] # gr ep Ci pher s / et c/ ssh/ ssh_conf i g
# Ci pher s aes128- cbc, 3des- cbc, bl owf i sh- cbc, cast 128- cbc, ar cf our , aes19
2- cbc, aes256- cbc, aes128- ct r , aes192- ct r , aes265- ct r
[ r oot @cm01dbm01 ~] #
To execute patchmgr successfully we need to establish SSH equivalence. I'm
using the ~/ cel l _gr oup file in r oot 's home directory, and executed the below
to validate SSH equivalence for r oot . If the following prompts you for a
password, you'd want to follow the instructions in the patch README to set it up:
[ r oot @cm01dbm01 ~] # dcl i - g . / cel l _gr oup - l r oot ' host name - i '
cm01cel 01: 172. 16. 1. 12
cm01cel 02: 172. 16. 1. 13
cm01cel 03: 172. 16. 1. 14
[ r oot @cm01dbm01 ~] #
After this, I validated free space in the root file-system on each storage cell, as
patchmgr will transfer and extract patch contents temporarily.
Next, I set the di sk_r epai r _t i me attribute to 7.2 hours for each ASM disk
group. This is most likely not going to be necessary, but I'm doing it anyway just
to save any potential difficulties downstream:
SQL> al t er di skgr oup dat a_cm01 set at t r i but e ' di sk_r epai r _t i me' =' 7. 2h
' ;
Di skgr oup al t er ed.
SQL>
(I then did the same thing for all ASM disk groups that were mounted)
Now it's time to apply the patch. I changed directories to the
6038715/ I nf r ast r uct ur e/ Exadat aSt or ageSer ver / 11. 2. 3. 2. 1 directory
and unzipped p14522699_112321_Li nux- x86- 64. zi p, and then change
directories to pat ch_11. 2. 3. 2. 1. 130109:
Classic
search
Send feedback
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] # pwd
/ u01/ st g/ 16038715/ I nf r ast r uct ur e/ Exadat aSt or ageSer ver / 11. 2. 3. 2. 1/ pat c
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] # l s
11. 2. 3. 2. 1. 130109. i so 11. 2. 3. 2. 1. 130109. pat ch. t ar dcl i dost ep. sh et c
pat chmgr README. ht ml
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] #
Next, I ran pat chmgr with the "- cl eanup" flag to clean up any previous patch
attempts:
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] # . / pat chmgr - cel l s ~/ cel l _gr
oup - cl eanup
Li nux cm01dbm01. cent r oi d. com2. 6. 32- 400. 1. 1. el 5uek #1 SMP Mon J un 25
20: 25: 08 EDT 2012 x86_64 x86_64 x86_64 GNU/ Li nux
2013- 02- 02 14: 46: 16 : DONE: Cl eanup
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] #
After this, I ran the pat chmgr with the prerequisite check, with the - r ol l i ng
option. The output looked like this:
oup - pat ch_check_pr er eq - r ol l i ng
20: 25: 08 EDT 2012 x86_64 x86_64 x86_64 GNU/ Li nux
2013- 02- 02 14: 47: 03 : Wor ki ng: DO: Check cel l s have ssh equi val ence
f or r oot user . Up t o 10 seconds per cel l . . .
2013- 02- 02 14: 47: 05 : SUCCESS: DONE: Check cel l s have ssh equi val en
ce f or r oot user .
2013- 02- 02 14: 47: 05 : Wor ki ng: DO: Check space and st at e of Cel l se
r vi ces on t ar get cel l s. Up t o 1 mi nut e . . .
2013- 02- 02 14: 47: 24 : SUCCESS: DONE: Check space and st at e of Cel l
ser vi ces on t ar get cel l s.
2013- 02- 02 14: 47: 24 : Wor ki ng: DO: Copy, ext r act pr er equi si t e check
ar chi ve t o cel l s. I f r equi r ed st ar t md11 mi smat ched par t ner si ze cor r
ect i on. Up t o 40 mi nut es . . .
2013- 02- 02 14: 47: 38 Wai t cor r ect i on of degr aded md11 due t o md par t ne
r si ze mi smat ch. Up t o 30 mi nut es.
2013- 02- 02 14: 47: 39 : SUCCESS: DONE: Copy, ext r act pr er equi si t e che
ck ar chi ve t o cel l s. I f r equi r ed st ar t md11 mi smat ched par t ner si ze co
r r ect i on.
Classic
search
Send feedback
2013- 02- 02 14: 47: 39 : Wor ki ng: DO: Check pr er equi si t es on al l cel l s
. Up t o 2 mi nut es . . .
l s.
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] #
Now, we'll apply the patch. The first several lines will look like this:
oup - pat ch - r ol l i ng
20: 25: 08 EDT 2012 x86_64 x86_64 x86_64 GNU/ Li nux
NOTE Cel l s wi l l r eboot dur i ng t he pat ch or r ol l back pr ocess.
NOTE For non- r ol l i ng pat ch or r ol l back, ensur e al l ASM i nst ances usi n
g
NOTE t he cel l s ar e shut down f or t he dur at i on of t he pat ch or r ol l bac
k.
NOTE For r ol l i ng pat ch or r ol l back, ensur e al l ASM i nst ances usi ng
NOTE t he cel l s ar e up f or t he dur at i on of t he pat ch or r ol l back.
WARNI NG Do not st ar t mor e t han one i nst ance of pat chmgr .
WARNI NG Do not i nt er r upt t he pat chmgr sessi on.
WARNI NG Do not al t er st at e of ASM i nst ances dur i ng pat ch or r ol l back.
WARNI NG Do not r esi ze t he scr een. I t may di st ur b t he scr een l ayout .
WARNI NG Do not r eboot cel l s or al t er cel l ser vi ces dur i ng pat ch or r o
l l back.
WARNI NG Do not open l og f i l es i n edi t or i n wr i t e mode or t r y t o al t er
t hem.
NOTE Al l t i me est i mat es ar e appr oxi mat e. Ti mest amps on t he l ef t ar e r
eal .
NOTE You may i nt er r upt t hi s pat chmgr r un i n next 60 seconds wi t h cont
r ol - c.
2013- 02- 02 14: 52: 03 : Wor ki ng: DO: Check cel l s have ssh equi val ence
f or r oot user . Up t o 10 seconds per cel l . . .
2013- 02- 02 14: 52: 06 : SUCCESS: DONE: Check cel l s have ssh equi val en
ce f or r oot user .
2013- 02- 02 14: 52: 06 : Wor ki ng: DO: Check space and st at e of Cel l se
r vi ces on t ar get cel l s. Up t o 1 mi nut e . . .
. . . Out put omi t t ed
2013- 02- 02 15: 07: 18 2 of 5 : SUCCESS: DONE: Wai t i ng t o f i ni sh pr e- r ebo
ot pat ch act i ons.
Classic
search
Send feedback
i nal st at us on cel l s. Cel l s wi l l r eboot .
2013- 02- 02 15: 07: 18 3 Do cm01cel 01 : Wor ki ng: Cel l wi l l r eboot . Up t o
2013- 02- 02 15: 07: 22 3 Done cm01cel 01 : SUCCESS: Fi nal i ze pat ch on cel l
.
2013- 02- 02 15: 07: 22 4 Do cm01cel 01 : Wor ki ng: Wai t f or cel l t o r eboot
and come onl i ne. At l east 35 mi nut es and at most 120 mi nut es per cel l
. . .
2013- 02- 02 15: 07: 22 cm01cel 01 Wai t f or pat ch f i nal i zat i on and r eboot
| | | | | Mi nut es l ef t 115
2013- 02- 02 16: 26: 29 4 Done cm01cel 01 : SUCCESS: Wai t f or cel l t o r eboo
t and come onl i ne.
2013- 02- 02 16: 26: 29 5 Do cm01cel 01 : Wor ki ng: DO: Check t he st at e of p
at ch on cel l . Up t o 5 mi nut es . . .
2013- 02- 02 16: 26: 36 5 Done cm01cel 01 : SUCCESS: Check t he st at e of pat
ch on cel l .
2013- 02- 02 16: 26: 36 3 Do cm01cel 02 : Wor ki ng: Cel l wi l l r eboot . Up t o
5 mi nut es . . .
2013- 02- 02 16: 26: 39 3 Done cm01cel 02 : SUCCESS: Fi nal i ze pat ch on cel l
.
2013- 02- 02 16: 26: 39 4 Do cm01cel 02 : Wor ki ng: Wai t f or cel l t o r eboot
and come onl i ne. At l east 35 mi nut es and at most 120 mi nut es per cel l
. . .
After the patch is applied to each node, we ran i magei nf o and i magehi st or y
to validate the current image. Here, we'll look for 11. 2. 3. 2. 1. 130109:
[ r oot @cm01cel 01 ~] # i magei nf o
Ker nel ver si on: 2. 6. 32- 400. 11. 1. el 5uek #1 SMP Thu Nov 22 03: 29: 09 PST
2012 x86_64
Cel l ver si on: OSS_11. 2. 3. 2. 1_LI NUX. X64_130109
Cel l r pmver si on: cel l - 11. 2. 3. 2. 1_LI NUX. X64_130109- 1
Act i ve i mage ver si on: 11. 2. 3. 2. 1. 130109
Act i ve i mage act i vat ed: 2013- 02- 02 16: 21: 12 - 0500
Act i ve i mage st at us: success
Act i ve syst empar t i t i on on devi ce: / dev/ md5
Act i ve sof t war e par t i t i on on devi ce: / dev/ md7
I n par t i t i on r ol l back: I mpossi bl e
Cel l boot usb par t i t i on: / dev/ sdm1
Classic
search
Send feedback
Cel l boot usb ver si on: 11. 2. 3. 2. 1. 130109
I nact i ve i mage ver si on: 11. 2. 3. 1. 1. 120607
I nact i ve i mage st at us: success
I nact i ve syst empar t i t i on on devi ce: / dev/ md6
I nact i ve sof t war e par t i t i on on devi ce: / dev/ md8
Boot ar ea has r ol l back ar chi ve f or t he ver si on: 11. 2. 3. 1. 1. 120607
Rol l back t o t he i nact i ve par t i t i ons: Possi bl e
[ r oot @cm01cel 01 ~] #
[ r oot @cm01cel 01 ~] # i magehi st or y
Ver si on : 11. 2. 2. 4. 2. 111221
I mage act i vat i on dat e : 2012- 09- 04 13: 46: 01 - 0400
I magi ng mode : f r esh
I magi ng st at us : success
Ver si on : 11. 2. 3. 1. 1. 120607
I magi ng mode : out of par t i t i on upgr ade
Ver si on : 11. 2. 3. 2. 1. 130109
I magi ng mode : out of par t i t i on upgr ade
[ r oot @cm01cel 01 ~] #
After patching is complete on all of the cells, we ran the patch cleanup:
oup - cl eanup
20: 25: 08 EDT 2012 x86_64 x86_64 x86_64 GNU/ Li nux
2013- 02- 02 19: 41: 40 : DONE: Cl eanup
[ r oot @cm01dbm01 pat ch_11. 2. 3. 2. 1. 130109] #
Patching the Compute Nodes to 11.2.3.2.1
For this release, the concept of the "minimal pack" no longer applies and Oracle
requires using the Unbreakable Linux Network (ULN) for distribution of the patch
updates. This being the case, the first applicable step for 11.2.3.2.1 is to validate
the current image version and kernel on the compute nodes:
Classic
search
Send feedback
[ r oot @cm01dbm01 ~] # uname - r
2. 6. 32- 400. 1. 1. el 5uek
Ker nel ver si on: 2. 6. 32- 400. 1. 1. el 5uek #1 SMP Mon J un 25 20: 25: 08 EDT
2012 x86_64
I mage ver si on: 11. 2. 3. 2. 0. 120713
I mage act i vat ed: 2012- 10- 11 03: 21: 52 - 0400
I mage st at us: success
Syst empar t i t i on on devi ce: / dev/ mapper / VGExaDb- LVDbSys1
[ r oot @cm01dbm01 ~] #
From the above, we can see we're on 11. 2. 3. 2. 0. 120713, so the instructions
are to prepare and populate the yum repositories with the
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base channel.
On our compute nodes, we had previously used the ULN to patch our compute
nodes and registered our compute node hosts, so we'll follow the steps in MOS
note 1473002.1 as well as the 11.2.3.2.1 patch README and subscribe to the
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64, as shown below:
[http://2.bp.blogspot.com/-qRB8f3-
zFFo/URCYZn1ITzI/AAAAAAAAAB4/LV90HUkMtp8/s1600/112321blog_1.png]
Classic
search
Send feedback
each compute node:
[ r oot @cm01dbm01 16038715] # sh 167283. sh
## BEGI N PROCESSI NG el 5_x86_64_addons ##
Channel Di r : / var / www/ ht ml / yum/ Ent er pr i seLi nux/ EL5/ addons/ x86_64
Fet chi ng al l package l i st f or channel : el 5_x86_64_addons. . .
########################################
Fet chi ng package l i st f or channel : el 5_x86_64_addons. . .
########################################
When 167283. sh completed, we updated / et c/ yum. r epos. d/ Exadat a-
comput enode. r epo file so that looks like this:
[ r oot @cm01dbm01 x86_64] # cat / et c/ yum. r epos. d/ Exadat a- comput enode. r ep
o
[ exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base]
name=Or acl e Exadat a DB ser ver 11. 2 Li nux $r el easever - $basear ch - l a
t est
baseur l =ht t p: / / cm01dbm01. cent r oi d. com/ yum/ unknown/ EXADATA/ dbser ver / 11
. 2. 3. 2. 1/ base/ x86_64/
gpgcheck=1
enabl ed=0
[ r oot @cm01dbm01 x86_64] #
Next, I updated/validated yum. conf to ensure that the exclusion line only
excluded up2dat e since I was running a UEK kernel already:
[ r oot @cm01dbm01 x86_64] # gr ep excl ude / et c/ yum. conf
excl ude=up2dat e
[ r oot @cm01dbm01 x86_64] #
Then I disabled CRS and stopped the cluster on the first compute node:
[ r oot @cm01dbm01 ~] # / u01/ app/ 11. 2. 0. 3/ gr i d/ bi n/ cr sct l di sabl e cr s
CRS- 4621: Or acl e Hi gh Avai l abi l i t y Ser vi ces aut ost ar t i s di sabl ed.
[ r oot @cm01dbm01 ~] # / u01/ app/ 11. 2. 0. 3/ gr i d/ bi n/ cr sct l st op cr s
Classic
search
Send feedback
CRS- 2791: St ar t i ng shut down of Or acl e Hi gh Avai l abi l i t y Ser vi ces- mana
ged r esour ces on ' cm01dbm01'
CRS- 2790: St ar t i ng shut down of Cl ust er Ready Ser vi ces- managed r esour c
es on ' cm01dbm01'
CRS- 2673: At t empt i ng t o st op ' or a. oc4j ' on ' cm01dbm01'
CRS- 2673: At t empt i ng t o st op ' or a. LI STENER_SCAN2. l snr ' on ' cm01dbm01'
CRS- 2673: At t empt i ng t o st op ' or a. vi sx. db' on ' cm01dbm01'
CRS- 2673: At t empt i ng t o st op ' or a. edw. db' on ' cm01dbm01'
CRS- 2673: At t empt i ng t o st op ' or a. dwpr d. db' on ' cm01dbm01'
After this, we ran a "yum clean all":
[ r oot @cm01dbm01 ~] # yumcl ean al l
Cl eani ng up Ever yt hi ng
[ r oot @cm01dbm01 ~] #
Ater this, we'll validated our yum repository using "yum
- - enabl er epo=<channel name> as ment i oned i n t he pat ch
README> r epol i st ":
[ r oot @cm01dbm01 yum. r epos. d] # yum- - enabl er epo=exadat a_dbser ver _11. 2.
3. 2. 1_x86_64_base r epol i st
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base
| 1. 9 kB 00: 00
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base/ pr i mar y_db
| 1. 1 MB 00: 00
Excl udi ng Packages i n gl obal excl ude l i st
Fi ni shed
r epo i d r epo name
st at us
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base Or acl e Exadat a DB s
er ver 11. 2 Li nux 5 - x86_64 - l at est 474+1
r epol i st : 474
[ r oot @cm01dbm01 yum. r epos. d] #
After this, the patch README states to remove a few RPMs:
Classic
search
Send feedback
[ r oot @cm01dbm01 yum. r epos. d] # r pm- e l i bcxgb3- st at i c. x86_64
er r or : package l i bcxgb3- st at i c. x86_64 i s not i nst al l ed
[ r oot @cm01dbm01 yum. r epos. d] #
Next (finally) I applied the updates using the
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base repository:
[ r oot @cm01dbm01 11. 2. 3. 2. 1] # yum- - enabl er epo=exadat a_dbser ver _11. 2. 3
. 2. 1_x86_64_base updat e
exadat a_dbser ver _11. 2. 3. 2. 1_x86_64_base | 1. 9 kB
00: 00
Excl udi ng Packages i n gl obal excl ude l i st
Fi ni shed
Set t i ng up Updat e Pr ocess
Resol vi ng Dependenci es
- - > Runni ng t r ansact i on check
- - - > Package OpenI PMI . x86_64 0: 2. 0. 16- 13. el 5_8 set t o be updat ed
- - - > Package OpenI PMI - l i bs. x86_64 0: 2. 0. 16- 13. el 5_8 set t o be updat ed
- - - > Package bi nd- l i bs. x86_64 30: 9. 3. 6- 20. P1. el 5_8. 5 set t o be updat e
d
- - - > Package bi nd- ut i l s. x86_64 30: 9. 3. 6- 20. P1. el 5_8. 5 set t o be updat
ed
- - - > Package cyr us- sasl . x86_64 0: 2. 1. 22- 7. el 5_8. 1 set t o be updat ed
- - - > Package cyr us- sasl - l i b. x86_64 0: 2. 1. 22- 7. el 5_8. 1 set t o be updat
ed
- - - > Package devi ce- mapper - mul t i pat h. x86_64 0: 0. 4. 9- 46. 0. 5. el 5 set t o
be updat ed
- - - > Package devi ce- mapper - mul t i pat h- l i bs. x86_64 0: 0. 4. 9- 46. 0. 5. el 5 s
et t o be updat ed
- - - > Package exadat a- appl yconf i g. x86_64 0: 11. 2. 3. 2. 1. 130109- 1 set t o
be updat ed
- - - > Package exadat a- asr . x86_64 0: 11. 2. 3. 2. 1. 130109- 1 set t o be updat
ed
- - - > Package exadat a- base. x86_64 0: 11. 2. 3. 2. 1. 130109- 1 set t o be upda
t ed
- - - > Package exadat a- commonnode. x86_64 0: 11. 2. 3. 2. 1. 130109- 1 set t o b
e updat ed
t zdat a. x86_64 0: 2012f - 1. el 5
Classic
search
Send feedback
t zdat a- j ava. x86_64 0: 2012f - 1. el 5
yum. noar ch 0: 3. 2. 22- 39. 0. 6. el 5
Compl et e!
[ r oot @cm01dbm01 11. 2. 3. 2. 1] #
Remot e br oadcast message ( Mon Feb 4 12: 27: 29 2013) :
Exadat a post i nst al l st eps st ar t ed.
I t may t ake up t o 2 mi nut es.
The db node wi l l be r eboot ed upon successf ul compl et i on.
Remot e br oadcast message ( Mon Feb 4 12: 27: 45 2013) :
Exadat a post i nst al l st eps compl et ed.
I ni t i at e r eboot i n 10 seconds t o appl y t he changes.
Br oadcast message f r omr oot ( Mon Feb 4 12: 27: 55 2013) :
The syst emi s goi ng down f or r eboot NOW!
Connect i on t o cm01dbm01 cl osed by r emot e host .
At the completion of the update, the README instructed to remove a few
packages, so I did this:
[ r oot @cm01dbm01 ~] # r pm- qa | gr ep ' of a- \ | ^ker nel - ' | gr ep - v ' uek\ | ^
ker nel - 2\ . 6\ . 18' | xar gs yum- y r emove
Set t i ng up Remove Pr ocess
Resol vi ng Dependenci es
- - > Runni ng t r ansact i on check
- - - > Package ker nel - debugi nf o. x86_64 0: 2. 6. 18- 308. 24. 1. 0. 1. el 5 set t o
be er ased
- - - > Package ker nel - debugi nf o- common. x86_64 0: 2. 6. 18- 308. 24. 1. 0. 1. el 5
set t o be er ased
- - - > Package ker nel - devel . x86_64 0: 2. 6. 18- 308. 24. 1. 0. 1. el 5 set t o be
er ased
- - - > Package ker nel - doc. noar ch 0: 2. 6. 18- 308. 24. 1. 0. 1. el 5 set t o be er
ased
- - - > Package of a- 2. 6. 18- 238. 12. 2. 0. 2. el 5. x86_64 0: 1. 5. 1- 4. 0. 53 set t o
be er ased
- - > Fi ni shed Dependency Resol ut i on
After this I ran a yum clean all:
Classic
search
Send feedback
[ r oot @cm01dbm01 ~] # yumcl ean al l
Cl eani ng up Ever yt hi ng
At this point, an i magei nf o command shows we're running the updated and
patched version:
[ r oot @cm01dbm01 ~] # i magei nf o
Ker nel ver si on: 2. 6. 32- 400. 11. 1. el 5uek #1 SMP Thu Nov 22 03: 29: 09 PST
2012 x86_64
I mage ver si on: 11. 2. 3. 2. 1. 130109
I mage act i vat ed: 2013- 02- 04 12: 27: 43 - 0500
I mage st at us: success
Syst empar t i t i on on devi ce: / dev/ mapper / VGExaDb- LVDbSys1
[ r oot @cm01dbm01 ~] #
The last couple of steps include relinking the GI and RDBMS Home binaries. See
below:
[ r oot @cm01dbm01 ~] # / u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / r oot cr s. pl - un
l ock
Usi ng conf i gur at i on par amet er f i l e: / u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al
l / cr sconf i g_par ams
CRS- 4544: Unabl e t o connect t o OHAS
CRS- 4000: Command St op f ai l ed, or compl et ed wi t h er r or s.
Successf ul l y unl ock / u01/ app/ 11. 2. 0. 3/ gr i d
[ r oot @cm01dbm01 ~] #
[ r oot @cm01dbm01 ~] # su - or acl e
The Or acl e base has been set t o / u01/ app/ or acl e
[ or acl e@cm01dbm01 ~] $ echo $ORACLE_HOME
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1
[ or acl e@cm01dbm01 ~] $ whi ch r el i nk
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ bi n/ r el i nk
[ or acl e@cm01dbm01 ~] $ r el i nk al l
wr i t i ng r el i nk l og t o: / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ i nst
al l / r el i nk. l og
[ or acl e@cm01dbm01 ~] $
[ or acl e@cm01dbm01 ~] $ make - C $ORACLE_HOME/ r dbms/ l i b - f i ns_r dbms. mk
i pc_r ds i or acl e
make: Ent er i ng di r ect or y `/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ r
dbms/ l i b'
Classic
search
Send feedback
r m- f / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ l i b/ l i bskgxp11. so
cp / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ l i b/ / l i bskgxpr . so / u01/ a
chmod 755 / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ bi n
[ r oot @cm01dbm01 ~] # su - gr i d
The Or acl e base has been set t o / u01/ app/ gr i d
[ gr i d@cm01dbm01 ~] $
[ gr i d@cm01dbm01 ~] $ whi ch r el i nk
/ u01/ app/ 11. 2. 0. 3/ gr i d/ bi n/ r el i nk
[ gr i d@cm01dbm01 ~] $ r el i nk al l
wr i t i ng r el i nk l og t o: / u01/ app/ 11. 2. 0. 3/ gr i d/ i nst al l / r el i nk. l og
[ gr i d@cm01dbm01 ~] $
[ gr i d@cm01dbm01 ~] $ make - C $ORACLE_HOME/ r dbms/ l i b - f i ns_r dbms. mk i p
c_r ds i or acl e
make: Ent er i ng di r ect or y `/ u01/ app/ 11. 2. 0. 3/ gr i d/ r dbms/ l i b'
r m- f / u01/ app/ 11. 2. 0. 3/ gr i d/ l i b/ l i bskgxp11. so
cp / u01/ app/ 11. 2. 0. 3/ gr i d/ l i b/ / l i bskgxpr . so / u01/ app/ 11. 2. 0. 3/ gr i d/ l i
b/ l i bskgxp11. so
chmod 755 / u01/ app/ 11. 2. 0. 3/ gr i d/ bi n
[ r oot @cm01dbm01 ~] # / u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / r oot cr s. pl - pa
t ch
CRS- 4123: Or acl e Hi gh Avai l abi l i t y Ser vi ces has been st ar t ed.
[ r oot @cm01dbm01 ~] #
[ r oot @cm01dbm01 ~] # shut down - r 0
Br oadcast message f r omr oot ( pt s/ 0) ( Mon Feb 4 13: 36: 55 2013) :
The syst emi s goi ng down f or r eboot NOW!
[ r oot @cm01dbm01 ~] #
After this is complete and all server is back up, we repeated all of the above steps
on the other node, cm01dbm02.
At this point, the storage cells and compute nodes are patched with the latest
QFSDP. The next section talks about how to patch the Oracle and Grid
Infrastructure binaries on the compute nodes.
Patching the Compute Node Binaries
In this section we're going to patch our two compute nodes with the Oracle
Quarterly Database Patch For Exadata (J AN 2013 - 11.2.0.3.14) for Bug
Classic
search
Send feedback
and to my knowledge there is always a QDPE patch inside a QFSDP patch (so
far).
(The README we followed was in
16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ README. t xt.)
(Below, we're going to patch with OPatch and not use OPlan.)
The first step is to install the latest OPatch software, included with the QFSDP
patch, on both server's RDBMS and GI homes.
[ or acl e@cm01dbm01 11. 2. 0. 3. 0] $ unzi p p6880880_112000_Li nux- x86- 64. zi p
- d $ORACLE_HOME
Ar chi ve: p6880880_112000_Li nux- x86- 64. zi p
r epl ace / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat ch/ opl an/ README
. ht ml ? [ y] es, [ n] o, [ A] l l , [ N] one, [ r ] ename: A
. . . Out put omi t t ed - di d f or bot h RDBMS and GI homes on bot h nodes
When unzipped, we checked the OPatch version from each home on each node:
[ gr i d@cm01dbm01 11. 2. 0. 3. 0] $ / u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ opat ch ver
si on
OPat ch Ver si on: 11. 2. 0. 3. 3
OPat ch succeeded.
[ gr i d@cm01dbm01 11. 2. 0. 3. 0] $
[ or acl e@cm01dbm01 ~] $ / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat c
h/ opat ch ver si on
OPat ch Ver si on: 11. 2. 0. 3. 3
OPat ch succeeded.
[ or acl e@cm01dbm01 ~]
[ gr i d@cm01dbm02 ~] $ / u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ opat ch ver si on
OPat ch Ver si on: 11. 2. 0. 3. 3
OPat ch succeeded.
[ gr i d@cm01dbm02 ~] $
[ or acl e@cm01dbm02 ~] $ / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat c
h/ opat ch ver si on
OPat ch Ver si on: 11. 2. 0. 3. 3
OPat ch succeeded.
After this, we created our response file for OPatch; below, I'm only showing doing
Classic
search
Send feedback
both nodes:
OCM I nst al l at i on Response Gener at or 10. 3. 4. 0. 0 - Pr oduct i on
Copyr i ght ( c) 2005, 2010, Or acl e and/ or i t s af f i l i at es. Al l r i ght s r e
ser ved.
Pr ovi de your emai l addr ess t o be i nf or med of secur i t y i ssues, i nst al l
and
i ni t i at e Or acl e Conf i gur at i on Manager . Easi er f or you i f you use your
My
Or acl e Suppor t Emai l addr ess/ User Name.
Vi si t ht t p: / / www. or acl e. com/ suppor t / pol i ci es. ht ml f or det ai l s.
Emai l addr ess/ User Name: j ohn. cl ar ke@cent r oi d. com
Pr ovi de your My Or acl e Suppor t passwor d t o r ecei ve secur i t y updat es v
i a your My Or acl e Suppor t account .
Passwor d ( opt i onal ) :
The OCM conf i gur at i on r esponse f i l e ( ocm. r sp) was successf ul l y cr eat e
d.
[ gr i d@cm01dbm01 ~] $ l s ocm. r sp
ocm. r sp
[ gr i d@cm01dbm01 ~] $ l s - l ocm. r sp
- r w- r - - r - - 1 gr i d oi nst al l 4614 Feb 4 15: 29 ocm. r sp
[ gr i d@cm01dbm01 ~] $ l ocat e ocm. r sp
/ opt / or acl e/ pat ches/ ocm/ ocm. r sp
/ opt / or acl e. Suppor t Tool s/ onecommand/ onecommand- def aul t - ocm. r sp
/ u01/ app/ 11. 2. 0/ gr i d/ OPat ch/ ocm/ bi n/ ocm. r sp
/ u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ ocm/ bi n/ ocm. r sp
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0/ dbhome_1/ OPat ch/ ocm/ bi n/ ocm. r sp
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat ch/ ocm/ bi n/ ocm. r sp
[ gr i d@cm01dbm01 ~] $ cp ocm. r sp / u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ ocm/ bi n/
ocm. r sp
[ gr i d@cm01dbm01 ~] $
Next, we validated the inventory for each GI and RDBMS home across the
compute nodes. Below I'm showing the GI inventory on cm01dbm02:
[ gr i d@cm01dbm02 ~] $ / u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ opat ch l si nvent or y
- det ai l - oh $ORACLE_HOME
Or acl e I nt er i mPat ch I nst al l er ver si on 11. 2. 0. 3. 3
Copyr i ght ( c) 2012, Or acl e Cor por at i on. Al l r i ght s r eser ved.
Or acl e Home : / u01/ app/ 11. 2. 0. 3/ gr i d
Cent r al I nvent or y : / u01/ app/ or aI nvent or y
Classic
search
Send feedback
f r om : / u01/ app/ 11. 2. 0. 3/ gr i d/ or aI nst . l oc
OPat ch ver si on : 11. 2. 0. 3. 3
Log f i l e l ocat i on : / u01/ app/ 11. 2. 0. 3/ gr i d/ cf gt ool l ogs/ opat ch/ opat ch2
013- 02- 04_15- 34- 43PM_1. l og
Lsi nvent or y Out put f i l e l ocat i on : / u01/ app/ 11. 2. 0. 3/ gr i d/ cf gt ool l ogs
/ opat ch/ l si nv/ l si nvent or y2013- 02- 04_15- 34- 43PM. t xt
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - -
I nst al l ed Top- l evel Pr oduct s ( 1) :
Or acl e Gr i d I nf r ast r uct ur e 11. 2. 0. 3. 0
Ther e ar e 1 pr oduct s i nst al l ed i n t hi s Or acl e Home.
I nst al l ed Pr oduct s ( 88) :
Agent Requi r ed Suppor t Fi l es 10. 2. 0. 4. 3
Assi st ant Common Fi l es 11. 2. 0. 3. 0
Next, we'll unzip our patch as "or acl e" and confirm that the primary group is
oi nst al l ,and set permissions so that we can execute the patch:
[ or acl e@cm01dbm01 ~] $ i d
ui d=1001( or acl e) gi d=1001( oi nst al l ) gr oups=101( f use) , 1001( oi nst al l ) , 1
002( dba) , 1004( asmdba) , 1005( asmoper ) , 1006( asmadmi n)
[ or acl e@cm01dbm01 ~] $ i d - gn
oi nst al l
[ or acl e@cm01dbm01 11. 2. 0. 3. 14_QDPE_J an2013] $ pwd
/ u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013
[ or acl e@cm01dbm01 11. 2. 0. 3. 14_QDPE_J an2013] $ unzi p p15835102_112030_L
i nux- x86- 64. zi p
[ r oot @cm01dbm01 Dat abase] # chown - R or acl e: oi nst al l 11*
[ r oot @cm01dbm01 Dat abase] # pwd
/ u01/ st g/ 16038715/ Dat abase
[ r oot @cm01dbm01 Dat abase] #
After this, we need to run the patch conflict check for both the GI and RDBMS
homes. For the GI home on cm01dbm01, it looks like this:
Classic
search
Send feedback
[ gr i d@cm01dbm01 15835102] $ / u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ opat ch pr er e
q CheckConf l i ct Agai nst OHWi t hDet ai l - phBaseDi r \
02
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
013- 02- 04_15- 42- 44PM_1. l og
I nvoki ng pr er eq " checkconf l i ct agai nst ohwi t hdet ai l "
Pr er eq "checkConf l i ct Agai nst OHWi t hDet ai l " passed.
OPat ch succeeded.
q CheckConf l i ct Agai nst OHWi t hDet ai l - phBaseDi r / u01/ st g/ 16038715/ Dat aba
se/ 11. 2. 0. 3. 14_QDPE_J an2013/ 15835102/ 15876003
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
013- 02- 04_15- 43- 05PM_1. l og
OPat ch succeeded.
q CheckConf l i ct Agai nst OHWi t hDet ai l - phBaseDi r / u01/ st g/ 16038715/ Dat aba
se/ 11. 2. 0. 3. 14_QDPE_J an2013/ 15835102/ 14307915
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
Classic
search
Send feedback
OUI ver si on : 11. 2. 0. 3. 0
OPat ch succeeded.
[ gr i d@cm01dbm01 15835102] $
We repeated the same on cm01dbm02 as the "gr i d" owner.
For the prereq check for the RDBMS home on cm01dbm01, it looks like this (and
yes, we did the same thing on the other node):
[ or acl e@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckConf l i ct
Agai nst OHWi t hDet ai l - phBaseDi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_
QDPE_J an2013/ 15835102/ 15835102
PREREQ sessi on
Or acl e Home : / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1
f r om : / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ or aI nst . l oc
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
Log f i l e l ocat i on : / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ cf gt ool
l ogs/ opat ch/ opat ch2013- 02- 04_15- 46- 11PM_1. l og
OPat ch succeeded.
[ or acl e@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckConf l i ct
Agai nst OHWi t hDet ai l - phBaseDi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_
QDPE_J an2013/ 15835102/ 15876003/ cust om/ ser ver / 15876003
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
Classic
search
Send feedback
After this, we needed to check system space using the CheckSyst emSpace
argument, across both nodes and both homes. Below is what it looks like for the
GI home on cm01dbm01:
[ gr i d@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckSyst emSpac
e - phBaseDi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ 15835
102/ 15835102
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
013- 02- 04_15- 48- 36PM_1. l og
I nvoki ng pr er eq " checksyst emspace"
Pr er eq "checkSyst emSpace" passed.
OPat ch succeeded.
[ gr i d@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckSyst emSpac
102/ 15876003
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
013- 02- 04_15- 48- 44PM_1. l og
OPat ch succeeded.
Classic
search
Send feedback
102/ 14307915
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
013- 02- 04_15- 48- 52PM_1. l og
OPat ch succeeded.
[ gr i d@cm01dbm01 ~] $
For the RDBMS home:
[ or acl e@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckSyst emSp
ace - phBaseDi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ 158
35102/ 15835102
PREREQ sessi on
OPat ch ver si on : 11. 2. 0. 3. 3
OUI ver si on : 11. 2. 0. 3. 0
OPat ch succeeded.
[ or acl e@cm01dbm01 ~] $ $ORACLE_HOME/ OPat ch/ opat ch pr er eq CheckSyst emSp
ace - phBaseDi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ 158
35102/ 15876003/ cust om/ ser ver / 15876003
PREREQ sessi on
Classic
search
Send feedback
OUI ver si on : 11. 2. 0. 3. 0
OPat ch succeeded.
Finally, we're at a spot when we can actually apply the patch. We'll start by
patching the GI home on cm01dbm02 (be sure to unmount DBFS file systems and
stop dbconsol e, if you use these and have them running):
I decided to patch the GI home alone instead of patching both the GI and RDBMS
homes together - this is just a matter of preference. I logged in as r oot and did
this:
[ r oot @cm01dbm02 11. 2. 0. 3. 14_QDPE_J an2013] # expor t PATH=$PATH: / u01/ app
/ 11. 2. 0. 3/ gr i d/ OPat ch/
[ r oot @cm01dbm02 11. 2. 0. 3. 14_QDPE_J an2013] # whi ch opat ch
/ u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ opat ch
[ r oot @cm01dbm02 11. 2. 0. 3. 14_QDPE_J an2013] # opat ch aut o / u01/ st g/ 16038
715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ 15835102/ - oh / u01/ app/ 11. 2. 0. 3/
gr i d/
Execut i ng / u01/ app/ 11. 2. 0. 3/ gr i d/ per l / bi n/ per l / u01/ app/ 11. 2. 0. 3/ gr i d
/ OPat ch/ cr s/ pat ch11203. pl - pat chdi r / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0.
3. 14_QDPE_J an2013 - pat chn 15835102 - oh / u01/ app/ 11. 2. 0. 3/ gr i d/ - par amf
i l e / u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / cr sconf i g_par ams
/ u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / cr sconf i g_par ams
/ u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / s_cr sconf i g_def s
Thi s i s t he mai n l og f i l e: / u01/ app/ 11. 2. 0. 3/ gr i d/ cf gt ool l ogs/ opat cha
ut o2013- 02- 04_15- 53- 48. l og
Thi s f i l e wi l l show your det ect ed conf i gur at i on and al l t he st eps t ha
t opat chaut o at t empt ed t o do on your syst em: / u01/ app/ 11. 2. 0. 3/ gr i d/ cf
gt ool l ogs/ opat chaut o2013- 02- 04_15- 53- 48. r epor t . l og
2013- 02- 04 15: 53: 48: St ar t i ng Cl ust er war e Pat ch Set up
OPat ch i s bundl ed wi t h OCM, Ent er t he absol ut e OCM r esponse f i l e pat h
:
/ u01/ app/ 11. 2. 0. 3/ gr i d/ OPat ch/ ocm/ bi n/ ocm. r sp Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
CRS- 2791: St ar t i ng shut down of Or acl e Hi gh Avai l abi l i t y Ser vi ces- mana
ged r esour ces on ' cm01dbm02'
CRS- 2790: St ar t i ng shut down of Cl ust er Ready Ser vi ces- managed r esour c
es on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. LI STENER_SCAN1. l snr ' on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. LI STENER. l snr ' on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. vi sx. db' on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. edw. db' on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. dwpr d. db' on ' cm01dbm02'
CRS- 2673: At t empt i ng t o st op ' or a. vi sy. db' on ' cm01dbm02'
CRS- 4133: Or acl e Hi gh Avai l abi l i t y Ser vi ces has been st opped.
Successf ul l y unl ock / u01/ app/ 11. 2. 0. 3/ gr i d
pat ch / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013/ 15835102/ 15
835102 appl y successf ul f or home / u01/ app/ 11. 2. 0. 3/ gr i d
CRS- 4123: Or acl e Hi gh Avai l abi l i t y Ser vi ces has been st ar t ed.
[ r oot @cm01dbm02 11. 2. 0. 3. 14_QDPE_J an2013] #
After this I repeated the same steps on the other node, cm01dbm01.
Next, I patched the RDBMS homes on both nodes. If using the opat ch aut o, it
should take care of shutting down instances and so forth...
[ r oot @cm01dbm02 ~] # expor t PATH=/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbho
me_1/ OPat ch/ : $PATH
[ r oot @cm01dbm02 ~] # whi ch opat ch
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat ch/ opat ch
[ r oot @cm01dbm02 ~] # opat ch aut o / u01/ st g/ 16038715/ Dat abase/ 11. 2. 0. 3. 1
4_QDPE_J an2013/ 15835102/ - oh / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1
/
Execut i ng / u01/ app/ 11. 2. 0. 3/ gr i d/ per l / bi n/ per l / u01/ app/ or acl e/ pr oduc
t / 11. 2. 0. 3/ dbhome_1/ OPat ch/ cr s/ pat ch11203. pl - pat chdi r / u01/ st g/ 160387
15/ Dat abase/ 11. 2. 0. 3. 14_QDPE_J an2013 - pat chn 15835102 - oh / u01/ app/ or a
cl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ - par amf i l e / u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i
nst al l / cr sconf i g_par ams
/ u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / cr sconf i g_par ams
Classic
search
Send feedback
/ u01/ app/ 11. 2. 0. 3/ gr i d/ cr s/ i nst al l / s_cr sconf i g_def s
Thi s i s t he mai n l og f i l e: / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/
Thi s f i l e wi l l show your det ect ed conf i gur at i on and al l t he st eps t ha
t opat chaut o at t empt ed t o do on your syst em: / u01/ app/ or acl e/ pr oduct / 1
1. 2. 0. 3/ dbhome_1/ cf gt ool l ogs/ opat chaut o2013- 02- 04_16- 25- 37. r epor t . l og
2013- 02- 04 16: 25: 37: St ar t i ng Cl ust er war e Pat ch Set up
OPat ch i s bundl ed wi t h OCM, Ent er t he absol ut e OCM r esponse f i l e pat h
:
/ u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbhome_1/ OPat ch/ ocm/ bi n/ ocm. r sp
835102 appl y successf ul f or home / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbho
me_1
876003/ cust om/ ser ver / 15876003 appl y successf ul f or home / u01/ app/ or acl
e/ pr oduct / 11. 2. 0. 3/ dbhome_1
307915 appl y successf ul f or home / u01/ app/ or acl e/ pr oduct / 11. 2. 0. 3/ dbho
me_1
[ r oot @cm01dbm02 ~] #
When complete, the software will patched with everything included in
QFSDP/QDPE_J an2013/ (patch 15835102), so we need to update the registry for
each database:
[ or acl e@cm01dbm01 ~] $ cd $ORACLE_HOME
[ or acl e@cm01dbm01 dbhome_1] $ sql pl us / as sysdba
SQL*Pl us: Rel ease 11. 2. 0. 3. 0 Pr oduct i on on Mon Feb 4 16: 47: 55 2013
Copyr i ght ( c) 1982, 2011, Or acl e. Al l r i ght s r eser ved.
Connect ed t o:
Or acl e Dat abase 11g Ent er pr i se Edi t i on Rel ease 11. 2. 0. 3. 0 - 64bi t Pr o
duct i on
Wi t h t he Par t i t i oni ng, Real Appl i cat i on Cl ust er s, Aut omat i c St or age M
anagement , OLAP,
Dat a Mi ni ng and Real Appl i cat i on Test i ng opt i ons
SQL> @r dbms/ admi n/ cat bundl e. sql exa appl y
. . . Lot s of out put omi t t ed
After checking log files we confirmed that the bundle had applied successfully.
Classic
search
Send feedback
We already had the correct PDU firmware applied (1.04) so we took no actions for
this task.
Patching Systems Management Software
For the Enterprise Manager Cloud Control and Agent patches, the README for
16038715 recommends upgrading 12c to 12.1.0.2. Our current version is
12.1.0.1.0:
[ or acl e@emcc ~] $ / u01/ app/ Mi ddl eWar e/ oms/ bi n/ emct l st at us oms - det ai l
s
Or acl e Ent er pr i se Manager Cl oud Cont r ol 12c Rel ease 12. 1. 0. 1. 0
Copyr i ght ( c) 1996, 2012 Or acl e Cor por at i on. Al l r i ght s r eser ved.
Ent er Ent er pr i se Manager Root ( SYSMAN) Passwor d :
Consol e Ser ver Host : emcc. cent r oi d. com
HTTP Consol e Por t : 7790
HTTPS Consol e Por t : 7803
HTTP Upl oad Por t : 4890
HTTPS Upl oad Por t : 4904
OMS i s not conf i gur ed wi t h SLB or vi r t ual host name
Agent Upl oad i s l ocked.
OMS Consol e i s l ocked.
Act i ve CA I D: 1
Consol e URL: ht t ps: / / emcc. cent r oi d. com: 7803/ em
Upl oad URL: ht t ps: / / emcc. cent r oi d. com: 4904/ empbs/ upl oad
WLS Domai n I nf or mat i on
Domai n Name : GCDomai n
Admi n Ser ver Host : emcc. cent r oi d. com
Managed Ser ver I nf or mat i on
Managed Ser ver I nst ance Name: EMGC_OMS1
Managed Ser ver I nst ance Host : emcc. cent r oi d. com
[ or acl e@emcc ~] $
Per MOS note 1494890.1, however, EM 12.1.0.2.0 does not support monitoring
Oracle e-Business Suite targets, so we elected not to upgrade our 12c
deployment.
Classic
search
Send feedback
Since most of the patches in 16038715 are for OMS 12.1.0.2.0 and 12.1.0.2.0
agents, I decided to call it a day and skip the SystemsManagement patches
Conclusion
Exadata's Quarterly Full Stack Download Patches, or QFSDPs, provide a
comprehensive set of Oracle-tested patches for each tier on your Exadata
Database Machine. They include storage server patches as well as the QDPE
(Quarterly Database Patch for Exadata) patches, along with InfiniBand switch
firmware and PDU firmware patches.
As has been the case for awhile, the patches tend to apply relatively cleanly and
are well-tested - it's simply a matter of following the instructions in the various
README files, along with having a good understanding of what utilities
(pat chmgr , OPatch, yum, etc) to use for each type of patch.
Posted 5th February 2013 by J ohn Clarke
0
Add a comment
25th J anuary 2013
Oracle ASM uses Exadata grid disks for its ASM disks in ASM disk groups. In this
post I'm going to show examples of Exadatas automatic disk management
functionality, also known as Automatic Exadata Storage Management.
Specifically, I'll try to show Oracle ASM handles different types of grid disk state
changes, how ASM copes with a dropped grid disk, and what Oracle ASM does
when a grid disk is added after being dropped. Additionally, I'll you how to trace
Oracles automatics disk management modules in both your ASM instance and
storage cells.
First, I'll login to a storage cell as r oot or cel l admi n and list my Exadata grid
disks via CellCLI:
Cel l CLI > l i st gr i ddi sk wher e name=SDATA_CD_09_cm01cel 01 at t r i but es na
me, st at us, asmDi skGr oupName, asmDi skName, asmModeSt at us
SDATA_CD_09_cm01cel 01 act i ve SDATA_CM01 SDATA_CD_09_
CM01CEL01 ONLI NE
Cel l CLI >
Automatic Disk Management on
Exadata
Classic
search
Send feedback
SDATA_CM01 ASM disk group and is currently ONLI NE to Oracle ASM, as
indicated by the asmModeSt at us attribute. This grid disk is the disk we will be
testing with in the next few examples.
Now, I'll manually deactivate the grid disk:
Cel l CLI > al t er gr i ddi sk DATA_CD_09_cm01cel 01 i nact i ve
Gr i dDi sk DATA_CD_09_cm01cel 01 successf ul l y al t er ed
Cel l CLI >
When this happens, the following messages are displayed in our ASM instances
alert log:
1 SQL> / * Exadat a Aut o Mgmt : OFFLI NE ASM Di sk due t o gr i ddi sk i nact i v
at e */
2 al t er di skgr oup SDATA_CM01 of f l i ne di sk SDATA_CD_09_CM01CEL01
3
4 NOTE: DRTi mer CD Cr eat e: f or di sk gr oup 3 di sks:
5 26
6 NOTE: pr ocess _xdmg_+asm1 ( 29727) i ni t i at i ng of f l i ne of di sk 26. 391
6053578 ( SDATA_CD_09_CM01CEL01) wi t h mask 0x7e i n gr oup 3
7 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0xe96a3c4a, mask = 0
x6a, op = cl ear
8 Fr i Sep 14 18: 44: 09 2012
9 GMON updat i ng di sk modes f or gr oup 3 at 116 f or pi d 23, osi d 29727
10 NOTE: PST updat e gr p = 3 compl et ed successf ul l y
11 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0xe96a3c4a, mask =
0x7e, op = cl ear
13 NOTE: cache cl osi ng di sk 26 of gr p 3: SDATA_CD_09_CM01CEL01
15 NOTE: DRTi mer CD Dest r oy: f or di skgr oup 3
16 SUCCESS: / * Exadat a Aut o Mgmt : OFFLI NE ASM Di sk due t o gr i ddi sk i n
act i vat e */
18
Classic
search
Send feedback
19 NOTE: Exadat a Aut o Management : OS PI D: 29727 SUCCESS Oper at i on I D:
0
i on of one or mor e gr i ddi sks on cel l o/ 19 2. 168. 10. 3/
21 SQL : / * Exadat a Aut o Mgmt : OFFLI NE ASM Di sk due t o gr i ddi sk i n
act i vat e */
23
24 Fr i Sep 14 18: 44: 49 2012
25 WARNI NG: Di sk 26 ( SDATA_CD_09_CM01CEL01) i n gr oup 3 wi l l be dr oppe
d i n: ( 12960) secs on ASM i nst 1
You can see from the above output that the following operations took place:
In line 1, an Exadat a Aut o Mgmt comment was introduced into the alert log,
indicating that a series of automatic disk management functions will take
place
In line 2, the alert log shows the ASM instance off-lining the disk
In line 6, we can see that the xdmg process (PID 29727) is the actual process
performing the ASM disk offline operation
Lines 7 through 15 are standard Oracle ASM partner status table update
changes
Line 16 indicates that the ASM disk offline operation succeeded
Lines 17 through 22 provide the actual SQL statements used to offline the
ASM disks
Line 25 displays a warning that the ASM disk will be automatically dropped in
12,960 seconds, which is the value specified by the ASM disk groups
di sk_r epai r _t i me attribute (3.6 hours =12,960 seconds)
Next, I'll login toASM instance via SQL*Plus as SYSDBA or SYSASMand query
your ASM disk status:
SQL> sel ect a. name, b. pat h, b. st at e, b. mode_st at us, b. f ai l gr oup
f r omv$asm_di skgr oup a, v$asm_di sk b
wher e a. gr oup_number =b. gr oup_number
and a. name=' SDATA_CM01'
and b. f ai l gr oup=' CM01CEL01'
or der by 2, 1
/
k St at us Fai l gr oup Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
I NE CM01CEL01
SDATA_CM01 o/ 192. 168. 10. 3/ SDATA_CD_01_cm01cel 01 NORMAL ONL
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
I NE CM01CEL01
SDATA_CM01 NORMAL OFF
LI NE CM01CEL01
12 r ows sel ect ed.
SQL>
Notice from the above output that the disk path for our offlined grid disk is missing
in the last row of the output and that there is a gap between
o/ 192. 168. 10. 3/ SDATA_CD_08_cm01cel 01 and
o/ 192. 168. 10. 3/ SDATA_CD_10_cm01cel 01 in the second column.
Additionally, the disk status (v$asm_di sk. mode_st at us) of our inactivated grid
disk reports a status of OFFLI NE.
This is an indication that the ASM disk has been taken offline.
Now, lets activate the grid disk using the CellCLI command below:
Cel l CLI > al t er gr i ddi sk SDATA_CD_09_cm01cel 01 act i ve
Gr i dDi sk DATA_CD_09_cm01cel 01 successf ul l y al t er ed
Cel l CLI >
Classic
search
Send feedback
After performing this activity, I'll check my ASM disk status using the same query
SQL> sel ect a. name, b. pat h, b. st at e, b. mode_st at us, b. f ai l gr oup
f r omv$asm_di skgr oup a, v$asm_di sk b
wher e a. gr oup_number =b. gr oup_number
and a. name=' SDATA_CM01'
and b. f ai l gr oup=' CM01CEL01'
or der by 2, 1
/
Di sk Gr oup Di sk St at e Di sk
St at us Fai l gr oup
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - -
SDATA_CM01 o/ 192. 168. 10. 3/ SDATA_CD_00_cm01cel 01 NORMAL ONLI
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
NE CM01CEL01
SDATA_CM01 o/ 192. 168. 10. 3/ SDATA_CD_09_cm01cel 01 NORMAL SYNC
I NG CM01CEL01
NE CM01CEL01
NE CM01CEL01
Notice that the ASM disk status from v$asm_di sk. mode_st at us shows a status
of SYNCI NG; this occurs in Oracle 11g as part of Oracles ASM
Classic
search
Send feedback
between the time an ASM disk was taken offline and brought online, prior to
dropping an ASM disk, and synchronizes modified extents.
After a brief period of time, whose duration is dependent on your extent change
rates in the ASM disk group, Oracle automatically brings the ASM disk online and
all values for v$asm_di sk. st at us will show ONLI NE.
ASM Fast Mirror Resync is not specific to Exadata but included in Oracle 11g.
Prior to 11g, disks are dropped very quickly after OFFLI NE operations and after
being replaced or repaired, which causes an ASM rebalance operation after
adding the disk back to the ASM disk group.
If you examine the ASM instances alert log after activating the grid disk, you will
see the SDATA_CD_09_CM01CEL01 ASM disk being brought online in lines 13
and 14 below:
1 NOTE: Found o/ 192. 168. 10. 3/ SDATA_CD_09_cm01cel 01 f or di sk SDATA_CD_
09_CM01CEL01
2 WARNI NG: i gnor i ng di sk i n deep di scover y
3 SUCCESS: val i dat ed di sks f or 3/ 0xb65aca7a ( SDATA_CM01)
4 GMON quer yi ng gr oup 3 at 122 f or pi d 41, osi d 7081
5 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0x0, mask = 0x19, op
= assi gn
8 NOTE: member shi p r ef r esh pendi ng f or gr oup 3/ 0xb65aca7a ( SDATA_CM01
)
9 GMON quer yi ng gr oup 3 at 124 f or pi d 19, osi d 29719
10 NOTE: cache openi ng di sk 26 of gr p 3: SDATA_CD_09_CM01CEL01 pat h: o
/ 192. 168. 10. 3/ SDATA_CD_09_cm01cel 01
11 SUCCESS: r ef r eshed member shi p f or 3/ 0xb65aca7a ( SDATA_CM01)
12 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0x0, mask = 0x5d, o
p = assi gn
13 SUCCESS: / * Exadat a Aut o Mgmt : ONLI NE ASM Di sk */
14 al t er di skgr oup SDATA_CM01 onl i ne di sk SDATA_CD_09_CM01CEL01
15 nowai t
18 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0x0, mask = 0x7d, o
p = assi gn
Classic
search
Send feedback
01)
22 NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 26/ 0x0, mask = 0x7f , o
p = assi gn
25 NOTE: r eset t i mer s f or di sk: 26
26 NOTE: compl et ed onl i ne of di sk gr oup 3 di sks
27 SDATA_CD_09_CM01CEL01 ( 26)
This test case demonstrates Exadatas automatic disk management features in
action, and shows you how Oracle ASM automatically adapts to grid disk state-
changes when disks are inactivated and activated. These events occur when an
Exadata DMA manually inactivates or activates a grid disk, as we demonstrated
above, during storage cell patches, or due to transient disk failures.
If you experience a physical disk failure, the Exadata storage server will perform
several tasks.
First, the physi cal di sk status will change to cr i t i cal .
The cell disk status will change to not pr esent
Each grid disk on the cell disk will be inactivated, dropped with a FORCE
option, and flagged with a status of not pr esent .
When these activities occur, the Oracle ASM instance will capture the status of the
grid disk and OFFLINE the grid disk from its ASM disk group. Oracle ASM will
then forcibly drop the ASM disk from its disk group.
You can simulate a physical disk failure by simply removing a disk from your
storage server. You can check your grid disk status via CellCLI using the
command below for the cell disk that you have removed:
Cel l CLI > l i st gr i ddi sk wher e cel l Di sk=CD_05_cm01cel 01 at t r i but es name
, st at us
DATA_CD_05_cm01cel 01 not pr esent
DBFS_DG_CD_05_cm01cel 01 not pr esent
RECO_CD_05_cm01cel 01 not pr esent
SDATA_CD_05_cm01cel 01 not pr esent
SRECO_CD_05_cm01cel 01 not pr esent
Classic
search
Send feedback
Cel l CLI >
similar to the below:
NOTE: pr ocess _user 8507_+asm1 ( 8507) i ni t i at i ng of f l i ne of di sk 35. 39
65712224 ( RECO_CD_05_CM01CEL01) wi t h mask 0x7e i n gr oup 3
NOTE: checki ng PST: gr p = 3
Tue J ul 05 21: 47: 54 2011
GMON checki ng di sk modes f or gr oup 3 at 13 f or pi d 30, osi d 8507
NOTE: checki ng PST f or gr p 3 done.
WARNI NG: Di sk 35 ( RECO_CD_05_CM01CEL01) i n gr oup 3 mode 0x7f i s now b
ei ng of f l i ned
WARNI NG: Di sk 35 ( RECO_CD_05_CM01CEL01) i n gr oup 3 i n mode 0x7f i s no
w bei ng t aken of f l i ne on ASM i nst 1
NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 35/ 0xec5f f 760, mode = 0x6
a, op = 4
GMON updat i ng di sk modes f or gr oup 3 at 14 f or pi d 30, osi d 8507
NOTE: PST updat e gr p = 3 compl et ed successf ul l y
NOTE: i ni t i at i ng PST updat e: gr p = 3, dsk = 35/ 0xec5f f 760, mode = 0x7
e, op = 4
NOTE: cache cl osi ng di sk 35 of gr p 3: RECO_CD_05_CM01CEL01
NOTE: PST updat e gr p = 3 compl et ed successf ul l y
NOTE: pr ocess _user 8507_+asm1 ( 8507) i ni t i at i ng of f l i ne of di sk 33. 39
65712156 ( DATA_CD_05_CM01CEL01) wi t h mask 0x7e i n gr oup 1
NOTE: checki ng PST: gr p = 1
GMON checki ng di sk modes f or gr oup 1 at 16 f or pi d 30, osi d 8507
NOTE: checki ng PST f or gr p 1 done.
WARNI NG: Di sk 33 ( DATA_CD_05_CM01CEL01) i n gr oup 1 mode 0x7f i s now b
ei ng of f l i ned
WARNI NG: Di sk 33 ( DATA_CD_05_CM01CEL01) i n gr oup 1 i n mode 0x7f i s no
w bei ng t aken of f l i ne on ASM i nst 1
NOTE: i ni t i at i ng PST updat e: gr p = 1, dsk = 33/ 0xec5f f 71c, mode = 0x6
a, op = 4
. . . out put omi t t ed f or br evi t y
WARNI NG: Di sk 33 ( DATA_CD_05_CM01CEL01) i n gr oup 1 wi l l be dr opped i n
: ( 12960) secs on ASM i nst 1
WARNI NG: Di sk 35 ( RECO_CD_05_CM01CEL01) i n gr oup 3 wi l l be dr opped i n
: ( 12960) secs on ASM i nst 1
Tue J ul 05 21: 48: 00 2011
Classic
search
Send feedback
SQL> /* Exadata Auto Mgmt: Proactive DROP ASM Disk */
alter diskgroup DATA_CM01 drop
NOTE: Gr oupBl ock out si de r ol l i ng mi gr at i on pr i vi l eged r egi on
NOTE: r equest i ng al l - i nst ance member shi p r ef r esh f or gr oup=1
GMON updat i ng f or r econf i g
The alert log messages above are similar to what you would see with automatic
disk management for situations in which a grid disk was inactivated on a storage
cell. The main difference lies in the last few lines, highlighted in bold. When a
physical disk fails, your ASM instance will proactively drop the ASM disk.
Lets switch gears to automatic disk management tracing. Both your Oracle ASM
instance and your storage cells provide the ability to trace automatic disk
management features. To enable automatic disk management module tracing in
your ASM instance, connect to SQL*Plus as SYSASMand execute the following
command:
[ gr i d@cm01dbm01 ~] $ sql pl us / as sysasm
SQL*Pl us: Rel ease 11. 2. 0. 3. 0 Pr oduct i on on Sat Sep 15 00: 40: 43 2012
Connect ed t o:
Or acl e Dat abase 11g Ent er pr i se Edi t i on Rel ease 11. 2. 0. 3. 0 - 64bi t Pr o
duct i on
Wi t h t he Real Appl i cat i on Cl ust er s and Aut omat i c St or age Management o
pt i ons
SQL> al t er syst emset event s=' t r ace[ KXDAM] memor y hi ghest , di sk hi ghe
st '
2 /
Syst emal t er ed.
SQL>
This command will trace the Kernel Execution, Disk Auto Management layer, or
KXDAM. After a grid disk state change, your ASM instances XDMG trace file will
contain messages such as these:
[ gr i d@cm01dbm01 t r ace] $ vi . / +ASM1_xdmg_26192. t r c
out put omi t t ed
kxdam_i s_dg_mount ed: Di skgr oup SRECO_CM01 i s mount ed ( 1241916869)
Classic
search
Send feedback
1
kxdam_of f l i ne_di sk: Pr ocessi ng r equest t o OFFLI NE di sk SRECO_CD_11_CM
Event I d: 2
kxdam_cel l _cl eanup_act i ons: Cel l o/ 192. 168. 10. 3/ ( 0xad5bf 9a0) f or gr i
d di sk SRECO_CD_11_cm01cel 01 ( 0xad5b5bf 8)
kxdam_i s_di sk_pr esent : Oper at i on I D: 0
SQL: / * Exadat a Aut o Mgmt : I s Di sk Known */
sel ect count ( di sk_number ) f r omv$asm_di sk_st at
wher e
name=' SRECO_CD_11_CM01CEL01'
and
gr oup_number i n
(
sel ect gr oup_number f r omv$asm_di skgr oup_st at
wher e
name=' SRECO_CM01'
and
st at e=' MOUNTED'
)
kxdam_val i dat e_r ow_count : quer y_r ow_count : 1 - expect ed: 1
out put omi t t ed
The contents of our XDMG trace file will show verbose information about your ASM
automatic disk management activities. The information is similar to that which is
contained in your ASM instances alert log and as such, I generally favor
examining the alert log over enabling KXDAMtracing.
To enable tracing on your storage cells, launch CellCLI and run the following
command:
Cel l CLI > al t er cel l event s=' t r ace[ cel l sr v. cel l sr v_event s_l ayer ] memor
y=hi ghest , di sk=hi ghest '
Cel l cm01cel 01 successf ul l y al t er ed
Cel l CLI >
After a grid disk state change your $ADR_BASE/ di ag/ asm/ cel l /
[ host ] / t r ace/ al er t . l og will contain entries resembling this:
Gr i ddi sk SDATA_CD_09_cm01cel 01 st at us al t er ed successf ul l y
Sat Sep 15 02: 26: 47 2012 Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
not i f yi ng ASM t o of f l i ne 1 gr i ddi sks
NOTE: I ni t i at i ng ASM I nst ance oper at i on: ASM OFFLI NE di sk on 1 di sks
Publ i shed 1 gr i d di sk event s ASM OFFLI NE di sk on DG SRECO_CM01 t o:
Cl i ent Host Name = cm01dbm02. cent r oi d. com, Cl i ent PI D = 31505
Sat Sep 15 02: 26: 47 2012
Publ i shed 1 gr i d di sk event s ASM OFFLI NE di sk on DG SRECO_CM01 t o:
Cl i ent Host Name = cm01dbm01. cent r oi d. com, Cl i ent PI D = 26192
Sat Sep 15 02: 38: 15 2012
This information can be useful to correlate ASM instance alert log entries with
storage cell messages.
Summary of How It Works
Exadatas automatic disk management modules provide a mechanism for Oracle
ASM to automatically adapt to grid disk state changes and reduce Exadata DMA
intervention during maintenance procedures and in the event of a disk failure.
Below I'll summarize what happens and what actions may be required under a
number of scenarios:
When a DMA manually inactivates grid disk ...
The grid disk status changes to i nact i ve
ASM disk is offlined and after di sk_r epai r _t i me interval, disk is dropped
from disk group
The DMA should activate the grid disk prior to di sk_r epai r _t i me expiring
When a DMA manually activates grid disk ...
The grid disk status changes to act i ve
The disk is synchronized with disk group via ASM Fast Mirror Resync and
brought online
When a disk experiences a transient failure ...
The grid disk status changes to i nact i ve
ASM disk is offlined and after di sk_r epai r _t i me interval, disk is dropped
from disk group
if failure condition does not automatically correct itself, disk will be dropped
from ASM after di sk_r epai r _t i me
Classic
search
Send feedback
Grid disks are inactivated during patching and reactivated after patching
ASM disks are offlined automatically when patches inactive disks
After patch is complete and grid disks are activated, ASM will resync and
online the disks
No DMA intervention required unless patch is expected to exceed
di sk_r epai r _t i me; in this case, we recommended increasing the value for
this attribute
When a cell experiences a physical disk failure ...
Grid disks are dropped with FORCE option and marked not pr esent
Cell disks are marked not pr esent
Physical disk marked cr i t i cal
ASM disks are offlined and subsequently dropped
DMA should replace disk
When a DMA replaces a failed physical disk ...
Cell disks are grid disks are automatically recreated
ASM disks are automatically added to disk group and synchronized if
replaced within di sk_r epai r _t i me window
If not, ASM disk group rebalance operation occurs
No DMA intervention is typically required unless you want to rebalance with a
higher limit/power
When a physical disk status changes to predictive failure ...
Grid disk status changes to predictive failure
ASM disks are automatically offlined and dropped
DMA should replace disk; when disk is replaced, it will automatically be added
to the ASM disk group and either synchronized or rebalanced depending on
the outage duration
When a DMA manually drops grid disk with FORCE option while status is act i ve
Grid disk will be dropped
ASM will not offline or drop ASM disks; IO errors will be generated in the ASM
instances alert log and ASM will not recognize the state change
DMA should recreate the grid disk and manually add this ASM disk to the disk
group
Classic
search
Send feedback
When a DMA manually drops grid disk without FORCE option while status
is act i ve
Grid disk will be dropped
ASM will offline and drop the disk from its disk group
DMA should recreate the grid disk and manually add this ASM disk to the disk
group
There are several software components that work together to facilitate automatic
disk management. On the Exadata cells, the Cell Services software, cel l sr v,
processes Management Server (MS) notifications and handles ASM disk queries
about the status of grid disks. The MS process monitors the status of the Exadata
storage components for events such as hardware failures and alerts, and notifies
cellsrv via i oct l system calls. Each ASM instance polls every cel l sr v process
on each cell via i oct l system calls to determine if any action is required for ASM
disks as a result of a state change.
Within Oracle ASM, two processes are used to implement automatic disk
management features, both of which are controlled by the Oracle 11gR2 Grid
Infrastructure diskmon process on Exadata:
Exadata Automation Manager, or XDMG, initiates automation tasks related to
Exadata storage. It monitors all available cells for state changes and looks for
events such as inaccessible disk or disk state changes
Exadata Automaton Worker, or XDWK, performs the automation tasks initiated
by XDMG. This process is started when XDMG instructs it to perform an
OFFLI NE, DROP, or ADD action to an ASM disk and is stopped after a give
minute period of inactivity
In your ASM instance, you can disable automatic disk management by setting the
_aut o_manage_exadat a_di sks initialization parameter to false. This
parameter is static and requires an ASM instance restart to take effect.
Some of the important configuration, log, and trace files that are relevant with
automatic disk management functionality are provided below:
$OSSCONF/ cel l _di sk_conf i g. xml on storage cells contains information
about configured objects, such as cell disks, grid disks, etc.
$OSSCONF/ gr i ddi sk. owner . dat on storage cells contains ASM disk
name, ASM disk group name, ASM failgroup name, Cluster identifier, and a
Requires DROP/ADD flag indicator
MS log and trace files in $ADR_BASE/ di ag/ asm/ cel l / [ host ] / t r ace
directory on cell nodes contain log and tracing information relevant to
automatic disk management
The cel l sr v alert log in $ADR_BASE/ di ag/ asm/ cel l / [ host ] / t r ace
directory on cell servers is your cel l sr v alert log
Classic
search
Send feedback
The ASM instance alert log
in $ORACLE_BASE/ di ag/ asm/ +asm/ <i nst ance>/ t r ace contains ASM-
The XDMG and XDWK trace files
in $ORACLE_BASE/ di ag/ asm/ +asm/ <i nst ance>/ t r ace on compute
nodes contain trace information for XDMG and XDWK processes
One topic worth additional discussion is Oracle 11gs ASM Fast Mirror Resync
functionality. When a disk is offlined in an ASM disk group in 11g, any extent
changes to the disk groups files will be queued in ASM up until the disk groups
di sk_r epai r _t i me interval is surpassed. If the disk is replaced, repaired, or
activated prior to the disk_repair_time window, ASM will perform an efficient
resynchronization operation prior to bringing the disk online. If the
di sk_r epai r _t i me window is exceeded before the situation is rectified, ASM
must perform a rebalance operation.
Note: The default value for di sk_r epai r _t i me is 3.6 hours; if you are patching
or performing maintenance on your storage cells, you may consider increasing this
time to limit the performance penalty after the disks are brought online.
Posted 25th J anuary 2013 by J ohn Clarke
0
Add a comment
24th J anuary 2013
As most Oracle DBA and DMA types know, I/O on Exadata happens on the
storage cells. As SQL statements are issued from your Exadata databases,
messages are encapsulated as iDB messages and sent to the storage cells.
When the storage cells ingest these messages, they determine which extents are
required to satisfy the SQL statements, issue I/O calls, and return the data over
iDB back to the compute node databases. Oh, and along the way, they decide
whether they'll use any Exadata storage server software (based on the nature of
the metadata in the iDB message), act as a column/row server (Smart Scan) or
block server (traditional), and so forth.
Exadata's Cell Services software, or cel l sr v, is the thing that makes all of this
happen. cel l sr v is a multi-threaded process whose job is twofold:
Exadata's cellsrv_statedump
Classic
search
Send feedback
Service I/O requests
Implement Exadata storage server software features, such as Smart Scan,
The purpose of this blog post is to talk about generating and interpreting a
cel l sr v state dump. The cel l sr v state dump is a means to translate the
operations of a cell's cel l sr v processes into a human-readable trace file.
Similar to traditional Oracle syst emst at e, hanganal yze, or other types of
dumps, the cel l sr v state dump shows you all sorts of interesting information
about what cel l sr v and its process threads are doing.
Generating a State Dump
First, login to a storage cell and launch CellCLI:
[ r oot @cm01cel 01 si _p] # cel l cl i
Cel l CLI : Rel ease 11. 2. 3. 1. 1 - Pr oduct i on on Wed J an 23 16: 31: 53 EST 2
013
Cel l Ef f i ci ency Rat i o: 65, 642
Cel l CLI >
Then, issue the following command:
Cel l CLI > al t er cel l event s = "i mmedi at e cel l sr v. cel l sr v_st at edump( 0, 0
) "
Dump sequence #2 has been wr i t t en t o / opt / or acl e/ cel l 11. 2. 3. 1. 1_LI NUX
. X64_120607/ l og/ di ag/ asm/ cel l / cm01cel 01/ t r ace/ svt r c_24797_82. t r c
Cel l cm01cel 01 successf ul l y al t er ed
Cel l CLI >
Above, this was the second time I'd issued a cel l sr v_st at edump, which is why
it says "Dump sequence #2 ...". The output of the CellCLI command shows that
we've written a trace file called svt r c_24797_82. t r c. The cell basically sticks
the contents of the cel l sr v_st at edump trace inside one of the cel l sr v
process threads, in this case, thread 82.
Disclai mer
A lot of the information in this post is based on assumptions and some hopefully
educated guesswork. I didn't have any part in the design of cellsrv or the
mechanics of how the contents of the cellsrv_statedump trace file were written, so
some of what I say below may not be true. Feedback welcome!
Readi ng a cel lsrv_statedump ("Pre-Summary" )
Classic
search
Send feedback
The cel l sr v_st at edump trace file contains a great deal of information and
cell. Some of this information can be useful in diagnosing problems, while most of
it probably isn't worthwhile spending time on other than from a purely academic
perspective. One thing that I find interesting when looking at the
cel l sr v_st at edump trace files is that Oracle's has formatted them in a way that
allows you to make some pretty educated assumptions about how the software is
designed and how cel l sr v operates.
A cellsrv_statedump trace file is organized into these sections (at least on an X2-2
running recent patches):
Cell server process state, including process ID information, thread
information, memory utilization/configuration, etc.
Cell initialization parameters
Fixed table allocation information for various cell structures
Quarantine Manager information
cel l sr v thread process information
cel l sr v scheduling log information
IO distribution statistics
I/O latency statistics and histograms
Global Storage Index statistics
GridDisk specific Storage Index statistics
GridDisk owner information
cel l sr v job queue, cache usage, and block read/write statistics
Fixed table allocations for "replacement" structures
Flash Cache statistics
Block I/O resource information and I/O-related histogram information
Flash Logging statistics
Flash Logging stats on a per-GridDisk perspective
Redo write statistics, with respect to "disk first" vs. "flash first"
InfiniBand/l i bcel l /iDB-related "receive" statistics (i.e., SKGXP
"RemoteReceivePort")
InfiniBand/l i bcel l /iDB-related timer records
InfiniBand/l i bcel l /iDB-related "send" statistics (i.e., SKGXP
"RemoteSendPort")
I/O operations by I/O type/reason, with their statistics
IORM state dump information
Elapsed time per I/O type information
Mutex and mutex group information
Trace information for individual physical I/O requests
Classic
search
Send feedback
A summary of reads and writes by I/O reason (i.e., redo log writes, control file
writes, etc.)
As mentioned, there is a ton of data in these trace files. Some of the parts I find
useful are:
The IORM state dump and its "child" sections. Not only do these confirm your
IORM plans, but they have a nice summary of I/O and I/O waits on a per-cell
disk (or per-device) perspective, which can be helpful to understand where
your busy disks are (without having to weed through a bunch of l i st
met r i ccur r ent | met r i chi st or y output)
I/O operations by type/reason, combined with smar t I O sections and
associated database objects can give you a good idea of what types of
segments (from which databases) are using Smart Scan and Storage Indexes
the most
Any section that provides timing and histograms for I/Os of various sizes, I/O
types, etc can be helpful to determine if you have latency issues and identify
whether your storage grid is having a hard time keeping up with the demand.
Of course, there are a number of other ways to get this information; the
cel l sr v_st at edump can serve as another data point in your performance
analysis
The very bottom of the trace file will contain a summary of I/O by I/O reason,
which can be helpful in understanding your overall workload
Readi ng a cel lsrv_statedump: The Detai ls
At the top of a cel l sr v_st at edump, the first section shows the process state of
the storage cell. In this section, you'll find information about the cellsrv process ID,
number of cellsrv threads, memory utilization and sizing, and a number of other
bits of information:
Dumpi ng cont ent s of Cel l Ser ver
Dumpi ng: ossp_pr oc_st at e
host name: cm01cel 01. cent r oi d. com, numi ps: 1
i p_ossp_pr oc_st at e: 192. 168. 10. 3
pi d 24797, t hr ead_i d: 0x2aaaacdbf c68, numt hr eads: 110
cache_saddr : 0x2aaca2e00000, cache_l en: 929849344
hugepage_addr : 0x2aaaaec00000, hugepage_l en: 8390705152
par k: 0, dump_st ack: 0, shut down: 0, shut down_t i mer : 0
dbgl ock_i ni t ed: 0, except i on_mast er : 4294967295, except i on_count : 0
cr ash_on_er r or : 23432584, r andom_seed: 0
Classic
search
Send feedback
os_boot _compl et e: 24797, cel l _i s_l i t t l e_endi an: 1
st at e_dump_l ock: 257, di sabl e_f ul l _t hr ead_st ack: 0
t hr _st at e_dump: ( ni l ) , st ack_dump_cb: 0x13a10760, st ackbuf : 0x59f 6bc
st ackbuf si ze: 1024, st ackdumpi npr ogr ess: 0x13a10f 40
ski p_mal l ocst ackdump: 0
numt hr s: 0, t hr ead0_dump_sysst at e: 0, t ot al _osmem_sga: 570427416
t ot al _osmem_pga: 10388376, t ot al _osmem_f i xed: 14716463504
t ot al _al l ocmem_sga: 453858472
t ot al _al l ocmem_pga: 2139224, cel l _max_memor y: 23440916480
sga_l owmem_t hr eshol d: 1073741824, mem_t hr eshol d_f l ags: 0
sga_l owmem_t hr eshol d_f ai l ur es 0 nomem_t hr eshol d_f ai l ur es 0
Memor y t ype: sga Memor y usage 570427416 byt es
Memor y t ype: pga Memor y usage 10388376 byt es
Memor y t ype: cache Memor y usage 9320554496 byt es
Memor y t ype: st or i dx Memor y usage 910497232 byt es
Memor y t ype: f l ash Memor y usage 3201758144 byt es
Memor y t ype: heapsummar y Memor y usage 18022400 byt es
Memor y t ype: codet ext Memor y usage 78643200 byt es
Memor y t ype: mal l oc Memor y usage 33554432 byt es
Memor y t ype: st ack Memor y usage 1153433600 byt es
net wor k_heap: 0x2aaaade76968 st ar t _net wor k_heap_si ze 67718856
osd_xor _suppor t ed: 1, osd_cr c_suppor t ed: 1
def aul t _t r ace_si ze_l i mi t : 18411482, user _changed_t r ace_l i mi t 0
enabl e_hang_af t er _qusc 0
cel l sr v bi nar y md5 checksum: e788a9e0f c81a55468ab0812d2737e3a
enabl e_hang_af t er _qusc 0 net heapgr ow_l at ch 0x2aaaade768b0
er r st ack_l at ch_acqui r e_f ai l ur es 0
Below this, you'll find a listing of your cell initialization parameters, as set in the
cel l i ni t . or a or otherwise defaulted. This is a good place to determine
whether you have any non-standard settings or to get a better understanding of all
the sorts of cell configuration parameters are used by cel l sr v:
Dumpi ng conf i gur at i on par amet er val ues
Unabl e t o l ookup val ue f or par amet er l ocal _i paddr esses
i paddr ess1 = 192. 168. 10. 3/ 22 ( def aul t = NULL)
Unabl e t o l ookup val ue f or par amet er i paddr ess2
ver si on = 0. 0 ( def aul t = )
Classic
search
Send feedback
_cel l _max_pl l _pr ed_wr i t es = 36
_cel l _pr ed_wr i t es_aut ot une_enabl ed = TRUE
_cel l _pr ed_r eads_aut ot une_enabl ed = TRUE
_cel l _max_f l ash_l ar gei os = 48
_cel l _num_t hr eads_i n_shor t _wai t = 40
Next, you'll see fixed table allocation definitions for a number of different
cel l sr v job structures and their related mutex information. Below we can see
the details for part of the Pr edCacheGet job/structure and some of its mutex
information:
============================= FI XED SI ZE ALLOCATOR ==================
===========
Comment : Pr edCacheGet J ob Fi xed Si ze
Mut exes : 23
al l ocat i onsMust Cl ear : 1
si zeOf EachAl l ocat i on : 952
i ni t i al Count Request ed : 0
al l ocat i onsHWM : 92
numFr eeObj ect s : 92
======================
ALLOCATI ON TABLE MAP
======================
Mut ex Al l ocat i onHWM Fr eeEl ement sNOW
0 4 4
1 4 4
2 3 3
3 4 4
4 4 4
. . Out put omi t t ed
If you search the trace file for lines starting with "Comment" you can see each of
the job structures maintained on the cell and used by cel l sr v:
[ r oot @cm01cel 01 si _p] # gr ep " ^Comment " / opt / or acl e/ cel l 11. 2. 3. 1. 1_LI N
UX. X64_120607/ l og/ di ag/ asm/ cel l / cm01cel 01/ t r ace/ svt r c_24797_82. t r c | g
r ep "Fi xed Si ze"
Comment : SKGXP BI D Fi xed Si ze
Comment : Cache Get J ob Fi xed Si ze
Classic
search
Send feedback
Comment : Cache Put J ob Fi xed Si ze
Comment : Pr edCacheGet J ob Fi xed Si ze
Comment : J obI OCont ext Fi xed Si ze
Comment : Pr edDi skRead J ob Fi xed Si ze
Comment : Pr edFi l t er J ob Fi xed Si ze
Comment : Pr edMapEl emFi xed Si ze
Comment : Pr edDest Buf f er Ct l Fi xed Si ze
Comment : OpenDi sk J ob Fi xed Si ze
Comment : Cl oseDi sk J ob Fi xed Si ze
Comment : Remot eSendPor t Fi xed Si ze
Comment : Remot eOpenI nf o Fi xed Si ze
Comment : FCPer si st mdwr i t ej ob
Comment : Remot eLi st ener Request Fi xed Si ze
Next, you should see any information related to Quarantines and Quarantine
Manager, if applicable:
Dumpi ng Quar ant i ne Manager st at e
Numcur r ent hashed 0 hwm0 t ot al hashed 0
i sI ni t 1 of f l oadDi sabl ed 0 r pmver OSS_11. 2. 3. 1. 1_LI NUX. X64_120607
qmSt at eFi l ePat h_QM / opt / or acl e/ cel l 11. 2. 3. 1. 1_LI NUX. X64_120607/ cel l sr
v/ depl oy/ conf i g/ qmst at e. or a
t hr eadSt at eFi l ePat hPr ef i x_QM / opt / or acl e/ cel l 11. 2. 3. 1. 1_LI NUX. X64_120
607/ cel l sr v/ depl oy/ conf i g/ . qmst at et hr
of f l oad t hr eshol d 3 db t hr eshol d 3 sql quar ant i ne di sabl ed 0
di sk r egi on quar ant i ne di sabl ed 0 numDi skRegi onsDequar ant i ned_QM 0
numFai l ur esToMoni t or Di skRegi ons 0numFai l ur esI nval i dat eDi skRegi ons_QM
0cur NumSi mRai l RoadCr ashes_QM 0
ent i t yI D_QM 0
MyRemovedQMObj ect sLi st : hwm=0 si ze=0 t ot al =0
Doesn' t have moni t or ed ent i t y
After this, you'll see information about each cel l sr v thread, in which below we
see 110 threads and information about each thread's wait state, time, wait objects,
etc.:
[ r oot @cm01cel 01 si _p] # gr ep " ^User Thr ead" / var / l og/ or acl e/ di ag/ asm/ ce
l l / cm01cel 01/ t r ace/ svt r c_24797_82. t r c| gr ep t hr eadI D| wc - l
111
[ r oot @cm01cel 01 si _p]
Classic
search
Send feedback
Dumpi ng t hr ead i nf or mat i on: START
User Thr ead: 0x2aaaacf a21e0 t hr eadI D: 0 pt hr eadI D: 1098852672
st at us: 2 wai t st at eName: wai t i ng_f or _syst em_wor k
wai t St ar t Ti me: Wed J an 23 16: 32: 10 2013 wai t Dur at i on( msec) : 174
wai t Obj Name: ( ni l ) [ - NA- ] wai t Obj Pt r : ( ni l ) wai t Locat i on: - NA-
Cur r ent Hol der [ pt hr eadI D: 0 Loc: - NA- ]
Hol der At Wai t Ent r y[ pt hr eadI D: 0 Loc: - NA- ]
memAl l ocFai l ur eOK: 0 cl oseTr aceFi l eFd: 0
dumpi ngDi agI nf o: 0
User Thr ead: 0x2aaaacf a29d8 t hr eadI D: 1 pt hr eadI D: 1085987136
Cur J ob: 0x2aade18f dd98 cur J obName: Remot e Li st ener
st at us: 2 wai t st at eName: wai t i ng_f or _connect
wai t St ar t Ti me: Wed J an 23 16: 32: 10 2013 wai t Dur at i on( msec) : 468
wai t Obj Name: ( ni l ) [ - NA- ] wai t Obj Pt r : ( ni l ) wai t Locat i on: - NA-
Cur r ent Hol der [ pt hr eadI D: 0 Loc: - NA- ]
Hol der At Wai t Ent r y[ pt hr eadI D: 0 Loc: - NA- ]
memAl l ocFai l ur eOK: 0 cl oseTr aceFi l eFd: 0
dumpi ngDi agI nf o: 0
. . . LI nes omi t t ed
Dumpi ng t hr ead i nf or mat i on: END
Next, you will find information about cel l sr v's scheduling operations and see
which types of I/O calls/types/functions are occurring. Below, I am showing the
first few lines representing a handful of different types of I/O types. This section
can be quite lengthy depending on how active your cells are:
Dumpi ng schedul i ng l og
Wed J an 23 16: 32: 00 2013. 748589: J obType=Pr edi cat eCacheGet j obAddr =0x
2aade33f cf 58 r esType=7 t hr eadI D=0x1f sour ceI D=748377076 r ef i d=981
2aade33f cf 58 r esType=4 t hr eadI D=0x1f sour ceI D=748377076 r ef i d=981
2aae0bf dce38 r esType=7 t hr eadI D=0x4c sour ceI D=748377052 r ef i d=1012
2aae0bf dce38 r esType=4 t hr eadI D=0x4c sour ceI D=748377052 r ef i d=1012
Wed J an 23 16: 32: 00 2013. 748762: J obType=Net wor kRead j obAddr =0x2aade2
236b08 r esType=6 t hr eadI D=0x55 sour ceI D=4294967295 r ef i d=65535
Wed J an 23 16: 32: 00 2013. 749340: J obType=Net wor kRead j obAddr =0x2aad21
b45760 r esType=6 t hr eadI D=0x1d sour ceI D=4294967295 r ef i d=65535
2aae0bf d59b8 r esType=7 t hr eadI D=0x50 sour ceI D=748377060 r ef i d=928
Classic
search
Send feedback
2aae0bf d59b8 r esType=4 t hr eadI D=0x50 sour ceI D=748377060 r ef i d=928
2aade38337f 8 r esType=7 t hr eadI D=0x54 sour ceI D=748377071 r ef i d=1042
Wed J an 23 16: 32: 00 2013. 750291: J obType=Pr edi cat eDi skRead j obAddr =0x
2aae0a327738 r esType=7 t hr eadI D=0x54 sour ceI D=748377073 r ef i d=65535
2aade38b79b0 r esType=7 t hr eadI D=0x54 sour ceI D=748377052 r ef i d=65535
2aade382aae0 r esType=7 t hr eadI D=0x54 sour ceI D=748377071 r ef i d=65535
Wed J an 23 16: 32: 00 2013. 751730: J obType=Net wor kRead j obAddr =0x2aade2
236b08 r esType=6 t hr eadI D=0x55 sour ceI D=4294967295 r ef i d=65535
Wed J an 23 16: 32: 00 2013. 751746: J obType=Pr edi cat eCachePut j obAddr =0x
2aade38f 0f d8 r esType=7 t hr eadI D=0x44 sour ceI D=748377080 r ef i d=1012
2aae0a295ca0 r esType=7 t hr eadI D=0x1a sour ceI D=748377080 r ef i d=65535
After this comes some of the fun stuff. The next section shows information about
your I/O distribution, both across all disks and for each individual cell disk. Below
I'm showing the cumulative statistics as well as for one of the cell disks:
Cumul at i ve I O- si ze di st r i but i on st at s f or ALL cdi sks
I O l engt h ( byt es) : Numr ead I Os: Numwr i t e I Os:
[ 512 - 1023) 67824 21712
[ 1024 - 2047) 0 6499
[ 2048 - 4095) 68 23106
[ 4096 - 8191) 1657 48811
[ 8192 - 16383) 26761 22504
[ 16384 - 32767) 92522 133428
[ 32768 - 65535) 23364 29962
[ 65536 - 131071) 10568 5922
[ 131072 - 262143) 90109 7362
[ 262144 - 524287) 1697 78
[ 524288 - 1048575) 770 43
[ 1048576 - 2097151) 23620 289
Cel l Di sk 0x2aaaae131960: name = CD_00_cm01cel 01, di skHandl e = 4691255
1893424, pendi ngReads = 0, pendi ngWr i t es = 0, pendi ngReadSumSi zes = 0,
pendi ngWr i t eSumSi zes = 0, i oEr r s = 0, cor r upt i ons = 0, myCdi skFl ags=0
x10, cdSt at e=NORMAL CDPoor Per f Type=CD_GOOD ( 1)
Fr ee Segment Li st :
I O l engt h ( byt es) : Numr ead I Os: Numwr i t e I Os:
[ 512 - 1023) 8 275
Classic
search
Send feedback
[ 1024 - 2047) 0 18
[ 2048 - 4095) 0 1864
[ 8192 - 16383) 51 1119
[ 16384 - 32767) 8 8443
[ 32768 - 65535) 1509 563
[ 65536 - 131071) 1257 529
[ 131072 - 262143) 9365 658
[ 262144 - 524287) 69 2
[ 524288 - 1048575) 32 0
[ 1048576 - 2097151) 2023 16
From this information, you can see the breakdown of read and write I/Os for
different I/O sizes. There are several interesting bits of data above, one of which
is the CDPoor Per f Type flag, which in our case shows CD_GOOD (meaning that
performance statistics are "good" for this cell disk).
Below this, for each cell disk, we see a section showing the average I/O latency:
Aver age I O- l at ency di st r i but i on st at s f or CDi sk CD_00_cm01cel 01
Number of Reads l en- l at ency di st r i but i on
I j l en( B) \ I O l at ( us) | | [ 32 | [ 64 | [ 128 | [ 256 | [ 5
12 | [ 1024 | [ 2048 | [ 4096 | [ 8192 | [ 16384 | [ 32768 |
[ 65536 | [ 131072 | [ 262144 |
| | 63) | 127) | 255) | 511) | 1023) | 2047
) | 4095) | 8191) | 16383) | 32767) | 65535) | 131071) | 2
62143) | 524287) |
- - - - - - - - - - - - - - - - - - - - - | | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - -
- - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - -
| - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - |
[ 512, 1023) | | 0 | 0 | 4 | 0 | 0 | 0 |
2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
[ 4096, 8191) | | 2 | 29 | 11 | 8 | 1 | 0 |
4 | 11 | 6 | 0 | 0 | 0 | 0 | 0 |
[ 8192, 16383) | | 0 | 12 | 1 | 0 | 1 | 3 |
14 | 18 | 2 | 0 | 0 | 0 | 0 | 0 |
You'll see latencies for both reads and writes for each cell disk, as well as flash
disks, so these sections are quite lengthy. A summary of each of the types of
information you'll see for each disk is below:
Classic
search
Send feedback
[ r oot @cm01cel 01 si _p] # gr ep " ^Aver age I O- l at ency di st r i but i on st at s f
or " / opt / or acl e/ cel l 11. 2. 3. 1. 1_LI NUX. X64_120607/ l og/ di ag/ asm/ cel l / cm01
> sor t - u
Aver age I O- l at ency di st r i but i on st at s f or CDi sk
[ r oot @cm01cel 01 si _p] #
[ r oot @cm01cel 01 si _p] # gr ep di st r i but i on / opt / or acl e/ cel l 11. 2. 3. 1. 1_L
I NUX. X64_120607/ l og/ di ag/ asm/ cel l / cm01cel 01/ t r ace/ svt r c_24797_82. t r c| s
or t - u| gr ep - v " f or CDi sk"
Aver age Lat ency of Reads ( per cent age of wr i t es) i osi ze- pendi ngI OCount
di st r i but i on
Aver age Lat ency of Reads ( per cent age of wr i t es) i osi ze- Pendi ngI OSi zes
di st r i but i on
Aver age Lat ency of Wr i t es ( per cent age of wr i t es) i osi ze- pendi ngI OCoun
t di st r i but i on
Aver age Lat ency of Wr i t es ( per cent age of wr i t es) i osi ze- Pendi ngI OSi ze
s di st r i but i on
Cumul at i ve I O- si ze di st r i but i on st at s f or ALL cdi sks
Number of Reads l en- l at ency di st r i but i on
Number of Wr i t es i osi ze- l at ency di st r i but i on
[ r oot @cm01cel 01 si _p] #
Below this, you'll find a section about your Storage Indexes; specifically, global
Storage Index statistics:
- - - - - - St or ageI dx Gl obal St at s: - - - - -
di agMode_St or ageI dx 23736772 numI OSaved 23741856 numI Oi dxLookedupBut N
ot Fi l t er 0 numByt esi dxLookedupBut Not Fi l t er 0
numCant Fi l t I OObj Mi smat ch 0 numCant Fi l t I OI OAcr oss1MBRgn 0numCant Fi l t I O
LoMSCN 0
numCant Fi l t I OLoQSCN 7733 numCant Fi l t I ONoSupp 0numCant Fi l t I ONoTur boSca
nPr eds 0
numCant Fi l t I OUncl usCol 0 numCant Fi l t I OMi sc 0numCant Fi l t I OI O2Uni ni t Rgn
0
numCant Fi l t I OI O2i nval i dRgn 0numCant Fi l t I OI Oout si deRgn 24242
numCant Fi l t I OPr edNot Sel ect i ve 75923numCant Fi l t I ONoI nt er est i ngPr ed_SI
40
numI dxNot Bui l t TooSmal l ReadI Os 0numI dxNot Bui l t Over l appi ngWr s 0
numI dxNot Bui l t NoCl ust er Col 83208 numI dxNot Bui l t Comput eEr r 0numI dxNot B
ui l t Unal i gnedWr i t e 0
numI dxNot Bui l t Unal i gnedRead 0 numI dxNot Bui l t Obj Mi smat ch 9583numI dxNot
Bui l t NoFr eeRi dx 0
Classic
search
Send feedback
numI dxNot Bui l t I OUndef i ned 3numI dxNot Bui l t I OLessThI OThr esRd 0numI dxNot
Bui l t I OLessThI OThr esWr 0numI dxNot Bui l t NoCol St or edi nRI DX 909
Bui l t Chai nedRowHeadP 0
numI dxNot Bui l t Chai nedRowMi sc 0numI dxNot Bui l t NoDat aBl ock 0numI dxNot Bui
l t EHCCBl k 0
numI dxNot Bui l t Pr obPr ocEHCCBl k 0numI dxNot Bui l t Encr upt ed 0
numI dxNot Bui l t St al eBuf f er 0
For each GridDisk, you'll then see a list of "local", grid-disk specific Storage Index
statistics:
Gr i dDi sk 0x2aace733ad68: name = RECO_CD_02_cm01cel 01, number =
2039251028, si ze = 140509184, numI ssued = 180, pendi ngI Os = 0, st at e =
402, numOpens=6, numSegment s = 2, numExt ent s = 4288, numI OEr r s = 0 nu
mCor r upt i ons = 0 di skNumber =2039251028 r ef er enceCnt = 0 myGdi skFl ags
= 0x0 gdi skOwner I nf o = 0x2b4251170b58
Al l ocat i on Map:
[ of f : 145104M, sz: 22864M] [ of f : 480592M, sz: 45744M]
ACL 0x2aace733b208: numEnt r i es = 0
Numcur r ent hashed 0 hwm0 t ot al hashed 0
St or ageI dx St at s: - - - - -
i sUsabl e_St or ageI dx 1 numI OSaved 0 numI Oi dxLookedupBut Not Fi l t er 0 num
Byt esi dxLookedupBut Not Fi l t er 0
numCant Fi l t I OObj Mi smat ch 0 numCant Fi l t I OI OAcr oss1MBRgn 0numCant Fi l t I O
LoMSCN 0
numCant Fi l t I OLoQSCN 0 numCant Fi l t I ONoSupp 0numCant Fi l t I ONoTur boScanPr
eds 0
numCant Fi l t I OUncl usCol 0 numCant Fi l t I OMi sc 0numCant Fi l t I OI O2Uni ni t Rgn
0
numCant Fi l t I OI O2i nval i dRgn 0numCant Fi l t I OI Oout si deRgn 0
numCant Fi l t I OPr edNot Sel ect i ve 0numCant Fi l t I ONoI nt er est i ngPr ed_SI 0
numI dxNot Bui l t TooSmal l ReadI Os 0numI dxNot Bui l t Over l appi ngWr s 0
numI dxNot Bui l t NoCl ust er Col 0 numI dxNot Bui l t Comput eEr r 0numI dxNot Bui l t
Unal i gnedWr i t e 0
numI dxNot Bui l t Unal i gnedRead 0 numI dxNot Bui l t Obj Mi smat ch 0numI dxNot Bui
l t NoFr eeRi dx 0
numI dxNot Bui l t I OUndef i ned 0numI dxNot Bui l t I OLessThI OThr esRd 0numI dxNot
Bui l t I OLessThI OThr esWr 0numI dxNot Bui l t NoCol St or edi nRI DX 0
numI dxNot Bt i l t Cant Pr ocessCol 0numI dxNot Bui l t Li bMi smat ch 0numI dxNot Bui
l t Chai nedRowHeadP 0
Classic
search
Send feedback
l t EHCCBl k 0
numI dxNot Bui l t Pr obPr ocEHCCBl k 0numI dxNot Bui l t Encr upt ed 0
numUnConvFai l ur es_SI 0numI dxNot Bui l t ConvFai l ur es 0
Dat aMover : er r or =0 numMovedExt ent s=0 numNewRel ocat edExt ent s=0
dmFl ags=0x0 movi ngExt ent . st ar t Sect or Addr =0 count I osOnMovi ngExt ent =0 w
ai t i ngI os_l i st : sz=0 hwm=0 i nPr ogr essI os_l i st : sz=0 hwm=2
Next, your trace file will contain Grid Disk owner information for each of your grid
disks:
Dumpi ng Gr i dDi skOwner . . .
Gr i dDi skOwner :
Gr i d di sk name( DBFS_DG_CD_06_cm01cel 01)
Gui d: bc944f 16- 5eba- 49eb- 98e9- 74bbc8cf 6b1c
Rei d: ci d=638715b73dd64f 04bf 5e08392ce70dc1, i ci n=243005881, nmn=0, l ni d=
0, gi d=0, gi n=0, gmn=0, umemi d=0, opi d=0, opsn=0, l vl =cl ust er hdr =0xf ece0100
ASM Di sk Name: DBFS_DG_CD_06_CM01CEL01
ASM Di skGr oup Name: DBFS_DG
ASM Fai l Gr oup Name: CM01CEL01
I s new r epl acement : 0
Fl ags: 0
Ref Cnt : 1
Pr oact i ve Dr op Opcode: 0
. . . done dumpi ng Owner I nf o.
Next comes some sections showing cel l sr v job queue, cache usage, and block
read/write statistics. In this context, cache refers to O/S memory on the cell:
Cel l sr v J ob Queue: Buf Wai t Obj queue i n Cache
Buf Wai t Obj Li st f or bl ock si ze 512 hwm=0 si ze=0 t ot al =0
Buf Wai t Obj Li st f or bl ock si ze 67108864 hwm=0 si ze=0 t ot al =0
Cel l sr v J ob Queue: Out st andi ng queue i n cache
hwm=355 si ze=0 t ot al =575324
Cel l sr v J ob Queue: Compl et ed queue i n cache
hwm=67 si ze=0 t ot al =312818
Cel l sr v J ob Queue: Hi gh pr i or i t y compl et ed queue i n cache
hwm=20 si ze=0 t ot al =262506
Cache usage st at i st i cs:
Classic
search
Send feedback
Bl ock si ze: 512 Buf f er pool si ze: 5000 Cur r ent usage count : 601
Fr eel i st si ze 3799 ( does not i ncl ude r eser ved buf f er s)
Bl ock wr i t e r eser ve usage st at i st i cs:
Bl ock si ze: 512: Resv Si ze 300: Resv f ai l : 301 St at s: hwm=300
si ze=300 t ot al =30401
0 si ze=34 t ot al =30180
Bl ock si ze: 67108864: Resv Si ze 0: Resv f ai l : 0 St at s: hwm=0 s
i ze=0 t ot al =0
Bl ock r ead r eser ve usage st at i st i cs:
Bl ock si ze: 512: Resv Si ze 300: Resv f ai l : 0 St at s: hwm=300 s
i ze=300 t ot al =300
si ze=300 t ot al =300
. . . Out put omi t t ed Bl ock si ze: 67108864: Resv Si ze 0: Resv f ai l :
0 St at s: hwm=0 si ze=0 t ot al =0
Under this comes a section showing the number of buffers used for different I/O
operations, which can be useful to show the I/O type per I/O size:
Cache usage st at i st i cs by j obs:
Buf f er s of si ze 512 used:
by Cacheget j obs - 0 by CachePut j obs - 0 by Opendi sk j obs - 0 b
y Cl osedi sk j obs - 0
by Net wor kRead j obs - 600 by Pr edi cat edi skr ead j obs - 0 by Pr edi
cat edi skwr i t e j obs - 0
by Pr edi cat eCacheGet j obs - 0 by Pr edi di cat eCachePut j obs - 0 by
Pr edi cat eFi l t er j obs - 0
by Pr edi cat eTr acki ng - 0 by Pr ocessI oct l Pgsz - 1 Uni dent i f i ed
j obs - 0
by Cacheget j obs - 0 by CachePut j obs - 0 by Opendi sk j obs - 0 b
Classic
search
Send feedback
j obs - 0
by Pr edi cat eTr acki ng - 0 by Pr ocessI oct l Pgsz - 0 Uni dent i f i ed
j obs - 0
Below this, similar to the fixed table job structures at the top of the trace file, there
is a section showing statistics for cache "replacement" structures, or dynamically
updated memory structures whose contents change based on activity. This
section shows the allocation table map and #of mutexes, as previously, but also
shows a section below it about the type of mmap operations per object type:
============================= FI XED SI ZE ALLOCATOR ==================
===========
Comment : cache r epl acement Q_32k
Mut exes : 23
al l ocat i onsMust Cl ear : 0
si zeOf EachAl l ocat i on : 248
i ni t i al Count Request ed : 5000
al l ocat i onsHWM : 5000
numFr eeObj ect s : 3800
======================
ALLOCATI ON TABLE MAP
======================
Mut ex Al l ocat i onHWM Fr eeEl ement sNOW
0 217 166
1 217 155
2 217 177
Dumpi ng t he mmap al l oc map
numEnt r i es 67 buf f er Ful l 0 boot St r apFul l Count 0
1: addr 0x2b424f 074000 si ze 9011200 t ype ALLOCATE comment HEAP SUMMAR
Y AREA: SGA HEAP
2: addr 0x2b424f 90c000 si ze 9011200 t ype ALLOCATE comment HEAP SUMMAR
Y AREA: SUBHEAP
ap memor y Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
4: addr 0x2aacdf 6c1000 si ze 5034120 t ype ALLOCATE comment St or age I nd
ex: SRECO_CD_00_cm01cel 01
ex: RECO_CD_00_cm01cel 01
6: addr 0x2aace0454000 si ze 19347464 t ype ALLOCATE comment St or age I n
dex: SDATA_CD_00_cm01cel 01
7: addr 0x2aace16c8000 si ze 38969352 t ype ALLOCATE comment St or age I n
dex: DATA_CD_00_cm01cel 01
. . . out put omi t t ed
After this comes a section about your Flash Cache and its statistics. There is
quite a bit more information below this related to Flash Cache, but I'll leave it out
for brevity:
Fl ashCache st at e=OPERATI ONAL numFl ashCacheSt or eGr i dDi sks=16 ReadVer i f
Level =cr c
NumTr ackedI Os=0 Out st Popul at eJ obQueue_si ze=0 Compl et edPopul at eJ obQu
eue_si ze=0
Def er r edJ obQueue_si zes: onSt at eLock=0 onLowResour ces=0
numFCChunkWr i t es=55664 numFCChunkReads=124528
numFCLar geReadsQual i f i ed=0 numFCLar geReadRej ect i ons=0
numByt esAvoi dedDbBl kChksumCal c=0 numDbBl kChksumCor r ups=0
numReadPi nEr r PageMaxUse=0 numReadPi nEr r NoFr eePage=0
numWr i t ePi nEr r PageMaxUse=0 numWr i t ePi nEr r NoFr eePage=0
numWr i t ePi nEr r PageMdSync=0
Next in the trace file you'll see a list of your disk devices in a "Printing block IO
resource" section. These are the devices for your cell disks and operating system
disks:
Pr i nt i ng bl ock I O r esour ce
Bl ockI OFi l e 0x2aaaadf db9b0: name = / dev/ sda3
Bl ockI OFi l e 0x2aaaacbb27e0: name = / dev/ sdaa
Bl ockI OFi l e 0x2aaaae6846c0: name = / dev/ sdab
Bl ockI OFi l e 0x2aace4418230: name = / dev/ sdac
Bl ockI OFi l e 0x2aace4074580: name = / dev/ sdb3
Bl ockI OFi l e 0x2aace3bf 5e40: name = / dev/ sdc
Bl ockI OFi l e 0x2aace6bf f c48: name = / dev/ sdd
Bl ockI OFi l e 0x2aacec504a58: name = / dev/ sde
Classic
search
Send feedback
After this, the trace file contains a list of your I/O histograms for reap and
queue problems; the output below shows that we had no more than 2 outstanding
I/Os across our disks:
Pr i nt i ng bl ock I O r eap hi st ogr am:
ReapCount 1 - 514276 - per cent age: 91%
Pr i nt i ng bl ock I O out st andi ng I O hi st ogr am:
Out st andi ngCount 0 - 322920 - per cent age: 28%
Below this section you'll find Flash Logging statistics and information, which also
can can found by using a CellCLI l i st met r i ccur r ent command. On an
X2-2 cell, you'll see 16 sections for different FlashLog "stores", which correspond
to the 16 different flash partitions:
Fl ash Log:
# of act i ve f l ash l og st or es = 16
r edo wr i t e l at ency t hr eshol d = 500000 mi cr oseconds
# of buf f er al l ocat i on f ai l ur es = 13
# of FC conf l i ct s = 0
# of r ead col l i si ons = 0
# of wr i t e col l i si ons = 0
over al l max f l ash_f i r st l at ency = 870 mi cr oseconds
di sabl ed dat abases = <NONE>
st at i st i cs:
FL_I O_W= 31808
FL_I O_W_SKI P_LARGE = 0
FL_I O_W_SKI P_BUSY = 0
FL_I O_DB_BY_W= 125535744
FL_I O_FL_BY_W= 196796416
FL_FLASH_I O_ERRS = 0
FL_DI SK_I O_ERRS = 0
FL_BY_KEEP = 0
FL_FLASH_FI RST = 329
FL_DI SK_FI RST = 31479
Classic
search
Send feedback
FL_FLASH_ONLY_OUTLI ERS = 0
FL_ACTUAL_OUTLI ERS = 0
Fl ashLog St or e #0 ( 0x2aad21e68658) : cm01cel 01_FLASHLOG ( di sk #6688035
16) , cdi sk = FD_01_cm01cel 01
st at e = act i ve
st ar t t i me = Wed J an 23 13: 50: 55 2013
up t i me = 0 day( s) , 2 hour ( s) , 41 mi nut e( s) , 15 second( s)
si ze = 32MB
cur r ent of f set = 12288
# of pendi ng wr i t es = 0
t ot al # of wr i t es = 1985
# of wr i t e er r or s = 0
# of r ead er r or s = 0
# of cor r upt i ons = 0
# of t i mes f l ash wr i t e f i ni shed f i r st = 22
# of out l i er s = 0
# of byt es wr i t t en = 12869632
# of wr aps = 0 ( 0 seconds per wr ap)
act i ve t abl e ( 1024 ent r i es) : [ 0x2aad21e58638, 0x2aad21e68638)
# of seconds si nce l ast checkpoi nt = 240
# of checkpoi nt s = 341
# of checkpoi nt ed act i ve r egi on ent r i es = 1986 ( 5 ent r i es per checkpo
i nt )
# of t i mes act i ve t abl e was f ul l = 0
# of t i mes act i ve r egi on was f ul l = 0
max f l ash_f i r st l at ency = 180 mi cr oseconds
Below this there is a section showing Flash Logging related statistics on a per
Grid-Disk perspective, which means that as redo writes are targeted to Grid Disks
and are satisfied in Flash first, the stats will show up here:
Fl ashLogGDi skSt at e ( 0x2aade3abdbe8) :
gr i d_di sk=RECO_CD_07_cm01cel 01 sync_wr i t e_er r or s=0 f l ags=( )
Act i ve_Wr i t es l i st :
over l ap_checks=88 qui ck_over l ap_checks=88 qui ck_l ocked_over l ap_ch
ecks=0 ent r i es_checked_f or _over l ap=0
addi t i ons=51 ent r i es_checked_f or _addi t i ons=0
Pendi ng_Wr i t es l i st :
over l ap_checks=129 qui ck_over l ap_checks=129 qui ck_l ocked_over l ap_
checks=0 ent r i es_checked_f or _over l ap=0
Classic
search
Send feedback
Saved_Redo l i st :
over l ap_checks=129 qui ck_over l ap_checks=129 qui ck_l ocked_over l ap_
Next, there is a nice little histogram showing you the redo write length vs. redo
write latency. Oracle's done some nice instrumentation here showing you the
balance between "flash first" and "disk first" for redo writes:
Redo wr i t e l engt h vs. r edo wr i t e l at ency hi st ogr am:
1st val ue: # of t i mes t hat t he f l ash wr i t e f i ni shed i n t he gi ven t i
me
2nd val ue: # of t i mes t hat t he di sk wr i t e f i ni shed i n t he gi ven t i m
e
3r d val ue: # of t i mes t hat t he f ast est wr i t e f i ni shed i n t he gi ven t
i me
- - - - - - - - - - - - - - - - - - - - - | | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - -
- - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - -
| - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - |
I O l en( B) / I O l at ( us) | | ( 0, | ( 1000, | ( 2000, | ( 4000, | (
8000, | ( 16000, | ( 32000, | ( 64000, | ( 128000, | ( 256000, | (
512000, | ( 1024000, |
| | 1000] | 2000] | 4000] | 8000] | 16000] | 32
000] | 64000] | 128000] | 256000] | 512000] | 1024000] | . . . . . .
. ] |
- - - - - - - - - - - - - - - - - - - - - | | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - -
- - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - -
| - - - - - - - - - - - - | - - - - - - - - - - - - | - - - - - - - - - - - - |
( 0, 4096] | | 30385 | 4 | 2 | 1 | 4 | 4 |
0 | 0 | 0 | 0 | 0 | 0 |
| | 30127 | 29 | 34 | 55 | 87 | 62 |
4 | 2 | 0 | 0 | 0 | 0 |
| | 30400 | 0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | 0 | 0 |
( 4096, 8192] | | 450 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 |
| | 447 | 0 | 1 | 0 | 2 | 0 | 0
| 0 | 0 | 0 | 0 | 0 |
| | 450 | 0 | 0 | 0 | 0 | 0 | 0
| 0 | 0 | 0 | 0 | 0 |
( 8192, 16384] | | 211 | 0 | 0 | 0 | 0 | 0 |
Classic
search
Send feedback
0 | 0 | 0 | 0 | 0 | 0 |
The next set of sections deals with InfiniBand network related statistics for I/O
request transmitted via iDB and picked up for processing for each cell disk. You
will see 12 sections that have the string "Remot eRecei vePor t ", one for each
disk, followed by a number of subsections contain a great deal of information. In
the context of Exadata, the storage cells are InfiniBand packet "receivers" and the
compute nodes are "senders" - there will be Remot eSendPor t set of sections for
each compute node later in the trace file.
The format of these sections flows like this:
Remot eRecei vePor t
Pr i nt i ng r emot e r ecei ve por t 0x2aad21b66e68
Pr i nt i ng skgxp mi sc st at s
Pr i nt i ng skgxp por t st at s
Dumpi ng SKGXP ct x: (SKGXP is a l i bcel l thing ...)
The data will look something like this:
Remot eRecei vePor t 0x2aad21b66e68: si ze = 0, out st andi ngRecei ves = 1
50, msg hdr al l ocat i on f ai l ur es 0
Pr i nt i ng r emot e r ecei ve por t 0x2aad21b66e68
Pr i nt i ng skgxp mi sc st at s
ct xpt r - 0x2aad21b51a18
ospi d - 24797
. . . Omi t t ed
Pr i nt i ng skgxp por t st at s
pt - 0x2aad21e68b30
pi d - 24797
. . . Omi t t ed
Dumpi ng SKGXP ct x: 0x2aad21b51de8
SKGXP: [ 2aad21b51de8. 573] {ct x}: SKGXPCTX: 0x0x2aad21b51de8 ct x Exadat a
Remot eRecei vePor t
SKGXP: [ 2aad21b51de8. 574] {ct x}: Compat i bl e: 0 0. 0 0
SKGXP: [ 2aad21b51de8. 575] {ct x}: SKGXP Ver si ons: 3. 3. 4. 2. 3. c0. 4( 3. 2. 4. 1
)
SKGXP: [ 2aad21b51de8. 576] {ct x}: DBG f l ags1: 0
. . . Omi t t ed
SKGXP: [ 2aad21b51de8. 583] {ct x}:
SKGXP: [ 2aad21b51de8. 584] {ct x}: WAI T HI STORY
Classic
search
Send feedback
Type Ret ur n Code
SKGXP: [ 2aad21b51de8. 586] {ct x}: ( ms) pr ev wai t ( ms) bef or e
- - - - - - - - - - - - - - - - - - -
SKGXP: [ 2aad21b51de8. 588] {ct x}: 102 0 0 NORMAL TI MEDOU
T
. . . Omi t t ed
SKGXP: [ 2aad21b51de8. 648] {ct x}: Por t queue
SKGXP: [ 2aad21b51de8. 649] {ct x}: SKGXPT 0x2aad21b51bf 8 por t no = 2455421
38
SKGXP: [ 2aad21b51de8. 650] {ct x}: f l ags=1000028 nr eqs=150 f r ee_r buf s=150
msgsz=216 mi n_f r ag_sz_ach=216
SKGXP: [ 2aad21b51de8. 651] {ct x}: OS Level Por t
SKGXP: [ 2aad21b51de8. 652] {ct x}: SSKGXPT 0x2aad21b51c58 f l ags 0x0 sockn
o 164 I P 192. 168. 10. 3 RDS 14052 l er r 0
SKGXP: [ 2aad21b51de8. 653] {ct x}: OS Level Por t I D
SKGXP: [ 2aad21b51de8. 654] {ct x}: SKGXPGPI D I nt er net addr ess 192. 168. 10
. 3 RDS por t number 14052
SKGXP: [ 2aad21b51de8. 655] {ct x}: Pendi ng r ecei ve r equest s
SKGXP: [ 2aad21b51de8. 656] {ct x}: SKGXPRQH 0x2aade2898410 r equest t ype R
CV st at us I NPROGRESS f l ags 0x1 pt [ 0x2aad21b51bf 8] cnh[ ( ni l ) ] r pc[ ( ni l )
] bi d[ 0x2aade2cd5f 18] mcph[ ( ni l ) ] expt [ 0] peer pi d ( uni ni t ) 0@0. 0. 0. 0
SKGXP: [ 2aad21b51de8. 657] {ct x}: seg: f set [ 0x9] f f i x[ 0x9] user [ ( 1) : 15
2] ddp[ ( ni l ) : 0] f r agno[ 0] t ot [ 1] f r agsz=216 seqn[ 0, 0]
SKGXP: [ 2aad21b51de8. 658] {ct x}: SKGXPRQH 0x2aade2898848 r equest t ype R
CV st at us I NPROGRESS f l ags 0x1 pt [ 0x2aad21b51bf 8] cnh[ ( ni l ) ] r pc[ ( ni l )
] bi d[ 0x2b4250f def e8] mcph[ ( ni l ) ] expt [ 0] peer pi d ( uni ni t ) 0@0. 0. 0. 0
SKGXP: [ 2aad21b51de8. 659] {ct x}: seg: f set [ 0x9] f f i x[ 0x9] user [ ( 1) : 15
2] ddp[ ( ni l ) : 0] f r agno[ 0] t ot [ 1] f r agsz=0 seqn[ 0, 0]
. . . Omi t t ed
SKGXP: [ 2aad21b51de8. 857] {ct x}: Ther e ar e 99 pendi ng r ecei ve r equest s
on t hi s por t
SKGXP: [ 2aad21b51de8. 858] {ct x}: Accept handl es wi t h Pendi ng Acks
SKGXP: [ 2aad21b51de8. 859] {ct x}: No pendi ng acks t o del i ver on t hi s por
t
SKGXP: [ 2aad21b51de8. 860] {ct x}: Accept handl es wi t h Pendi ng RDMAs
SKGXP: [ 2aad21b51de8. 862] {ct x}: Regi on queue
SKGXP: [ 2aad21b51de8. 863] {ct x}: SKGXPRGNSTATE: 0x2aad21b53320
SKGXP: [ 2aad21b51de8. 864] {ct x}: r gns mapped: 0 numshar ed: 0 max: 0
Classic
search
Send feedback
SKGXP: [ 2aad21b51de8. 866] {ct x}: shar ed por t s open: 0 l i mi t s ( 1, 16)
SKGXP: [ 2aad21b51de8. 868] {ct x}: space - used: 0 max possi bl e: 0 sock l
i mi t : 0
SKGXP: [ 2aad21b51de8. 869] {ct x}: r gnar r [ ( ni l ) ( 0) ] r gnpt s[ 0] =( ni l )
SKGXP: [ 2aad21b51de8. 870] {ct x}: REGI ON ARRAY
SKGXP: [ 2aad21b51de8. 871] {ct x}: SHARED PORT ARRAY
SKGXP: [ 2aad21b51de8. 873] {ct x}: Dumpi ng Connect i on Handl e Tabl eSKGXP: [
2aad21b51de8. 874] {ct x}: hdl sconno aconno admno Rmt Pi d s
t at e seq# msgs r t r ans cr edi t s r t t l ast ack i d i p
SKGXP: [ 2aad21b51de8. 876] {ct x}: Dumpi ng Accept Handl e Tabl eSKGXP: [ 2aad
21b51de8. 877] {ct x}: hdl aconno sconno admno Rmt Pi d st at e
seq# msgs r t r ans cr edi t s acks i d i p
SKGXP: [ 2aad21b51de8. 878] {ct x}: ACH Tabl e Bucket : 94SKGXP: [ 2aad21b51
de8. 879] {ct x}: 0x00002aacdb3ee3f 8 0x5c067ac6 0x77a814c6 0x126b9004 21
386 40 32764 1 0 2 0 0xea2acf a 192. 168. 10. 1
SKGXP: [ 2aad21b51de8. 880] {ct x}: ACH Tabl e Bucket : 101SKGXP: [ 2aad21b5
1de8. 881] {ct x}: 0x00002aacdb329b50 0x5c067acd 0x1e37bcb5 0x5494de46 2
3224 40 32790 27 0 2 0 0xea2acf a 192. 168. 10. 1
. . . Omi t t ed
26048 Cel l sr v J ob Queue: Net wor k r ecei ve por t 26049
26050 Message header f r ee l i st :
26051 hwm=300 si ze=117 t ot al =26563
26052 numMsgHdr Al l ocCal l s 26446 numMsgHdr Recachi ngCal l s 174 numMessa
geHeader s 300
26053 Pr i nt i ng r ecei vepor t message hdr st at e ar r ay
26054 I ndex[ 3] - val ue: 0
26055 I ndex[ 4] - val ue: 0
26056 I ndex[ 5] - val ue: 0
. . . Omi t t ed
After this lengthy set of (arguably useless) information, you'll find sections that
dump timer information for various InfiniBand network communications that looks
like this. You'll see sections of communication flows with sources being your
compute nodes, with information on the O/S process on the compute server:
Dumpi ng Ti mer Recor ds i n Bucket
Classic
search
Send feedback
Seqno Begi nTi me ( usec) EndTi me ( usec) Sour ceLoc
Sour ceCode Cl i ent cooki e
125 1358974345437886 0 FI LE: Remot eLi st ener . cpp
LI NE: 1389 myLi st ener . get Net wor kDi r ect or y( ) - >del et e
126 1358974345437890 0 FI LE: Net wor kDi r ect or y. cp
p LI NE: 728 por t ToDel et e- >del et ePor t ( )
126 1358974345437890 1358974345437933 FI LE: Net wor kDi r e
ct or y. cpp LI NE: 728 por t ToDel et e- >del et ePor t ( )
125 1358974345437886 1358974345437934 FI LE: Remot eLi st e
ner . cpp LI NE: 1389 myLi st ener . get Net wor kDi r ect or y( ) - >del et e
ner . cpp LI NE: 1531 handl eDi sconnect _Remot eLi st ener ( )
ner . cpp LI NE: 1626 pr ocessOneRequest ( ) host cm01dbm01. cent r oi d. com, p
i d 19812, msgi d 6, por t 17973, f d 210
LI NE: 1626 pr ocessOneRequest ( )
LI NE: 1531 handl eDi sconnect _Remot eLi st ener ( )
LI NE: 1389 myLi st ener . get Net wor kDi r ect or y( ) - >del et e
130 1358974348033598 0 FI LE: Net wor kDi r ect or y. cp
p LI NE: 728 por t ToDel et e- >del et ePor t ( )
130 1358974348033598 1358974348033632 FI LE: Net wor kDi r e
ct or y. cpp LI NE: 728 por t ToDel et e- >del et ePor t ( )
ner . cpp LI NE: 1389 myLi st ener . get Net wor kDi r ect or y( ) - >del et e
ner . cpp LI NE: 1531 handl eDi sconnect _Remot eLi st ener ( )
ner . cpp LI NE: 1626 pr ocessOneRequest ( ) host cm01dbm02. cent r oi d. com, p
i d 29025, msgi d 6, por t 11696, f d 210
. . . Lot s of out put omi t t ed
After this you'll find sections showing elapsed time statistics for "RemoteListener",
which in this case indicates inbound network communications from the cells.
Here, you can look for latency anomalies:
======================================================
Dumpi ng El apsed Ti me st at i st i cs f or Remot eLi st ener
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Classic
search
Send feedback
Tot al Ti me St at i st i cs ( Sampl es: 2183) :
<= 1usec: 2125
> 2usec && <= 4usec: 0
There is also a more granular breakdown of "Phase-specific" remote listener
timing statistics for the following phases:
[ r oot @cm01cel 01 ~] # gr ep "Phase Name" / var / l og/ or acl e/ di ag/ asm/ cel l / c
m01cel 01/ t r ace/ svt r c_24797_82. t r c| mor e
Phase Name: Pol l _t o_pr ocess
Phase Name: Pr ocess_t o_LOGI N1_compl et e
Phase Name: Pr ocess_t o_CONNECT_compl et e
Phase Name: Pr ocess_t o_GETANTI NFO_compl et e
Below this section is a list of "Remot eSendPor t " statistics, which, similar to the
Remot eRecei vePor t statistics, show l i bcel l /InfiniBand network statistics
from each compute node. The overall format is similar to the
Remot eRecei vePor t sections, with a main section like this:
Dumpi ng SKGXP ct x: 0x2aaaab3a4ed0
SKGXP: [ 2aaaab3a4ed0. 101] {ct x}: SKGXPCTX: 0x0x2aaaab3a4ed0 ct x Exadat a
SendPor t
SKGXP: [ 2aaaab3a4ed0. 102] {ct x}: Compat i bl e: 0 0. 0 0
SKGXP: [ 2aaaab3a4ed0. 103] {ct x}: SKGXP Ver si ons: 3. 3. 4. 2. 3. c0. 4( 3. 2. 4. 1
)
SKGXP: [ 2aaaab3a4ed0. 104] {ct x}: DBG f l ags1: 0
SKGXP: [ 2aaaab3a4ed0. 105] {ct x}: DBG post _t ype: 0
SKGXP: [ 2aaaab3a4ed0. 106] {ct x}: DBG post _si g: 0
SKGXP: [ 2aaaab3a4ed0. 107] {ct x}: DBG post _send_t hr esh: 0
SKGXP: [ 2aaaab3a4ed0. 108] {ct x}: DBG post _t i mi ng: 0
SKGXP: [ 2aaaab3a4ed0. 109] {ct x}: DBG post _t r ace_al l : 0
SKGXP: [ 2aaaab3a4ed0. 110] {ct x}: DBG dev_pol l : 0
SKGXP: [ 2aaaab3a4ed0. 111] {ct x}:
SKGXP: [ 2aaaab3a4ed0. 112] {ct x}: WAI T HI STORY
SKGXP: [ 2aaaab3a4ed0. 113] {ct x}: Wai t Ti me Ti me si nce Fast r eaps Wai t
Type Ret ur n Code
SKGXP: [ 2aaaab3a4ed0. 114] {ct x}: ( ms) pr ev wai t ( ms) bef or e
Classic
search
Send feedback
SKGXP: [ 2aaaab3a4ed0. 115] {ct x}: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - -
SKGXP: [ 2aaaab3a4ed0. 117] {ct x}: 0 0 3 NORMAL SUCC
SKGXP: [ 2aaaab3a4ed0. 118] {ct x}: 0 0 3 NORMAL SUCC
SKGXP: [ 2aaaab3a4ed0. 132] {ct x}: wai t del t a 1 sec ( 1674 msec) ct x t s 0x
914f 1f l ast t s 0x914f 1f
SKGXP: [ 2aaaab3a4ed0. 133] {ct x}: user cpu t i me si nce l ast wai t 0 sec 0
t i cks
SKGXP: [ 2aaaab3a4ed0. 134] {ct x}: syst emcpu t i me si nce l ast wai t 0 sec
0 t i cks
SKGXP: [ 2aaaab3a4ed0. 135] {ct x}: l ocked 2
SKGXP: [ 2aaaab3a4ed0. 136] {ct x}: bl ocked 0
SKGXP: [ 2aaaab3a4ed0. 137] {ct x}: t i med wai t r ecei ves 0
Dumpi ng secur i t y i nf o: dbName=' edw' Fl ags= OPEN_OPEN
Remot eSendPor t 0x2aaaac80d228: cont ai nFdWi t hVal i dRei d 1 mySkgxp
Ct x = 0x2aaaac80d768, i sANT = Yes cl i ent I dent i f i er = - NA- , myDest i nat i o
n = 748376314, r pcMcpyI npr og_Remot eSendPor t = 0
cl i ent Host Name = cm01dbm01. cent r oi d. com, cl i ent Pi d 22573, sour ceEndi a
nI dent i f i er = 16909060, r ef Count er =0, i npr ogJ obCount = 0, mar kedFor Del e
t i on = 0, mar kedToFence = 0 numFr eeRequest Handl esAl l oc = 0, numReqhand
l esFr omPool = 0
numFdsWi t hVal i dRei d = 36 f r eePr i vat eQSi ze 10 f r eeGl obal QSi ze 0 numGl o
bal Al l ocs 0
f enceCapabi l i t y 1, l ast SendTi me Wed J an 23 16: 25: 56 2013
Pr i nt i ng r emot e open i nf o f or t he sendpor t
myRemot eOpenI d 0 di skNumber 1176031092 r ef Count 0 pendi ngWr i t eCnt 0 m
ar kedToDel et e 0
The "Dumping security info: dbName" line shows information for a specific call
from a specific database, and clues us in that each "section" of information in this
RemoteSendPort detail shows us individual I/O requests from each of our
databases. If you start looking through a cel l sr v_st at edump trace file in an
active system, this set of statistics and trace information will consume a lot of real-
estate in your trace file.
When all the network send/receive statistics are done you'll find a section that
summarizes things:
Cel l sr v J ob Queue: Net wor k send out st andi ng queue
numPol l s_Resour ce 478606 numSuccessf ul Pol l s_Resour ce 474670
Classic
search
Send feedback
numUnsuccessf ul Pol l s_Resour ce 5200 numEmpt yPol l s_Resour ce 128
Pr i nt i ng net wor k send t r i ps hi st ogr am
Numnet wor k send t r i ps 2: NumJ obs 44
After all of this, you'll see a number of sections that present information about the
types of I/O being requested and completed. You can examine the various gets
and misses columns in these sections to get an idea of how the cell is handling its
workload:
Pr i nt i ng pr edi cat e I O r esour ce
Pr edi cat eI O 0x2aaaaccbf 050 cel l l evel st at s
Number of act i ve pr edi cat e di sks: 0 Hwmact i ve pr edi cat e di sks: 64
Number of compl et ed pr edi cat e di sks: 130 Number of pr edi cat e di sk use
s: 902
cur Fi ndJ obI t er at i on 687504
numPol l s_Resour ce 104836936 numEmpt yPol l s_Resour ce 104090676
numSuccessf ul Pol l s_Resour ce 0 numUnsuccessf ul Pol l s_Resour ce 628320
There are also a few histograms that show your predicate and I/O stats, which can
be informational as well:
Tot al - map- el ement s- per - pr edi cat e- di sk- use hi st ogr am
< 1 el ement s: 130
< 2 el ement s: 18
. . .
Avg- map- el ement s- per - cwr i t e hi st ogr am
< 1 el ement s: 130
< 2 el ement s: 18
. . .
I O- r equest - si zes hi st ogr am
. . .
< 8 KB- numI Os: 0
< 16 KB- numI Os: 39
< 32 KB- numI Os: 6
< 64 KB- numI Os: 2389
. . .
Pr edi cat ecacheget - per - pr edi cat e- di sk- use hi st ogr am
< 1 oss_cr ead cal l s: 130
Classic
search
Send feedback
< 2 oss_cr ead cal l s: 178
. . .
< 1 met adat a si ze: 0 KB
< 2 met adat a si ze: 52 KB
. . .
After a number of rows that dump "predicate disk slots" and "predicate disks free
queue" statistics, you'll see sections that have the string
"Pr i nt i ng / box/ cel l event di sk obj ect", one for each compute node.
Pr i nt i ng / box/ cel l event di sk obj ect . .
Remot eSendPor t = 0x2b4250e23148, Remot eOpenI nf o = 0x2aaaabbcba80 , Cl
i ent Host Name = cm01dbm02. cent r oi d. com, l ast SeenEvent I d = 0, subJ oi nEven
t I d = 0, l ast Of f set = 0, subModVer s=1, subMaj Ver s=1, subMi nVer s=2, subM
odName=aut o onl i ne
Pr i nt i ng / box/ cel l event di sk obj ect . .
Remot eSendPor t = 0x2aaaabacab88, Remot eOpenI nf o = 0x2aaaac0c62b0 , Cl
i ent Host Name = cm01dbm01. cent r oi d. com, l ast SeenEvent I d = 0, subJ oi nEven
t I d = 0, l ast Of f set = 0, subModVer s=1, subMaj Ver s=1, subMi nVer s=2, subM
odName=aut o onl i n
This and the lines below show events and messages for a variety of cel l sr v
scheduled tasks, including "Remote Listener", "IORM self tuning", "IO hang
detection", and more. If you have an IORM plan enabled, you'll see information
like this as well:
Resour ceManager 0x2aaaade755b8: cur pl ns = 4, shdpl ns = 0
Resour cePl an 0x2aade33c5420: t ps = 1, f ul l = 1, t yp = 3, nsubs =
0, csub = 0, dbi d = 0
TopNode 0x2aade33c5458: nent s = 4, pl i d = 0, pl nm= , l at = 10
Ent [ 0] : sub = 0, i d = 0, cat p0 0 p1 100 p2 0 p3 0 p4 0 p5 0 p6 0
p7 0 p8 0
p7 0 p8 0
p7 0 p8 0
p7 0 p8 0
Classic
search
Send feedback
0, csub = 0, dbi d = 0
TopNode 0x2aade37a8190: nent s = 5, pl i d = 0, pl nm= , l at = 10
p7 0 p8 0
p7 0 p8 0
p7 0 p8 0
Ent [ 3] : sub = 0, i d = 0, cat p0 0 p1 5 p2 0 p3 0 p4 0 p5 0 p6 0 p
7 0 p8 0
p7 0 p8 0
Below this comes one of the more useful sections of the trace file, the IORM state
dump. This section shows you details about your IORM plan, whether you've got
one enabled or not. The first part of it looks like this:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
I ORM st at e dump
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
Ti me: 01- 23- 2013 16: 32: 12. 181755000
I ORM Enabl ed: Act i ve Pl an, Thr ot t l i ng
Cur r ent I ORM Pl ans
This indicates that we have an IORM plan active. Next comes a section listing a
category plan (and this shouldn't be a surprise it's listed next, since Category
IORM is evaluated first):
Cat egor y Pl an
number of cat egor i es: 8
i d 0: CAT_HI GH
i d 1: CAT_MEDI UM
i d 2: CAT_LOW
i d 3: OTHER
i d 4: _ORACLE_BG_CATEGORY_
i d 5: _ORACLE_MEDPRI BG_CATEGORY_
i d 6: _ORACLE_LOWPRI BG_CATEGORY_
i d 7: _ASM_
Classic
search
Send feedback
ot her cat egor y i ndex: 3
ASM cat egor y i ndex: 7
medi um- pr i or i t y backgr ound cat egor y i ndex: 5
l ow- pr i or i t y backgr ound cat egor y i ndex: 6
map of dat abase and consumer gr oup i ndi ci es t o cat egor y i ndex
dat abase 0 <EDW>, consumer gr oup 0 <CG_WH2> maps t o cat egor y 1 <CAT
_MEDI UM>
_LOW>
dat abase 0 <EDW>, consumer gr oup 2 <OTHER_GROUPS> maps t o cat egor y
3 <OTHER>
_HI GH>
Next comes the Inter-Database IORM Plan section:
I nt er - Dat abase Pl an
number of dat abases: 5
i d 0: EDW
i d 1: VI SX
i d 2: DWPRD
i d 3: VI SY
i d 4: OTHER
ot her dat abase i ndex: 4
map of dat abase i d t o dat abase i ndex
i d 4075123336 <VI SX> maps t o i ndex 1 <VI SX> has Fl ashCache=on, Fl as
hLog=on, Li mi t =0
i d 2273376219 <EDW> maps t o i ndex 0 <EDW> has Fl ashCache=on, Fl ashL
og=on, Li mi t =0
i d 849012303 <DWPRD> maps t o i ndex 2 <DWPRD> has Fl ashCache=on, Fl a
shLog=on, Li mi t =0
i d 3369927204 <VI SY> maps t o i ndex 3 <VI SY> has Fl ashCache=on, Fl as
hLog=on, Li mi t =0
And below this, the Intra-Database (think DBRM!) section:
Classic
search
Send feedback
I nt r a- Dat abase Pl an
dat abase i ndex: 1
dat abase name: VI SX
pl an name: VI SX_PLAN
number of consumer gr oups: 8
ot her consumer gr oup i ndex: 2
backgr ound consumer gr oup i ndex: 4
medi um- pr i or i t y backgr ound consumer gr oup i ndex: 5
l ow- pr i or i t y backgr ound consumer gr oup i ndex: 6
l ow- pr i or i t y f or egr ound consumer gr oup i ndex: 7
map of consumer gr oup i d t o i ndex:
CG_REPORTI NG maps t o i ndex 0
CG_SHI PPI NG maps t o i ndex 1
OTHER_GROUPS maps t o i ndex 2
CG_FI NANCE maps t o i ndex 3
_ORACLE_BACKGROUND_GROUP_ maps t o i ndex 4
_ORACLE_MEDPRI BG_GROUP_ maps t o i ndex 5
_ORACLE_LOWPRI BG_GROUP_ maps t o i ndex 6
_ORACLE_LOWPRI FG_GROUP_ maps t o i ndex 7
The previous three sections can provide a good means to understanding your
IORM implementation (aside from doing a CellCLI l i st i or mpl an det ai l ...),
and below this comes an even more interesting section that shows your IORM
statistics per cell disk:
******** I ORM STATS ******** Wed J an 23 16: 32: 12 2013
I ORM st at s f or di sk=/ dev/ sda3
I ORM st at s f or di sk=/ dev/ sdd
Heap st at s: I nuse=2005KB Tot al =2158KB
- - - - - - - - - I ORM Wor kl oad St at e & Char act er i zat i on - - - - - - - - -
I ORM: Sol o Wor kl oad
Sol o wor kl oad ( no db or cg) : 0 t r ansi t i ons
#ser ved=15 bi t map=0 #queued=0 adt i me=0ms asmr dt i me=0ms #cumul ser ved=3
7358 #pendi ng=0 #l pendi ng=0
#max_conc_i o=5 wr i t e_cache_hi t _r at e=86%i ocost =55834574849ms
cat i dx=0 bi t map=0 CAT_HI GH
cat i dx=1 bi t map=0 CAT_MEDI UM
cat i dx=2 bi t map=0 CAT_LOW
cat i dx=3 bi t map=0 OTHER
cat i dx=4 bi t map=0 _ORACLE_BG_CATEGORY_
Classic
search
Send feedback
SI O: #ser ved=13 #queued=0 Ut i l =0%aqt i me=0ms adt i me=0ms
dbi dx=0 bi t map=0 EDW
cgi dx=0 bi t map=0 cgname=CG_WH2
cgi dx=2 bi t map=0 cgname=OTHER_GROUPS
cgi dx=4 bi t map=0 cgname=_ORACLE_BACKGROUND_GROUP_
SI O: #ser ved=4 #queued=0 Ut i l =0%aqt i me=0ms adt i me=0ms
#conci os=55, #f r agi os=0 #st ar vedi os=0 #maxcapwai t s=0
cgi dx=5 bi t map=0 cgname=_ORACLE_MEDPRI BG_GROUP_
cgi dx=6 bi t map=0 cgname=_ORACLE_LOWPRI BG_GROUP_
cgi dx=7 bi t map=0 cgname=_ORACLE_LOWPRI FG_GROUP_
dbi dx=1 bi t map=0 VI SX
In this section you will find information about cell disk service time, queue time,
and a number of other sections - look for the string "SI O" to find the I/O statistics
you're interested in for each type of consumer.
After a few sections that provide statistics about "Skgxp Host St at s", there are
a number of sections under "Dumpi ng TI MEDACTI ONS" and show threshold
information for various metrics:
Lat ency war ni ng t hr eshol d t abl e f or I O:
I O r eason = Lat ency t hr eshol d i n mi l l i seconds
UNKNOWN = 2000
RedoLog Wr i t e = 500
RedoLog Read = 2000
Ar chLog Read = 2000
Medi aRecover y Wr i t e = 2000
Mi r r or Read = 2000
Resi l ver i ng Wr i t e = 2000
. . . Omi t t ed
Below this, there are a time statistics for each type of I/O operation, each with a
variable number of "phases" with timing histograms.
[ r oot @cm01cel 01 ~] # gr ep "Dumpi ng El apsed Ti me st at i st i cs f or " / opt /
or acl e/ cel l 11. 2. 3. 1. 1_LI NUX. X64_120607/ l og/ di ag/ asm/ cel l / cm01cel 01/ t r a
ce/ svt r c_24797_82. t r c
Dumpi ng El apsed Ti me st at i st i cs f or Remot eLi st ener Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
Dumpi ng El apsed Ti me st at i st i cs f or Pr edi cat eMapEl ement
Dumpi ng El apsed Ti me st at i st i cs f or Pr edi cat eCachePut
Dumpi ng El apsed Ti me st at i st i cs f or CacheGet
Dumpi ng El apsed Ti me st at i st i cs f or CachePut
Dumpi ng El apsed Ti me st at i st i cs f or CachePut LogWr i t e
Dumpi ng El apsed Ti me st at i st i cs f or Remot eLi st ener
Dumpi ng El apsed Ti me st at i st i cs f or OpenDi sk
Dumpi ng El apsed Ti me st at i st i cs f or Cl oseDi sk
[ r oot @cm01cel 01 ~] #
If we look at, say, the Pr edi cat eCacheGet statistic, we see something like this:
======================================================
Dumpi ng El apsed Ti me st at i st i cs f or Pr edi cat eCacheGet
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Tot al Ti me St at i st i cs ( Sampl es: 30827) :
<= 1usec: 25571
> 1usec && <= 2usec: 0
> 2usec && <= 4usec: 0
> 4usec && <= 8usec: 0
> 8usec && <= 16usec: 0
> 16usec && <= 32usec: 0
> 32usec && <= 64usec: 0
> 64usec && <= 128usec: 0
> 128usec && <= 256usec: 0
> 256usec && <= 512usec: 0
> 512usec && <= 1msec: 0
> 1msec && <= 2msec: 0
> 2msec && <= 4msec: 0
> 4msec && <= 8msec: 2299
> 8msec && <= 16msec: 453
Phase Name: Wai t _f or _Fi l t er ed_Resul t
<= 1usec: 28235
> 1usec && <= 2usec: 0
> 2usec && <= 4usec: 0
> 4usec && <= 8usec: 0
> 8usec && <= 16usec: 0
> 16usec && <= 32usec: 0
> 32usec && <= 64usec: 0
> 64usec && <= 128usec: 0
Classic
search
Send feedback
> 128usec && <= 256usec: 0
> 256usec && <= 512usec: 0
> 1msec && <= 2msec: 0
> 2msec && <= 4msec: 0
> 4msec && <= 8msec: 855
> 8msec && <= 16msec: 211
Next we see sections about event counts and waits, with timing information, for
various mutex groups:
*** St at i st i cs f or al l mut ex gr oups ***
Name Count Wai t s Tot al Wai t Ti me ( usecs) Avg. W
ai t ( usecs) No- wai t Successes No- wai t Fai l ur es
Admi nSendPor t 1 9664 0 0. 0 0
0
FSA: Cache Get J ob 23 446671 0 0. 0 0
0
FSA: Cache Put J ob 23 424437 0 0. 0 0
0
. . .
Below this, we see detail about each individual mutex per mutex group:
*** St at i st i cs f or al l mut exes ***
Name Addr ess Wai t s Tot al Wai t Ti me ( usecs)
Avg. Wai t ( usecs) No- wai t Successes No- wai t Fai l ur es
Admi nSendPor t 0x2aade1c1c7e8 9664 0 0. 0
0 0
FSA: Cache Get J ob 0x2b4250f c8bf 8 18965 0 0
. 0 0 0
FSA: Cache Get J ob 0x2b4250f c8b58 19661 0 0
. 0 0 0
FSA: Cache Get J ob 0x2b4250f c8ab8 19645 0 0
. 0 0 0
. . .
There are then similar statistics for reader/write lock groups and locks:
Classic
search
Send feedback
*** St at i st i cs f or al l r eader / wr i t er l ock gr oups ***
Name Count Wr i t e- Wai t s Tot al Wr - Wai t Ti me ( usecs)
i t ( usecs) Wr i t e- No- Wai t Successes Wr i t e- No- Wai t Fai l ur es Read- No- Wai t
Successes Read- No- Wai t Fai l ur es
RWLockGr oups 38 193713 0 0. 0
34 0 0. 0 0 0 0
0
Mut exGr oups 189 208717 0 0. 0
370 0 0. 0 0 0 0
0
ant mast er moni t or Q 1 2 0 0. 0
5507 0 0. 0 0 0 0
0
ant mast er nopat hQ 1 0 0 0. 0
1 0 0. 0 0 0 0
0
Cache Out st andi ng J obs 1 1085915 0 0. 0
0 0 0. 0 0 0 0
0
Cel l di sk 28 0
. . .
*** St at i st i cs f or al l r eader / wr i t er l ocks ***
Name Addr ess Wr i t e- Wai t s Tot al Wr - Wai t Ti me (
usecs) Avg. Wr - Wai t ( usecs) Read- Wai t s Tot al Rd- Wai t Ti me ( usecs) Avg.
Rd- Wai t ( usecs) Wr i t e- No- Wai t Successes Wr i t e- No- Wai t Fai l ur es Read- N
o- Wai t Successes Read- No- Wai t Fai l ur es
TopSQLCPU HT l i st l ock 0x139a24e0 128 0
0. 0 1 0 0. 0 0 0
0 0
SkgxpSt at s HT l i st l ock 0x139a23f 0 101 0
0. 0 1 0 0. 0 0 0
0 0
Saf eFi l e l i st l ock 0x139a2300 7 0 0.
0 1 0 0. 0 0 0
0 0
Quar ant i neManager HT l i st l ock 0x139a2200 2047 0
0. 0 0 0 0. 0 0 0
0 0
pr edI ODi sks l i st l ock 0x139a2110 1 0
0. 0 1 0 0. 0 0 0
0 0
Classic
search
Send feedback
0. 0 1 0 0. 0 0 0
0 0
0. 0 0 0 0. 0 0 0
0 0
Gr i dDi sk Owner Li st l i st l ock 0x139a1e10 1 0
0. 0 1 0 0. 0 0
. . .
The next large section contains trace information for individual I/O operations.
This section starts with the string "Tr ace Bucket Dump Begi n: def aul t
t r ace bucket", and you'll probably see a number of lines like this, each
indicating a block I/O operation with before and after offset (bef and af t ) as well
as an "i ncr ", which I believe maps to the size of the I/O being serviced (I could
be wrong here):
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
Tr ace Bucket Dump Begi n: def aul t t r ace bucket
TI ME( *=appr ox) : SEQ: DATA
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - -
2013- 01- 23 16: 32: 03. 713068 : 005DEC30: Decr cur of f f or buf 0x2aaaade2
bab8: bef = 99768, i ncr = 6912, af t = 92856 mySi ze = 1048576
2013- 01- 23 16: 32: 03. 713080 : 005DEC3B: al l ocat eFr omFi l t er edBuf f er : si z
e: 8192 numel ement s 0
2013- 01- 23 16: 32: 03. 713080 : 005DEC3C: I ncr cur of f f or buf 0x2aaaade2
2013- 01- 23 16: 32: 03. 713134 : 005DEC73: r esi zeFi l t er edBuf f er : cur r ent Of
f set : 101048 si ze7264
2013- 01- 23 16: 32: 03. 713135 : 005DEC74: Decr cur of f f or buf 0x2aaaade2
2013- 01- 23 16: 32: 03. 713159 : 005DEC86: al l ocat eFr omFi l t er edBuf f er : si z
e: 8192 numel ement s 0
When cel l sr v decides to perform a Smart Scan, you'd see entries like these for
a predicate mapping operation:
2013- 01- 23 16: 32: 03. 716210 : 005DF705: Used smar t I O mode. Amount consu
med 1015808. of f l oadExcept i on 0
Classic
search
Send feedback
. Amount consumed 1015808
2013- 01- 23 16: 32: 03. 716211 : 005DF707: I ncr cur of f f or buf 0x2aaaadd5
2013- 01- 23 16: 32: 03. 716212 : 005DF709: map el ement member s, numel ement
s = 1 compl et edMapEl emFl ag 0x1
2013- 01- 23 16: 32: 03. 716216 : 005DF70D:
oss_pr edi cat e_map_el ement 0x0x2aaca1b00080
post Fi l t er er r or 0 f l ags 0 bl ock_i d 37654404 di sk 9 l en 1015808
of f set 17904467968 buf _of f 8192 dat a_l en 144024 cache_ver 12
r eq_ver 0 r eq_i d 58 st bl k 0 nbl ks 0
2013- 01- 23 16: 32: 03. 716221 : 005DF712: Rel easi ng map el ement 0x2aae0a1
5b2f 8. buf f er ( ni l ) di skNumber 142
2013- 01- 23 16: 32: 03. 716222 : 005DF713: Rel easi ng map el ement 0x2aae099
a6418. buf f er 0x2aaaadd52b78 di skNumber 142
2013- 01- 23 16: 32: 03. 716224 : 005DF714: Fr eei ng I O buf f er : 0x2aaaadd52b
78
2013- 01- 23 16: 32: 03. 716226 : 005DF719: Pr edi cat e f i l t er j ob done. di sk
142 f i l t er 0x2aae0c8bda90. numf i l t er s 1
2013- 01- 23 16: 32: 03. 773212 : 005E8ABB: Al l ocat ed dest buf f er f r omcach
e: 0x2aaaadd04bb8
2013- 01- 23 16: 32: 03. 773215 : 005E8AC0: Reusi ng dest buf f er ct l 0x2aae0da
52bc0 dest i nat i on buf f er 0x2aaaadd04bb8
2013- 01- 23 16: 32: 03. 773216 : 005E8AC4: Reusi ng pr edi cat e f i l t er j ob 0x
2aae0da52c70 f or di sk 157. Tot al f i l t er j obs r unni ng 0
2013- 01- 23 16: 32: 03. 773218 : 005E8AC6: Pr edi cat e f i l t er 0x2aae0da52c70
pr ocess st ar t ed. di sk 157 dest i nat i on buf f er 0x2aaaadd04bb8
2013- 01- 23 16: 32: 03. 773219 : 005E8AC8: I ncr cur of f f or buf 0x2aaaadd0
4bb8: bef = 0, i ncr = 8192, af t = 8192 mySi ze = 1048576
2013- 01- 23 16: 32: 03. 773222 : 005E8ACC: Pr ocessi ng compl et ed buf f er 0x2
aaaadc7dae8. di sk of f set 18113249280 buf of f set 8192 r dba 38414610 gr i
dDi skNumber 2396841556 dbi d: 2273376219 t abl eSpaceNum9 dat aObj ect Num
20315 spaceI nCompl et edBuf f er 901120
2013- 01- 23 16: 32: 03. 773227 : 005E8AD2:
Above, you can see that we're entering "smar t I O" mode as well as the block ID,
disk number, and size of the request. Additionally, the last line shows the grid disk
number, DBID, tablespace number (from SYS. TS$. TS#) and data object number
(SYS. OBJ $. DATAOBJ ECT#).
With these values, you can probably write a quick little script to summarize your
combinations of DATAOBJ # and TS# to arrive at some counts by object (per
database) and see what's doing Smart Scans. Also, to find the database
associated with the objects, a good place to look is in the
Classic
search
Send feedback
I ORMsection of the cel l sr v_st at edump trace file, as the "dbI d" is not the
same as the one you'll find in V$DATABASE.
If Storage Indexes were in use, you would also see a few sections like below:
2012- 12- 26 20: 50: 33. 907791 : 0010F513: St ar t i ng pr ed f i l t er SI comput e
2012- 12- 26 20: 50: 33. 907794 : 0010F514: St or ageI dx: : get St or i dx di skI D=
- 1219314244 gdi skp=0x2aacecc37690 st or i dx=0x2aacecc353c0
2012- 12- 26 20: 50: 33. 908355 : 0010F531: Endi ng pr ed f i l t er SI comput e
2012- 12- 26 20: 50: 33. 908358 : 0010F532: Rel easi ng map el ement 0x2b8792d
250a8. buf f er ( ni l ) di skNumber 126
2012- 12- 26 20: 50: 33. 908359 : 0010F533: Rel easi ng map el ement 0x2b8792c
f 4588. buf f er 0x2aaaaddb56d8 di skNumber 126
2012- 12- 26 20: 50: 34. 005590 : 00113CED: Pr edi cat e di sk r ead 0x2aade3a24
598 pr ocess st ar t ed. di sk 133
2012- 12- 26 20: 50: 34. 005593 : 00113CEF: St or ageI dx: : get St or i dx di skI D=
780371708 gdi skp=0x2aacf aed7528 st or i dx=0x2aacf aed51d0
2012- 12- 26 20: 50: 34. 005598 : 00113CF4: Pr edi cat e r ead I O on dsk 780371
708 ( pdi sk 133) at of f 218731905024 sz 1048576 f i l t er ed by st or age i nd
ex
2012- 12- 26 20: 50: 34. 005600 : 00113CF7: St or ageI dx: : get St or i dx di skI D=
2012- 12- 26 20: 50: 34. 005602 : 00113CFA: Pr edi cat e r ead I O on dsk 780371
ex
2012- 12- 26 20: 50: 34. 005603 : 00113CFC: St or ageI dx: : get St or i dx di skI D=
2012- 12- 26 20: 50: 34. 005605 : 00113CFE: Pr edi cat e r ead I O on dsk 780371
ex
2012- 12- 26 20: 50: 34. 005606 : 00113D00: St or ageI dx: : get St or i dx di skI D=
2012- 12- 26 20: 50: 34. 005607 : 00113D03: Pr edi cat e r ead I O on dsk 780371
ex
2012- 12- 26 20: 50: 34. 005609 : 00113D05: St or ageI dx: : get St or i dx di skI D=
Finally, the last section will show you counters for various I/Os into an "I/O Reason
Table", like below:
Classic
search
Send feedback
Cache: : dumpReasons I / O Reason Tabl e
Cache: : dumpReasons Reason Reads Wr i t es
Cache: : dumpReasons UNKNOWN 68483 19145
Cache: : dumpReasons RedoLog Wr i t e 0 31901
Cache: : dumpReasons RedoLog Read 539 0
Cache: : dumpReasons Resi l ver i ng Wr i t e 0 10
Cache: : dumpReasons Cont r ol Fi l e Read 89064 0
Cache: : dumpReasons Cont r ol Fi l e Wr i t e 0 109743
Cache: : dumpReasons ASM Di skHeader I O 852 116
Cache: : dumpReasons Buf f er Cache Read 43739 0
. . .
Conclusion
A cel l sr v_st at edump trace file can be a good way to try to understand the
inner workings of cel l sr v and help paint a picture of how Exadata works,
become familiar with some of the scope of what cel l sr v does, and so forth.
Generating a state dump isn't something that you would typically need to do, but if
you've got the time to weed through thousands of lines in a trace file, I think it's
worth the digging and research.
Enter your comment...

Comment as: Select profile...
Publ ish Previ ew
0
Add a comment
21st J anuary 2013Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
Have you ever wanted to know what's going on with Storage Indexes in your
Exadata environment? In this post I'll show you how to trace Storage Indexes to
understand the contents of the trace files.
Two Methods
There are a couple of methods to enable Storage Index tracing:
Do an al t er [ syst em| sessi on] set
"_kcf i s_st or agei dx_di ag_mode"=2
Set _cel l _st or age_i ndex_di ag_mode = 2 in your storage cell's
cel l i ni t . or a file
I prefer doing an "al t er sessi on" to limit the scope of what I'm tracing, since
this will generate a large number of trace files on your storage cell (don't worry
through, Oracle limits the number and quantity of Storage Index tracing
information/files with its _cel l _si _max_num_di ag_mode_dumps cell
initialization parameter).
Setting the parameter to 2 will enable Storage Index diagnostics/tracing. Setting
to 1 will revert your system/session to not trace, and the default of 0 also means
that diagnostics is off.
Where Are My Trace Files?
Storage Indexes are implemented on the Exadata storage cells; specifically,
they're delivered via Cell Services. The trace files will be located on each storage
cell's / var / l og/ or acl e/ di ag/ asm/ cel l / <node>/ t r ace directory, or a few
directories down from $ADR_BASE. If you're looking for the trace files on your
compute nodes, you won't find them.
The Trace Files
In the directory above, you'll see a number of files that start with svt r c* - these
are the trace files, and you'll find one for each cel l sr v process thread. You may
not actually find Storage Index trace information in each file, depending on how
well your data is balanced across your disks and how much I/O your workload
performed, but in any event these are the trace files that will contain all sorts of
interesting bits about the behavior of Storage Indexes when the
_kcf i s_st or agei dx_di ag_mode is set to 2.
If you take a look at the contents of your trace directory, you'll see something like
this:
Tracing Storage Indexes with
_kcfis_storageidx_diag_mode
Classic
search
Send feedback
r oot @cm01cel 01 t r ace] # l s - l s*t r c| head - 10
- r w- r - - - - - 1 r oot cel l admi n 8677 J an 18 17: 45 svt r c_29709_0. t r c
[ r oot @cm01cel 01 t r ace] #
In the output above, each file is named svt r c_29709_[ x] . t r c, where 29709
represents the operating system PID of cel l sr v and "x" represents the cel l sr v
process thread. You'll typically see about a hundred different trace files, each
corresponding to a specific cel l sr v thread.
Understanding Storage Index Trace Fi le Contents
If you take a peek inside one of these trace files, you'll notice sections that look
like this:
1: 2012- 12- 25 22: 35: 40. 592751*: RI DX ( 0x2aad17d21f 48) f or SQLI D 826b8
usj kvsba f i l t er 0
2: 2012- 12- 25 22: 35: 40. 592751*: RI DX ( 0x2aad17d21f 48) : st 2 val i dBi t
Map 0 t abn 0 i d {29606 12 2273376219}
3: 2012- 12- 25 22: 35: 40. 592751*: RI DX: st r t 0 end 2048 of f set 21907164
3648 si ze 1048576 r gnI dx 208923 RgnOf f set 0 scn: 0x0000. 0666d22c hi st :
0x92
4: 2012- 12- 25 22: 35: 40. 592751*: RI DX val i dat i on hi st or y: 0: Ful l Read 1
: Ful l Read 2: Ful l Read 3: Undef 4: Undef 5: Undef 6: Undef 7: Undef 8: Undef 9
: Undef
5: 2012- 12- 25 22: 35: 40. 592751*: Col i d [ 1] numFi l t 3 f l g 2:
6: 2012- 12- 25 22: 35: 40. 592751*: l o: c2 a 33 0 0 0 0 0
7: 2012- 12- 25 22: 35: 40. 592751*: hi : c2 a 34 0 0 0 0 0
9: 2012- 12- 25 22: 35: 40. 592751*: l o: c1 8 0 0 0 0 0 0
10: 2012- 12- 25 22: 35: 40. 592751*: hi : c3 5 64 64 0 0 0 0
12: 2012- 12- 25 22: 35: 40. 592751*: l o: 42 43 4e 5f 42 43 4e 53
13: 2012- 12- 25 22: 35: 40. 592751*: hi : 78 64 62 2d 6c 6f 67 31
Classic
search
Send feedback
8usj kvsba f i l t er 0
15: 2012- 12- 25 22: 35: 40. 848675*: RI DX ( 0x2aad17d1b698) : st 2 val i dBi
16: 2012- 12- 25 22: 35: 40. 848675*: RI DX: st r t 0 end 2048 of f set 2184886
35392 si ze 1048576 r gnI dx 208367 RgnOf f set 0 scn: 0x0000. 06674cc3 hi st
: 0x92
17: 2012- 12- 25 22: 35: 40. 848675*: RI DX val i dat i on hi st or y: 0: Ful l Read
1: Ful l Read 2: Ful l Read 3: Undef 4: Undef 5: Undef 6: Undef 7: Undef 8: Undef
9: Undef
19: 2012- 12- 25 22: 35: 40. 848675*: l o: c2 2 43 0 0 0 0 0
20: 2012- 12- 25 22: 35: 40. 848675*: hi : c2 2 43 0 0 0 0 0
22: 2012- 12- 25 22: 35: 40. 848675*: l o: c1 8 0 0 0 0 0 0
23: 2012- 12- 25 22: 35: 40. 848675*: hi : c3 5 64 61 0 0 0 0
25: 2012- 12- 25 22: 35: 40. 848675*: l o: 42 43 4e 5f 42 43 4e 53
26: 2012- 12- 25 22: 35: 40. 848675*: hi : 61 57 5f 54 52 61 43 4b
In the output above, we have a section of a trace file that shows information about
two different region indexes. Let's stop for a second and explain this:
A Brief Primer (Digression)
On Exadata, your database will be stored on ASM disk groups comprised of
ASM disks
Each ASM disk group has an attribute associated with it called an allocation
unit size, or au_si ze
Each AU is broken down into what are called storage regions. A storage
region is 1MB in size and generally speaking, represents the unit of work
under which cel l sr v issues physical I/Os
When queries execute with direct path reads (serial or parallel) and qualify for
Smart Scan, when the query has a predicate, and when a number of other
conditions apply, cel l sr v will populate a memory structure called a region
index with the high and low values for the column(s) in the WHERE clause as
data is accessed
The sum of the region indexes across your storage regions across your
storage cells corresponds to your "Storage Indexes". This may or may not be
exactly how Oracle phrases things, but this is how I like to look at it
Back to Understanding Storage Index Trace File Contents ...
Classic
search
Send feedback
index sections. Basically, lines that start with RIDX indicate the beginning of a
region index and the data for this region index continues up until the
2012- 12-
25 22: 35: 40. 592751 corresponds to r gnI dx 208923 and 2012- 12- 25
22: 35: 40. 848675 maps to r gnI dx 208367.
Let's look at the first line on the first region index:
2012- 12- 25 22: 35: 40. 592751*: RI DX ( 0x2aad17d21f 48) f or SQLI D 826b8usj
kvsba f i l t er 0
This line shows the SQL ID for the SQL statement issued when accessing the region index,
or Storage Index. You can map this sql_id back to a cursor from views in your database
like V$SQL, V$SQLSTAT, ASH and AWR views. At this point though, you don't know
which database this cursor is coming from, which is where the information from Line 2
comes in handy:
2012- 12- 25 22: 35: 40. 592751*: RI DX ( 0x2aad17d21f 48) : st 2 val i dBi t Map
0 t abn 0 i d {29606 12 2273376219}
Above, we see a set of curly brackets to the right of "t abn 0 i d"; here's what it means:
The first number corresponds to a DATAOBJ #from SYS. OBJ $ in your source
database
The second number is a TS#from SYS. TS$
The third number is your dbI d as the Exadata storage grid sees things. This doesn't
appear to be the same as the DBI D from V$DATABASE, but you can find your
source database easily enough by doing something like this:
[ or acl e@cm01dbm01 si _p] $ dcl i - c cm01cel 01 cel l cl i - e l i st f l ashcache
cont ent at t r i but es dbUni queName, dbI d wher e dbi d=2273376219| t ai l - 10
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
cm01cel 01: EDW 2273376219
[ or acl e@cm01dbm01 si _p] $
Classic
search
Send feedback
The next two lines show us some other interesting pieces of information:
2012- 12- 25 22: 35: 40. 592751*: RI DX: st r t 0 end 2048 of f set 21907164364
8 si ze 1048576 r gnI dx 208923 RgnOf f set 0 scn: 0x0000. 0666d22c hi st : 0x
92
2012- 12- 25 22: 35: 40. 592751*: RI DX val i dat i on hi st or y: 0: Ful l Read 1: Fu
l l Read 2: Ful l Read 3: Undef 4: Undef 5: Undef 6: Undef 7: Undef 8: Undef 9: Un
def
Here:
st r t 0 and end 2048 tells us that our region indexes are 2KB in size
si ze 1048576 tells us that our storage region is 1MB in size
r gnI dx 208923 shows us the identifier of this specific region index
scn: 0x0000.0666d22c is our SCN
hi st : and the RI DX val i dat i on hi st or y on the next line provide
information about the type of I/O operation against the region index (Ful l Read,
Par t i al Read, Par t i al Wr i t e, etc) as well as show information about the
different cursor/predicate conditions that accessed the same storage region. For
example, if you update some data (and checkpoint), you'll see information like this
prior to seeing a Par t i al Wr i t e in your validation history and if you run queries
with different predicate conditions accessing same region, you'll see something like
the second set of output:
r egi onLast ChangeScn 0x0937. 2d04b6a0 mi nAct SCN 0x0937. 2d04a953
2013- 01- 22 12: 02: 37. 188680*: RI DX ( 0x2aad05f a40a0) f or SQLI D 60bq0bcx
2p1g8 f i l t er 0
2013- 01- 22 12: 02: 37. 188680*: RI DX ( 0x2aad05f a40a0) : st 3 val i dBi t Map
0 t abn 0 i d {32024 9 2273376219}
si ze 1048576 r gnI dx 40122 RgnOf f set 0 scn: 0x0937. 2d04b6a0 hi st : 0x13
6db6db
2013- 01- 22 12: 02: 37. 188680*: RI DX val i dat i on hi st or y: 0: Par t i al Wr i t e
1: Par t i al Wr i t e 2: Par t i al Wr i t e 3: Par t i al Wr i t e 4: Par t i al Wr i t e 5: Par t i al W
r i t e 6: Par t i al Wr i t e 7: Par t i al Wr i t e 8: Par t i al Wr i t e 9: Ful l Read
Classic
search
Send feedback
def 2: Undef 3: Undef 4: Undef 5: Undef 6: Undef 7: Undef 8: Undef 9: Undef
2013- 01- 22 12: 13: 53. 444934*: Col i d [ 1] numFi l t 4 f l g 2:
2013- 01- 22 12: 13: 53. 444934*: hi : 57 59 0 0 0 0 0 0
2013- 01- 22 12: 13: 53. 444934*: l o: c3 2 2a 52 0 0 0 0
2013- 01- 22 12: 13: 53. 444934*: hi : c5 5 33 58 2b 57 0 0
l l Read 2: Undef 3: Undef 4: Undef 5: Undef 6: Undef 7: Undef 8: Undef 9: Undef
2013- 01- 22 12: 20: 35. 762422*: l o: 41 4b 0 0 0 0 0 0
2013- 01- 22 12: 20: 35. 762422*: hi : 57 59 0 0 0 0 0 0
2013- 01- 22 12: 20: 35. 762422*: l o: 41 61 63 68 65 6e 0 0
2013- 01- 22 12: 20: 35. 762422*: hi : c3 98 73 74 65 72 67 c3
2013- 01- 22 12: 20: 35. 762422*: l o: c3 2 2a 52 0 0 0 0
2013- 01- 22 12: 20: 35. 762422*: hi : c5 5 33 58 2b 57 0 0
Below this, we see sections for each column contained in the region index, which is a
reflection of how many different columns had a WHERE clause used for this object.
2012- 12- 25 22: 35: 40. 592751*: l o: c2 a 33 0 0 0 0 0
2012- 12- 25 22: 35: 40. 592751*: hi : c2 a 34 0 0 0 0 0
2012- 12- 25 22: 35: 40. 592751*: l o: c1 8 0 0 0 0 0 0
2012- 12- 25 22: 35: 40. 592751*: hi : c3 5 64 64 0 0 0 0
2012- 12- 25 22: 35: 40. 592751*: l o: 42 43 4e 5f 42 43 4e 53
2012- 12- 25 22: 35: 40. 592751*: hi : 78 64 62 2d 6c 6f 67 31
Above we have three different columns, column 1, 2, and 4. These correspond to the
column position from DBA_TAB_COLUMNS for the specific object whose data resides in
the storage region. The numFi l t indicates how many Storage Index filtering operations
have occurred for each column, and f l g tells us whether the column contains null values
in the storage region (2 means it doesn't, 1 or 3 means it may or does).
Classic
search
Send feedback
Finally, the lines with the l o and hi text in them show us the high and low values for the
columns, represented in hex. You'll also note that only 8 bytes are stored per column per
Some Interesting Things to Note
Oracle documents that Exadata maintains storage indexes for up to 8 columns in
a table, which means that if you have an application that accesses a table using
more than 8 columns in your predicates (across the entire workload), each with
Smart Scan, not all of your columns/queries will benefit from Storage Indexes.
cellsrv dynamically updates these region indexes based on your workload, and
one of the interesting things to note is that each region index on the same object
can contain different combinations of columns. Like the high and low values
tracked, it's very data-dependent in design. The output below shows this:
2012- 12- 25 22: 35: 42. 216672*: RI DX ( 0x2aacf 1e078b0) f or SQLI D 826b8usj
kvsba f i l t er 0
2012- 12- 25 22: 35: 42. 216672*: RI DX ( 0x2aacf 1e078b0) : st 2 val i dBi t Map
0 t abn 0 i d {29606 12 2273376219}
4 si ze 1048576 r gnI dx 208674 RgnOf f set 0 scn: 0x0000. 0666d1ae hi st : 0x
92492492
l l Read 2: Ful l Read 3: Ful l Read 4: Ful l Read 5: Ful l Read 6: Ful l Read 7: Ful l Re
ad 8: Ful l Read 9: Ful l Read
2012- 12- 25 22: 35: 42. 216672*: l o: c1 37 0 0 0 0 0 0
2012- 12- 25 22: 35: 42. 216672*: hi : c1 37 0 0 0 0 0 0
2012- 12- 25 22: 35: 42. 216672*: l o: c1 7 0 0 0 0 0 0
2012- 12- 25 22: 35: 42. 216672*: hi : c3 5 64 51 0 0 0 0
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: hi : 54 53 5f 50 49 54 52 5f
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: hi : 54 53 5f 50 49 54 52 5f
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: hi : 54 53 5f 50 49 54 52 5f
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: hi : 54 53 5f 50 49 54 52 5f Dynamic Views template. Powered by Blogger.
Classic
search
Send feedback
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: l o: 4d 47 4d 54 5f 50 52 4f
2012- 12- 25 22: 35: 42. 216672*: hi : 54 53 5f 50 49 54 52 5f
2012- 12- 25 22: 35: 48. 460125*: RI DX ( 0x2aacf d011138) f or SQLI D 826b8usj
kvsba f i l t er 0
2012- 12- 25 22: 35: 48. 460125*: RI DX ( 0x2aacf d011138) : st 2 val i dBi t Map
0 t abn 0 i d {29606 12 2273376219}
2 si ze 1048576 r gnI dx 208592 RgnOf f set 0 scn: 0x0000. 06673e34 hi st : 0x
92492492
l l Read 2: Ful l Read 3: Ful l Read 4: Ful l Read 5: Ful l Read 6: Ful l Read 7: Ful l Re
ad 8: Ful l Read 9: Ful l Read
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 61 52 4e 49 4e 47 5f
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
2012- 12- 25 22: 35: 48. 460125*: l o: 53 59 53 54 45 4d 5f 50
2012- 12- 25 22: 35: 48. 460125*: hi : 57 4d 5f 45 56 45 4e 54
Another interesting thing to note is that with Storage Index tracing, in my opinion
Classic
search
Send feedback
documentation states. One of the reasons for this is that Oracle limits trace file
information by the _cel l _si _max_num_di ag_mode_dumps cell initialization
long term performance impact, etc). Tracing enables you to see the dynamic
nature of how region index values are maintained for specific tables, but you need
to be cautious about how you scope your tests.
Conclusion
Enabling Storage Index tracing can be achieved by setting the
_kcfis_storageidx_diag_mode initialization parameter for a session (or system) on
the database tier and may be useful to understand how cel l sr v builds and
maintains storage indexes. Once you get the feel for how they work, simply
querying your cel l physi cal I O saved by st or age i ndex statistic is
probably a one-stop shopping place for anything you need to measure with
respect to Storage Indexes.
Posted 21st J anuary 2013 by J ohn Clarke
0
Add a comment
20th J anuary 2013
Those familiar with Exadata are aware that there are a number of important
software features that run on the Exadata storage cells whose job is to make
things fast.
You've undoubtedly heard about things like Smart Scan, cell offload, Smart Flash
Cache, Storage Indexes, and so forth. In this post I'm going to show you a couple
of SQL*Plus scripts that enable you to display statistics for a variety of Exadata-
specific performance metrics. Specifically, the script will report:
The amount of data being requested, in MB
The amount of data eligible for cell offload, in MB
The "Smart Scan efficiency", or the percentage of I/O saved via Smart Scan
relative to the total amount of I/O required
The amount of data transmitted over the InfiniBand interconnect, in MB
The throughput, in MB/second, of data transmitted over the interconnect
The amount of MB saved via Storage Indexes
The amount of data, in MB, satisfied from Smart Flash Cache
Displaying Exadata-specific
performance statistics for a
session
Classic
search
Send feedback
The amount of data processed on the on the storage cells; in other words, the
amount of physical I/O required by cel l sr v process threads to satisfy the
The I/O bandwidth, in MB/second, processed on the storage cells
The point of this blog is to hopefully provide a few scripts that will enable you to
measure what's happening for specific sessions in/on your Exadata Database
Machine.
Measuring statistics for you session
If you're doing some benchmarking for a program or SQL statement on Exadata,
simply execute your query or program and use the code below to measure what's
happening:
SQL> sel ect count ( *) f r omd14. census_r aw
2 wher e st at e_abbr =' NY' ;
COUNT( *)
- - - - - - - - - -
7747830
SQL>
SQL> sel ect * f r om
( sel ect phys_r eads+phys_wr i t es+r edo_si ze mb_r equest ed,
of f l oad_el i gi bl e mb_el i gi bl e_of f l oad,
i nt er connect _byt es i nt er connect _mb,
st or agei ndex_byt es st or agei ndex_mb_saved, f l ashcache_hi t s f
l ashcache_mb,
r ound( ( ( case
when of f l oad_el i gi bl e=0 t hen 0
when of f l oad_el i gi bl e> 0 t hen
( 100*( ( ( phys_r eads+phys_wr i t es+r edo_si ze) - i nt er connect _b
yt es) /
( phys_r eads+phys_wr i t es+r edo_si ze) ) )
end) ) , 2) smar t scan_ef f i ci ency,
i nt er connect _byt es/ dbt i nt er connect _mbps,
( phys_r eads+phys_wr i t es+r edo_si ze) - ( st or agei ndex_byt es+f l as
hcache_hi t s) cel l _mb_pr ocessed,
( ( phys_r eads+phys_wr i t es+r edo_si ze) - ( st or agei ndex_byt es+f l a
shcache_hi t s) ) / dbt cel l _mbps
f r om(
sel ect * f r om(
sel ect name, mb, dbt f r om(
sel ect st at s. name, t m. dbt dbt ,
( case
when st at s. name=' physi cal r eads' t hen ( st at s. val ue * dbbs. v
al ue) / 1024/ 1024
Classic
search
Send feedback
when st at s. name=' physi cal wr i t es' t hen
asm. asm_r edundancy*( ( st at s. val ue * dbbs. val ue) / 1024/ 10
when st at s. name=' r edo si ze' t hen asm. asm_r edundancy*( ( st at s
. val ue * 512) / 1024/ 1024)
when st at s. name l i ke ' cel l physi %' t hen st at s. val ue/ 1024/ 10
24
when st at s. name l i ke ' cel l %f l ash%' t hen ( st at s. val ue * dbbs
. val ue) / 1024/ 1024
el se st at s. val ue
end) mb
f r om(
sel ect b. name,
val ue
f r om v$myst at a,
v$st at name b
wher e a. st at i st i c# = b. st at i st i c#
and b. name i n
( ' cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad' ,
' cel l physi cal I O i nt er connect byt es' ,
' cel l physi cal I O i nt er connect byt es r et ur ned by smar t
scan' ,
' cel l f l ash cache r ead hi t s' , ' cel l physi cal I O byt es s
aved by st or age i ndex' ,
' physi cal r eads' ,
' physi cal wr i t es' ,
' r edo si ze' )
) st at s,
( sel ect val ue f r omv$par amet er wher e name=' db_bl ock_si ze' )
dbbs,
( sel ect decode( max( t ype) , ' NORMAL' , 2, ' HI GH' , 3, 2) asm_r edunda
ncy
f r omv$asm_di skgr oup ) asm,
( sel ect b. val ue/ 100 dbt
f r omv$myst at b, v$st at name a
wher e a. st at i st i c#=b. st at i st i c#
and a. name=' DB t i me' ) t m
) ) pi vot ( sum( mb) f or ( name)
i n ( ' cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad'
as of f l oad_el i gi bl e,
' cel l physi cal I O i nt er connect byt es' as i
nt er connect _byt es,
Classic
search
Send feedback
' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan
' as smar t scan_r et ur ned,
cache_hi t s,
' cel l physi cal I O byt es saved by st or age i ndex'
as st or agei ndex_byt es,
' physi cal r eads' as phys_r e
ads,
' physi cal wr i t es' as phys_w
r i t es,
' r edo si ze' as r edo_si ze) )
) )
unpi vot
( st at val f or st at t ype i n
( mb_r equest ed as ' MB Request ed' ,
mb_el i gi bl e_of f l oad as ' MB El i gi bl e f or Of f l oad' ,
smar t scan_ef f i ci ency as ' Smar t Scan Ef f i ci ency' ,
i nt er connect _mb as ' I nt er connect MB' ,
i nt er connect _mbps as ' I nt er connect MBPS' ,
st or agei ndex_mb_saved as ' St or age I ndex MB Saved' ,
f l ashcache_mb as ' Fl ash Cache MB r ead' ,
cel l _mb_pr ocessed as ' Cel l MB Pr ocessed' ,
cel l _mbps as ' Cel l MBPS' ) )
/
St at i st i c St at i st i c Val ue
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
MB Request ed 11300. 75
MB El i gi bl e f or Of f l oad 11300. 59
Smar t Scan Ef f i ci ency 93. 47
I nt er connect MB 737. 72
I nt er connect MBPS 26. 14
St or age I ndex MB Saved 87. 26
Fl ash Cache MB r ead . 02
Cel l MB Pr ocessed 11213. 48
Cel l MBPS 397. 36
9 r ows sel ect ed.
SQL>
Above, I'm doing a couple of SQL things to make my life easier. First, I'm pivoting
the results of the query from V$MYSTAT into columns; I like doing this because it
makes doing computations with SQL easier, in my opinion. A matter of personal
Classic
search
Send feedback
each stat, maybe some UNIONs, maybe some lengthy DECODEs, and so forth.
After the results are determine I'm using UNPIVOT to turn these back into rows so
Now for the important stuff - here's how the results were tabulated:
MB r equest ed represents the sum of
(physi cal _r eads) X (db_bl ock_si ze)
(physi cal _wr i t es) X (db_bl ock_si ze) X (ASM disk group
redundancy)
(r edo_si ze) X (512 bytes) X (ASM disk group redundancy)
MB El i gi bl e f or Of f l oad represents the value from the
cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad
statistic
Smar t Scan Ef f i ci ency represents the total MB requested minus the
interconnect bytes divided by the total amount of data requested. In other
words, it's basically physical I/O minus actual interconnect I/O divided by
physical I/O
I nt er connect MB represents the value from the cel l physi cal I O
i nt er connect byt es statistic
I nt er connect MBPS uses the actual DB t i me statistic from V$MYSTAT
and divides the cel l physi cal I O i nt er connect byt es statistic by DB
t i me
St or age I ndex MB Saved represent the value of the cel l physi cal
I O byt es saved by st or age i ndex statistic
Fl ash Cache MB r ead represents the number of flash cache hits, or
the cel l f l ash cache r ead hi t s statistic.
Cel l MB Pr ocessed is the sum of I/O requested (see first bullet) minus I/O
saved by storage indexes (cel l physi cal I O saved by st or age
i ndex) and Smart Flash Cache (cel l f l ash cache r ead hi t s)
Cel l MBPS represents the CELL MB Pr ocessed value divided the DB
t i me
A couple of notes about the above computations. First, I am using DB t i me as
the metric to arrive at the throughput rates for both the InfiniBand interconnect and
cell server bandwidth. I acknowledge that this isn't truly accurate; for example, if
a large portion of DB time was consumed parsing and the session were busy
spinning on the compute node(s), it will artificially decreased both the interconnect
bandwidth and cell disk bandwidth computations. In other words, these rates are
approximations which will potentially decrease these bandwidth rates.
Second, I'm assuming that a cel l f l ash cache r ead hi t
Classic
search
Send feedback
size. I think this is the lower bound at which I/O can be serviced against flash
storage, but need to validate this.
Measuring statistics for someone else' s sessi on
To capture the same type of detail for another session, run something the below.
It basically does the same thing as the previous script but uses V$SESSTAT
instead:
sel ect * f r om
( sel ect phys_r eads+phys_wr i t es+r edo_si ze mb_r equest ed,
of f l oad_el i gi bl e mb_el i gi bl e_of f l oad,
i nt er connect _byt es i nt er connect _mb,
st or agei ndex_byt es st or agei ndex_mb_saved, f l ashcache_hi t s f l ashca
che_mb,
r ound( ( ( case
when of f l oad_el i gi bl e=0 t hen 0
when of f l oad_el i gi bl e> 0 t hen
( 100*( ( ( phys_r eads+phys_wr i t es+r edo_si ze) - i nt er connect _byt es)
/
( phys_r eads+phys_wr i t es+r edo_si ze) ) )
end) ) , 2) smar t scan_ef f i ci ency,
i nt er connect _byt es/ dbt i nt er connect _mbps,
( phys_r eads+phys_wr i t es+r edo_si ze) - ( st or agei ndex_byt es+f l ashcache
_hi t s) cel l _mb_pr ocessed,
( ( phys_r eads+phys_wr i t es+r edo_si ze) - ( st or agei ndex_byt es+f l ashcach
e_hi t s) ) / dbt cel l _mbps
f r om(
sel ect * f r om(
sel ect name, mb, dbt f r om(
sel ect st at s. name, t m. dbt dbt ,
( case
when st at s. name=' physi cal r eads' t hen ( st at s. val ue * dbbs. val ue)
/ 1024/ 1024
when st at s. name=' physi cal wr i t es' t hen
asm. asm_r edundancy*( ( st at s. val ue * dbbs. val ue) / 1024/ 1024)
when st at s. name=' r edo si ze' t hen asm. asm_r edundancy*( ( st at s. val u
e * 512) / 1024/ 1024)
when st at s. name l i ke ' cel l physi %' t hen st at s. val ue/ 1024/ 1024
when st at s. name l i ke ' cel l %f l ash%' t hen ( st at s. val ue * dbbs. val u
e) / 1024/ 1024
el se st at s. val ue
end) mb
Classic
search
Send feedback
f r om(
sel ect b. name,
f r om v$sesst at a,
v$st at name b
wher e a. st at i st i c# = b. st at i st i c#
and a. si d=' &&si d'
and b. name i n
' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan' ,
' cel l f l ash cache r ead hi t s' , ' cel l physi cal I O byt es saved by
st or age i ndex' ,
' physi cal r eads' ,
' physi cal wr i t es' ,
' r edo si ze' )
) st at s,
( sel ect val ue f r omv$par amet er wher e name=' db_bl ock_si ze' ) dbbs,
( sel ect decode( max( t ype) , ' NORMAL' , 2, ' HI GH' , 3, 2) asm_r edundancy
f r omv$asm_di skgr oup ) asm,
( sel ect b. val ue/ 100 dbt
f r omv$sesst at b, v$st at name a
wher e a. st at i st i c#=b. st at i st i c#
and b. si d=' &&si d'
and a. name=' DB t i me' ) t m
) ) pi vot ( sum( mb) f or ( name)
i n ( ' cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad' as of
f l oad_el i gi bl e,
' cel l physi cal I O i nt er connect byt es' as i nt er connec
t _byt es,
' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan' as
smar t scan_r et ur ned,
' cel l f l ash cache r ead hi t s' as f l ashcache_hi t s,
' cel l physi cal I O byt es saved by st or age i ndex' as st or ag
ei ndex_byt es,
' physi cal r eads' as phys_r eads,
' physi cal wr i t es' as phys_wr i t es,
' r edo si ze' as r edo_si ze) )
) )
unpi vot
Classic
search
Send feedback
( st at val f or st at t ype i n
( mb_r equest ed as ' MB Request ed' ,
smar t scan_ef f i ci ency as ' Smar t Scan Ef f i ci ency' ,
i nt er connect _mb as ' I nt er connect MB' ,
i nt er connect _mbps as ' I nt er connect MBPS' ,
st or agei ndex_mb_saved as ' St or age I ndex MB Saved' ,
f l ashcache_mb as ' Fl ash Cache MB r ead' ,
cel l _mb_pr ocessed as ' Cel l MB Pr ocessed' ,
cel l _mbps as ' Cel l MBPS' ) )
/
undef i ne si d
What about a Broader Analysis?
You can pretty easily adapt the scripts above to examine AWR, ASH, or
V$SYSSTAT data; the thing you'll need to take into consideration is the time
element (which I'm using DB t i me for) to gather reasonably accurate MB/second
values.
Conclusion
While the scripts above may have some holes in the MB/second logic and the use
of DB t i me to drive rate calculations, you may find them helpful to benchmark
problematic sessions, programs, etc and get a pretty decent idea of the
macroscopic Exadata performance nature of a particular problem.
0
Add a comment
20th J anuary 2013
In the post I'm going to demonstrate the importance of how data is ordered in your
Oracle tables with respect to Exadata's Storage Index benefit.
As a brief primer, Storage Indexes are one of Exadata's storage cell software
features implemented via cell services (cel l sr v) that work to deliver "extreme
performance". Exadata uses storage indexes to skip I/O requests to specific
Data Clustering and Storage
Indexes on Exadata
Classic
search
Send feedback
conditions. Storage Indexes are essentially a collection of "region indexes",
maintained in each 1 MB storage region, that reside on your ASM disks. As data
columns used in the query predicates and populates these values in a region
index. For subsequent queries that access the data in these storage regions,
cel l sr v will examine the high and low boundaries in the region index and if the
predicate conditions lie outside the high/low range, cel l sr v will bypass
performing a physical I/O.
To measure the benefit of Storage Indexes for your queries, use the cel l
physi cal I O byt es saved by st or age i ndex statistic. This statistic
represents the number of bytes skipped as a result of Storage Indexes. In this
post I'll be using the simple script below to measure the impact of Storage Indexes
for a couple of queries:
Scr i pt : si _myst at . sql
set echo of f
set l i nes 200
col name f or mat a70 head ' St at i st i c'
col val ue f or mat 999, 999, 999, 999, 999. 90 head ' Val ue ( MB) '
col st at f or mat a20
set echo on
sel ect st at . name,
sess. val ue/ 1024/ 1024 val ue
f r om v$myst at sess,
v$st at name st at
wher e st at . st at i st i c# = sess. st at i st i c#
and st at . name i n
' cel l physi cal I O byt es saved by st or age i ndex' ,
' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan' )
or der by 1;
Without Well-Ordered Data
In the test below I'm going to execute a query against a test table called
MYOBJ _TEST1 and query on the column COL5. The table's characteristics look
like this:
SQL> desc d14. myobj _t est 1
Name Nul l ? Type
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Classic
search
Send feedback
COL1 NUMBER
COL2 NUMBER
COL4 VARCHAR2( 128)
COL5 VARCHAR2( 30)
COL6 VARCHAR2( 19)
COL7 VARCHAR2( 19)
COL8 VARCHAR2( 7)
COL9 VARCHAR2( 1)
COL10 VARCHAR2( 1)
SQL>
SQL> sel ect num_r ows, bl ocks f r omdba_t abl es
2 wher e t abl e_name=' MYOBJ _TEST1'
3 and owner =' D14' ;
NUM_ROWS BLOCKS
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
20447000 221050
SQL>
SQL> sel ect col umn_name, num_di st i nct
2 f r omdba_t ab_col umns
3 wher e t abl e_name=' MYOBJ _TEST1'
4 and owner =' D14' ;
COLUMN_NAME NUM_DI STI NCT
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
COL1 1000
COL2 50536
COL3 13
COL4 15726
COL5 424
COL6 41
COL7 469
COL8 2
COL9 2
COL10 2
10 r ows sel ect ed.
SQL>
Let's execute a test query against this table:
SQL> al t er syst emf l ush buf f er _cache;
Syst emal t er ed.
Classic
search
Send feedback
El apsed: 00: 00: 00. 12
SQL> set echo of f
PL/ SQL pr ocedur e successf ul l y compl et ed.
El apsed: 00: 00: 13. 07
Connect ed.
SQL> SELECT count ( *)
3 wher e col 5=' H0114' ;
COUNT( *)
- - - - - - - - - -
3000
El apsed: 00: 00: 00. 44
SQL> @si _myst at . sql
SQL> set echo of f
2 sess. val ue/ 1024/ 1024 val ue
10 ' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan' )
11 or der by 1
12 /
St at i st i c Val ue ( MB
)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad 1, 721. 50
cel l physi cal I O byt es saved by st or age i ndex 765. 88
cel l physi cal I O i nt er connect byt es . 43
cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan . 40
El apsed: 00: 00: 00. 00
SQL>
Classic
search
Send feedback
Here's what this means:
First, we're flushing our buffer cache to avoid buffered reads of any sort. This
is not strictly required based on the fact that our query will perform serial
direct reads, but I like to do this anyway
We're then executing a query that retrieved 3,000 rows from our 20+million
row table, and see that it completed in under 1 second (0.44 seconds). This
performance can be explained by a combination of Smart Scan and Storage
Indexes
We see that about 1720 MB of data was eligible for offload and based on the
value of the cel l physi cal I O byt es saved by st or age i ndex
statistic, we saved about 765 MB of I/O from via Storage Indexes
With Well-Ordered Data
What would happen if the data in our table were ordered based on the COL5
column? Let's create a copy of this table and order it (below, I'm importing some
previously exported statistics to save some time):
SQL> dr op t abl e d14. myobj _t est 2
2 /
Tabl e dr opped.
El apsed: 00: 00: 00. 14
SQL> cr eat e t abl e d14. myobj _t est 2
2 t abl espace t bs_t est
3 nol oggi ng
4 as sel ect * f r omd14. myobj _t est 1
5 or der by col 5
6 /
Tabl e cr eat ed.
El apsed: 00: 00: 44. 55
SQL> begi n
2 dbms_st at s. i mpor t _t abl e_st at s( ownname=>' D14' , t abname=>' MYOBJ _TEST
2' ,
3 st at t ab=>' SI _STATTAB' , st at own=>' SYS' ) ;
5 end;
5 /
El apsed: 00: 00: 00. 16
SQL>
Classic
search
Send feedback
table:
SQL> al t er syst emf l ush buf f er _cache;
Syst emal t er ed.
El apsed: 00: 00: 00. 11
SQL> set echo of f
Pr i mi ng our st or age i ndexes . . .
El apsed: 00: 00: 18. 73
Connect ed.
SQL> SELECT count ( *)
3 wher e col 5=' H0114' ;
COUNT( *)
- - - - - - - - - -
3000
El apsed: 00: 00: 00. 06
SQL> @si _myst at . sql
SQL> set echo of f
2 sess. val ue/ 1024/ 1024 val ue
10 ' cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan' )
11 or der by 1
12 /
St at i st i c Val ue ( MB
)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - -
cel l physi cal I O byt es el i gi bl e f or pr edi cat e of f l oad 1, 721. 50
cel l physi cal I O byt es saved by st or age i ndex 1, 718. 61
cel l physi cal I O i nt er connect byt es . 06
Classic
search
Send feedback
cel l physi cal I O i nt er connect byt es r et ur ned by smar t scan . 04
SQL>
As we can see:
The number of bytes eligible for offload was about the same as the previous
tests
Our query execute in 0.06 seconds, as compared to 0.44 seconds
We saved about 1,718 MB of I/O as a result of Storage Indexes - nearly *all*
of the required I/O requested
Conclusion
The ordering of data in your tables plays a significant role in how well Storage
Indexes are used and how much they can potentially improve performance. If you
think about how they work, if the region indexes on each storage region contain
high and low values that are "close together", queries with predicates that lie
outside these ranges can very effectively save you a great deal of physical I/O
requests.
Something to think about as you're designing your applications on Exadata and
specifically, how data is inserted into your tables/segments. You may not have a
great deal of control of how this works depending on your application and
application usage, but hopefully this will get you thinking about the impact of
Storage Indexes and the order of the data in your tables.
0
Add a comment
Classic
search
Send feedback

Jccoracle Blogspot in

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Jccoracle Blogspot in

Transféré par

Droits d'auteur :

Formats disponibles

10th March 2013

Enter your comment...

Enter your comment...

Vous aimerez peut-être aussi