Andrey Nikolaev Latch Internals 2012 For Ruoug

Latch internals
RUOUG Seminar, December 6, 2012

Andrey Nikolaev
Oracle Database Performance Expert
RDTEX, Russia
RDTEX | Latch internals | RUOUG Seminar
WHO AM I
Andrey.Nikolaev@rdtex.ru
RDTEX, First Line Support Center
http://andreynikolaev.wordpress.com
"Latch, mutex and beyond"
Specialize in Oracle performance tuning
Over 20 years of Oracle related experience as a research
scientist, developer, DBA, performance consultant, lecturer
Occasionally presented at conferences
RUOUG member
EPISODE OF LATCH CONTENTION

Oracle instance hangs due to heavy "cache buffers chains" latch contention
WHAT WE ALL KNEW ABOUT ORACLE LATCHES?

Latches spin up to the instance parameter _SPIN_COUNT
that defaults to 2000.
This number can be tuned dynamically.
Latch waits use increasing exponential backoff.
Latch waits are very short.
We should tune application, not the _SPIN_COUNT
CONTEMPORARY LATCHES SINCE 9.2

I was wondering to find that:
Exclusive latch spins 10 times more.
The exponential backoff was disappeared more then decade
ago in Oracle 9.2.
All except one latch use wait posting now.
"Latch free" wait may be infinite.
Sometimes it is worth to tune the _SPIN_COUNT.
This parameter is static for most latches.
GREAT SOURCES OF LATCH INFO

Oracle8i Internal Services for Waits, Latches, Locks, and
Memory by Steve Adams [1], 1999.
Founded new era of Oracle performance tuning
However, the Oracle latch technologies were evolved greatly during
the decade.
Systematic Latch Contention Troubleshooting approach by

Tanel Poder [2].
Latchprofx.sql script revolutionized the latch tuning
Oracle Core: Essential Internals for DBAs and Developers"

by Jonathan Lewis [3], 2011.
REVIEW OF SERIALIZATION MECHANISMS IN ORACLE

"Latches are simple, low-level serialization mechanisms that coordinate
multiuser access to shared data structures, objects, and files"
"A mutual exclusion object (mutex) is a low-level mechanism that

prevents an object in memory from aging out or from being corrupted "
"Internal locks are higher-level, more complex mechanisms "

Oracle Database Concepts 11.2
KGX Mutexes appeared in latest Oracle versions inside Library Cache

Locks
Latches
Mutexes
Access
Several Modes
Types and Modes
Operations
Acquisition
FIFO
SIRO (spin) + FIFO
SIRO
Spinlock based
No
Yes
Yes
Timescale
> Milliseconds
Microseconds
SubMicroseconds
Lifecycle
Dynamic
Static
Dynamic
HOW SPINLOCK WORKS

The spinlock location:
Holding by CPU 1
Free
CPU1:
Holding by CPU 2
Free
Holding the spinlock

Spin
CPU2: Get
CPU3:
Holding time (S)
Holding the spinlock
Spin polls the spinlock location

Miss
Spin count ( )
Sleep releases the CPU

Sleep
Oracle latches:
Use atomic hardware instruction for Immediate get.

If missed, process spins by polling latch location during Spin get.
Number of spin cycles is bounded by spin count.
In spin get not succeed, the process acquiring the latch sleeps.
8
CLASSIC SPINLOCKS
Wiki: " spinlock waits in a loop repeatedly checking until the lock
becomes available ".
Introduced by Edsger Dijkstra [4] in 1965. Have been thoroughly
investigated since that time [5].
Many sophisticated spinlock realizations were proposed and evaluated
(TS, TTS, Delay, MCS, Anderson,...).
Two general types:
System spinlocks. Kernel OS threads cannot wait. Major metrics:
atomic operations frequency. Shared bus utilization.
User spinlocks. Oracle latch and mutex. Average lock holding time ~
10 us. It is more efficient to poll a lock rather than pre-empt the thread
doing 1 ms context switch. Metrics: CPU and elapsed times.
9
THE NEED FOR SPIN
No spinning
_spin_count=0
_spin_count=2000
Time to complete CBC latch contention testcase vs. number of threads.

10
SPINLOCK REALIZATIONS
Spinlock:
TS
Pseudocode:
Problems:
while(Test_and_Set(lock));
Bus saturation
while(lock||Test_and_Set(lock));
Invalidation storms on
release (open door,
thundering herds).
Adjustable delay
Slower under contention
pre-11.2 mutex
TTS
Oracle latch
Delay
Latch and 11.2
Mutex
Anderson, MCS, Queues. Widely used in Java, CPU and memory

etc.
Linux kernel
overhead, preemption
issues
11
THE TOOLS
12
DTRACE. SOLARIS DYNAMIC TRACING FRAMEWORK
DTrace allows us to investigate how Oracle latches perform in real time:

Create triggers on any event inside Solaris and function call inside Oracle
provider:module:function:name
pid1910:oracle:kslgetl:entry
Write trigger bodies actions. May read and change any address location.
Can count the latch spins, trace the latch waits, perform experiments.
Measure times and distributions up to microseconds precision.
sqlplus /nolog @demo_00
ORADEBUG. INTERNAL ORACLE DEBUGGER

oradebug call. Allows us invoke any internal Oracle function manually.
SQL> oradebug call kslgetl 0x5001A020 1 2 3
Function returned 1
oradebug peek. Examines contents of any memory location.
SQL> oradebug peek 200222A0 24
[200222A0, 200222B8) = 00000016 00000001 000001D0 00000007
oradebug poke. Modifies memory.
oradebug watch. Watch a region of memory.
LATCH CONTENTION TESTCASES

Exclusive and shared latches behave differently.
Contention for shared "cache buffer chains" latch arises when many
sessions scan the same block concurrently and _fastpin_enable=0.
create table latch_contention as select * from dual;
for i in 1..1000000
loop
select dummy into a from latch_contention;
end loop;
To induce the contention for exclusive row cache objects latch protecting
dc_users row cache just add several dynamic RLS policies:
exec dbms_rls.add_policy('system','latch_contention','rls_1',
exec dbms_rls.add_policy('system','latch_contention','rls_2',
These dummy policies generate predicates '1=1'.

Many other contention scenarios possible.
sqlplus /nolog @demo_0_1
15
ROW CACHE LATCH CONTENTION TESTCASE
Latch wait
Elapsed and CPU times vs. number of threads.

Oracle Database Appliance (24 SMT).
16
HOW ORACLE
REQUESTS THE LATCH
17
DTRACE REVEALS LATCH INTERFACE ROUTINES

Oracle calls the following functions to acquire the latch:
kslgetl(laddr, wait, why, where)
- get exclusive latch
kslg2c(l1,l2,trc,why, where)
- get two excl. child latches
kslgetsl (laddr,wait,why,where,mode)
- get shared latch
Use 32-bit ports

11g - ksl_get_shared_latch ()
64bit Oracle corrupts the arguments of
kslg2cs(l1,l2,mode,trc,why,
where))
- get two shr. child latches
oradebug call in
10.2.0.4 and above!
kslgpl(laddr,comment,why,where)
- get parent and all childs
kslfre(laddr)
- free the latch
Oracle give us possibility to do the same by oradebug call
sqlplus /nolog @demo_01
18
WHERE AND WHY ORACLE GETS THE LATCH

To request the latch, Oracle kernel routines need:
laddres address of latch in SGA
wait flag for no-wait (0) or wait (1) latch acquisition
where integer code for location from where the latch is acquired.
Oracle lists possible latch where values in x$ksllw.indx.
why - context of why the latch is acquiring at this where. It
contains DBA for CBC latches (Tanel Poder), SGA chunk address
for shared pool latch, session address, etc Why meaning for
some where can be found in x$ksllw.ksllwlbl
mode requesting state for shared lathes.
8 SHARED mode, 16 EXCLUSIVE mode
19
LIST OF LATCH WHERE LOCATIONS

X$KSLLW List of latch [W]here. Part of v$latch_misses:
ksllwnam Kernel function name(kghupr1, ksqdel, etc...)
ksllwlbl Description of why context
Join this code on indx to:
x$ksuprlat.ksulawhr (v$latchholder) where value that was
used by latch holder
x$ksllt.kslltwhr (v$latch) where that was used by last
successful latch acquisition
x$ksupr.ksllawer (v$process) where that was used by
process is being waiting for latch
x$kslwsc.indx (v$latch_misses) latch sleep statistics by
where
20
V$LATCH_MISSES
Based on X$KSLWSC - No[W]ait and [S]leep [C]ount stats by
Context. Documentation is rather obscure.
The following counters are incremented on each latch sleep:
sleep_count (where the latch was held)
wtr_slp_count (where the latch get was requested)
nwfail_count is incremented only for session, which SID is equal
to _LATCH_MISS_STAT_SID parameter.
longhold_count is incremented for shared latches only, when
someone held the latch at this where for the entire duration of
sleep.
21
LATCH IS HOLDING BY PROCESS, NOT SESSION

Process fixed array:
v$process
-> x$ksupr
Struct ksupr {
Struct kslla{
ksllt *ksllalat[14];
}
}
List of all latches:

v$latch ->x$ksllt
Struct ksllt{
Each process has an array of references to the latches it is holding.

22
KSLLA IS HIDDEN IN V$PROCESS

The process latching info is the kslla structure embedded in the
process state object.
Some kslla fields were externalized to SQL.
kslla contains array of latches holding by process at each level and
information about the state of any current attempt to acquire a
latch. Not externalized directly.
When you query v$latchholder/x$ksuprlat, Oracle scans
v$process/x$ksupr in order to find all the processes that are
holding the latches.
Tanel Poder uses high frequency sampling of v$latchholder to
systematically troubleshoot latch contention.
23
X$KSUPR.KSLLA FIELDS INSTRUMENT THE LATCH GET

ksllalaq address of latch acquiring. Populated during immediate
get (and spin before 11g)
ksllawat - latch being waited for. This is v$process.latchwait
ksllawhy why for the latch being waited for
ksllawere where for the latch being waited for
ksllalow bit array of levels of currently holding latches
ksllaspn - latch this process is spinning on. v$process.latchspin.
Not populated since 8.1
ksllaps% - inter-process post statistics
Generic latch free and 27 latch specific latch free: % wait events are
associated with sleep only. Spin is invisible to Oracle Wait Interface.
24
THE LATCH STRUCTURE KSLLT
struct ksllt {
<Latch>
where and why
Level, latch#, class, other attributes
Statistics
Latch wait list header
25
LATCH SIZE BY VERSION

x$ksmfsv list of all fixed SGA variables:
SELECT DISTINCT ksmfssiz
FROM x$ksmfsv
WHERE ksmfstyp = 'ksllt';
*nix 32bit
*nix 64bit
Windows 32bit
7.3.4
92
120
8.0.6
104
104
8.1.7
104
144
104
9.0.1
200
160
9.2.0
196
240
200
10.1.0
256
208
100
160
104
10.2.0 - 11.2.0.3
Latch structure was bigger in 10.1 due to additional latch statistics

26
ORACLE LATCH IS NOT JUST A SINGLE MEMORY LOCATION

Before 11g. Value of first latch byte (word for shared latches) was
used to determine latch state:
0x00 latch is free
0xFF exclusive latch is busy. Was 0x01 in Oracle 7
0x01,0x02, - shared latch holding by 1,2, processes simultaneously
0x20000000 | pid - shared latch holding exclusively
In 11g first latch word shows the pid of the latch holder
0x00 latch is free
0x12 Oracle process with pid 18 holds the exclusive latch
27
BUSY ORACLE 11G LATCH IN MEMORY

11.2 Exclusive latch (32-bit platform):
SQL> oradebug peek 200222A0 24
[200222A0, 200222B8) = 00000016 00000001 000001D0 00000007
pid^
gets
latch#
level#
11.2 Shared latch. EXCLUSIVE mode:

SQL> oradebug peek 0x&laddr 24
[2000C814, 2000C82C) = 2000001F 0000007C 0000007E 00000006
pid^
gets
latch#
level#
cont. sqlplus /nolog @demo_01

28
LATCH FIXED TABLES

Before Oracle 11g all the latches were externalized to SQL in
v$latch (x$ksllt).
In 11g parent (fixed SGA) latches are externalized in
v$latch_parent (x$kslltr_parent).
In 11g child (allocated on startup) latches are externalized in
v$latch_children (x$kslltr_children) .
Oracle 11g v$latch (x$kslltr) merges these data. Beware: select
from x$kslltr acquires all parent latches and their children.
Only part of latch structure is visible in x$ksllt fixed tables:
kslltwhr, kslltwhy where and why latch was acquired last time.
kslltwgt, - statistics.
kslltlvl, - attributes.
kslltwtt latch wait time.
29
CUMULATIVE V$LATCH STATISTIC COUNTERS

Statistics:
x$ksllt
When and how it increments:
GETS
kslltwgt
++ after wait mode latch get
MISSES
kslltwff
++ after wait get if it was missed
SLEEPS
kslltwsl
+number_of_sleeps during get
SPIN_GETS
ksllthst0
++ if get was missed but not slept
WAIT_TIME
kslltwtt
+wait_time after latch get
IMMEDIATE_GETS
kslltngt
++ after nowait mode latch get. Is not

incremented on miss.
IMMEDIATE_MISSES kslltnfa
++ if nowait mode get was missed.

30
MANY STATISTICS WERE LOST IN 10.2

In 10.2 Oracle made obsolete many useful latch statistics.
Latch size was decreased and the following fields always contain 0
now in v$latch:
v$latch.waiters_woken
v$latch.waits_holding_latch
v$latch.sleep1sleep10
and many undocumented statistics.
This was mentioned in 10.2 Reference.
31
X$KSLLD - LATCH DESCRIPTORS

Flags and attributes describing each latch are stored in kslldt array
of kslld structures.
[K]ernel [S]ervice [L]atch [L]ock [D]escriptor.
Some, but not all, of kslld fields are externalized via x$kslld and
v$latchname.
Located in the PGA.
32
LATCH ATTRIBUTES
Each latch have at least the following attributes in kslldt :
Name Latch name as appeared in V$ views
SHR. Is the latch Exclusive or Shared?
PAR. Is the latch Solitary or Parent for the family of child latches?
G2C. Can two child latches be simultaneously requested in wait mode
using kslg2c()
LNG. Is wait posting used for this latch? Obsolete since Oracle 9.2.
UFS. The latch is Ultrafast. It will not increment miss statistics when
STATISTICS_LEVEL=BASIC. 10.2 and above
Level. 0-14. Used to prevent latch deadlocks
Class. 0-7. Spin and wait class assigned to the latch. 9.2 and above.
33
LATCH ATTRIBUTES BY ORACLE VERSION

Oracle
Number of latches
PAR
G2C
LNG
UFS
SHARED
7.3.4.0
53
14
8.0.6.3
80
21
8.1.7.4
152
48
19
9.2.0.8
242
79
37
19
10.2.0.2
385
114
55
47
10.2.0.3
388
117
58
48
10.2.0.4
394
117
59
50
11.1.0.6
496
145
67
81
11.1.0.7
502
145
67
83
11.2.0.1
535
149
70
86
11.2.0.2
551
153
72
91
11.2.0.3
553
154
72
93
34
LATCH LEVELS
Every latch has a level in the range 0-14 to avoid deadlocks.
Process can request a latch X in wait mode after obtaining latch Y, if and
only if level X > level Y
At the same level. Process can request the second G2C child X in wait
mode after obtaining child Y, if and only if the child number of X < child
number of Y
If these rules are broken, the Oracle process raises:
ORA-600[504] - trying to obtain another latch at incompatible level. Note 28104.1.
ORA-600[525] - trying to obtain another latch at the same level. Note 138888.1.
ORA-600[526] - latch child order violation. Note 138889.1.
35
PARENT AND CHILD LATCHES

Latch can be a parent, child, or solitary latch
Both parent and child latches share the same latch name.
Parent latches are static objects in fixed SGA. See x$ksmfsv and ksms.s
Child latches are allocated during instance startup
The parent latch can be gotten independently
But may act as a master latch when acquired in special mode in kslgpl()
during deadlock detection or select from v$latch/x$kslltr and some x$'s:
(latch info) wait_event=0 bits=10
holding 5000a5b8 Parent+children enqueue hash chains level=4
Location from where latch is held: ksqcmi: kslgpl:
Context saved from call: 0
state=busy
36
LATCH TREES
Rising level rule leads to trees of processes waiting for and
holding the latches:
ospid: 28067
sid: 1677 pid: 61
holding: 3800729f0
'shared pool' (156) level=7 child=1 whr=1602 kghupr1
waiter: ospid: 129
sid: 72 pid: 45
holding: a154b7120
'library cache' (157) level=5 child=17 kglupc: child
waiter: ospid: 18255 sid: 65 pid: 930
waiter: ospid: 6690
sid: 554 pid: 1654
waiter: ospid: 4685
sid: 879 pid: 1034
waiter: ospid: 29749

sid: 180 pid: 155
holding: a154b7db8
'library cache' (157) level=5 child=4 kglupc: child
Direct SGA access program output for 9.2.0.6 instance with too small
shared pool.
37
LATCH CLASSES
Classes with different spin and wait policies.
Up to 8 classes. By default all the latches belong to class 0
Exception process allocation latch belongs to class 2
Latch can be assigned to class by its number. For example :
_LATCH_CLASS_6='100 12 1000 2000 3000 4000'
_LATCH_CLASSES='4:6'
Will be discussed later.

SQL> select * from x$ksllclass;
INDX
SPIN YIELD WAITTIME SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 ...
---- ------ ----- -------- ------ ------ ------ ------ -----0 20000
0
1
8000
8000
8000
8000
8000
6
100
12
1000
2000
3000
4000
4000
4000
38
LATCH ACQUISITION IN NO-WAIT MODE

kslgetl(laddr,0,why,where)
Exclusive latch get
kslgetsl(laddr,0,why,where,mode)
Shared latch get
One attempt to get latch, no spin and sleep

sskgslgf() -> uses atomic XCHG/LDSTUB instructions for exclusive
latches
skgslocas() -> uses atomic CMPXCHG (CAS) for shared latches
IMMEDIATE_... statistics are updated

Used when many candidate latches are available, to bypass the
latch level rule or as part of CBC chain examination
sqlplus /nolog @demo_03.sql
39
LATCH WAITS
40
WAITING FOR THE LATCH
S G A
Latch
Process B
Process A
CPU 1
CPU 2
Process A
holds a latch
Process B waits
(spins and sleeps)
41
LATCH ACQUISITION IN WAIT MODE.

OFFICIAL VERSION
" If a required latch is busy, the process requesting it spins
(counts up to a large number (_LATCH_SPIN_COUNT)), tries
again and if still not available, spins again.
This loop is repeated up to _SPIN_COUNT maximum number of
times.
If after that entire loop, the latch is still not available, the process
must yield the CPU and go to sleep.
A sleep corresponds to a wait for the latch free wait event.
Initially, a process sleeps for 0.01 s. This time is doubled in every
subsequent sleep up to a maximum determined by the parameter
_MAX_EXPONENTIAL_SLEEP (0.2s or 2s)"
Still did not found any version,

which used this.
May be it was <= Oracle 6
42
LATCH ACQUISITION IN WAIT MODE

ANOTHER OFFICIAL VERSION. WAS USED IN 7.3-8.1
Latch gets with wait (kslgetl(laddress,1,))
One fast get, no spin (sskgslgf())

Spin get: check the latch upto _SPIN_COUNT times
Sleep on latch free event with exponential backoff
Repeat
43
8I LATCH SPIN CODE FLOW USING DTRACE

sqlplus /nolog @demo_04_8.sql 29
kslgetl(0x200058F8,1,2,3)
kslges(0x200058F8, ...)
skgsltst(0x200058F8)
pollsys(...,timeout=10 ms,...)
- KSL GET exclusive Latch# 29

- wait get of exclusive latch
...
call repeated 2000 times = SPIN_COUNT
- Sleep 1
...
call repeated 2000 times
- Sleep 2
...
- Sleep 3
...
- Sleep 4
WAIT #0: nam='latch free' ela= 0 p1=536893688 p2=29 p3=0
Event 10046 trace:
44
POTENTIAL PROBLEMS WITH EXPONENTIAL BACKOFF

0.01-0.01-0.01-0.03-0.03-0.07-0.07-0.15-0.23-0.39-0.39-0.71-0.711.35-1.35-2.0-2.0-2.0-2.0...sec
timeout = 2[( N wait +1) / 2 ] 1
Most waits were for nothing latch already was free.

Latch utilization can not be high.
Unnecessary spins provoke CPU thrashing.
45
9.2-11G EXCLUSIVE LATCH SPIN FLOW USING DTRACE

sqlplus /nolog @demo_05.sql 45
kslgetl(0x50006318, 1)
-> sskgslgf(0x50006318)= 0
-immediate latch get
-> kslges(0x50006318, ...)
-wait latch get
-> skgslsgts(...,0x50006318, ...) -spin latch get
->sskgslspin(0x50006318)
...
- repeated 20000 cycles = 10*_SPIN_COUNT!
-> kskthbwt(0x0)
-> kslwlmod()
- set up Wait List
-> sskgslgf(0x50006318)= 0
-immediate latch get
-> skgpwwait
-sleep latch get
semop(11, {17,-1,0}, 1)
Semop infinite wait until posted!
46
RELIABLE LATCH WAITS

Hidden latch wait revolution that we missed. Latch relies on wait
posting.
In Oracle 9.2-11.2, all the latches in default class 0 use latch wait
posting. Process waits for the latch without any timeout.
Exclusive latch spins 20000 cycles by default.
Only first read is atomic, further reads poll the CPU cache. Latch is
TTS spinlock.
Latches assigned to non-default class wait until timeout.
Version 9.0 was transient and used mix of backoff and wait posting.
This is why _LATCH_WAIT_POSTING parameter disappeared in
9i.
_LATCH_CLASSES and _LATCH_CLASS_X parameters
determine the latch wait and spin.
47
IMAGINE THE TIME MICROSCOPE

Reality:
One million zoomed time:
1 us
1 sec
1 ms
17 min
1 sec
11.5 days
Light speed (300000 km/sec)
Sonic speed (300 m/sec)
CPU tick (2 GHz)
0.0005 sec
Avg Cache buffers chains latch holding time
~ 1-2 sec
Max spin time for shared latch (~2*2000*5 ticks)

Avg Library Cache latch holding time (10-20us)
Max spin time for exclusive latch (~20000*5 ticks)
~10 sec
~ 10-20 sec
~50 sec
Interval between CBC latch gets
~ 10 sec
OS context switch time (10us-1ms)
10sec-17min
48
LATCH WAITS UNDER TIME MICROSCOPE

Old 8i latch algorithm:
Spin for latch during 5 seconds.
Go sleep for 2 hours 46 min in hope that congestion will dissolve.
Spin again during 5 seconds.
Sleep again for 2 hours 46 min.
Spin again for 5 seconds.

Sleep again for 23 days (2 sec)
New 9.2 latch algorithm:

Spin for shared latch during 10 seconds (50 for exclusive latch)
If spin not succeeded, go to sleep until post.
According to v$event_histogram the majority of latch waits now take
less then 17 min (1ms)
49
_SPIN_COUNT IS EFFECTIVELY STATIC

FOR EXCLUSIVE LATCHES
Oracle process spins for exclusive latch up to x$ksllclass[0].spin
cycles. This number depends on _SPIN_COUNT and
_LATCH_CLASS_0 parameters.
If _SPIN_COUNT was not set during instance startup process will
spin up to 20000 cycles. The default value for _SPIN_COUNT is
only 2000.
If instance started with _SPIN_COUNT parameter, Oracle set
x$ksllclass.spin to _SPIN_COUNT if not otherwise specified by
_LATCH_CLASS_0 parameter.
The _SPIN_COUNT parameter can be changed dynamically, but
this change will not have any effect for exclusive latch.
50
TIMED LATCH WAIT POSTING

If wakeup post is lost in OS, waiters will sleep infinitely. Common
problem in earlier 2.6.9 Linux kernel.
This can lead to instance hang process will never be woken up.
_ENABLE_RELIABLE_LATCH_WAITS = FALSE changes semop()
system call to semtimedop() with 0.3 sec timeout.
Introduced in 9.2.0.6 for Bug 3632400 LMS may hang in kslges()
waiting for a post. Increased robustness to OS posting problem.
Generic. AIX ports always use timed latch wait posting
My testcases did not revealed any measurable performance
overhead of timed latch wait posting in Solaris and Linux.
51
FINE GRAIN LATCH TUNING

Latch can be assigned to one of eight classes having different spin and
wait policies. Standard class 0 latch use wait posting.
_latch_class_X = Spin Yield Waittime Sleep0 Sleep1 Sleep7"
Nonstandard class latch loops upto Spin cycles, then yields CPU. This is
repeated Yield times. Then the process sleeps for SleepX
microseconds using pollsys() (not semtimedop()) system call.
If Yield !=0 repeat Yield times:
Loop up to Spins cycles
Yield CPU using yield() (or sched_yield())
Sleep for SleepX usecs

Then spin again
52
TRACES OF NONDEFAULT CLASS LATCH

_latch_class_6="100 2 3 10000 20000 30000 40000 50000 60000 70000 80000"
kslges(0x50006434, ...)
- wait get of exclusive latch
sskgslspin(0x50006434) ... repeated 100 times.
yield()
- Yield 1
yield()
- Yield 2
pollsys(...,timeout=10 ms,...) - Sleep 1. End of spin and yield loop
Event 10046 trace:

WAIT #0: nam='latch free' ela= 10886 address= number=46 tries=1 ...
...
53
PERFORMANCE OF LATCH CLASSES
"Row cache objects" latch contention testcase on ODA (24 SMT)54
IS THE _SPIN_COUNT REALLY STATIC?

Well-known Guy Harrison experiments [5] with Oracle 11g
demonstrated that dynamic _SPIN_COUNT tuning influenced latch
waits, CPU and throughput.
The experiments with dynamic change of _SPIN_COUNT used
cache buffers chain latch contention
This latch became shared in Oracle 9.2
Exclusive latch uses static SPIN value from x$ksllclass
(_LATCH_CLASS_X). By default 20000.
Shared latch spin can be tuned by _SPIN_COUNT value. By
default 2000. However, this is not trivial.
55
SHARED LATCH BEHAVE LIKE ENQUEUE

Oracle realization of Read-Write spinlocks. Appeared in 8.0.
S and X shared latch modes are incompatible.
If latch is held in S mode, X mode waiter blocks all further requests
This blocking achieved by special 0x40000000 bit in the latch
value. This bit is a flag that some incompatible latch acquisition
is in progress.
Queued processes will be woken up one by one on latch release.
X mode latch gets effectively serialize the shared latch.
See my blog for description of experiments.
56
TRACE OF SHARED LATCH WAIT

Oracle process spins for shared latch twice. If the first spin upto
_SPIN_COUNT was unsuccessful, the process puts itself into the
latch wait list and then spins again:
kslgetsl(0x50009BAC,1,2,3,16)
kslgess(0x50009BAC, ...)
kslskgs(0x50009BAC, ...)
sskgslspin(0x50009BAC)
kslwlmod(...)
kslskgs(0x50009BAC, ...)
sskgslspin(0x50009BAC)
skgpwwait(...)
semop(27,...)
- KSL GET Shared Latch in X mode

- shared latch wait get
... repeated 2000 times.
... repeated 2000 times.

- sleep until posted
57
SHARED LATCH ACQUISITION

Shared latch spin in Oracle 9.2-11g is governed by _SPIN_COUNT
value and can be dynamically tuned.
X mode shared latch get spins by default up to 4000 cycles.
S mode does not spin at all (or spins in unknown way).
S mode get
X mode get
Held in S mode
Compatible
2*_SPIN_COUNT
Held in X mode
2*_SPIN_COUNT
Blocking mode
2*_SPIN_COUNT
58
LATCH RELEASE
Free the latch kslfre(laddr).
Oracle process releases the latch nonatomically.
Then it sets up memory_barrier perform atomic operation on
address individual to each process.
This requires less bus invalidation and ensures propagation of latch
release to other local caches.
Not fair policy. Spinners on the local CPU board have the
preference.
Then process posts the first process in the list of waiters.
59
LATCH CONTENTION
60
LATCH CONTENTION
Occurs when a latch is required by several processes at the
same time
Mostly due to bugs or abnormal application behavior.
As utilization increases, latch waits and spins also increase.
Performance degradation can be extremely sharp
To diagnose the latch contention we need to investigate the
latch statistics [8]
61
DIFFERENTIAL (POINT IN TIME) LATCH STATISTICS

Latch requests arrival rate
gets
=
time
Immediate gets efficiency
Latch sleeps ratio

Latch wait time per second
Latch spin efficiency
misses
gets
sleeps
K =
misses
wait _ time
W =
time
spin _ gets
=
misses
Should be calculated for each child latch separately.

V$LATCH averaging distorts statistics
62
SAMPLED AVERAGES
By sampling of x$ksupr we can estimate:
Length of latch wait queue
Lw
SELECT COUNT(1) FROM X$KSUPR WHERE KSLLAWAT=LADDRESS;
Number of spinning processes in 10g
: Ns
SELECT COUNT(1) FROM X$KSUPR WHERE KSLLALAQ=LADDRESS;
Sampling of v$lathholder estimates latch utilization

Using DTrace we can measure
Latch acquisition and holding times
Actual spin count
DERIVED LATCH STATISTICS

latch _ holding _ time
time
Latch utilization: (PASTA)
U =
Average holding time:
S=
Length of latch wait list:
L =W
Recurrent sleeps ratio:
+ 1
Latch acquisition time:
Taq = 1 ( N s + W )
" Pct _ Get _ Miss "" Snap _ Time "

=
100 *" Get _ Re quests "
64
LATCH STATISTICS EXAMPLE

Latch statistics for:
0x380007358 "session allocation"
Requests rate:
lambda= 1350 Hz
Miss /get:
rho= .022
Sampled Utilization:
U= .013
Slps /Miss:
kappa= .28
Wait_time/sec:
W= .021
Sampled queue length Lw= .017
Spin_gets/miss:
sigma= .72
Sampled spinning procs:Ns= .013
Secondary sleeps ratio = .002
Avg holding time= 16.3 usec
sleeping time = 15.9 usec
acquisition time = 25.8 usec
Latch acquisition time distribution

measured by DTrace:
--------- Distribution -------2048 |
4096 |@@@@@@
8192 |@@@@@@@@
16384 |@@@@@@@@@@@@@@@@@@@@@@@
32768 |@@@
65536 |
ns
Average acquisition time=21 usec
sqlplus /nolog @demo_06.sql
65
LATCH CONTENTION DIAGNOSTICS IN 9.2-11G

Should be suspected if latch wait events are observed in Top 5 Timed
Events AWR section
Look for the latch with highest W
Symptoms of contention for the latch:
W > 1 sec/sec
Utilization
> 10%
Acquisition (or sleeping) time sufficiently greater then holding time

Latchprofx.sql script invented by Tanel Poder [4] greatly simplifies
diagnostics.
The script and v$latch_misses reveal where the contention arise
66
LATCH LEVELS AND LATCH CONTENTION

Process holding latch can only acquire latch with higher level.
Contention for a high-level latch frequently exacerbates contention
for lower-level latches
Library cache latch contention without many cursor versions
likely is a consequence of moderate shared pool latch contention.
Shared pool resizes, clearances and ORA-4031 dumps can induce
library cache latch storms.
Use latch_tree.sql script to confirm this scenario
67
BEWARE OF CERTAIN V$/X$

Frequent scans of some x$ tables induce latch contention:
x$ktcxb (v$transaction) - Transaction allocation latch
x$ktadm (v$lock, dba_jobs_running) "DML lock allocation"
x$kqlfxpl (v$sql_plan) "library cache" latch in 10g
x$ksmsp "shared pool" latch
x$kslltr (v$latch) all parent latches in 11g.

68
TREATING THE LATCH CONTENTION

The right method: tune the application and reduce the latch
demand. Tune the SQL, bind variables, schema, etc
Many brilliant books exist on this topic. Out of scope for this
presentation.
Install latest Oracle PSU and OS recommended patches. Several
latch related examples:
Oracle bug 7627743 Higher latch waits in 11g. Processes in 11.1
always spun upto the maximum spin count value.
Linux 2.6.9 kernel bug 149933. "Missing wakeup in ipc/sem". Fixed
in Red Hat EL 4 Update 4
HP-UX 11.31. Should install scheduler cumulative patch
PHKL_38397 or later
69
WHEN WE NEED TO TUNE SPIN COUNT:

The right method to tune the application may be too expensive and
require complete application rewrite.
Nowadays the CPU power is cheaper. We may already have enough free
CPU resources.
Oracle does not explicitly forbid _SPIN_COUNT tuning. However, change
of undocumented parameter should be discussed with Support. For
example, MOS recommends _SPIN_COUNT=5000 for 10g Streams.
Processes spin for exclusive latch spin upto 20000 cycles, for shared latch
upto 4000 cycles and infinitely for pre 11.2.0.2 mutex. Tuning may find
more optimal values for your application.
In CPU thrashing, decrease of the number of spins can reduce CPU
consumption.
70
TUNING THE SPIN COUNT EFFICIENTLY

Beware of side effects. You should have enough free CPU.
First, diagnose the root cause of latch contention. Measure the
latch statistics, install relevant patches.
Spin count tuning will only be effective if latch holding time S
is in its normal microseconds range
The number of spinning processes should remain less then number
of CPUs. Beware CPU thrashing. Analyze AWR and latch statistics
before and after each change.
Spin count adjustment is different for exclusive and shared latches.
Nonstandard latch classes may decrease CPU consumption.
It is a common myth that CPU time will raise infinitely while we
increase spin count.
71
SPIN COUNT ADJUSTMENT

Shared latches:
Dynamically by _SPIN_COUNT parameter.
Good starting point is the multiple of default 2000 value.
Setting _SPIN_COUNT parameter in initialization file, should be
accompanied by _LATCH_CLASS_0="20000". Otherwise spin for
exclusive latches will be greatly affected by next instance restart.
Exclusive latches:
Statically by _LATCH_CLASS_0 parameter. Needs the instance restart.
Good starting point is the multiple of default 20000 value.
It may be preferable to increase the number of "yields" for class 0 latches.
Spin count tuning probes the tail of latch holding time distribution. See my
blog for related mathematics.
72
SPIN SCALING
My previous work [8] proposed approximate scaling rules to estimate
effect of SPIN_COUNT tuning depending on:
= "sleeps
ratio"=Avg
If "sleeps ratio"K for
exclusive
latchSlps/Miss
is 10% than increase of
spin count
40000 may results in
K 0.1tothen:
If spin is inefficient
10Doubling
times decrease
of "latch free" wait events, and only 10%
the spin count will reduce "sleeps ratio" by 2 and doubles
the spin CPU consumption
increase of CPU consumption.
mileage
may vary
If latch holding timeYour
distribution
has exponential
tail, you have enough
CPU resources and spin is efficient K << 0 .1 then:
Doubling the spin count will square the sleeps ratio coefficient.
This will multiply the spin CPU consumption by (1 + K )
If the spin is already efficient, it is worth to increase the spin count.
73
CACHE BUFFER CHAINS LATCH CONTENTION TESTCASE
Latch wait and CPU times vs. _SPIN_COUNT

74
LONG LATCH HOLDING TIME: CPU THRASHING

Latch contention can cause CPU starvation. Processes contending
for a latch, also contend for CPU.
Once CPU starves, OS runqueue length raise and loadaverage
exceeds the number of CPUs. Some OS may shrink the time
quantum. Latch holders will not receive enough time to release the
latch.
Due to priority decay, latch acquirers may preempt latch holders.
This leads to priority inversion. The throughput falls.
Transition to this metastable state is more likely if workload of
your system approaches ~100% CPU
Due to preemption, latch holding time S will raise to the CPU
scheduling scale [7].
75
CPU THRASHING
If CPUs are running at 100% utilization for other reasons, the latch
contention can be just a symptom of this CPU starvation itself.
All wait times will be inflated by wait for CPU. However, latch
holding time should be in normal range.
CPU thrashing is unlikely on Solaris due to latch preemption
control.
To prevent the priority inversion use:
FX priority class on Solaris
HPUX_SCHED_NOAGE on HPUX
SCHED_BATCH policy on Linux
76
LATCH SMP SCALABILITY:

If latch utilization is
in single CPU environment.
Then in N CPU server latch utilization will be N

problematic:
N1. This can be
If single CPU system held latches only for 1% of time.

48 CPU server with the same per-CPU load will hold latches for 50%.
128 CPU Cores server will suffer huge latch (and mutex) contention.
This is also known as "Software lockout". It may substantially affect
contemporary multi-core servers.
NUMA should overcome this intrinsic spinlock scalability restrictions.
77
BIBLIOGRAPHY
1.
2.
3.
4.
5.
6.
7.
8.
9.
Steve Adams. Oracle8i Internal Services for Waits, Latches, Locks, and
Memory. 1999.
Tanel Poder blog, http://blog.tanelpoder.com
Jonathan Lewis, Oracle Core: Essential Internals for DBAs and
Developers. 2011
Edsger Dijkstra, Solution of a Problem in Concurrent Programming
Control CACM. 1965.
M. Herlihy, N. Shavit, The Art of Multiprocessor Programming. 2008.
B. Sinharoy, et al. , Improving Software MP Efficiency for Shared
Memory Systems. Proc. of 29th Hawaii Conference on System Sciences.
1996.
Guy Harrison blog, Using _spin_count to reduce latch contention in 11g.
http://guyharrison.squarespace.com/ 2008.
R. Johnson, et al. A new look at the roles of spinning and blocking.
Proc. of 5th Workshop on Data Management on New Hardware. 2009
A. Nikolaev, Exploring Oracle RDBMS latches using Solaris DTrace.
Proc. of MEDIAS 2011 Conf., http://arxiv.org/abs/1111.0594v1 2011
78
Q/A?
Questions?
Comments?
79
ACKNOWLEDGEMENTS
Thanks to RDTEX Technical Support Centre Director S.P.
Misiura for years of encouragement and support of my
investigations.
Thanks to my colleagues for discussions.
Thanks to all our customers for participating in latch
troubleshooting.
80
THANK YOU!
Andrey Nikolaev
http://andreynikolaev.wordpress.com
Andrey.Nikolaev@rdtex.ru
RDTEX, Moscow, Russia
www.rdtex.ru
RDTEX | Latch internals | RUOUG

Seminar
IN MEMORY OF MY FRIEND FOR 30 YEARS:
Andrey Kriushin
Chairman of Russian Oracle User Group (RuOUG)
Died of heart attack 02.08.2011
82

Andrey Nikolaev Latch Internals 2012 For Ruoug

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Andrey Nikolaev Latch Internals 2012 For Ruoug

Transféré par

Droits d'auteur :

Formats disponibles

Latch internals

RUOUG Seminar, December 6, 2012

RDTEX | Latch internals | RUOUG Seminar

RDTEX | Latch internals | RUOUG Seminar

EPISODE OF LATCH CONTENTION

RDTEX | Latch internals | RUOUG Seminar

WHAT WE ALL KNEW ABOUT ORACLE LATCHES?

RDTEX | Latch internals | RUOUG Seminar

CONTEMPORARY LATCHES SINCE 9.2

RDTEX | Latch internals | RUOUG Seminar

GREAT SOURCES OF LATCH INFO

Systematic Latch Contention Troubleshooting approach by

Oracle Core: Essential Internals for DBAs and Developers"

RDTEX | Latch internals | RUOUG Seminar

REVIEW OF SERIALIZATION MECHANISMS IN ORACLE

"A mutual exclusion object (mutex) is a low-level mechanism that

"Internal locks are higher-level, more complex mechanisms "

KGX Mutexes appeared in latest Oracle versions inside Library Cache

Types and Modes

SIRO (spin) + FIFO

RDTEX | Latch internals | RUOUG Seminar

HOW SPINLOCK WORKS

Holding the spinlock

Holding time (S)

Holding the spinlock

Spin polls the spinlock location

Sleep releases the CPU

Use atomic hardware instruction for Immediate get.

RDTEX | Latch internals | RUOUG Seminar

RDTEX | Latch internals | RUOUG Seminar

THE NEED FOR SPIN

Time to complete CBC latch contention testcase vs. number of threads.

RDTEX | Latch internals | RUOUG Seminar

Slower under contention

Anderson, MCS, Queues. Widely used in Java, CPU and memory

RDTEX | Latch internals | RUOUG Seminar

RDTEX | Latch internals | RUOUG Seminar

DTRACE. SOLARIS DYNAMIC TRACING FRAMEWORK

DTrace allows us to investigate how Oracle latches perform in real time:

RDTEX | Latch internals | RUOUG Seminar

ORADEBUG. INTERNAL ORACLE DEBUGGER

RDTEX | Latch internals | RUOUG Seminar

LATCH CONTENTION TESTCASES

These dummy policies generate predicates '1=1'.

RDTEX | Latch internals | RUOUG Seminar

ROW CACHE LATCH CONTENTION TESTCASE

Elapsed and CPU times vs. number of threads.

RDTEX | Latch internals | RUOUG Seminar

DTRACE REVEALS LATCH INTERFACE ROUTINES

- get exclusive latch

- get two excl. child latches

- get shared latch

Use 32-bit ports

- get parent and all childs

- free the latch

Oracle give us possibility to do the same by oradebug call

sqlplus /nolog @demo_01

RDTEX | Latch internals | RUOUG Seminar

WHERE AND WHY ORACLE GETS THE LATCH

RDTEX | Latch internals | RUOUG Seminar

LIST OF LATCH WHERE LOCATIONS

RDTEX | Latch internals | RUOUG Seminar

RDTEX | Latch internals | RUOUG Seminar