Vous êtes sur la page 1sur 82

Latch internals

RUOUG Seminar, December 6, 2012


Andrey Nikolaev
Oracle Database Performance Expert
RDTEX, Russia

RDTEX | Latch internals | RUOUG Seminar

WHO AM I

Andrey.Nikolaev@rdtex.ru
RDTEX, First Line Support Center
http://andreynikolaev.wordpress.com
"Latch, mutex and beyond"
Specialize in Oracle performance tuning
Over 20 years of Oracle related experience as a research
scientist, developer, DBA, performance consultant, lecturer
Occasionally presented at conferences
RUOUG member

RDTEX | Latch internals | RUOUG Seminar

EPISODE OF LATCH CONTENTION


Oracle instance hangs due to heavy "cache buffers chains" latch contention

RDTEX | Latch internals | RUOUG Seminar

WHAT WE ALL KNEW ABOUT ORACLE LATCHES?


Latches spin up to the instance parameter _SPIN_COUNT
that defaults to 2000.
This number can be tuned dynamically.
Latch waits use increasing exponential backoff.
Latch waits are very short.
We should tune application, not the _SPIN_COUNT

RDTEX | Latch internals | RUOUG Seminar

CONTEMPORARY LATCHES SINCE 9.2


I was wondering to find that:
Exclusive latch spins 10 times more.
The exponential backoff was disappeared more then decade
ago in Oracle 9.2.
All except one latch use wait posting now.
"Latch free" wait may be infinite.
Sometimes it is worth to tune the _SPIN_COUNT.
This parameter is static for most latches.

RDTEX | Latch internals | RUOUG Seminar

GREAT SOURCES OF LATCH INFO


Oracle8i Internal Services for Waits, Latches, Locks, and
Memory by Steve Adams [1], 1999.
Founded new era of Oracle performance tuning
However, the Oracle latch technologies were evolved greatly during
the decade.

Systematic Latch Contention Troubleshooting approach by


Tanel Poder [2].
Latchprofx.sql script revolutionized the latch tuning

Oracle Core: Essential Internals for DBAs and Developers"


by Jonathan Lewis [3], 2011.

RDTEX | Latch internals | RUOUG Seminar

REVIEW OF SERIALIZATION MECHANISMS IN ORACLE


"Latches are simple, low-level serialization mechanisms that coordinate
multiuser access to shared data structures, objects, and files"

"A mutual exclusion object (mutex) is a low-level mechanism that


prevents an object in memory from aging out or from being corrupted "

"Internal locks are higher-level, more complex mechanisms "


Oracle Database Concepts 11.2

KGX Mutexes appeared in latest Oracle versions inside Library Cache


Locks

Latches

Mutexes

Access

Several Modes

Types and Modes

Operations

Acquisition

FIFO

SIRO (spin) + FIFO

SIRO

Spinlock based

No

Yes

Yes

Timescale

> Milliseconds

Microseconds

SubMicroseconds

Lifecycle

Dynamic

Static

Dynamic

RDTEX | Latch internals | RUOUG Seminar

HOW SPINLOCK WORKS


The spinlock location:
Holding by CPU 1

Free

CPU1:

Holding by CPU 2

Free

Holding the spinlock


Spin

CPU2: Get
CPU3:

Holding time (S)

Holding the spinlock

Spin polls the spinlock location


Miss

Spin count ( )

Sleep releases the CPU


Sleep

Oracle latches:

Use atomic hardware instruction for Immediate get.


If missed, process spins by polling latch location during Spin get.
Number of spin cycles is bounded by spin count.
In spin get not succeed, the process acquiring the latch sleeps.
8

RDTEX | Latch internals | RUOUG Seminar

CLASSIC SPINLOCKS
Wiki: " spinlock waits in a loop repeatedly checking until the lock
becomes available ".
Introduced by Edsger Dijkstra [4] in 1965. Have been thoroughly
investigated since that time [5].
Many sophisticated spinlock realizations were proposed and evaluated
(TS, TTS, Delay, MCS, Anderson,...).
Two general types:
System spinlocks. Kernel OS threads cannot wait. Major metrics:
atomic operations frequency. Shared bus utilization.
User spinlocks. Oracle latch and mutex. Average lock holding time ~
10 us. It is more efficient to poll a lock rather than pre-empt the thread
doing 1 ms context switch. Metrics: CPU and elapsed times.
9

RDTEX | Latch internals | RUOUG Seminar

THE NEED FOR SPIN

No spinning
_spin_count=0

_spin_count=2000

Time to complete CBC latch contention testcase vs. number of threads.


10

RDTEX | Latch internals | RUOUG Seminar

SPINLOCK REALIZATIONS
Spinlock:
TS

Pseudocode:

Problems:

while(Test_and_Set(lock));

Bus saturation

while(lock||Test_and_Set(lock));

Invalidation storms on
release (open door,
thundering herds).

Adjustable delay

Slower under contention

pre-11.2 mutex
TTS

Oracle latch
Delay
Latch and 11.2
Mutex

Anderson, MCS, Queues. Widely used in Java, CPU and memory


etc.
Linux kernel
overhead, preemption
issues
11

THE TOOLS

RDTEX | Latch internals | RUOUG Seminar

12

RDTEX | Latch internals | RUOUG Seminar

DTRACE. SOLARIS DYNAMIC TRACING FRAMEWORK

DTrace allows us to investigate how Oracle latches perform in real time:


Create triggers on any event inside Solaris and function call inside Oracle
provider:module:function:name
pid1910:oracle:kslgetl:entry
Write trigger bodies actions. May read and change any address location.
Can count the latch spins, trace the latch waits, perform experiments.
Measure times and distributions up to microseconds precision.
sqlplus /nolog @demo_00

RDTEX | Latch internals | RUOUG Seminar

ORADEBUG. INTERNAL ORACLE DEBUGGER


oradebug call. Allows us invoke any internal Oracle function manually.
SQL> oradebug call kslgetl 0x5001A020 1 2 3
Function returned 1
oradebug peek. Examines contents of any memory location.
SQL> oradebug peek 200222A0 24
[200222A0, 200222B8) = 00000016 00000001 000001D0 00000007
oradebug poke. Modifies memory.
oradebug watch. Watch a region of memory.

RDTEX | Latch internals | RUOUG Seminar

LATCH CONTENTION TESTCASES


Exclusive and shared latches behave differently.
Contention for shared "cache buffer chains" latch arises when many
sessions scan the same block concurrently and _fastpin_enable=0.
create table latch_contention as select * from dual;

for i in 1..1000000
loop
select dummy into a from latch_contention;
end loop;

To induce the contention for exclusive row cache objects latch protecting
dc_users row cache just add several dynamic RLS policies:
exec dbms_rls.add_policy('system','latch_contention','rls_1',
exec dbms_rls.add_policy('system','latch_contention','rls_2',

These dummy policies generate predicates '1=1'.


Many other contention scenarios possible.
sqlplus /nolog @demo_0_1
15

RDTEX | Latch internals | RUOUG Seminar

ROW CACHE LATCH CONTENTION TESTCASE

Latch wait

Elapsed and CPU times vs. number of threads.


Oracle Database Appliance (24 SMT).

16

HOW ORACLE
REQUESTS THE LATCH
RDTEX | Latch internals | RUOUG Seminar

17

RDTEX | Latch internals | RUOUG Seminar

DTRACE REVEALS LATCH INTERFACE ROUTINES


Oracle calls the following functions to acquire the latch:
kslgetl(laddr, wait, why, where)

- get exclusive latch

kslg2c(l1,l2,trc,why, where)

- get two excl. child latches

kslgetsl (laddr,wait,why,where,mode)

- get shared latch

Use 32-bit ports


11g - ksl_get_shared_latch ()
64bit Oracle corrupts the arguments of
kslg2cs(l1,l2,mode,trc,why,
where))
- get two shr. child latches
oradebug call in
10.2.0.4 and above!

kslgpl(laddr,comment,why,where)

- get parent and all childs

kslfre(laddr)

- free the latch

Oracle give us possibility to do the same by oradebug call

sqlplus /nolog @demo_01

18

RDTEX | Latch internals | RUOUG Seminar

WHERE AND WHY ORACLE GETS THE LATCH


To request the latch, Oracle kernel routines need:
laddres address of latch in SGA
wait flag for no-wait (0) or wait (1) latch acquisition
where integer code for location from where the latch is acquired.
Oracle lists possible latch where values in x$ksllw.indx.
why - context of why the latch is acquiring at this where. It
contains DBA for CBC latches (Tanel Poder), SGA chunk address
for shared pool latch, session address, etc Why meaning for
some where can be found in x$ksllw.ksllwlbl
mode requesting state for shared lathes.
8 SHARED mode, 16 EXCLUSIVE mode
19

RDTEX | Latch internals | RUOUG Seminar

LIST OF LATCH WHERE LOCATIONS


X$KSLLW List of latch [W]here. Part of v$latch_misses:
ksllwnam Kernel function name(kghupr1, ksqdel, etc...)
ksllwlbl Description of why context
Join this code on indx to:
x$ksuprlat.ksulawhr (v$latchholder) where value that was
used by latch holder
x$ksllt.kslltwhr (v$latch) where that was used by last
successful latch acquisition
x$ksupr.ksllawer (v$process) where that was used by
process is being waiting for latch
x$kslwsc.indx (v$latch_misses) latch sleep statistics by
where
20

RDTEX | Latch internals | RUOUG Seminar

V$LATCH_MISSES
Based on X$KSLWSC - No[W]ait and [S]leep [C]ount stats by
Context. Documentation is rather obscure.
The following counters are incremented on each latch sleep:
sleep_count (where the latch was held)
wtr_slp_count (where the latch get was requested)
nwfail_count is incremented only for session, which SID is equal
to _LATCH_MISS_STAT_SID parameter.
longhold_count is incremented for shared latches only, when
someone held the latch at this where for the entire duration of
sleep.

21

RDTEX | Latch internals | RUOUG Seminar

LATCH IS HOLDING BY PROCESS, NOT SESSION


Process fixed array:

v$process
-> x$ksupr

Struct ksupr {

Struct kslla{
ksllt *ksllalat[14];
}
}

List of all latches:


v$latch ->x$ksllt

Struct ksllt{

Each process has an array of references to the latches it is holding.


22

RDTEX | Latch internals | RUOUG Seminar

KSLLA IS HIDDEN IN V$PROCESS


The process latching info is the kslla structure embedded in the
process state object.
Some kslla fields were externalized to SQL.
kslla contains array of latches holding by process at each level and
information about the state of any current attempt to acquire a
latch. Not externalized directly.
When you query v$latchholder/x$ksuprlat, Oracle scans
v$process/x$ksupr in order to find all the processes that are
holding the latches.
Tanel Poder uses high frequency sampling of v$latchholder to
systematically troubleshoot latch contention.
23

RDTEX | Latch internals | RUOUG Seminar

X$KSUPR.KSLLA FIELDS INSTRUMENT THE LATCH GET


ksllalaq address of latch acquiring. Populated during immediate
get (and spin before 11g)
ksllawat - latch being waited for. This is v$process.latchwait
ksllawhy why for the latch being waited for
ksllawere where for the latch being waited for
ksllalow bit array of levels of currently holding latches
ksllaspn - latch this process is spinning on. v$process.latchspin.
Not populated since 8.1
ksllaps% - inter-process post statistics
Generic latch free and 27 latch specific latch free: % wait events are
associated with sleep only. Spin is invisible to Oracle Wait Interface.
24

RDTEX | Latch internals | RUOUG Seminar

THE LATCH STRUCTURE KSLLT

struct ksllt {
<Latch>
where and why
Level, latch#, class, other attributes
Statistics
Latch wait list header

25

RDTEX | Latch internals | RUOUG Seminar

LATCH SIZE BY VERSION


x$ksmfsv list of all fixed SGA variables:
SELECT DISTINCT ksmfssiz
FROM x$ksmfsv
WHERE ksmfstyp = 'ksllt';
*nix 32bit

*nix 64bit

Windows 32bit

7.3.4

92

120

8.0.6

104

104

8.1.7

104

144

104

9.0.1

200

160

9.2.0

196

240

200

10.1.0

256

208

100

160

104

10.2.0 - 11.2.0.3

Latch structure was bigger in 10.1 due to additional latch statistics


26

RDTEX | Latch internals | RUOUG Seminar

ORACLE LATCH IS NOT JUST A SINGLE MEMORY LOCATION


Before 11g. Value of first latch byte (word for shared latches) was
used to determine latch state:
0x00 latch is free
0xFF exclusive latch is busy. Was 0x01 in Oracle 7
0x01,0x02, - shared latch holding by 1,2, processes simultaneously
0x20000000 | pid - shared latch holding exclusively

In 11g first latch word shows the pid of the latch holder
0x00 latch is free
0x12 Oracle process with pid 18 holds the exclusive latch

27

RDTEX | Latch internals | RUOUG Seminar

BUSY ORACLE 11G LATCH IN MEMORY


11.2 Exclusive latch (32-bit platform):
SQL> oradebug peek 200222A0 24
[200222A0, 200222B8) = 00000016 00000001 000001D0 00000007
pid^
gets
latch#
level#

11.2 Shared latch. EXCLUSIVE mode:


SQL> oradebug peek 0x&laddr 24
[2000C814, 2000C82C) = 2000001F 0000007C 0000007E 00000006
pid^
gets
latch#
level#

cont. sqlplus /nolog @demo_01


28

RDTEX | Latch internals | RUOUG Seminar

LATCH FIXED TABLES


Before Oracle 11g all the latches were externalized to SQL in
v$latch (x$ksllt).
In 11g parent (fixed SGA) latches are externalized in
v$latch_parent (x$kslltr_parent).
In 11g child (allocated on startup) latches are externalized in
v$latch_children (x$kslltr_children) .
Oracle 11g v$latch (x$kslltr) merges these data. Beware: select
from x$kslltr acquires all parent latches and their children.
Only part of latch structure is visible in x$ksllt fixed tables:

kslltwhr, kslltwhy where and why latch was acquired last time.
kslltwgt, - statistics.
kslltlvl, - attributes.
kslltwtt latch wait time.
29

RDTEX | Latch internals | RUOUG Seminar

CUMULATIVE V$LATCH STATISTIC COUNTERS


Statistics:

x$ksllt

When and how it increments:

GETS

kslltwgt

++ after wait mode latch get

MISSES

kslltwff

++ after wait get if it was missed

SLEEPS

kslltwsl

+number_of_sleeps during get

SPIN_GETS

ksllthst0

++ if get was missed but not slept

WAIT_TIME

kslltwtt

+wait_time after latch get

IMMEDIATE_GETS

kslltngt

++ after nowait mode latch get. Is not


incremented on miss.

IMMEDIATE_MISSES kslltnfa

++ if nowait mode get was missed.


30

RDTEX | Latch internals | RUOUG Seminar

MANY STATISTICS WERE LOST IN 10.2


In 10.2 Oracle made obsolete many useful latch statistics.
Latch size was decreased and the following fields always contain 0
now in v$latch:
v$latch.waiters_woken
v$latch.waits_holding_latch
v$latch.sleep1sleep10
and many undocumented statistics.
This was mentioned in 10.2 Reference.

31

RDTEX | Latch internals | RUOUG Seminar

X$KSLLD - LATCH DESCRIPTORS


Flags and attributes describing each latch are stored in kslldt array
of kslld structures.
[K]ernel [S]ervice [L]atch [L]ock [D]escriptor.
Some, but not all, of kslld fields are externalized via x$kslld and
v$latchname.
Located in the PGA.

32

RDTEX | Latch internals | RUOUG Seminar

LATCH ATTRIBUTES
Each latch have at least the following attributes in kslldt :
Name Latch name as appeared in V$ views
SHR. Is the latch Exclusive or Shared?
PAR. Is the latch Solitary or Parent for the family of child latches?
G2C. Can two child latches be simultaneously requested in wait mode
using kslg2c()
LNG. Is wait posting used for this latch? Obsolete since Oracle 9.2.
UFS. The latch is Ultrafast. It will not increment miss statistics when
STATISTICS_LEVEL=BASIC. 10.2 and above
Level. 0-14. Used to prevent latch deadlocks
Class. 0-7. Spin and wait class assigned to the latch. 9.2 and above.

33

RDTEX | Latch internals | RUOUG Seminar

LATCH ATTRIBUTES BY ORACLE VERSION


Oracle

Number of latches

PAR

G2C

LNG

UFS

SHARED

7.3.4.0

53

14

8.0.6.3

80

21

8.1.7.4

152

48

19

9.2.0.8

242

79

37

19

10.2.0.2

385

114

55

47

10.2.0.3

388

117

58

48

10.2.0.4

394

117

59

50

11.1.0.6

496

145

67

81

11.1.0.7

502

145

67

83

11.2.0.1

535

149

70

86

11.2.0.2

551

153

72

91

11.2.0.3

553

154

72

93
34

RDTEX | Latch internals | RUOUG Seminar

LATCH LEVELS
Every latch has a level in the range 0-14 to avoid deadlocks.
Process can request a latch X in wait mode after obtaining latch Y, if and
only if level X > level Y
At the same level. Process can request the second G2C child X in wait
mode after obtaining child Y, if and only if the child number of X < child
number of Y
If these rules are broken, the Oracle process raises:
ORA-600[504] - trying to obtain another latch at incompatible level. Note 28104.1.
ORA-600[525] - trying to obtain another latch at the same level. Note 138888.1.
ORA-600[526] - latch child order violation. Note 138889.1.

35

RDTEX | Latch internals | RUOUG Seminar

PARENT AND CHILD LATCHES


Latch can be a parent, child, or solitary latch
Both parent and child latches share the same latch name.
Parent latches are static objects in fixed SGA. See x$ksmfsv and ksms.s
Child latches are allocated during instance startup
The parent latch can be gotten independently
But may act as a master latch when acquired in special mode in kslgpl()
during deadlock detection or select from v$latch/x$kslltr and some x$'s:
(latch info) wait_event=0 bits=10
holding 5000a5b8 Parent+children enqueue hash chains level=4
Location from where latch is held: ksqcmi: kslgpl:
Context saved from call: 0
state=busy
36

RDTEX | Latch internals | RUOUG Seminar

LATCH TREES
Rising level rule leads to trees of processes waiting for and
holding the latches:
ospid: 28067
sid: 1677 pid: 61
holding: 3800729f0
'shared pool' (156) level=7 child=1 whr=1602 kghupr1
waiter: ospid: 129
sid: 72 pid: 45
holding: a154b7120
'library cache' (157) level=5 child=17 kglupc: child
waiter: ospid: 18255 sid: 65 pid: 930
waiter: ospid: 6690
sid: 554 pid: 1654
waiter: ospid: 4685
sid: 879 pid: 1034

waiter: ospid: 29749


sid: 180 pid: 155
holding: a154b7db8
'library cache' (157) level=5 child=4 kglupc: child
waiter: ospid: 13104 sid: 281 pid: 220
waiter: ospid: 24089 sid: 565 pid: 636
waiter: ospid: 25002 sid: 621 pid: 1481
waiter: ospid: 16930 sid: 1046 pid: 783

Direct SGA access program output for 9.2.0.6 instance with too small
shared pool.
37

RDTEX | Latch internals | RUOUG Seminar

LATCH CLASSES
Classes with different spin and wait policies.
Up to 8 classes. By default all the latches belong to class 0
Exception process allocation latch belongs to class 2
Latch can be assigned to class by its number. For example :
_LATCH_CLASS_6='100 12 1000 2000 3000 4000'
_LATCH_CLASSES='4:6'

Will be discussed later.


SQL> select * from x$ksllclass;
INDX
SPIN YIELD WAITTIME SLEEP0 SLEEP1 SLEEP2 SLEEP3 SLEEP4 ...
---- ------ ----- -------- ------ ------ ------ ------ -----0 20000
0
1
8000
8000
8000
8000
8000

6
100
12
1000
2000
3000
4000
4000
4000
38

RDTEX | Latch internals | RUOUG Seminar

LATCH ACQUISITION IN NO-WAIT MODE


kslgetl(laddr,0,why,where)

Exclusive latch get

kslgetsl(laddr,0,why,where,mode)

Shared latch get

One attempt to get latch, no spin and sleep


sskgslgf() -> uses atomic XCHG/LDSTUB instructions for exclusive
latches
skgslocas() -> uses atomic CMPXCHG (CAS) for shared latches

IMMEDIATE_... statistics are updated


Used when many candidate latches are available, to bypass the
latch level rule or as part of CBC chain examination

sqlplus /nolog @demo_03.sql

39

LATCH WAITS

RDTEX | Latch internals | RUOUG Seminar

40

RDTEX | Latch internals | RUOUG Seminar

WAITING FOR THE LATCH

S G A
Latch

Process B

Process A

CPU 1

CPU 2

Process A
holds a latch

Process B waits
(spins and sleeps)

41

RDTEX | Latch internals | RUOUG Seminar

LATCH ACQUISITION IN WAIT MODE.


OFFICIAL VERSION
" If a required latch is busy, the process requesting it spins
(counts up to a large number (_LATCH_SPIN_COUNT)), tries
again and if still not available, spins again.
This loop is repeated up to _SPIN_COUNT maximum number of
times.
If after that entire loop, the latch is still not available, the process
must yield the CPU and go to sleep.
A sleep corresponds to a wait for the latch free wait event.
Initially, a process sleeps for 0.01 s. This time is doubled in every
subsequent sleep up to a maximum determined by the parameter
_MAX_EXPONENTIAL_SLEEP (0.2s or 2s)"

Still did not found any version,


which used this.
May be it was <= Oracle 6

42

RDTEX | Latch internals | RUOUG Seminar

LATCH ACQUISITION IN WAIT MODE


ANOTHER OFFICIAL VERSION. WAS USED IN 7.3-8.1
Latch gets with wait (kslgetl(laddress,1,))

One fast get, no spin (sskgslgf())


Spin get: check the latch upto _SPIN_COUNT times
Sleep on latch free event with exponential backoff
Repeat

43

RDTEX | Latch internals | RUOUG Seminar

8I LATCH SPIN CODE FLOW USING DTRACE


sqlplus /nolog @demo_04_8.sql 29
kslgetl(0x200058F8,1,2,3)
kslges(0x200058F8, ...)
skgsltst(0x200058F8)
pollsys(...,timeout=10 ms,...)
skgsltst(0x200058F8)
pollsys(...,timeout=10 ms,...)
skgsltst(0x200058F8)
pollsys(...,timeout=10 ms,...)
skgsltst(0x200058F8)
pollsys(...,timeout=30 ms,...)

- KSL GET exclusive Latch# 29


- wait get of exclusive latch
...
call repeated 2000 times = SPIN_COUNT
- Sleep 1
...
call repeated 2000 times
- Sleep 2
...
call repeated 2000 times
- Sleep 3
...
call repeated 2000 times
- Sleep 4

WAIT #0: nam='latch free' ela= 0 p1=536893688 p2=29 p3=0

WAIT #0: nam='latch free' ela= 0 p1=536893688 p2=29 p3=1

WAIT #0: nam='latch free' ela= 0 p1=536893688 p2=29 p3=2

Event 10046 trace:

44

RDTEX | Latch internals | RUOUG Seminar

POTENTIAL PROBLEMS WITH EXPONENTIAL BACKOFF


0.01-0.01-0.01-0.03-0.03-0.07-0.07-0.15-0.23-0.39-0.39-0.71-0.711.35-1.35-2.0-2.0-2.0-2.0...sec

timeout = 2[( N wait +1) / 2 ] 1

Most waits were for nothing latch already was free.


Latch utilization can not be high.
Unnecessary spins provoke CPU thrashing.
45

RDTEX | Latch internals | RUOUG Seminar

9.2-11G EXCLUSIVE LATCH SPIN FLOW USING DTRACE


sqlplus /nolog @demo_05.sql 45
kslgetl(0x50006318, 1)
-> sskgslgf(0x50006318)= 0
-immediate latch get
-> kslges(0x50006318, ...)
-wait latch get
-> skgslsgts(...,0x50006318, ...) -spin latch get
->sskgslspin(0x50006318)
...
- repeated 20000 cycles = 10*_SPIN_COUNT!
-> kskthbwt(0x0)
-> kslwlmod()
- set up Wait List
-> sskgslgf(0x50006318)= 0
-immediate latch get
-> skgpwwait
-sleep latch get
semop(11, {17,-1,0}, 1)

Semop infinite wait until posted!

46

RDTEX | Latch internals | RUOUG Seminar

RELIABLE LATCH WAITS


Hidden latch wait revolution that we missed. Latch relies on wait
posting.
In Oracle 9.2-11.2, all the latches in default class 0 use latch wait
posting. Process waits for the latch without any timeout.
Exclusive latch spins 20000 cycles by default.
Only first read is atomic, further reads poll the CPU cache. Latch is
TTS spinlock.
Latches assigned to non-default class wait until timeout.
Version 9.0 was transient and used mix of backoff and wait posting.
This is why _LATCH_WAIT_POSTING parameter disappeared in
9i.
_LATCH_CLASSES and _LATCH_CLASS_X parameters
determine the latch wait and spin.
47

RDTEX | Latch internals | RUOUG Seminar

IMAGINE THE TIME MICROSCOPE


Reality:

One million zoomed time:

1 us

1 sec

1 ms

17 min

1 sec

11.5 days

Light speed (300000 km/sec)

Sonic speed (300 m/sec)

CPU tick (2 GHz)

0.0005 sec

Avg Cache buffers chains latch holding time

~ 1-2 sec

Max spin time for shared latch (~2*2000*5 ticks)


Avg Library Cache latch holding time (10-20us)
Max spin time for exclusive latch (~20000*5 ticks)

~10 sec
~ 10-20 sec
~50 sec

Interval between CBC latch gets

~ 10 sec

OS context switch time (10us-1ms)

10sec-17min
48

RDTEX | Latch internals | RUOUG Seminar

LATCH WAITS UNDER TIME MICROSCOPE


Old 8i latch algorithm:
Spin for latch during 5 seconds.
Go sleep for 2 hours 46 min in hope that congestion will dissolve.
Spin again during 5 seconds.
Sleep again for 2 hours 46 min.

Spin again for 5 seconds.


Sleep again for 23 days (2 sec)

New 9.2 latch algorithm:


Spin for shared latch during 10 seconds (50 for exclusive latch)
If spin not succeeded, go to sleep until post.
According to v$event_histogram the majority of latch waits now take
less then 17 min (1ms)
49

RDTEX | Latch internals | RUOUG Seminar

_SPIN_COUNT IS EFFECTIVELY STATIC


FOR EXCLUSIVE LATCHES
Oracle process spins for exclusive latch up to x$ksllclass[0].spin
cycles. This number depends on _SPIN_COUNT and
_LATCH_CLASS_0 parameters.
If _SPIN_COUNT was not set during instance startup process will
spin up to 20000 cycles. The default value for _SPIN_COUNT is
only 2000.
If instance started with _SPIN_COUNT parameter, Oracle set
x$ksllclass.spin to _SPIN_COUNT if not otherwise specified by
_LATCH_CLASS_0 parameter.
The _SPIN_COUNT parameter can be changed dynamically, but
this change will not have any effect for exclusive latch.

50

RDTEX | Latch internals | RUOUG Seminar

TIMED LATCH WAIT POSTING


If wakeup post is lost in OS, waiters will sleep infinitely. Common
problem in earlier 2.6.9 Linux kernel.
This can lead to instance hang process will never be woken up.
_ENABLE_RELIABLE_LATCH_WAITS = FALSE changes semop()
system call to semtimedop() with 0.3 sec timeout.
Introduced in 9.2.0.6 for Bug 3632400 LMS may hang in kslges()
waiting for a post. Increased robustness to OS posting problem.
Generic. AIX ports always use timed latch wait posting
My testcases did not revealed any measurable performance
overhead of timed latch wait posting in Solaris and Linux.

51

RDTEX | Latch internals | RUOUG Seminar

FINE GRAIN LATCH TUNING


Latch can be assigned to one of eight classes having different spin and
wait policies. Standard class 0 latch use wait posting.
_latch_class_X = Spin Yield Waittime Sleep0 Sleep1 Sleep7"
Nonstandard class latch loops upto Spin cycles, then yields CPU. This is
repeated Yield times. Then the process sleeps for SleepX
microseconds using pollsys() (not semtimedop()) system call.
If Yield !=0 repeat Yield times:
Loop up to Spins cycles
Yield CPU using yield() (or sched_yield())

Sleep for SleepX usecs


Then spin again
52

RDTEX | Latch internals | RUOUG Seminar

TRACES OF NONDEFAULT CLASS LATCH


_latch_class_6="100 2 3 10000 20000 30000 40000 50000 60000 70000 80000"
kslges(0x50006434, ...)
- wait get of exclusive latch
sskgslspin(0x50006434) ... repeated 100 times.
yield()
- Yield 1
sskgslspin(0x50006434) ... repeated 100 times.
yield()
- Yield 2
sskgslspin(0x50006434) ... repeated 100 times.
pollsys(...,timeout=10 ms,...) - Sleep 1. End of spin and yield loop

Event 10046 trace:


WAIT #0: nam='latch free' ela= 10886 address= number=46 tries=1 ...
WAIT #0: nam='latch free' ela= 19922 address= number=46 tries=2 ...
WAIT #0: nam='latch free' ela= 30482 address= number=46 tries=3 ...
WAIT #0: nam='latch free' ela= 39937 address= number=46 tries=4 ...
...

sqlplus /nolog @demo_04.sql 46

53

RDTEX | Latch internals | RUOUG Seminar

PERFORMANCE OF LATCH CLASSES

"Row cache objects" latch contention testcase on ODA (24 SMT)54

RDTEX | Latch internals | RUOUG Seminar

IS THE _SPIN_COUNT REALLY STATIC?


Well-known Guy Harrison experiments [5] with Oracle 11g
demonstrated that dynamic _SPIN_COUNT tuning influenced latch
waits, CPU and throughput.
The experiments with dynamic change of _SPIN_COUNT used
cache buffers chain latch contention
This latch became shared in Oracle 9.2
Exclusive latch uses static SPIN value from x$ksllclass
(_LATCH_CLASS_X). By default 20000.
Shared latch spin can be tuned by _SPIN_COUNT value. By
default 2000. However, this is not trivial.

55

RDTEX | Latch internals | RUOUG Seminar

SHARED LATCH BEHAVE LIKE ENQUEUE


Oracle realization of Read-Write spinlocks. Appeared in 8.0.
S and X shared latch modes are incompatible.
If latch is held in S mode, X mode waiter blocks all further requests
This blocking achieved by special 0x40000000 bit in the latch
value. This bit is a flag that some incompatible latch acquisition
is in progress.
Queued processes will be woken up one by one on latch release.
X mode latch gets effectively serialize the shared latch.
See my blog for description of experiments.
56

RDTEX | Latch internals | RUOUG Seminar

TRACE OF SHARED LATCH WAIT


Oracle process spins for shared latch twice. If the first spin upto
_SPIN_COUNT was unsuccessful, the process puts itself into the
latch wait list and then spins again:
kslgetsl(0x50009BAC,1,2,3,16)
kslgess(0x50009BAC, ...)
kslskgs(0x50009BAC, ...)
sskgslspin(0x50009BAC)
kslwlmod(...)
kslskgs(0x50009BAC, ...)
sskgslspin(0x50009BAC)
skgpwwait(...)
semop(27,...)

- KSL GET Shared Latch in X mode


- shared latch wait get
... repeated 2000 times.

... repeated 2000 times.


- sleep until posted

sqlplus /nolog @demo_05.sql 102

57

RDTEX | Latch internals | RUOUG Seminar

SHARED LATCH ACQUISITION


Shared latch spin in Oracle 9.2-11g is governed by _SPIN_COUNT
value and can be dynamically tuned.
X mode shared latch get spins by default up to 4000 cycles.
S mode does not spin at all (or spins in unknown way).
S mode get

X mode get

Held in S mode

Compatible

2*_SPIN_COUNT

Held in X mode

2*_SPIN_COUNT

Blocking mode

2*_SPIN_COUNT

58

RDTEX | Latch internals | RUOUG Seminar

LATCH RELEASE
Free the latch kslfre(laddr).
Oracle process releases the latch nonatomically.
Then it sets up memory_barrier perform atomic operation on
address individual to each process.
This requires less bus invalidation and ensures propagation of latch
release to other local caches.
Not fair policy. Spinners on the local CPU board have the
preference.
Then process posts the first process in the list of waiters.

59

LATCH CONTENTION
RDTEX | Latch internals | RUOUG Seminar

60

RDTEX | Latch internals | RUOUG Seminar

LATCH CONTENTION
Occurs when a latch is required by several processes at the
same time
Mostly due to bugs or abnormal application behavior.
As utilization increases, latch waits and spins also increase.
Performance degradation can be extremely sharp
To diagnose the latch contention we need to investigate the
latch statistics [8]

61

RDTEX | Latch internals | RUOUG Seminar

DIFFERENTIAL (POINT IN TIME) LATCH STATISTICS


Latch requests arrival rate

gets
=
time

Immediate gets efficiency

Latch sleeps ratio


Latch wait time per second
Latch spin efficiency

misses
gets

sleeps
K =
misses
wait _ time
W =
time

spin _ gets
=
misses

Should be calculated for each child latch separately.


V$LATCH averaging distorts statistics
62

RDTEX | Latch internals | RUOUG Seminar

SAMPLED AVERAGES
By sampling of x$ksupr we can estimate:
Length of latch wait queue

Lw

SELECT COUNT(1) FROM X$KSUPR WHERE KSLLAWAT=LADDRESS;

Number of spinning processes in 10g

: Ns

SELECT COUNT(1) FROM X$KSUPR WHERE KSLLALAQ=LADDRESS;

Sampling of v$lathholder estimates latch utilization


Using DTrace we can measure
Latch acquisition and holding times
Actual spin count

RDTEX | Latch internals | RUOUG Seminar

DERIVED LATCH STATISTICS


latch _ holding _ time
time

Latch utilization: (PASTA)

U =

Average holding time:

S=

Length of latch wait list:

L =W

Recurrent sleeps ratio:

+ 1

Latch acquisition time:

Taq = 1 ( N s + W )

" Pct _ Get _ Miss "" Snap _ Time "


=

100 *" Get _ Re quests "

64

RDTEX | Latch internals | RUOUG Seminar

LATCH STATISTICS EXAMPLE


Latch statistics for:
0x380007358 "session allocation"
Requests rate:
lambda= 1350 Hz
Miss /get:
rho= .022
Sampled Utilization:
U= .013
Slps /Miss:
kappa= .28
Wait_time/sec:
W= .021
Sampled queue length Lw= .017
Spin_gets/miss:
sigma= .72
Sampled spinning procs:Ns= .013
Secondary sleeps ratio = .002
Avg holding time= 16.3 usec
sleeping time = 15.9 usec
acquisition time = 25.8 usec

Latch acquisition time distribution


measured by DTrace:
--------- Distribution -------2048 |
4096 |@@@@@@
8192 |@@@@@@@@
16384 |@@@@@@@@@@@@@@@@@@@@@@@
32768 |@@@
65536 |
ns

Average acquisition time=21 usec

sqlplus /nolog @demo_06.sql

65

RDTEX | Latch internals | RUOUG Seminar

LATCH CONTENTION DIAGNOSTICS IN 9.2-11G


Should be suspected if latch wait events are observed in Top 5 Timed
Events AWR section
Look for the latch with highest W
Symptoms of contention for the latch:
W > 1 sec/sec
Utilization

> 10%

Acquisition (or sleeping) time sufficiently greater then holding time


Latchprofx.sql script invented by Tanel Poder [4] greatly simplifies
diagnostics.
The script and v$latch_misses reveal where the contention arise
66

RDTEX | Latch internals | RUOUG Seminar

LATCH LEVELS AND LATCH CONTENTION


Process holding latch can only acquire latch with higher level.
Contention for a high-level latch frequently exacerbates contention
for lower-level latches
Library cache latch contention without many cursor versions
likely is a consequence of moderate shared pool latch contention.
Shared pool resizes, clearances and ORA-4031 dumps can induce
library cache latch storms.
Use latch_tree.sql script to confirm this scenario

67

RDTEX | Latch internals | RUOUG Seminar

BEWARE OF CERTAIN V$/X$


Frequent scans of some x$ tables induce latch contention:
x$ktcxb (v$transaction) - Transaction allocation latch
x$ktadm (v$lock, dba_jobs_running) "DML lock allocation"
x$kqlfxpl (v$sql_plan) "library cache" latch in 10g
x$ksmsp "shared pool" latch
x$kslltr (v$latch) all parent latches in 11g.

68

RDTEX | Latch internals | RUOUG Seminar

TREATING THE LATCH CONTENTION


The right method: tune the application and reduce the latch
demand. Tune the SQL, bind variables, schema, etc
Many brilliant books exist on this topic. Out of scope for this
presentation.
Install latest Oracle PSU and OS recommended patches. Several
latch related examples:
Oracle bug 7627743 Higher latch waits in 11g. Processes in 11.1
always spun upto the maximum spin count value.
Linux 2.6.9 kernel bug 149933. "Missing wakeup in ipc/sem". Fixed
in Red Hat EL 4 Update 4
HP-UX 11.31. Should install scheduler cumulative patch
PHKL_38397 or later
69

RDTEX | Latch internals | RUOUG Seminar

WHEN WE NEED TO TUNE SPIN COUNT:


The right method to tune the application may be too expensive and
require complete application rewrite.
Nowadays the CPU power is cheaper. We may already have enough free
CPU resources.
Oracle does not explicitly forbid _SPIN_COUNT tuning. However, change
of undocumented parameter should be discussed with Support. For
example, MOS recommends _SPIN_COUNT=5000 for 10g Streams.
Processes spin for exclusive latch spin upto 20000 cycles, for shared latch
upto 4000 cycles and infinitely for pre 11.2.0.2 mutex. Tuning may find
more optimal values for your application.
In CPU thrashing, decrease of the number of spins can reduce CPU
consumption.

70

RDTEX | Latch internals | RUOUG Seminar

TUNING THE SPIN COUNT EFFICIENTLY


Beware of side effects. You should have enough free CPU.
First, diagnose the root cause of latch contention. Measure the
latch statistics, install relevant patches.
Spin count tuning will only be effective if latch holding time S
is in its normal microseconds range
The number of spinning processes should remain less then number
of CPUs. Beware CPU thrashing. Analyze AWR and latch statistics
before and after each change.
Spin count adjustment is different for exclusive and shared latches.
Nonstandard latch classes may decrease CPU consumption.
It is a common myth that CPU time will raise infinitely while we
increase spin count.
71

RDTEX | Latch internals | RUOUG Seminar

SPIN COUNT ADJUSTMENT


Shared latches:
Dynamically by _SPIN_COUNT parameter.
Good starting point is the multiple of default 2000 value.
Setting _SPIN_COUNT parameter in initialization file, should be
accompanied by _LATCH_CLASS_0="20000". Otherwise spin for
exclusive latches will be greatly affected by next instance restart.
Exclusive latches:
Statically by _LATCH_CLASS_0 parameter. Needs the instance restart.
Good starting point is the multiple of default 20000 value.
It may be preferable to increase the number of "yields" for class 0 latches.
Spin count tuning probes the tail of latch holding time distribution. See my
blog for related mathematics.
72

RDTEX | Latch internals | RUOUG Seminar

SPIN SCALING
My previous work [8] proposed approximate scaling rules to estimate
effect of SPIN_COUNT tuning depending on:
= "sleeps
ratio"=Avg
If "sleeps ratio"K for
exclusive
latchSlps/Miss
is 10% than increase of
spin count
40000 may results in
K 0.1tothen:
If spin is inefficient
10Doubling
times decrease
of "latch free" wait events, and only 10%
the spin count will reduce "sleeps ratio" by 2 and doubles
the spin CPU consumption
increase of CPU consumption.

mileage
may vary
If latch holding timeYour
distribution
has exponential
tail, you have enough
CPU resources and spin is efficient K << 0 .1 then:
Doubling the spin count will square the sleeps ratio coefficient.
This will multiply the spin CPU consumption by (1 + K )
If the spin is already efficient, it is worth to increase the spin count.
73

RDTEX | Latch internals | RUOUG Seminar

CACHE BUFFER CHAINS LATCH CONTENTION TESTCASE

Latch wait and CPU times vs. _SPIN_COUNT


74

RDTEX | Latch internals | RUOUG Seminar

LONG LATCH HOLDING TIME: CPU THRASHING


Latch contention can cause CPU starvation. Processes contending
for a latch, also contend for CPU.
Once CPU starves, OS runqueue length raise and loadaverage
exceeds the number of CPUs. Some OS may shrink the time
quantum. Latch holders will not receive enough time to release the
latch.
Due to priority decay, latch acquirers may preempt latch holders.
This leads to priority inversion. The throughput falls.
Transition to this metastable state is more likely if workload of
your system approaches ~100% CPU
Due to preemption, latch holding time S will raise to the CPU
scheduling scale [7].
75

RDTEX | Latch internals | RUOUG Seminar

CPU THRASHING
If CPUs are running at 100% utilization for other reasons, the latch
contention can be just a symptom of this CPU starvation itself.
All wait times will be inflated by wait for CPU. However, latch
holding time should be in normal range.
CPU thrashing is unlikely on Solaris due to latch preemption
control.
To prevent the priority inversion use:
FX priority class on Solaris
HPUX_SCHED_NOAGE on HPUX
SCHED_BATCH policy on Linux

76

RDTEX | Latch internals | RUOUG Seminar

LATCH SMP SCALABILITY:


If latch utilization is

in single CPU environment.

Then in N CPU server latch utilization will be N


problematic:

N1. This can be

If single CPU system held latches only for 1% of time.


48 CPU server with the same per-CPU load will hold latches for 50%.
128 CPU Cores server will suffer huge latch (and mutex) contention.
This is also known as "Software lockout". It may substantially affect
contemporary multi-core servers.
NUMA should overcome this intrinsic spinlock scalability restrictions.

77

RDTEX | Latch internals | RUOUG Seminar

BIBLIOGRAPHY
1.
2.
3.
4.
5.
6.
7.
8.
9.

Steve Adams. Oracle8i Internal Services for Waits, Latches, Locks, and
Memory. 1999.
Tanel Poder blog, http://blog.tanelpoder.com
Jonathan Lewis, Oracle Core: Essential Internals for DBAs and
Developers. 2011
Edsger Dijkstra, Solution of a Problem in Concurrent Programming
Control CACM. 1965.
M. Herlihy, N. Shavit, The Art of Multiprocessor Programming. 2008.
B. Sinharoy, et al. , Improving Software MP Efficiency for Shared
Memory Systems. Proc. of 29th Hawaii Conference on System Sciences.
1996.
Guy Harrison blog, Using _spin_count to reduce latch contention in 11g.
http://guyharrison.squarespace.com/ 2008.
R. Johnson, et al. A new look at the roles of spinning and blocking.
Proc. of 5th Workshop on Data Management on New Hardware. 2009
A. Nikolaev, Exploring Oracle RDBMS latches using Solaris DTrace.
Proc. of MEDIAS 2011 Conf., http://arxiv.org/abs/1111.0594v1 2011
78

RDTEX | Latch internals | RUOUG Seminar

Q/A?
Questions?
Comments?

79

RDTEX | Latch internals | RUOUG Seminar

ACKNOWLEDGEMENTS
Thanks to RDTEX Technical Support Centre Director S.P.
Misiura for years of encouragement and support of my
investigations.
Thanks to my colleagues for discussions.
Thanks to all our customers for participating in latch
troubleshooting.

80

THANK YOU!

Andrey Nikolaev
http://andreynikolaev.wordpress.com
Andrey.Nikolaev@rdtex.ru
RDTEX, Moscow, Russia
www.rdtex.ru

RDTEX | Latch internals | RUOUG


Seminar

RDTEX | Latch internals | RUOUG Seminar

IN MEMORY OF MY FRIEND FOR 30 YEARS:

Andrey Kriushin
Chairman of Russian Oracle User Group (RuOUG)
Died of heart attack 02.08.2011
82

Vous aimerez peut-être aussi