Académique Documents
Professionnel Documents
Culture Documents
DTrace
DynamicTracing
Observingproductionsystems
Safety
Zerooverheadifobservationisnotactivated
Minimaloverheadifobservationisactivated
Nospecialdebug/releasebuilds
Mergingandcorrelatingdatafrommultiplesources
Totalobservability
Globalviewofthesystemstate
CrashDumpAnalysisMFFUKDTrace
Terminology
Probe
Aplaceincodeoraneventwhichcanbeobserved
Ifaprobeisactivatedandthecodeisexecuted
(ortheeventhappens),theprobeisfired
Provider
RegistersprobestoDTraceinfrastructure
AspecialscriptwritteninDlanguageisexecuted
Doesthedirtyworkofactivation,tracingandinactivation
Consumer
Consumesandpostprocessesthedatafromfiredprobes
CrashDumpAnalysisMFFUKDTrace
Overview
dtrace
script
plockstat(1M)
lockstat(1M)
intrstat(1M)
consumers
dtrace(1M)
libdtrace(3LIB)
user-space
provider
D compiler
communication device
user-space
dtrace(3D)
kernel
DTrace
D virtual machine
pid
sysinfo
sdt
CrashDumpAnalysisMFFUKDTrace
fasttrap
syscall
providers
fbt
usdt
4
DTracehistory
st
31
January2005
OfficialpartofSolaris10
Releasedasopensource(CDDL)
FirstpieceofOpenSolaristobereleased
27thOctober2007
PortedtoMacOSX10.5(Leopard)
nd
2
September2008
PortedtoFreeBSD7.1(released6thJanuary2009)
21stFebruary2010
PortedtoNetBSD(onlyfori386,notenabledbydefault)
CrashDumpAnalysisMFFUKDTrace
DTracehistory(2)
Linux
Cannotbedirectlyintegrated(CDDLvs.GPL)
Betareleases(since2008)
Standalonekernelmodulewithnomodificationstocoresources
Onlysomeproviders(fbt,syscall,usdt)
Developmentsnapshotsavailableregularly
SystemTap
Linuxnativeanalogy
AscriptinSystemTaplanguageisconvertedtoaCsourcecode
ofakernelmodule
Loadedandexecutednativelyintherunningkernel
EmbeddedCenabledingurumode
CrashDumpAnalysisMFFUKDTrace
DTracehistory(3)
QNX
Portinprogress
rd
3
partysoftwarewithDTraceprobes
Apache
MySQL
PostgreSQL
X.Org
Firefox
OracleJVM
Perl,Ruby,PHP
CrashDumpAnalysisMFFUKDTrace
Dlanguage
probe /predicate/ {
actions
}
Describewhatisexecutedifaprobefires
SimilartoCorAWK
Withoutdangerousconstructs(branching,loops,etc.)
Manyofthefieldscanbeabsent
Defaultpredicate/action
CrashDumpAnalysisMFFUKDTrace
Dprobes
probe /predicate/ {
actions
}
Apatternconsistingoffieldssplitbycolon
provider:module:function:name
Fieldscanbeomited(otherarereadfromrighttoleft)
foo:bar
matchfunctionfooandnamebarinallmodules
providedbyallproviders
Fieldscanbeempty(interpretedasany)
syscall:::
matchallprobesprovidedbythesyscallprovider
CrashDumpAnalysisMFFUKDTrace
Dprobes(2)
probe /predicate/ {
actions
}
Shellpatternmatching
Wildcharacters*,?,[]
Canbeescapedby\
syscall::*lwp*:entry
matchallprobesprovidedbythesyscall
provider,inanymodule,inallfunctions
(syscalls)containingthestringlwpand
matchingsyscallentrypoints
Specialprobes
BEGIN,END,ERROR
Implementedbydtraceprovider
CrashDumpAnalysisMFFUKDTrace
10
Dprobes(3)
probe /predicate/ {
actions
}
Displayingallconfiguredprobes
dtrace -l
CrashDumpAnalysisMFFUKDTrace
11
Dpredicates
probe /predicate/ {
actions
}
Booleanexpressionguardingtheactions
Anyexpressionwhichevaluatesasintegeror
pointer
Zeroisconsideredasfalse,nonzeroastrue
AnyDoperators,variablesandconstants
Canbeabsent
Implicitlytrue
CrashDumpAnalysisMFFUKDTrace
12
Dactions
probe /predicate/ {
actions
}
Listofstatements
Separatedbysemicolon
Nobranching,noloops
Defaultactionifempty
Usuallytheprobenameisprintedout
CrashDumpAnalysisMFFUKDTrace
13
Dtypes
BasicdatatypesreflectClanguage
Integertypesandaliases
(unsigned/signed)char,short,int,long,longlong
int8_t,int16_t,int32_t,int64_t,intptr_t,uint8_t,uint16_t,
uint32_t,uint64_t,uintptr_t
Floatingpointtypes
float,double,longdouble
Valuescanbeassigned,butnofloatingpointarithmeticsis
implementedinDTrace
CrashDumpAnalysisMFFUKDTrace
14
Dtypes(2)
Derivatedandspecialdatatypes
Pointers
Clikepointerstootherdatatypes
(includingpointerarithmetics)
int*value;void*ptr;
ConstantNULLiszero
DTraceenforcesweakpointersafety
Invalidmemoryaccessesarefullyhandled
However,thisdoesnotprovidereferencesafetyasinJava
CrashDumpAnalysisMFFUKDTrace
15
Dtypes(2)
Scalararrays
Clikearraysofbasicdatatypes
Similartopointers,butcanbeassignedasawhole
intvalues[5][6];
Strings
Specialtypedescriptorstring(insteadofchar*)
Canbeassignedasawholebyvalue(char*copiesreference)
RepresentedasNULLterminatedcharacterarrays
Internalstringsarealwaysallocatedasbounded
Cannotexceedthepredefinedmaximumlength(256bytes)
CrashDumpAnalysisMFFUKDTrace
16
Dtypes(3)
Composeddatatypes
Structures
Recordsofseveral
othertypes
Typedeclaredina
similarwayasinC
Variablesmustbe
declaredexplicitly
Membersare
accessedvia.and>
operators
CrashDumpAnalysisMFFUKDTrace
struct callinfo {
uint64_t ts;
uint64_t calls;
};
struct callinfo info[string];
syscall::read:entry,
syscall::write:entry {
info[probefunc].ts = timestamp;
info[probefunc].calls++;
}
END {
printf("read %d %d\n",
info["read"].ts,
info["read"].calls);
printf("write %d %d\n",
info["write"].ts,
info["write"].calls);
}
17
Dtypes(4)
Unions
Bitfields
Enumerations
Typedefs
AllsimilarasinC
Inlines
enum typeinfo {
CHAR_ARRAY = 0,
INT,
UINT,
LONG
};
struct info {
enum typeinfo disc;
union {
char c[4];
int32_t i32;
uint32_t u32;
long l;
} value;
Typedconstants
inlinestringdesc=
"something";
CrashDumpAnalysisMFFUKDTrace
int a : 3;
int b : 4;
};
typedef struct info info_t;
18
DTraceoperators
Arithmetic
+*/%
Relational
&|^<<>>~
<<=>>===!=
&&||^^!
Shortcircuitevaluation
CrashDumpAnalysisMFFUKDTrace
Assignment
=+==*=/=%=&=|=
^=<<=>>=
Worksalsoonstrings
(lexicalcomparison)
Logical
Bitwise
ReturnvaluesasinC
Incrementand
decrement
++
19
DTraceoperators(2)
Conditionalexpression
Replacementforbranching(whichisabsentinD)
Addressing,memberaccessandsizes
&*.>sizeof(type/expr)offsetof(type,member)
Kernelvariablesaccess
condition?true_expression:false_expression
Typecasting
(int)x,(int*)NULL,(string)expression,stringof(expr)
CrashDumpAnalysisMFFUKDTrace
20
DTracevariables
Scalarvariables
Simpleglobalvariables
Storingfixedsizedata(integers,pointers,fixedsize
compositetypes,stringswithfixedsizeupperbound)
Donothavetobedeclared(butcanbe),ducktyping
BEGIN {
/* Implicitly declare
an int variable */
value = 1234;
}
CrashDumpAnalysisMFFUKDTrace
21
DTracevariables(2)
Associativearrays
Globalarraysofscalarvaluesindexedbyakey
Keysignatureisalistofscalarexpressionvalues
Integers,stringsorevenatupleofscalartypes
Eacharraycanhaveadifferent(butfixed)keysignature
Declaredimplicitlybyassignmentorexplicitly
values[123,"key"]=456;
Allvalueshavealsoafixedtype
Buteacharraycanhaveadifferentvaluetype
Declaredimplicitlybyassignmentorexplicitly
intvalues[unsignedint,string];
CrashDumpAnalysisMFFUKDTrace
22
DTracevariables(3)
Threadlocalvariables
Scalarvariablesorassociativearraysspecifictoa
giventhread
Identifiedbyaspecialidentifierself
Ifnovaluehasbeenassignedtoathreadlocalvariable
inthegiventhread,thevariableisconsideredzerofilled
Assigningzerotoathreadlocalvariabledeallocatesit
syscall::read:entry {
/* Mark this thread */
self->tag = 1;
}
CrashDumpAnalysisMFFUKDTrace
/* Explicit declaration */
self int tag;
syscall::read:entry {
self->tag = 1;
}
23
DTracevariables(4)
Clauselocalvariables
Scalarvariablesorassociativearraysspecifictoa
givenprobeclause
Identifiedbyaspecialidentifierthis
Theyarenotinitializedtozero
Thevalueiskeptformultipleclausesassociatedwiththesame
probe
syscall::read:entry {
this->value = 1;
}
CrashDumpAnalysisMFFUKDTrace
/* Explicit declaration */
this int value;
syscall::read:entry {
this->value = 1;
}
24
DTraceaggregations
Variablesforstoringstatisticaldata
Storingvaluesofaggregativedatacomputation
Foraggregatingfunctionsf(...)whichsatisfythefollowing
property
f(f(x0)f(x1)...f(xn))=f(x0x1...xn)
Aggregationsaredeclaredinasimularwayas
associativearrays
@values[123, "key"] = aggfunc(args);
@_[123, "key"] = aggfunc(args); /* Simple variable */
@[123, "key"] = aggfunc(args);
/* dtto */
CrashDumpAnalysisMFFUKDTrace
25
DTraceaggregations(2)
Aggregationfunctions
count()
sum(scalar)
avg(scalar)
min(scalar)
max(scalar)
lquantize(scalar,lower_bound,upper_bound,step)
Linearfrequencydistribution
quantize(scalar)
Poweroftwofrequencydistribution
CrashDumpAnalysisMFFUKDTrace
26
DTraceaggregations(3)
BydefaultaggregationsareprintedoutinEND
syscall:::entry {
@counts[probefunc] = count();
}
# dtrace -s counts.d
dtrace: script 'counts.d' matched 235 probes
^C
resolvepath
lwp_park
gtime
lwp_sigmask
stat64
pollsys
p_online
ioctl
CrashDumpAnalysisMFFUKDTrace
8
10
12
16
46
93
256
1695
27
DTracebuiltinvariables
GlobalvariablesdefinedbyDTrace
Containvariousstatedependentvalues
int64_targ0,arg1,...,arg9
args[]
Typedargumentstothecurrentprobe(e.g.thesyscall
argumentswiththeappropriatetypes)
uintptr_tcaller
Inputargumentsforthecurrentprobe
Instructionpointerofthecodejustbeforefiringtheprobe
kthread_t*curthread
Currentthreadkernelstructure
CrashDumpAnalysisMFFUKDTrace
28
DTracebuiltinvariables(2)
stringcwd
stringexecname
Namewhichwasusedtoexecutethecurrentprocess
pid_tpid,tid_ttid
Currentworkingdirectory
CurrentPID,TID
stringprobeprov,probemod,probefunc,probename
Currentprobeprovider,module,functionandname
CrashDumpAnalysisMFFUKDTrace
29
Usingactionstatements
DTracerecordsoutputtoatracebuffer
Mostoftheactionstatementsproducesomesortof
outputtothetracebuffer
trace(expr)
tracemem(address,bytes)
Outputvalueofanexpression
Copygivennumberofbytesfromthegivenaddresstothebuffer
printf(format,...)
Outputformattedstrings(formatoptionscoveredlater)
Safetychecks
CrashDumpAnalysisMFFUKDTrace
30
Usingactionstatements(2)
printa(aggregation)
printa(format,aggregation)
Startprocessingaggregationdata
Paralleltootherexecution(outputcanbedelayed)
stack()
stack(frames)
Outputkernelstacktrace
ustack()
ustack(frames)
Outputuserspacestacktrace
Addressesarenotlookedupbythekernel,butbytheuserspace
consumer(later)
CrashDumpAnalysisMFFUKDTrace
31
Usingactionstatements(3)
ustack(frames,string_size)
Outputuserspacestacktracewithsymbollookup(inkernel)
Thekernelallocatesstring_sizebytesfortheoutputofthesymbol
lookup
Theprobeprovidermustannotatetheuserspacestackwithrun
timesymbolannotationstomakethelookuppossible
CurrentlyonlyJVM(1.5ornewer)supportsthis
jstack()
jstack(frames)
jstack(frames,string_size)
Aliasforustack()withnonzerodefaultstring_size
CrashDumpAnalysisMFFUKDTrace
32
printf()formatting
Conversionformats
%a
Pointeraskernel
symbolname
%c
ASCIIcharacter
%C
PrintableASCIIor
escape
%d,%i,%o,%u,%x
CrashDumpAnalysisMFFUKDTrace
%e
Floatas[]d.dddedd
%f
Floatas[]ddd.ddd
%p
Hexadecimalpointer
%s
ASCIIstring
%S
ASCIIstringorescape33
Subroutines
SpecialactionswhichalterthestateofDTrace
Butdonotproduceanyoutputtothetracebuffer
Arecompletelysafe
UsuallymanipulatethelocalmemorystorageofDTrace
*alloca(size)
Allocatesizebytesofscratchmemory
Thememoryisreleasedafterthecurrentclauseends
bcopy(*src,*dest,size)
Copysizebytesfromoutsidescratchmemorytoscratchmemory
CrashDumpAnalysisMFFUKDTrace
34
Subroutines(2)
*copyin(addr,size)
*copyinstr(addr)
Tellwhetherakernelmutexiscurrentlylockedornot
*mutex_owner(*mutex)
CopyNULLterminatedstringfromtheusermemoryofthe
currentprocesstoscratchmemory
mutex_owned(*mutex)
Copysizebytesfromtheusermemoryofthecurrentprocessto
scratchmemory
Returnthepointertokthread_tofthethreadwhichownsthe
givenmutex(orNULL)
mutex_type_adaptive(*mutex)
CrashDumpAnalysisMFFUKDTrace
35
Subroutines(3)
strlen(string)
strjoin(*str,*str)
ReturnlengthofaNULLterminatedstring
ConcatenatetwoNULLterminatedstrings
basename(*str)
Returnabasenameofagivenfilename
dirname(*str)
cleanpath(*str)
Returnafilesystempathwithoutelementssuchas../
rand()
Returna(weak)pseudorandomnumber
CrashDumpAnalysisMFFUKDTrace
36
Destructiveactions
Changingthestateofthesystem
Inadeterministicway
Butitcanbestilldangerousinproductionenvironment
Needtobeexplicitlyenabledusingdtrace -w
stop()
raise(signal)
Stopthecurrentprocess(e.g.todumpthecoreorattachmdb)
Sendasignaltothecurrentprocess
panic()
CrashDumpAnalysisMFFUKDTrace
37
Destructiveactions(2)
copyout(*buffer,addr,bytes)
Storegivennumberofbytesfromabuffertothegivenaddress
Pagefaultsaredetectedandavoided
copyoutstr(string,addr,maxlen)
system(program,...)
StoreatmostmaxlenbytesfromaNULLterminatedstringtothe
givenaddress
Executeaprogramasitwouldbeexecutedbyashell(program
isactuallyaprintf()formatspecifier)
breakpoint()
Induceakernelbreakpoint(ifakerneldebuggerisloaded,itis
executed)
CrashDumpAnalysisMFFUKDTrace
38
Destructiveactions(3)
chill(nanoseconds)
Spinactivelyforagivennumberofnanoseconds
Usefulforanalyzingtimingbugs
exit(status)
Exitthetracingsessionandreturnthegivenstatustothe
consumer
CrashDumpAnalysisMFFUKDTrace
39
Speculativetracing
Predicatesaregoodforfilteringoutunimportant
probesbeforetheyarefired
Buthowtoeffectivelyfilteroutunimportant
probeseventuallysometimeaftertheyare
fired?
Youcantellthatyouareinterestedinthedatafrom
aprobenonlyafterproben+k(k>0)isfired
Solution:Speculativelyrecordallthedata,but
decidelaterwhethertocommititornot
CrashDumpAnalysisMFFUKDTrace
40
Speculativetracing(2)
speculation()
CreateanewspeculativebufferandreturnitsID
Bydefaultthenumberofspeculativebuffersislimitedto1
speculate(id)
Therestoftheclausewillberecordedtothespeculativebuffer
givenbyid
Thismustbethefirstdataprocessingactioninaclause
Disallowedactions:aggregating,destructive
commit(id)
Committhespeculativebuffergivenbyidtothetracebuffer
CrashDumpAnalysisMFFUKDTrace
41
Provider:syscall
Tracingofkernelsystemcalls
Probesforentryandexitpointsofasyscall
Accessto(typed)arguments
Accesstothereturnvalue(onexit)
Accesstokernelerrno
Accesstokernelvariables
Internallyusestheoriginalsyscalltracing
mechanism
CrashDumpAnalysisMFFUKDTrace
42
Provider:fbt
Functionboundarytracing
Probesonfunctionentrypointand(all)exitpoints
ofalmostallkernelfunctions
Inlinedandleaffunctionscannotbetraced
Inentry
Alltypedfunctionargumentscanbeaccessedviaargs[]
Inreturn
Offsetofthereturninstructionisstoredinarg0
Typedreturnvalueisstoredinargs[1]
CrashDumpAnalysisMFFUKDTrace
43
Provider:fbt(2)
Howdoesitwork?
ufs_mount:
ufs_mount+1:
ufs_mount+4:
ufs_mount+0xb:
......
ufs_mount+0x3f3:
ufs_mount+0x3f4:
ufs_mount+0x3f7:
ufs_mount+0x3f8:
pushq %rbp
movq %rsp,%rbp
subq $0x88,%rsp
pushq %rbx
int $0x3
movq %rsp,%rbp
subq $0x88,%rsp
pushq %rbx
popq %rbx
movq %rbp,%rsp
popq %rbp
ret
popq %rbx
movq %rbp,%rsp
popq %rbp
int $0x3
uninstrumented
CrashDumpAnalysisMFFUKDTrace
instrumented
44
Provider:sdt
Statickernelprobes
Probesdeclaredonarbitraryplacesinthekernel
code(viaamacro)
Currentlyjustafewofthemactuallydefined
interruptstart
interruptcomplete
arg0containspointertodev_infostructure
CrashDumpAnalysisMFFUKDTrace
45
Provider:sdt(2)
Howdoesitwork?
squeue_enter_chain+0x1af:
squeue_enter_chain+0x1b1:
squeue_enter_chain+0x1b2:
squeue_enter_chain+0x1b3:
squeue_enter_chain+0x1b4:
squeue_enter_chain+0x1b5:
squeue_enter_chain+0x1b6:
xorl %eax,%eax
nop
nop
nop
nop
nop
movb %bl,%bh
uninstrumented
CrashDumpAnalysisMFFUKDTrace
xor %eax,%eax
nop
nop
lock nop
nop
movb %bl,%bh
instrumented
46
Provider:proc
Probescorrespondingtoprocessandthread
lifecycle
Creatingaprocess(usingfork()andfriends)
Executingabinary
Exitingaprocess
Creatingathread,destroyingathread
Receivingsignals
CrashDumpAnalysisMFFUKDTrace
47
Provider:sched
Kernelschedulerabstractionprobes
Changingofpriorities
Threadbeingscheduled
Threadbeingpreempted
Threadgoingtosleep
Threadwakingup
CrashDumpAnalysisMFFUKDTrace
48
Provider:io
Input/outputsubsystemprobes
StartinganI/Orequest
FinishinganI/Orequest
Waitingforadevice
CrashDumpAnalysisMFFUKDTrace
49
Provider:pid
Tracinguserspacefunctions
Doesnotenforceserialization
Tracedprocessinneverstopped
Boundaryprobessimilartofbt
Functionentryandreturn
Argumentsinarg0,arg1,...arg9arerawunfilteredint64_t
values
Arbitraryfunctionoffset
Userspacesymbolinformationisrequiredtosupport
symbolicfunctionnames
OnSolaris,standardsharedlibrariescontainsymbolinformation
CrashDumpAnalysisMFFUKDTrace
50
Otherproviders
Manyotherprovidersexist
Applicationspecificproviders(X.Org,PostgreSQL,
Firefox,etc.)
ViaDTracetotalobservabilityyoucancorrelateinformationsuch
aswhichSQLtransactionisgeneratingaparticularI/Oloadin
thekernel
VMbasedproviders(JVM,PHP,Perl,Ruby)
Morekernelproviders
Memorymanagementprovider(vminfo)
Networkstackprovider(mid)
Profilingprovider(profile)
Intervalbasedprobes
CrashDumpAnalysisMFFUKDTrace
51
DTraceandmdb
AccessingDTracedatafromacrashdump
AnalyzingDTracestate
Displaytracebuffers,consumers,etc.
> ::dtrace_state
ADDR MINOR
PROC
ccaba400
2
ccab9d80
3 d1d6d7e0
cbfb56c0
4 d71377f0
ccabb100
5 d713b0c0
d7ac97c0
6 d713b7e8
CrashDumpAnalysisMFFUKDTrace
NAME
<anonymous>
intrstat
dtrace
lockstat
dtrace
FILE
cda37078
ceb51bd0
ceb51b60
ceb51ab8
52
DTraceandmdb(2)
Displayingthecontentsofatracebuffer
> ccaba400::dtrace
CPU
ID
0
344
0
16
0
202
0
202
0
14
0
206
0
186
0
186
0
186
0
190
0
344
0
216
0
16
0
202
...
CrashDumpAnalysisMFFUKDTrace
FUNCTION:NAME
resolvepath:entry
close:entry
xstat:entry
xstat:entry
open:entry
fxstat:entry
mmap:entry
mmap:entry
mmap:entry
munmap:entry
resolvepath:entry
memcntl:entry
close:entry
xstat:entry
init
init
init
init
init
init
init
init
init
init
init
init
init
init
53
DTraceandmdb(3)
Interprettingtheresults
Theoutputof::dtraceisthesameastheoutputofdtrace
utility
TheorderisalwaysoldesttoyoungestwithineachCPU
TheCPUbuffersaredisplayedinnumericalorder(youcan
use::dtrace -c cputoshowonlyaspecificCPU)
Onlyinkerneldatawhichhasnotyetbeenprocessedbyanuser
spaceconsumercanbedisplayed
Tokeepasmuchdataaspossibleinthekernelbuffer,the
followingdtraceoptionscanbeused
dtrace -s ... -b 64k -x bufpolicy=ring
CrashDumpAnalysisMFFUKDTrace
54
Resources
RichardMcDougall,JimMauro,Brendan
Gregg:SolarisPerformanceandTools:DTrace
andMDBTechniquesforSolaris10and
OpenSolaris
SolarisDynamicTracingGuide
http://docs.sun.com/app/docs/doc/8176223
CrashDumpAnalysisMFFUKDTrace
55