Apache Internals and Debugging: Ryan Matteson

Apache internals and debugging
Ryan Matteson matty91@gmail.com http://prefetch.net
Presentation overview
An introduction to the Apache source code layout An introduction to key concepts such as modules, buckets, brigades, hooks, filters, and MPMs An introduction to several tools that can be used to perform post-mortem analysis An introduction to several tools that can be used to debug transient server problems
Apache layout
What is Apache?
Flexible and extensible opensource web server Supports HTTP versions 0.9, 1.0 and 1.1 Ships with a variety of modules that can be used to secure communications and transform content Supports virtual hosting Supports static and dynamic content delivery The list goes on and on
How big is Apache?

150k+ lines of C code*:
$ find . -name \*.c -exec egrep -v '(^[ ]+\*|^$|\/\*|\*\/) {} \; | wc -l 155884
480+ source code files:

$ find . -name \*.c -ls | wc -l 488
240+ header files:

$ find . -name \*.h -ls | wc -l 247
* Several source code files are used for experimentation, testing, and debugging purposes
How is Apache organized?

Apache is broken up into several pieces
The server core The Apache portable runtime The Apache portable runtime utilities Support infrastructure Numerous modules
How is the Apache source code organized?

The server core source code resides in $SRCROOT/server The server core header files reside in $SRCROOT/include The portable runtime resides in $SRCROOT/srclib/apr The portable runtime utilities reside in $SRCROOT/srclib/apr-util Support infrastructure resides in $SRCROOT/support Modules reside in $SRCROOT/modules
Apache terminology
Resource pools
Pools are used to simplify memory management Pools are created and destroyed automatically by the server core Pools have a lifetime associated with them (e.g, a request pool is created when a request arrives, and destroyed after the request is processed) apr_pools.h provides a thorough description of the pool API and the underlying allocator
Multi-processing modules (MPMs)

MPMs are responsible for accepting network requests and dispatching those requests to children (processes or threads) Two main MPMs in use today
Prefork MPM utilizes processes to handle requests Worker MPM utilizes a threaded multi-process model process requests
mpm_common.h, prefork.c and worker.c contain the MPM, prefork and worker MPM implementations
Modules and hooks

Apache uses modules to isolate functionality and to extend the core servers capabilities Modules utilize hooks and filters to tie into the request processing flow The SHOW_HOOKS environment variable can be used to watch hook processing order
$ export SHOW_HOOKS=1 && httpd -X Registering hooks for core.c Hooked create_connection Hooked pre_connection
Buckets and brigades

Buckets are abstractions used by Apache to store data as its read from or written to the network Bucket API provides a rich set of functions to modify bucket contents Brigades are chains of buckets used to efficiently pass data between filters Brigade API provides a rich set of functions to modify the members in a brigade apr_buckets.h provides a thorough description of buckets and brigades
Filters
Filters are used to read and transform data as its read from and written to the network Filter chains are used to allow the output from one filter to become the input to another filter Two type of filters
Input filters (e.g., mod_deflate, mod_ssl ) Output filters (e.g., mod_ssl, mod_headers)
util_filter.h provides a thorough description of input and output filter chains
Debugging
Why debug?
Because software breaks Because administrators get frustrated with messages similar to the following:
[Sat Feb 04 13:00:27 2006] [notice] child pid 8581 exit signal Segmentation fault (11), possible coredump in /var/tmp/apache2
And most importantly, because debugging can be fun and rewarding
Types of debugging
Post-mortem debugging Transient problem debugging Reverse engineering
Postmortem debugging
What is post-mortem debugging?

Post-mortem debugging is the art of finding out why a system failed given a set of determinants (e.g., core file, audit trail, logfile messages) Software post-mortem analysis typically relies on custom instrumented binaries, logfile messages, and core file analysis Several tools can help with performing Apache post-mortem analysis
Apache maintainer mode mod_backtrace GDB and the Apache GDB macros
Maintainer mode
Builds Apache with debugging support and custom instrumentation Enabled at configure time
$ ./configure --enable-maintainer-mode \ --enable-exception-hook \ --enable-mods-shared=most \ --enable-deflate=shared \ --prefix=/opt/apache2
mod_backtrace
Writes a stack trace to a logfile each time a critical signal (e.g., SIGSEGV) is received Utilizes the Apache exception hook Can help with diagnosing problems on platforms that dont reliably create core files Supports Linux and FreeBSD Solaris patch available on my website
Building and installing mod_backtrace

Enable exception hook
$ ./configure --enable-exception-hook
Compile mod_backtrace module

$ apxs -ci mod_backtrace.c
Enable module
LoadModule backtrace_module modules/mod_backtrace.so EnableExceptionHook On
Watch Apache go splat

Check the error_log when Apache misbehaves
$ kill -SIGSEGV `pgrep httpd | tail -1` $ tail -100 error_log
[Sat Feb 4 20:36:05 2006] pid 23514 mod_backtrace backtrace for sig 11 (thread "pid" 23514) [Sat Feb 4 20:36:05 2006] pid 23514 mod_backtrace main() is at 326b0 /var/tmp/apache2/modules/mod_backtrace.so:bt_exception_hook+0x108 /var/tmp/apache2/bin/httpd:ap_run_fatal_exception+0x34 /var/tmp/apache2/bin/httpd:0x2a09c /lib/libc.so.1:0xbfec8 /lib/libc.so.1:0xb4ff4 /lib/libc.so.1:_so_accept+0x8 [ Signal 11 (SEGV)] /var/tmp/apache2/bin/httpd:unixd_accept+0x10 /var/tmp/apache2/bin/httpd:0x1c2ac /var/tmp/apache2/bin/httpd:0x1c574 /var/tmp/apache2/bin/httpd:0x1c644 /var/tmp/apache2/bin/httpd:ap_mpm_run+0x76c /var/tmp/apache2/bin/httpd:main+0x63c /var/tmp/apache2/bin/httpd:_start+0x5c [Sat Feb 4 20:36:05 2006] pid 23514 mod_backtrace end of backtrace
GDB macros
Apache comes with numerous GDB macros to print brigades, buckets, strings, filters, memnodes, tables, and process and server records Macros are located in $SRCROOT/.gdbinit Can be sourced using the gdb source command
Using GDB macros

$ httpd -X $ gdb -q /usr/apache2/bin/httpd (gdb) source apachemacros (gdb) show user User command dump_bucket: dump_bucket_ex $arg0 0 ... (gdb) info function ap_pass_brigade All functions matching regular expression "ap_pass_brigade": File util_filter.c: apr_status_t ap_pass_brigade(ap_filter_t *, apr_bucket_brigade *); (gdb) break ap_pass_brigade
Using GDB macros (cont.)

(gdb) attach 975 (gdb) continue (gdb) backtrace 4 #0 ap_pass_brigade (next=0x129d18, bb=0x139168) at util_filter.c:489 #1 0x000291d4 in ap_http_header_filter (f=0x138568, b=0x139168) at http_protocol.c:1766 #2 0x0003ad5c in ap_pass_brigade (next=0x138568, bb=0x139168) at util_filter.c:512 #3 0x0003d444 in ap_content_length_filter (f=0x138550, b=0x139168) at protocol.c:1248 (gdb) next (gdb) dump_brigade bb dump of brigade 0x139168 | type (address) | length | data addr | contents | rc -------------------------------------------------------------------------------0 | FILE (0x0012d918) | 2326 | 0x0012da58 | [**unprintable**] |1 1 | EOS (0x0012daa8) | 0 | 0x00000000 | | n/a end of brigade (gdb) detach
Debugging transient problems
What is transient debugging?

Transient debugging is the art of correlating unacceptable behaviors to specific application and system components Several utilities can help with debugging transient problems:
Chaosreader Curl Dtrace Ethereal Firefox HTTP Live Headers
Curl
Versatile command line utility that can be used to debug web-based problems Curl contains several advanced options to print protocol headers and connection errors Invaluable utility for locating misbehaving servers and applications
Curl example
$ curl -v --user-agent "CURL DEBUG (`date`)" -H "X-foo: yikes" http://daemons.net
About to connect() to daemons.net port 80 Trying 66.148.84.65... * connected Connected to daemons.net (66.148.84.65) port 80 > GET / HTTP/1.1 User-Agent: CURL DEBUG (Sat Feb 4 23:02:36 EST 2006) Host: daemons.net Pragma: no-cache Accept: */* X-foo: yikes < HTTP/1.1 200 OK < Date: Sun, 05 Feb 2006 04:04:13 GMT < Server: Apache < Last-Modified: Sun, 20 Jun 2004 14:39:21 GMT < ETag: "5c186-912-c108d840" < Accept-Ranges: bytes < Content-Length: 2322 < Content-Type: text/html
DTrace
Dynamic tracing facility introduced in Solaris 10 Can dynamically instrument applications and the Solaris kernel down to the instruction level Utilizes 30k+ probes distributed throughout the Solaris kernel Designed to be used on production systems No overhead when probes arent enabled
Dtrace script organization

Dtrace scripts contain one or more probes, an optional predicate, and an optional action to perform (the default action is trace()):
provider:module:function:name / predicate / { action(); }
Dtrace example #1
Viewing system calls by Apache process
$ dtrace -n 'syscall:::entry /execname == "httpd"/ { @calls[probefunc] = count(); }'
Dtrace example #2
Determining WHO called writev
$ dtrace -n 'syscall::writev:entry / execname == "httpd" / { ustack(); }'
Dtrace example #3
Tracing execution flow per request
#pragma D option flowindent pid$target::ap_process_request:entry { self->trace = 1; } pid$target::ap_process_request:return { self->trace = 0; } pid$target:::entry, pid$target:::return / self->trace / {}
Dtrace example #4
Tracing execution flow into the kernel
#pragma D option flowindent pid$target::ap_read_request:entry { self->trace = 1; } pid$target::ap_read_request:return { self->trace = 0; } pid$target:::entry, pid$target:::return, fbt:::entry, fbt:::return / self->trace / {}
Dtrace example #5
Watching Logical Apache I/O operations
syscall::write:entry / execname == "httpd" / { printf("Apache wrote (%s) to fd %d (%s\n", probefunc, arg0, fds[arg0].fi_pathname); } syscall::read:entry / execname == "httpd" / { printf("Apache read (%s) from fd %d (%s)\n", probefunc, arg0, fds[arg0].fi_pathname); }
Dtrace example #6
Measuring write size
$ dtrace -n 'syscall::read:return / execname == "httpd" && errno == 0 / { @distribution["Average read size"] = quantize(arg1); }'
Dtrace example #7
Measuring request processing time
pid$target::ap_read_request:entry { self->ts = timestamp; } pid$target::ap_read_request:return { self->method = arg1 == 0 ? "Unknown" : copyinstr(*(uintptr_t *)copyin(arg1 + 68,4)); self->uuri = arg1 == 0 ? "Unknown" : copyinstr(*(uintptr_t *)copyin(arg1 + 200,4)); } pid$target::ap_process_request:return { printf("Processed %s %s in %d microseconds\n", self->method, self->uuri, (timestamp - self->ts) / 1000000); self->uuri = 0; self->ts = 0; }
Conclusion
Debugging is cool! Debugging is great! Now its time for me to escape! :-)
Questions?
References
Apache: http://apache.org Chaosreader: http://users.tpg.com.au/bdgcvb/chaosreader.html Curl: http://curl.haxx.se/ Dtrace: http://opensolaris.org/os/community/dtrace/ Debugging Web Applications: http://www.samag.com/articles/2006/0603/ Ethereal: http://www.ethereal.com/ mod_backtrace: http://people.apache.org/~trawick/exception_hook.htm Observing I/O Behavior with the DTraceToolkit: http://www.samag.com/documents/s=9915/sam0512a/051 2a.htm

Apache Internals and Debugging: Ryan Matteson

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Apache Internals and Debugging: Ryan Matteson

Transféré par

Droits d'auteur :

Formats disponibles

Apache internals and debugging

Ryan Matteson matty91@gmail.com http://prefetch.net

How big is Apache?

480+ source code files:

240+ header files:

How is Apache organized?

How is the Apache source code organized?

Multi-processing modules (MPMs)

Modules and hooks

Buckets and brigades

util_filter.h provides a thorough description of input and output filter chains

And most importantly, because debugging can be fun and rewarding

What is post-mortem debugging?

Building and installing mod_backtrace

Compile mod_backtrace module

Watch Apache go splat

Using GDB macros

Using GDB macros (cont.)

Debugging transient problems

What is transient debugging?

Dtrace script organization

Vous aimerez peut-être aussi