mirror of
https://github.com/brendangregg/perf-tools.git
synced 2025-11-30 23:16:03 +07:00
uprobe for user-level tracing
This commit is contained in:
@@ -33,6 +33,7 @@ Using ftrace:
|
|||||||
- kernel/[funcslower](kernel/funcslower): trace kernel functions slower than a threshold. [Examples](examples/funcslower_example.txt).
|
- kernel/[funcslower](kernel/funcslower): trace kernel functions slower than a threshold. [Examples](examples/funcslower_example.txt).
|
||||||
- kernel/[funcgraph](kernel/funcgraph): trace a graph of kernel function calls, showing children and times. [Examples](examples/funcgraph_example.txt).
|
- kernel/[funcgraph](kernel/funcgraph): trace a graph of kernel function calls, showing children and times. [Examples](examples/funcgraph_example.txt).
|
||||||
- kernel/[kprobe](kernel/kprobe): dynamically trace a kernel function call or its return, with variables. [Examples](examples/kprobe_example.txt).
|
- kernel/[kprobe](kernel/kprobe): dynamically trace a kernel function call or its return, with variables. [Examples](examples/kprobe_example.txt).
|
||||||
|
- user/[uprobe](kernel/uprobe): dynamically trace a user-level function call or its return, with variables. [Examples](examples/uprobe_example.txt).
|
||||||
- tools/[reset-ftrace](tools/reset-ftrace): reset ftrace state if needed. [Examples](examples/reset-ftrace_example.txt).
|
- tools/[reset-ftrace](tools/reset-ftrace): reset ftrace state if needed. [Examples](examples/reset-ftrace_example.txt).
|
||||||
|
|
||||||
Using perf_events:
|
Using perf_events:
|
||||||
|
|||||||
1
bin/uprobe
Symbolic link
1
bin/uprobe
Symbolic link
@@ -0,0 +1 @@
|
|||||||
|
../user/uprobe
|
||||||
239
examples/uprobe_example.txt
Normal file
239
examples/uprobe_example.txt
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
Demonstrations of uprobe, the Linux ftrace version.
|
||||||
|
|
||||||
|
Trace the readline() function from all processes named "bash":
|
||||||
|
|
||||||
|
# ./uprobe p:bash:readline
|
||||||
|
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||||
|
bash-11886 [003] d... 19601233.618462: readline: (0x48db60)
|
||||||
|
bash-11886 [003] d... 19601235.152067: readline: (0x48db60)
|
||||||
|
bash-11915 [003] d... 19601238.976244: readline: (0x48db60)
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
readline() is the bash shell's function for reading interactive input, and
|
||||||
|
a line is printed each time I entered commands in separate bash shells.
|
||||||
|
The line contains default ftrace columns: the process name, "-", and PID;
|
||||||
|
the CPU, flags, a timestamp (in units of seconds), the probe name, then
|
||||||
|
other arguments. These columns are documented in the kernel source, under
|
||||||
|
Documentation/trace/ftrace.txt.
|
||||||
|
|
||||||
|
The first line of output is informational, and shows what uprobe is really
|
||||||
|
doing: it turned "bash" into "/bin/bash", using a $PATH lookup (via which(1)).
|
||||||
|
It then turned the "readline" symbol into 0x8db60, using objdump(1) for
|
||||||
|
symbol lookups.
|
||||||
|
|
||||||
|
Note that this traces _all_ bash processes simultaneously.
|
||||||
|
|
||||||
|
|
||||||
|
Tracing PID 11886 only:
|
||||||
|
|
||||||
|
# ./uprobe -p 11886 p:bash:readline
|
||||||
|
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||||
|
bash-11886 [002] d... 19601657.753893: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19601658.246613: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19601658.386666: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19601661.415952: readline: (0x48db60)
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
This may be important if you are tracing shared library functions, and only care
|
||||||
|
about one target process.
|
||||||
|
|
||||||
|
|
||||||
|
You can specify the full path to a binary to trace:
|
||||||
|
|
||||||
|
# ./uprobe p:/bin/bash:readline
|
||||||
|
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||||
|
bash-11886 [002] d... 19601746.902461: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19601749.543485: readline: (0x48db60)
|
||||||
|
bash-11886 [001] d... 19601749.702369: readline: (0x48db60)
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
This might be useful if uprobe picked the wrong binary to trace, as shown by
|
||||||
|
the informational line, and you wanted to specify it directly. It is also useful
|
||||||
|
for tracing binaries not in the $PATH, which uprobe can't otherwise find.
|
||||||
|
|
||||||
|
|
||||||
|
Use -l to list symbols available to trace; eg, searching for functions
|
||||||
|
containing "readline" in bash:
|
||||||
|
|
||||||
|
# ./uprobe -l bash | grep readline
|
||||||
|
initialize_readline
|
||||||
|
pcomp_set_readline_variables
|
||||||
|
posix_readline_initialize
|
||||||
|
readline
|
||||||
|
readline_internal_char
|
||||||
|
readline_internal_setup
|
||||||
|
readline_internal_teardown
|
||||||
|
|
||||||
|
|
||||||
|
Tracing the return of readline() with return value as a string:
|
||||||
|
|
||||||
|
# ./uprobe 'r:bash:readline +0($retval):string'
|
||||||
|
Tracing uprobe readline (r:readline /bin/bash:0x8db60 +0($retval):string). Ctrl-C to end.
|
||||||
|
bash-11886 [003] d... 19601837.001935: readline: (0x41e876 <- 0x48db60) arg1="ls -l"
|
||||||
|
bash-11886 [002] d... 19601851.008409: readline: (0x41e876 <- 0x48db60) arg1="echo "hello world""
|
||||||
|
bash-11886 [002] d... 19601854.099730: readline: (0x41e876 <- 0x48db60) arg1="df -h"
|
||||||
|
bash-11886 [002] d... 19601858.805740: readline: (0x41e876 <- 0x48db60) arg1="cd .."
|
||||||
|
bash-11886 [003] d... 19601898.378753: readline: (0x41e876 <- 0x48db60) arg1="foo bar"
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
Now I can see the commands entered. Note that this traces what bash reads in,
|
||||||
|
even if the command eventually fails. Eg, the last command "foo bar" didn't
|
||||||
|
work (No command 'foo' found).
|
||||||
|
|
||||||
|
Note that this invocation now uses "r:" at the start of the probe description,
|
||||||
|
instead of "p:". r is for return probes, p for entry probes.
|
||||||
|
|
||||||
|
|
||||||
|
Tracing sleep() calls in all running libc shared libraries:
|
||||||
|
|
||||||
|
# ./uprobe p:libc:sleep
|
||||||
|
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
|
||||||
|
svscan-2134 [000] d... 19602402.959904: sleep: (0x7f2dba562130)
|
||||||
|
cron-923 [000] d... 19602404.640507: sleep: (0x7f3e26d9e130)
|
||||||
|
cron-923 [002] d... 19602404.655232: sleep: (0x7f3e26d9e130)
|
||||||
|
cron-923 [002] d... 19602405.189271: sleep: (0x7f3e26d9e130)
|
||||||
|
svscan-2134 [000] d... 19602407.959947: sleep: (0x7f2dba562130)
|
||||||
|
[...]
|
||||||
|
|
||||||
|
This shows different programs calling sleep -- likely threads waiting for work.
|
||||||
|
|
||||||
|
I ran a "sleep 1" command in a bash shell, which wasn't seen above: probably
|
||||||
|
using a different sleep library call, which I'd need to trace separately.
|
||||||
|
|
||||||
|
|
||||||
|
Including headers (-H):
|
||||||
|
|
||||||
|
# ./uprobe -H p:libc:sleep
|
||||||
|
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
|
||||||
|
# tracer: nop
|
||||||
|
#
|
||||||
|
# entries-in-buffer/entries-written: 0/0 #P:4
|
||||||
|
#
|
||||||
|
# _-----=> irqs-off
|
||||||
|
# / _----=> need-resched
|
||||||
|
# | / _---=> hardirq/softirq
|
||||||
|
# || / _--=> preempt-depth
|
||||||
|
# ||| / delay
|
||||||
|
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
||||||
|
# | | | |||| | |
|
||||||
|
svscan-2134 [000] d... 19603052.976770: sleep: (0x7f2dba562130)
|
||||||
|
svscan-2134 [002] d... 19603057.976927: sleep: (0x7f2dba562130)
|
||||||
|
[...]
|
||||||
|
|
||||||
|
These are documented in Documentation/trace/ftrace.txt.
|
||||||
|
|
||||||
|
|
||||||
|
Tracing sleep() with its argument (seconds):
|
||||||
|
|
||||||
|
# ./uprobe 'p:libc:sleep %di'
|
||||||
|
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130 %di). Ctrl-C to end.
|
||||||
|
svscan-2134 [002] d... 19602517.962925: sleep: (0x7f2dba562130) arg1=0x5
|
||||||
|
svscan-2134 [002] d... 19602522.963082: sleep: (0x7f2dba562130) arg1=0x5
|
||||||
|
cron-923 [002] d... 19602524.187733: sleep: (0x7f3e26d9e130) arg1=0x3c
|
||||||
|
svscan-2134 [002] d... 19602527.963267: sleep: (0x7f2dba562130) arg1=0x5
|
||||||
|
[...]
|
||||||
|
|
||||||
|
So svcan was sleeping for 5 seconds, and cron for 60 seconds (0x3c = 60).
|
||||||
|
|
||||||
|
The argument is specified by its register, %di. This is platform dependent: %di
|
||||||
|
may only be meaningful on x86. If you're on a different architecture (eg, ARM),
|
||||||
|
you will probably need to use something else.
|
||||||
|
|
||||||
|
If working with registers is not for you, then consider tracing this using
|
||||||
|
perf_events with debuginfo installed: in which case you can use the variable
|
||||||
|
names. Or consider a different tracer.
|
||||||
|
|
||||||
|
|
||||||
|
Here is an example of the optional filter expression, to only trace the return
|
||||||
|
of fopen() when it failed and returned NULL (0):
|
||||||
|
|
||||||
|
# ./uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||||
|
Tracing uprobe fopen (r:fopen /lib/x86_64-linux-gnu/libc-2.15.so:0x6e540 file=$retval). Ctrl-C to end.
|
||||||
|
prog1-23982 [000] d... 19602894.346872: fopen: (0x40051e <- 0x7f637867f540) file=0x0
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
The argument $retval was given a vanity name "file", which was then tested in
|
||||||
|
the filter expression "file == 0".
|
||||||
|
|
||||||
|
|
||||||
|
Tracing func_abc() in my test program, and including user-level stacks:
|
||||||
|
|
||||||
|
# ./uprobe -s p:/root/func_abc:func_c
|
||||||
|
Tracing uprobe func_c (p:func_c /root/func_abc:0x4f4). Ctrl-C to end.
|
||||||
|
func_abc-25394 [000] d... 19603250.054040: func_c: (0x4004f4)
|
||||||
|
func_abc-25394 [000] d... 19603250.054056: <user stack trace>
|
||||||
|
=> <00000000004004f4>
|
||||||
|
=> <0000000000400527>
|
||||||
|
=> <0000000000400537>
|
||||||
|
=> <00007fca9f0e376d>
|
||||||
|
func_abc-25394 [000] d... 19603251.054250: func_c: (0x4004f4)
|
||||||
|
func_abc-25394 [000] d... 19603251.054266: <user stack trace>
|
||||||
|
=> <00000000004004f4>
|
||||||
|
=> <0000000000400527>
|
||||||
|
=> <0000000000400537>
|
||||||
|
=> <00007fca9f0e376d>
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
The output has the raw hex addresses. If this is too much of a nuisance, then
|
||||||
|
try tracing this using perf_events which should automate the translation.
|
||||||
|
|
||||||
|
It can get worse, eg:
|
||||||
|
|
||||||
|
l# ./uprobe -s p:bash:readline
|
||||||
|
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||||
|
bash-11886 [002] d... 19603434.397818: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19603434.397832: <user stack trace>
|
||||||
|
=> <000000000048db60>
|
||||||
|
bash-11886 [002] d... 19603434.592500: readline: (0x48db60)
|
||||||
|
bash-11886 [002] d... 19603434.592510: <user stack trace>
|
||||||
|
=> <000000000048db60>
|
||||||
|
^C
|
||||||
|
Ending tracing...
|
||||||
|
|
||||||
|
Here the stack trace is missing (0x48db60 is the traced function, transposed
|
||||||
|
from the base load address). This is due to compiler optimizations. It can be
|
||||||
|
fixed by recompiling with -fno-omit-frame-pointer, or, using perf_events and
|
||||||
|
a different method of stack walking.
|
||||||
|
|
||||||
|
|
||||||
|
Use -h to print the USAGE message:
|
||||||
|
|
||||||
|
# ./uprobe -h
|
||||||
|
USAGE: uprobe [-FhHsv] [-d secs] [-p PID] {-l target |
|
||||||
|
uprobe_definition [filter]}
|
||||||
|
-F # force. trace despite warnings.
|
||||||
|
-d seconds # trace duration, and use buffers
|
||||||
|
-l target # list functions from this executable
|
||||||
|
-p PID # PID to match on I/O issue
|
||||||
|
-v # view format file (don't trace)
|
||||||
|
-H # include column headers
|
||||||
|
-s # show user stack traces
|
||||||
|
-h # this usage message
|
||||||
|
|
||||||
|
Note that these examples may need modification to match your kernel
|
||||||
|
version's function names and platform's register usage.
|
||||||
|
eg,
|
||||||
|
# trace readline() calls in all running "bash" executables:
|
||||||
|
uprobe p:bash:readline
|
||||||
|
# trace readline() with explicit executable path:
|
||||||
|
uprobe p:/bin/bash:readline
|
||||||
|
# trace the return of readline() with return value as a string:
|
||||||
|
uprobe 'r:bash:readline +0($retval):string'
|
||||||
|
# trace strcpy() calls in all running libc shared libraries:
|
||||||
|
uprobe p:libc:sleep
|
||||||
|
# trace strcpy() with register %di (x86):
|
||||||
|
uprobe 'p:libc:sleep %di'
|
||||||
|
# trace this address (use caution, can cause failures):
|
||||||
|
uprobe p:libc:0xbf130
|
||||||
|
# trace gettimeofday() for PID 1182 only:
|
||||||
|
uprobe -p 1182 p:libc:gettimeofday
|
||||||
|
# trace the return of fopen() only when it returns NULL:
|
||||||
|
uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||||
|
|
||||||
|
See the man page and example file for more info.
|
||||||
160
man/man8/uprobe.8
Normal file
160
man/man8/uprobe.8
Normal file
@@ -0,0 +1,160 @@
|
|||||||
|
.TH uprobe 8 "2014-07-20" "USER COMMANDS"
|
||||||
|
.SH NAME
|
||||||
|
uprobe \- trace a given uprobe definition. User-level dynamic tracing. Uses Linux ftrace. EXPERIMENTAL.
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B uprobe
|
||||||
|
[\-FhHsv] [\-d secs] [\-p PID] uprobe_definition [filter]
|
||||||
|
.SH DESCRIPTION
|
||||||
|
This will create, trace, then destroy a given uprobe definition. See
|
||||||
|
Documentation/trace/uprobetracer.txt in the Linux kernel source for the
|
||||||
|
syntax of a uprobe definition, and "uprobe -h" for examples. With this tool,
|
||||||
|
the probe alias is optional (it will default to something meaningful).
|
||||||
|
|
||||||
|
WARNING: This uses dynamic tracing of user-level functions, using some
|
||||||
|
relatively new kernel code. I have seen this cause target processes to fail,
|
||||||
|
either entering endless spin loops or crashing on illegal instructions. I
|
||||||
|
believe newer kernels (post 4.0) are relatively safer, but use caution. Test
|
||||||
|
in a lab environment, and know what you are doing, before use. Also consider
|
||||||
|
other (more developed) user-level tracers (perf_events, LTTng, etc.).
|
||||||
|
|
||||||
|
Also beware of widespread tracing that interferes with the operation of the
|
||||||
|
system, eg, tracing libc:malloc, which by-default will trace _all_ processes.
|
||||||
|
|
||||||
|
I wrote this because I kept testing different custom uprobes at the command
|
||||||
|
line, and wanted a way to automate the steps. For generic user-level
|
||||||
|
tracing, use perf_events directly.
|
||||||
|
|
||||||
|
Since this uses ftrace, only the root user can use this tool.
|
||||||
|
.SH REQUIREMENTS
|
||||||
|
REQUIREMENTS: FTRACE and UPROBE CONFIG, which you may already have on recent
|
||||||
|
kernel versions, file(1), ldconfig(8), objdump(1), and some version of awk.
|
||||||
|
Also, currently only executes on Linux 4.0+ (see WARNING) unless -F is used.
|
||||||
|
.SH OPTIONS
|
||||||
|
.TP
|
||||||
|
\-F
|
||||||
|
Force. Trace despite kernel version warnings. Use on older kernels may expose
|
||||||
|
you to (since fixed) bugs, which can lock up or crash target processes, which
|
||||||
|
could also lock up the entire system. Test in a lab environment before use,
|
||||||
|
and consider other more developed user-level tracers (perf_events, LTTng,
|
||||||
|
etc.).
|
||||||
|
.TP
|
||||||
|
\-d seconds
|
||||||
|
Set the duration of tracing, in seconds. Trace output will be buffered and
|
||||||
|
printed at the end. This also reduces overheads by buffering in-kernel,
|
||||||
|
instead of printing events as they occur.
|
||||||
|
|
||||||
|
The ftrace buffer has a fixed size per-CPU (see
|
||||||
|
/sys/kernel/debug/tracing/buffer_size_kb). If you think events are missing,
|
||||||
|
try increasing that size.
|
||||||
|
.TP
|
||||||
|
\-h
|
||||||
|
Print usage message.
|
||||||
|
.TP
|
||||||
|
\-H
|
||||||
|
Print column headers.
|
||||||
|
.TP
|
||||||
|
\-s
|
||||||
|
Print user-level stack traces after each event. These are currently printed
|
||||||
|
in hex, and need post-processing to see user-level symbols (eg, addr2line;
|
||||||
|
I should automate that).
|
||||||
|
.TP
|
||||||
|
\-v
|
||||||
|
Show the uprobe format file only (do not trace), identifying possible variables
|
||||||
|
for use in a custom filter.
|
||||||
|
.TP
|
||||||
|
\-p PID
|
||||||
|
Only trace user-level functions when this process ID is on-CPU.
|
||||||
|
.TP
|
||||||
|
uprobe_definition
|
||||||
|
A full uprobe definition, as documented by Documentation/trace/uprobetracer.txt
|
||||||
|
in the Linux kernel source. Note that the probe alias name is optional with
|
||||||
|
uprobe(8), and if not specified, it will default to something meaningful.
|
||||||
|
See the EXAMPLES section.
|
||||||
|
.TP
|
||||||
|
filter
|
||||||
|
An ftrace filter definition.
|
||||||
|
.SH EXAMPLES
|
||||||
|
These examples may need modification to match your target software function
|
||||||
|
names and platform's register usage. If using platform specific registers
|
||||||
|
becomes too painful in practice, consider a debuginfo-based tracer,
|
||||||
|
which can trace variables names instead (eg, perf_events).
|
||||||
|
.TP
|
||||||
|
trace readline() calls in all running "bash" executables:
|
||||||
|
#
|
||||||
|
.B uprobe p:bash:readline
|
||||||
|
.TP
|
||||||
|
trace readline() with explicit executable path:
|
||||||
|
#
|
||||||
|
.B uprobe p:/bin/bash:readline
|
||||||
|
.TP
|
||||||
|
trace the return of readline() with return value as a string:
|
||||||
|
#
|
||||||
|
.B uprobe 'r:bash:readline +0($retval):string'
|
||||||
|
.TP
|
||||||
|
trace sleep() calls in all running libc shared libraries:
|
||||||
|
#
|
||||||
|
.B uprobe p:libc:sleep
|
||||||
|
.TP
|
||||||
|
trace sleep() with register %di (x86):
|
||||||
|
#
|
||||||
|
.B uprobe 'p:libc:sleep %di'
|
||||||
|
.TP
|
||||||
|
trace this address (use caution, can cause failures):
|
||||||
|
#
|
||||||
|
.B uprobe p:libc:0xbf130
|
||||||
|
.TP
|
||||||
|
trace gettimeofday() for PID 1182 only:
|
||||||
|
#
|
||||||
|
.B uprobe -p 1182 p:libc:gettimeofday
|
||||||
|
.TP
|
||||||
|
trace the return of fopen() only when it returns NULL:
|
||||||
|
#
|
||||||
|
.B uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||||
|
.SH FIELDS
|
||||||
|
The output format depends on the kernel version, and headings can be printed
|
||||||
|
using \-H. The format is the same as the ftrace function trace format, described
|
||||||
|
in the kernel source under Documentation/trace/ftrace.txt.
|
||||||
|
|
||||||
|
Typical fields are:
|
||||||
|
.TP
|
||||||
|
TASK-PID
|
||||||
|
The process name (which could include dashes), a dash, and the process ID.
|
||||||
|
.TP
|
||||||
|
CPU#
|
||||||
|
The CPU ID, in brackets.
|
||||||
|
.TP
|
||||||
|
||||
|
||||||
|
Kernel state flags. For example, on Linux 3.16 these are for irqs-off,
|
||||||
|
need-resched, hardirq/softirq, and preempt-depth.
|
||||||
|
.TP
|
||||||
|
TIMESTAMP
|
||||||
|
Time of event, in seconds.
|
||||||
|
.TP
|
||||||
|
FUNCTION
|
||||||
|
User-level function name.
|
||||||
|
.SH OVERHEAD
|
||||||
|
This can generate a lot of trace data quickly, depending on the
|
||||||
|
frequency of the traced events. Such data will cause performance overheads.
|
||||||
|
This also works without buffering by default, printing function events
|
||||||
|
as they happen (uses trace_pipe), context switching and consuming CPU to do
|
||||||
|
so. If needed, you can try the "\-d secs" option, which buffers events
|
||||||
|
instead, reducing overhead. If you think the buffer option is losing events,
|
||||||
|
try increasing the buffer size (buffer_size_kb).
|
||||||
|
|
||||||
|
If you find a use for uprobe(8) where the overhead is prohibitive, consider
|
||||||
|
the same enabling using perf_events where overhead should be reduced.
|
||||||
|
.SH SOURCE
|
||||||
|
This is from the perf-tools collection:
|
||||||
|
.IP
|
||||||
|
https://github.com/brendangregg/perf-tools
|
||||||
|
.PP
|
||||||
|
Also look under the examples directory for a text file containing example
|
||||||
|
usage, output, and commentary for this tool.
|
||||||
|
.SH OS
|
||||||
|
Linux
|
||||||
|
.SH STABILITY
|
||||||
|
Unstable - in development.
|
||||||
|
.SH AUTHOR
|
||||||
|
Brendan Gregg
|
||||||
|
.SH SEE ALSO
|
||||||
|
kprobe(8)
|
||||||
373
user/uprobe
Executable file
373
user/uprobe
Executable file
@@ -0,0 +1,373 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
#
|
||||||
|
# uprobe - trace a given uprobe definition. Kernel dynamic tracing.
|
||||||
|
# Written using Linux ftrace. Experimental.
|
||||||
|
#
|
||||||
|
# This will create, trace, then destroy a given uprobe definition. See
|
||||||
|
# Documentation/trace/uprobetrace.txt in the Linux kernel source for the
|
||||||
|
# syntax of a uprobe definition, and "uprobe -h" for examples. With this tool,
|
||||||
|
# the probe alias is optional (it will default to something meaningful).
|
||||||
|
#
|
||||||
|
# USAGE: ./uprobe [-FhHsv] [-d secs] [-p pid] uprobe_definition [filter]
|
||||||
|
#
|
||||||
|
# Run "uprobe -h" for full usage.
|
||||||
|
#
|
||||||
|
# WARNING: This uses dynamic tracing of user-level functions, using some
|
||||||
|
# relatively new kernel code. I have seen this cause target processes to fail,
|
||||||
|
# either entering endless spin loops or crashing on illegal instructions. I
|
||||||
|
# believe newer kernels (post 4.0) are relatively safer, but use caution. Test
|
||||||
|
# in a lab environment, and know what you are doing, before use.
|
||||||
|
#
|
||||||
|
# Also beware of widespread tracing that interferes with the operation of the
|
||||||
|
# system, eg, tracing libc:malloc, which by-default will trace _all_ processes.
|
||||||
|
# Test in a lab environment before use.
|
||||||
|
#
|
||||||
|
# I wrote this because I kept testing different custom uprobes at the command
|
||||||
|
# line, and wanted a way to automate the steps. For generic user-level
|
||||||
|
# tracing, use perf_events directly.
|
||||||
|
#
|
||||||
|
# REQUIREMENTS: FTRACE and UPROBE CONFIG, which you may already have on recent
|
||||||
|
# kernel versions, file(1), ldconfig(8), objdump(1), and some version of awk.
|
||||||
|
# Also, currently only executes on Linux 4.0+ (see WARNING) unless -F is used.
|
||||||
|
#
|
||||||
|
# From perf-tools: https://github.com/brendangregg/perf-tools
|
||||||
|
#
|
||||||
|
# See the uprobe(8) man page (in perf-tools) for more info.
|
||||||
|
#
|
||||||
|
# COPYRIGHT: Copyright (c) 2015 Brendan Gregg.
|
||||||
|
#
|
||||||
|
# This program is free software; you can redistribute it and/or
|
||||||
|
# modify it under the terms of the GNU General Public License
|
||||||
|
# as published by the Free Software Foundation; either version 2
|
||||||
|
# of the License, or (at your option) any later version.
|
||||||
|
#
|
||||||
|
# This program is distributed in the hope that it will be useful,
|
||||||
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
# GNU General Public License for more details.
|
||||||
|
#
|
||||||
|
# You should have received a copy of the GNU General Public License
|
||||||
|
# along with this program; if not, write to the Free Software Foundation,
|
||||||
|
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||||
|
#
|
||||||
|
# (http://www.gnu.org/copyleft/gpl.html)
|
||||||
|
#
|
||||||
|
# 27-Jul-2015 Brendan Gregg Created this.
|
||||||
|
|
||||||
|
### default variables
|
||||||
|
tracing=/sys/kernel/debug/tracing
|
||||||
|
flock=/var/tmp/.ftrace-lock; wroteflock=0
|
||||||
|
opt_duration=0; duration=; opt_pid=0; pid=; opt_filter=0; filter=
|
||||||
|
opt_view=0; opt_headers=0; opt_stack=0; dmesg=2; debug=0; opt_force=0
|
||||||
|
opt_list=0; target=
|
||||||
|
PATH=$PATH:/usr/bin:/sbin # ensure we find objdump, ldconfig
|
||||||
|
trap ':' INT QUIT TERM PIPE HUP # sends execution to end tracing section
|
||||||
|
|
||||||
|
function usage {
|
||||||
|
cat <<-END >&2
|
||||||
|
USAGE: uprobe [-FhHsv] [-d secs] [-p PID] {-l target |
|
||||||
|
uprobe_definition [filter]}
|
||||||
|
-F # force. trace despite warnings.
|
||||||
|
-d seconds # trace duration, and use buffers
|
||||||
|
-l target # list functions from this executable
|
||||||
|
-p PID # PID to match on I/O issue
|
||||||
|
-v # view format file (don't trace)
|
||||||
|
-H # include column headers
|
||||||
|
-s # show user stack traces
|
||||||
|
-h # this usage message
|
||||||
|
|
||||||
|
Note that these examples may need modification to match your kernel
|
||||||
|
version's function names and platform's register usage.
|
||||||
|
eg,
|
||||||
|
# trace readline() calls in all running "bash" executables:
|
||||||
|
uprobe p:bash:readline
|
||||||
|
# trace readline() with explicit executable path:
|
||||||
|
uprobe p:/bin/bash:readline
|
||||||
|
# trace the return of readline() with return value as a string:
|
||||||
|
uprobe 'r:bash:readline +0(\$retval):string'
|
||||||
|
# trace sleep() calls in all running libc shared libraries:
|
||||||
|
uprobe p:libc:sleep
|
||||||
|
# trace sleep() with register %di (x86):
|
||||||
|
uprobe 'p:libc:sleep %di'
|
||||||
|
# trace this address (use caution, can cause failures):
|
||||||
|
uprobe p:libc:0xbf130
|
||||||
|
# trace gettimeofday() for PID 1182 only:
|
||||||
|
uprobe -p 1182 p:libc:gettimeofday
|
||||||
|
# trace the return of fopen() only when it returns NULL:
|
||||||
|
uprobe 'r:libc:fopen file=\$retval' 'file == 0'
|
||||||
|
|
||||||
|
See the man page and example file for more info.
|
||||||
|
END
|
||||||
|
exit
|
||||||
|
}
|
||||||
|
|
||||||
|
function warn {
|
||||||
|
if ! eval "$@"; then
|
||||||
|
echo >&2 "WARNING: command failed \"$@\""
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
function end {
|
||||||
|
# disable tracing
|
||||||
|
echo 2>/dev/null
|
||||||
|
echo "Ending tracing..." 2>/dev/null
|
||||||
|
cd $tracing
|
||||||
|
warn "echo 0 > events/uprobes/$uname/enable"
|
||||||
|
if (( opt_filter )); then
|
||||||
|
warn "echo 0 > events/uprobes/$uname/filter"
|
||||||
|
fi
|
||||||
|
warn "echo -:$uname >> uprobe_events"
|
||||||
|
(( opt_stack )) && warn "echo 0 > options/userstacktrace"
|
||||||
|
warn "echo > trace"
|
||||||
|
(( wroteflock )) && warn "rm $flock"
|
||||||
|
}
|
||||||
|
|
||||||
|
function die {
|
||||||
|
echo >&2 "$@"
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
function edie {
|
||||||
|
# die with a quiet end()
|
||||||
|
echo >&2 "$@"
|
||||||
|
exec >/dev/null 2>&1
|
||||||
|
end
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
function set_path {
|
||||||
|
name=$1
|
||||||
|
|
||||||
|
path=$(which $name)
|
||||||
|
if [[ "$path" == "" ]]; then
|
||||||
|
path=$(ldconfig -v 2>/dev/null | awk -v lib=$name '
|
||||||
|
$1 ~ /:/ { sub(/:/, "", $1); path = $1 }
|
||||||
|
{ sub(/\..*/, "", $1); }
|
||||||
|
$1 == lib { print path "/" $3 }')
|
||||||
|
if [[ "$path" == "" ]]; then
|
||||||
|
die "ERROR: segment \"$name\" ambiguous.
|
||||||
|
program or library? Try a full path."
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ ! -x $path ]]; then
|
||||||
|
die "ERROR: resolved \"$name\" to \"$path\", but file missing"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
function set_addr {
|
||||||
|
path=$1
|
||||||
|
name=$2
|
||||||
|
sym=$3
|
||||||
|
|
||||||
|
[[ "$path" == "" ]] && die "ERROR: missing symbol path."
|
||||||
|
[[ "$sym" == "" ]] && die "ERROR: missing symbol for $path"
|
||||||
|
|
||||||
|
addr=$(objdump -tT $path | awk -v sym=$sym '
|
||||||
|
$NF == sym { print $1; exit }')
|
||||||
|
[[ "$addr" == "" ]] && die "ERROR: missing symbol \"$sym\" in $path"
|
||||||
|
(( 0x$addr == 0 )) && die "ERROR: failed resolving \"$sym\" in $path." \
|
||||||
|
"Maybe it exists in a different target (eg, library)?"
|
||||||
|
addr=0x$( printf "%x" 0x$addr ) # strip leading zeros
|
||||||
|
|
||||||
|
type=$(file $path)
|
||||||
|
if [[ "$type" != *shared?object* ]]; then
|
||||||
|
# subtract the base mapping address. see Documentation/trace/
|
||||||
|
# uprobetracer.txt to understand the following steps.
|
||||||
|
pid=$(pgrep -nx $name)
|
||||||
|
[[ "$pid" == "" ]] && die "ERROR: no PID found for \"$name\"."
|
||||||
|
range=$(grep 'r-xp.*'$path /proc/$pid/maps)
|
||||||
|
[[ "$range" == "" ]] && \
|
||||||
|
die "ERROR: no addr range found for PID $pid and segment" \
|
||||||
|
"$path. Searched for r-xp in /proc/$pid/maps."
|
||||||
|
base=$range
|
||||||
|
base=0x${base%%-*}
|
||||||
|
addr=$(( addr - base ))
|
||||||
|
if (( addr < 0 )); then
|
||||||
|
echo "WARNING: problems removing base addr from $sym." \
|
||||||
|
"Trying untransposed addr."
|
||||||
|
addr=$(( addr + base ))
|
||||||
|
fi
|
||||||
|
addr=0x$( printf "%x" $addr)
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
### process options
|
||||||
|
while getopts Fd:hHl:p:sv opt
|
||||||
|
do
|
||||||
|
case $opt in
|
||||||
|
F) opt_force=1 ;;
|
||||||
|
d) opt_duration=1; duration=$OPTARG ;;
|
||||||
|
p) opt_pid=1; pid=$OPTARG ;;
|
||||||
|
l) opt_list=1; target=$OPTARG ;;
|
||||||
|
H) opt_headers=1 ;;
|
||||||
|
s) opt_stack=1 ;;
|
||||||
|
v) opt_view=1 ;;
|
||||||
|
h|?) usage ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
shift $(( $OPTIND - 1 ))
|
||||||
|
uprobe=$1
|
||||||
|
shift
|
||||||
|
if (( $# )); then
|
||||||
|
opt_filter=1
|
||||||
|
filter=$1
|
||||||
|
fi
|
||||||
|
|
||||||
|
### handle listing
|
||||||
|
[[ "$opt_list" == 1 && "$uprobe" != "" ]] && die "ERROR: -l takes a target only"
|
||||||
|
if (( opt_list )); then
|
||||||
|
if [[ "$target" != */* ]]; then
|
||||||
|
set_path $target
|
||||||
|
target=$path
|
||||||
|
fi
|
||||||
|
objdump -tT $target | awk '$4 == ".text" { print $NF }' | sort
|
||||||
|
exit
|
||||||
|
fi
|
||||||
|
|
||||||
|
### check kernel version
|
||||||
|
ver=$(uname -r)
|
||||||
|
maj=${ver%%.*}
|
||||||
|
if (( opt_force == 0 && $maj < 4 )); then
|
||||||
|
cat <<-END >&2
|
||||||
|
ERROR: Kernel version >= 4.0 preferred (you have $ver). Aborting.
|
||||||
|
|
||||||
|
Background: uprobes were first added in 3.5. I've tested them on 3.13,
|
||||||
|
and found them unsafe, as they can crash or lock up processes, which can
|
||||||
|
effectively lock up the system. On 4.0, uprobes seem much safer (but may
|
||||||
|
not be 100%). They may be safer (safe?) on other 3.x kernels, but I
|
||||||
|
don't know. You can use -F to force tracing, but you've been warned.
|
||||||
|
END
|
||||||
|
exit
|
||||||
|
fi
|
||||||
|
|
||||||
|
### option logic
|
||||||
|
[[ "$uprobe" == "" ]] && usage
|
||||||
|
(( opt_pid && opt_filter )) && die "ERROR: use either -p or a filter."
|
||||||
|
(( opt_duration && opt_view )) && die "ERROR: use either -d or -v."
|
||||||
|
if (( opt_pid )); then
|
||||||
|
# convert to filter
|
||||||
|
opt_filter=1
|
||||||
|
filter="common_pid == $pid"
|
||||||
|
fi
|
||||||
|
if [[ "$uprobe" != p:* && "$uprobe" != r:* ]]; then
|
||||||
|
echo >&2 "ERROR: invalid uprobe definition (should start with p: or r:)"
|
||||||
|
usage
|
||||||
|
fi
|
||||||
|
#
|
||||||
|
# Parse the following:
|
||||||
|
# p:bash:readline
|
||||||
|
# p:my bash:readline
|
||||||
|
# p:bash:readline %si
|
||||||
|
# r:bash:readline $ret
|
||||||
|
# p:my bash:readline %si
|
||||||
|
# p:bash:readline si=%si
|
||||||
|
# p:my bash:readline si=%si
|
||||||
|
# r:bash:readline cmd=+0($retval):string
|
||||||
|
# ... and all of the above with /bin/bash instead of bash
|
||||||
|
# ... and all of the above with libc:sleep instead of ...
|
||||||
|
# ... and all of the above with /lib/x86_64-linux-gnu/libc.so.6:sleep ...
|
||||||
|
# ... and all of the above with symbol addresses
|
||||||
|
# ... and examples from USAGE message
|
||||||
|
# The following code is not as complicated as it looks.
|
||||||
|
#
|
||||||
|
utype=${uprobe%%:*}
|
||||||
|
urest="${uprobe#*:} "
|
||||||
|
set -- $urest
|
||||||
|
if [[ $1 == *:* ]]; then
|
||||||
|
uname=; probe=$1; shift; uargs="$@"
|
||||||
|
else
|
||||||
|
uname=$1; probe=$2; shift 2; uargs="$@"
|
||||||
|
fi
|
||||||
|
path=$probe; path=${path%%:*}
|
||||||
|
addr=$probe; addr=${addr##*:}
|
||||||
|
|
||||||
|
# set seg and fix path (eg, seg=bash, path=/bin/bash)
|
||||||
|
if [[ $path == */* ]]; then
|
||||||
|
seg=${path##*/}
|
||||||
|
seg=${seg%%.*}
|
||||||
|
else
|
||||||
|
seg=$path
|
||||||
|
# determine path, eg, given "zsh" or "libc"
|
||||||
|
set_path $path
|
||||||
|
fi
|
||||||
|
|
||||||
|
# fix uname and addr (eg, uname=readline, addr=0x8db60)
|
||||||
|
if [[ "$addr" == 0x* ]]; then
|
||||||
|
# symbol unknown; default to seg+addr
|
||||||
|
[[ "$uname" == "" ]] && uname=${seg}_$addr
|
||||||
|
else
|
||||||
|
[[ "$uname" == "" ]] && uname=$addr
|
||||||
|
set_addr $path $seg $addr
|
||||||
|
fi
|
||||||
|
|
||||||
|
# construct uprobe
|
||||||
|
uprobe="$utype:$uname $path:$addr"
|
||||||
|
[[ "$uargs" != "" ]] && uprobe="$uprobe $uargs"
|
||||||
|
|
||||||
|
if (( debug )); then
|
||||||
|
echo "uname: \"$uname\", uprobe: \"$uprobe\""
|
||||||
|
fi
|
||||||
|
|
||||||
|
### check permissions
|
||||||
|
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
|
||||||
|
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
|
||||||
|
|
||||||
|
if (( !opt_view )); then
|
||||||
|
if (( opt_duration )); then
|
||||||
|
echo "Tracing uprobe $uname for $duration seconds (buffered)..."
|
||||||
|
else
|
||||||
|
echo "Tracing uprobe $uname ($uprobe). Ctrl-C to end."
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
### ftrace lock
|
||||||
|
[[ -e $flock ]] && die "ERROR: ftrace may be in use by PID $(cat $flock) $flock"
|
||||||
|
echo $$ > $flock || die "ERROR: unable to write $flock."
|
||||||
|
wroteflock=1
|
||||||
|
|
||||||
|
### setup and begin tracing
|
||||||
|
echo nop > current_tracer
|
||||||
|
if ! echo "$uprobe" >> uprobe_events; then
|
||||||
|
echo >&2 "ERROR: adding uprobe \"$uprobe\"."
|
||||||
|
if (( dmesg )); then
|
||||||
|
echo >&2 "Last $dmesg dmesg entries (might contain reason):"
|
||||||
|
dmesg | tail -$dmesg | sed 's/^/ /'
|
||||||
|
fi
|
||||||
|
edie "Exiting."
|
||||||
|
fi
|
||||||
|
if (( opt_view )); then
|
||||||
|
cat events/uprobes/$uname/format
|
||||||
|
edie ""
|
||||||
|
fi
|
||||||
|
if (( opt_filter )); then
|
||||||
|
if ! echo "$filter" > events/uprobes/$uname/filter; then
|
||||||
|
edie "ERROR: setting filter or -p. Exiting."
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
if (( opt_stack )); then
|
||||||
|
if ! echo 1 > options/userstacktrace; then
|
||||||
|
edie "ERROR: enabling stack traces (-s). Exiting"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
if ! echo 1 > events/uprobes/$uname/enable; then
|
||||||
|
edie "ERROR: enabling uprobe $uname. Exiting."
|
||||||
|
fi
|
||||||
|
|
||||||
|
### print trace buffer
|
||||||
|
warn "echo > trace"
|
||||||
|
if (( opt_duration )); then
|
||||||
|
sleep $duration
|
||||||
|
if (( opt_headers )); then
|
||||||
|
cat trace
|
||||||
|
else
|
||||||
|
grep -v '^#' trace
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
# trace_pipe lack headers, so fetch them from trace
|
||||||
|
(( opt_headers )) && cat trace
|
||||||
|
cat trace_pipe
|
||||||
|
fi
|
||||||
|
|
||||||
|
### end tracing
|
||||||
|
end
|
||||||
Reference in New Issue
Block a user