mirror of
https://github.com/brendangregg/perf-tools.git
synced 2025-11-30 23:16:03 +07:00
uprobe for user-level tracing
This commit is contained in:
@@ -33,6 +33,7 @@ Using ftrace:
|
||||
- kernel/[funcslower](kernel/funcslower): trace kernel functions slower than a threshold. [Examples](examples/funcslower_example.txt).
|
||||
- kernel/[funcgraph](kernel/funcgraph): trace a graph of kernel function calls, showing children and times. [Examples](examples/funcgraph_example.txt).
|
||||
- kernel/[kprobe](kernel/kprobe): dynamically trace a kernel function call or its return, with variables. [Examples](examples/kprobe_example.txt).
|
||||
- user/[uprobe](kernel/uprobe): dynamically trace a user-level function call or its return, with variables. [Examples](examples/uprobe_example.txt).
|
||||
- tools/[reset-ftrace](tools/reset-ftrace): reset ftrace state if needed. [Examples](examples/reset-ftrace_example.txt).
|
||||
|
||||
Using perf_events:
|
||||
|
||||
1
bin/uprobe
Symbolic link
1
bin/uprobe
Symbolic link
@@ -0,0 +1 @@
|
||||
../user/uprobe
|
||||
239
examples/uprobe_example.txt
Normal file
239
examples/uprobe_example.txt
Normal file
@@ -0,0 +1,239 @@
|
||||
Demonstrations of uprobe, the Linux ftrace version.
|
||||
|
||||
Trace the readline() function from all processes named "bash":
|
||||
|
||||
# ./uprobe p:bash:readline
|
||||
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||
bash-11886 [003] d... 19601233.618462: readline: (0x48db60)
|
||||
bash-11886 [003] d... 19601235.152067: readline: (0x48db60)
|
||||
bash-11915 [003] d... 19601238.976244: readline: (0x48db60)
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
readline() is the bash shell's function for reading interactive input, and
|
||||
a line is printed each time I entered commands in separate bash shells.
|
||||
The line contains default ftrace columns: the process name, "-", and PID;
|
||||
the CPU, flags, a timestamp (in units of seconds), the probe name, then
|
||||
other arguments. These columns are documented in the kernel source, under
|
||||
Documentation/trace/ftrace.txt.
|
||||
|
||||
The first line of output is informational, and shows what uprobe is really
|
||||
doing: it turned "bash" into "/bin/bash", using a $PATH lookup (via which(1)).
|
||||
It then turned the "readline" symbol into 0x8db60, using objdump(1) for
|
||||
symbol lookups.
|
||||
|
||||
Note that this traces _all_ bash processes simultaneously.
|
||||
|
||||
|
||||
Tracing PID 11886 only:
|
||||
|
||||
# ./uprobe -p 11886 p:bash:readline
|
||||
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||
bash-11886 [002] d... 19601657.753893: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19601658.246613: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19601658.386666: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19601661.415952: readline: (0x48db60)
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
This may be important if you are tracing shared library functions, and only care
|
||||
about one target process.
|
||||
|
||||
|
||||
You can specify the full path to a binary to trace:
|
||||
|
||||
# ./uprobe p:/bin/bash:readline
|
||||
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||
bash-11886 [002] d... 19601746.902461: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19601749.543485: readline: (0x48db60)
|
||||
bash-11886 [001] d... 19601749.702369: readline: (0x48db60)
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
This might be useful if uprobe picked the wrong binary to trace, as shown by
|
||||
the informational line, and you wanted to specify it directly. It is also useful
|
||||
for tracing binaries not in the $PATH, which uprobe can't otherwise find.
|
||||
|
||||
|
||||
Use -l to list symbols available to trace; eg, searching for functions
|
||||
containing "readline" in bash:
|
||||
|
||||
# ./uprobe -l bash | grep readline
|
||||
initialize_readline
|
||||
pcomp_set_readline_variables
|
||||
posix_readline_initialize
|
||||
readline
|
||||
readline_internal_char
|
||||
readline_internal_setup
|
||||
readline_internal_teardown
|
||||
|
||||
|
||||
Tracing the return of readline() with return value as a string:
|
||||
|
||||
# ./uprobe 'r:bash:readline +0($retval):string'
|
||||
Tracing uprobe readline (r:readline /bin/bash:0x8db60 +0($retval):string). Ctrl-C to end.
|
||||
bash-11886 [003] d... 19601837.001935: readline: (0x41e876 <- 0x48db60) arg1="ls -l"
|
||||
bash-11886 [002] d... 19601851.008409: readline: (0x41e876 <- 0x48db60) arg1="echo "hello world""
|
||||
bash-11886 [002] d... 19601854.099730: readline: (0x41e876 <- 0x48db60) arg1="df -h"
|
||||
bash-11886 [002] d... 19601858.805740: readline: (0x41e876 <- 0x48db60) arg1="cd .."
|
||||
bash-11886 [003] d... 19601898.378753: readline: (0x41e876 <- 0x48db60) arg1="foo bar"
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
Now I can see the commands entered. Note that this traces what bash reads in,
|
||||
even if the command eventually fails. Eg, the last command "foo bar" didn't
|
||||
work (No command 'foo' found).
|
||||
|
||||
Note that this invocation now uses "r:" at the start of the probe description,
|
||||
instead of "p:". r is for return probes, p for entry probes.
|
||||
|
||||
|
||||
Tracing sleep() calls in all running libc shared libraries:
|
||||
|
||||
# ./uprobe p:libc:sleep
|
||||
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
|
||||
svscan-2134 [000] d... 19602402.959904: sleep: (0x7f2dba562130)
|
||||
cron-923 [000] d... 19602404.640507: sleep: (0x7f3e26d9e130)
|
||||
cron-923 [002] d... 19602404.655232: sleep: (0x7f3e26d9e130)
|
||||
cron-923 [002] d... 19602405.189271: sleep: (0x7f3e26d9e130)
|
||||
svscan-2134 [000] d... 19602407.959947: sleep: (0x7f2dba562130)
|
||||
[...]
|
||||
|
||||
This shows different programs calling sleep -- likely threads waiting for work.
|
||||
|
||||
I ran a "sleep 1" command in a bash shell, which wasn't seen above: probably
|
||||
using a different sleep library call, which I'd need to trace separately.
|
||||
|
||||
|
||||
Including headers (-H):
|
||||
|
||||
# ./uprobe -H p:libc:sleep
|
||||
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
|
||||
# tracer: nop
|
||||
#
|
||||
# entries-in-buffer/entries-written: 0/0 #P:4
|
||||
#
|
||||
# _-----=> irqs-off
|
||||
# / _----=> need-resched
|
||||
# | / _---=> hardirq/softirq
|
||||
# || / _--=> preempt-depth
|
||||
# ||| / delay
|
||||
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
||||
# | | | |||| | |
|
||||
svscan-2134 [000] d... 19603052.976770: sleep: (0x7f2dba562130)
|
||||
svscan-2134 [002] d... 19603057.976927: sleep: (0x7f2dba562130)
|
||||
[...]
|
||||
|
||||
These are documented in Documentation/trace/ftrace.txt.
|
||||
|
||||
|
||||
Tracing sleep() with its argument (seconds):
|
||||
|
||||
# ./uprobe 'p:libc:sleep %di'
|
||||
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130 %di). Ctrl-C to end.
|
||||
svscan-2134 [002] d... 19602517.962925: sleep: (0x7f2dba562130) arg1=0x5
|
||||
svscan-2134 [002] d... 19602522.963082: sleep: (0x7f2dba562130) arg1=0x5
|
||||
cron-923 [002] d... 19602524.187733: sleep: (0x7f3e26d9e130) arg1=0x3c
|
||||
svscan-2134 [002] d... 19602527.963267: sleep: (0x7f2dba562130) arg1=0x5
|
||||
[...]
|
||||
|
||||
So svcan was sleeping for 5 seconds, and cron for 60 seconds (0x3c = 60).
|
||||
|
||||
The argument is specified by its register, %di. This is platform dependent: %di
|
||||
may only be meaningful on x86. If you're on a different architecture (eg, ARM),
|
||||
you will probably need to use something else.
|
||||
|
||||
If working with registers is not for you, then consider tracing this using
|
||||
perf_events with debuginfo installed: in which case you can use the variable
|
||||
names. Or consider a different tracer.
|
||||
|
||||
|
||||
Here is an example of the optional filter expression, to only trace the return
|
||||
of fopen() when it failed and returned NULL (0):
|
||||
|
||||
# ./uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||
Tracing uprobe fopen (r:fopen /lib/x86_64-linux-gnu/libc-2.15.so:0x6e540 file=$retval). Ctrl-C to end.
|
||||
prog1-23982 [000] d... 19602894.346872: fopen: (0x40051e <- 0x7f637867f540) file=0x0
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
The argument $retval was given a vanity name "file", which was then tested in
|
||||
the filter expression "file == 0".
|
||||
|
||||
|
||||
Tracing func_abc() in my test program, and including user-level stacks:
|
||||
|
||||
# ./uprobe -s p:/root/func_abc:func_c
|
||||
Tracing uprobe func_c (p:func_c /root/func_abc:0x4f4). Ctrl-C to end.
|
||||
func_abc-25394 [000] d... 19603250.054040: func_c: (0x4004f4)
|
||||
func_abc-25394 [000] d... 19603250.054056: <user stack trace>
|
||||
=> <00000000004004f4>
|
||||
=> <0000000000400527>
|
||||
=> <0000000000400537>
|
||||
=> <00007fca9f0e376d>
|
||||
func_abc-25394 [000] d... 19603251.054250: func_c: (0x4004f4)
|
||||
func_abc-25394 [000] d... 19603251.054266: <user stack trace>
|
||||
=> <00000000004004f4>
|
||||
=> <0000000000400527>
|
||||
=> <0000000000400537>
|
||||
=> <00007fca9f0e376d>
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
The output has the raw hex addresses. If this is too much of a nuisance, then
|
||||
try tracing this using perf_events which should automate the translation.
|
||||
|
||||
It can get worse, eg:
|
||||
|
||||
l# ./uprobe -s p:bash:readline
|
||||
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
|
||||
bash-11886 [002] d... 19603434.397818: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19603434.397832: <user stack trace>
|
||||
=> <000000000048db60>
|
||||
bash-11886 [002] d... 19603434.592500: readline: (0x48db60)
|
||||
bash-11886 [002] d... 19603434.592510: <user stack trace>
|
||||
=> <000000000048db60>
|
||||
^C
|
||||
Ending tracing...
|
||||
|
||||
Here the stack trace is missing (0x48db60 is the traced function, transposed
|
||||
from the base load address). This is due to compiler optimizations. It can be
|
||||
fixed by recompiling with -fno-omit-frame-pointer, or, using perf_events and
|
||||
a different method of stack walking.
|
||||
|
||||
|
||||
Use -h to print the USAGE message:
|
||||
|
||||
# ./uprobe -h
|
||||
USAGE: uprobe [-FhHsv] [-d secs] [-p PID] {-l target |
|
||||
uprobe_definition [filter]}
|
||||
-F # force. trace despite warnings.
|
||||
-d seconds # trace duration, and use buffers
|
||||
-l target # list functions from this executable
|
||||
-p PID # PID to match on I/O issue
|
||||
-v # view format file (don't trace)
|
||||
-H # include column headers
|
||||
-s # show user stack traces
|
||||
-h # this usage message
|
||||
|
||||
Note that these examples may need modification to match your kernel
|
||||
version's function names and platform's register usage.
|
||||
eg,
|
||||
# trace readline() calls in all running "bash" executables:
|
||||
uprobe p:bash:readline
|
||||
# trace readline() with explicit executable path:
|
||||
uprobe p:/bin/bash:readline
|
||||
# trace the return of readline() with return value as a string:
|
||||
uprobe 'r:bash:readline +0($retval):string'
|
||||
# trace strcpy() calls in all running libc shared libraries:
|
||||
uprobe p:libc:sleep
|
||||
# trace strcpy() with register %di (x86):
|
||||
uprobe 'p:libc:sleep %di'
|
||||
# trace this address (use caution, can cause failures):
|
||||
uprobe p:libc:0xbf130
|
||||
# trace gettimeofday() for PID 1182 only:
|
||||
uprobe -p 1182 p:libc:gettimeofday
|
||||
# trace the return of fopen() only when it returns NULL:
|
||||
uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||
|
||||
See the man page and example file for more info.
|
||||
160
man/man8/uprobe.8
Normal file
160
man/man8/uprobe.8
Normal file
@@ -0,0 +1,160 @@
|
||||
.TH uprobe 8 "2014-07-20" "USER COMMANDS"
|
||||
.SH NAME
|
||||
uprobe \- trace a given uprobe definition. User-level dynamic tracing. Uses Linux ftrace. EXPERIMENTAL.
|
||||
.SH SYNOPSIS
|
||||
.B uprobe
|
||||
[\-FhHsv] [\-d secs] [\-p PID] uprobe_definition [filter]
|
||||
.SH DESCRIPTION
|
||||
This will create, trace, then destroy a given uprobe definition. See
|
||||
Documentation/trace/uprobetracer.txt in the Linux kernel source for the
|
||||
syntax of a uprobe definition, and "uprobe -h" for examples. With this tool,
|
||||
the probe alias is optional (it will default to something meaningful).
|
||||
|
||||
WARNING: This uses dynamic tracing of user-level functions, using some
|
||||
relatively new kernel code. I have seen this cause target processes to fail,
|
||||
either entering endless spin loops or crashing on illegal instructions. I
|
||||
believe newer kernels (post 4.0) are relatively safer, but use caution. Test
|
||||
in a lab environment, and know what you are doing, before use. Also consider
|
||||
other (more developed) user-level tracers (perf_events, LTTng, etc.).
|
||||
|
||||
Also beware of widespread tracing that interferes with the operation of the
|
||||
system, eg, tracing libc:malloc, which by-default will trace _all_ processes.
|
||||
|
||||
I wrote this because I kept testing different custom uprobes at the command
|
||||
line, and wanted a way to automate the steps. For generic user-level
|
||||
tracing, use perf_events directly.
|
||||
|
||||
Since this uses ftrace, only the root user can use this tool.
|
||||
.SH REQUIREMENTS
|
||||
REQUIREMENTS: FTRACE and UPROBE CONFIG, which you may already have on recent
|
||||
kernel versions, file(1), ldconfig(8), objdump(1), and some version of awk.
|
||||
Also, currently only executes on Linux 4.0+ (see WARNING) unless -F is used.
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
\-F
|
||||
Force. Trace despite kernel version warnings. Use on older kernels may expose
|
||||
you to (since fixed) bugs, which can lock up or crash target processes, which
|
||||
could also lock up the entire system. Test in a lab environment before use,
|
||||
and consider other more developed user-level tracers (perf_events, LTTng,
|
||||
etc.).
|
||||
.TP
|
||||
\-d seconds
|
||||
Set the duration of tracing, in seconds. Trace output will be buffered and
|
||||
printed at the end. This also reduces overheads by buffering in-kernel,
|
||||
instead of printing events as they occur.
|
||||
|
||||
The ftrace buffer has a fixed size per-CPU (see
|
||||
/sys/kernel/debug/tracing/buffer_size_kb). If you think events are missing,
|
||||
try increasing that size.
|
||||
.TP
|
||||
\-h
|
||||
Print usage message.
|
||||
.TP
|
||||
\-H
|
||||
Print column headers.
|
||||
.TP
|
||||
\-s
|
||||
Print user-level stack traces after each event. These are currently printed
|
||||
in hex, and need post-processing to see user-level symbols (eg, addr2line;
|
||||
I should automate that).
|
||||
.TP
|
||||
\-v
|
||||
Show the uprobe format file only (do not trace), identifying possible variables
|
||||
for use in a custom filter.
|
||||
.TP
|
||||
\-p PID
|
||||
Only trace user-level functions when this process ID is on-CPU.
|
||||
.TP
|
||||
uprobe_definition
|
||||
A full uprobe definition, as documented by Documentation/trace/uprobetracer.txt
|
||||
in the Linux kernel source. Note that the probe alias name is optional with
|
||||
uprobe(8), and if not specified, it will default to something meaningful.
|
||||
See the EXAMPLES section.
|
||||
.TP
|
||||
filter
|
||||
An ftrace filter definition.
|
||||
.SH EXAMPLES
|
||||
These examples may need modification to match your target software function
|
||||
names and platform's register usage. If using platform specific registers
|
||||
becomes too painful in practice, consider a debuginfo-based tracer,
|
||||
which can trace variables names instead (eg, perf_events).
|
||||
.TP
|
||||
trace readline() calls in all running "bash" executables:
|
||||
#
|
||||
.B uprobe p:bash:readline
|
||||
.TP
|
||||
trace readline() with explicit executable path:
|
||||
#
|
||||
.B uprobe p:/bin/bash:readline
|
||||
.TP
|
||||
trace the return of readline() with return value as a string:
|
||||
#
|
||||
.B uprobe 'r:bash:readline +0($retval):string'
|
||||
.TP
|
||||
trace sleep() calls in all running libc shared libraries:
|
||||
#
|
||||
.B uprobe p:libc:sleep
|
||||
.TP
|
||||
trace sleep() with register %di (x86):
|
||||
#
|
||||
.B uprobe 'p:libc:sleep %di'
|
||||
.TP
|
||||
trace this address (use caution, can cause failures):
|
||||
#
|
||||
.B uprobe p:libc:0xbf130
|
||||
.TP
|
||||
trace gettimeofday() for PID 1182 only:
|
||||
#
|
||||
.B uprobe -p 1182 p:libc:gettimeofday
|
||||
.TP
|
||||
trace the return of fopen() only when it returns NULL:
|
||||
#
|
||||
.B uprobe 'r:libc:fopen file=$retval' 'file == 0'
|
||||
.SH FIELDS
|
||||
The output format depends on the kernel version, and headings can be printed
|
||||
using \-H. The format is the same as the ftrace function trace format, described
|
||||
in the kernel source under Documentation/trace/ftrace.txt.
|
||||
|
||||
Typical fields are:
|
||||
.TP
|
||||
TASK-PID
|
||||
The process name (which could include dashes), a dash, and the process ID.
|
||||
.TP
|
||||
CPU#
|
||||
The CPU ID, in brackets.
|
||||
.TP
|
||||
||||
|
||||
Kernel state flags. For example, on Linux 3.16 these are for irqs-off,
|
||||
need-resched, hardirq/softirq, and preempt-depth.
|
||||
.TP
|
||||
TIMESTAMP
|
||||
Time of event, in seconds.
|
||||
.TP
|
||||
FUNCTION
|
||||
User-level function name.
|
||||
.SH OVERHEAD
|
||||
This can generate a lot of trace data quickly, depending on the
|
||||
frequency of the traced events. Such data will cause performance overheads.
|
||||
This also works without buffering by default, printing function events
|
||||
as they happen (uses trace_pipe), context switching and consuming CPU to do
|
||||
so. If needed, you can try the "\-d secs" option, which buffers events
|
||||
instead, reducing overhead. If you think the buffer option is losing events,
|
||||
try increasing the buffer size (buffer_size_kb).
|
||||
|
||||
If you find a use for uprobe(8) where the overhead is prohibitive, consider
|
||||
the same enabling using perf_events where overhead should be reduced.
|
||||
.SH SOURCE
|
||||
This is from the perf-tools collection:
|
||||
.IP
|
||||
https://github.com/brendangregg/perf-tools
|
||||
.PP
|
||||
Also look under the examples directory for a text file containing example
|
||||
usage, output, and commentary for this tool.
|
||||
.SH OS
|
||||
Linux
|
||||
.SH STABILITY
|
||||
Unstable - in development.
|
||||
.SH AUTHOR
|
||||
Brendan Gregg
|
||||
.SH SEE ALSO
|
||||
kprobe(8)
|
||||
373
user/uprobe
Executable file
373
user/uprobe
Executable file
@@ -0,0 +1,373 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# uprobe - trace a given uprobe definition. Kernel dynamic tracing.
|
||||
# Written using Linux ftrace. Experimental.
|
||||
#
|
||||
# This will create, trace, then destroy a given uprobe definition. See
|
||||
# Documentation/trace/uprobetrace.txt in the Linux kernel source for the
|
||||
# syntax of a uprobe definition, and "uprobe -h" for examples. With this tool,
|
||||
# the probe alias is optional (it will default to something meaningful).
|
||||
#
|
||||
# USAGE: ./uprobe [-FhHsv] [-d secs] [-p pid] uprobe_definition [filter]
|
||||
#
|
||||
# Run "uprobe -h" for full usage.
|
||||
#
|
||||
# WARNING: This uses dynamic tracing of user-level functions, using some
|
||||
# relatively new kernel code. I have seen this cause target processes to fail,
|
||||
# either entering endless spin loops or crashing on illegal instructions. I
|
||||
# believe newer kernels (post 4.0) are relatively safer, but use caution. Test
|
||||
# in a lab environment, and know what you are doing, before use.
|
||||
#
|
||||
# Also beware of widespread tracing that interferes with the operation of the
|
||||
# system, eg, tracing libc:malloc, which by-default will trace _all_ processes.
|
||||
# Test in a lab environment before use.
|
||||
#
|
||||
# I wrote this because I kept testing different custom uprobes at the command
|
||||
# line, and wanted a way to automate the steps. For generic user-level
|
||||
# tracing, use perf_events directly.
|
||||
#
|
||||
# REQUIREMENTS: FTRACE and UPROBE CONFIG, which you may already have on recent
|
||||
# kernel versions, file(1), ldconfig(8), objdump(1), and some version of awk.
|
||||
# Also, currently only executes on Linux 4.0+ (see WARNING) unless -F is used.
|
||||
#
|
||||
# From perf-tools: https://github.com/brendangregg/perf-tools
|
||||
#
|
||||
# See the uprobe(8) man page (in perf-tools) for more info.
|
||||
#
|
||||
# COPYRIGHT: Copyright (c) 2015 Brendan Gregg.
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU General Public License
|
||||
# as published by the Free Software Foundation; either version 2
|
||||
# of the License, or (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software Foundation,
|
||||
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
#
|
||||
# (http://www.gnu.org/copyleft/gpl.html)
|
||||
#
|
||||
# 27-Jul-2015 Brendan Gregg Created this.
|
||||
|
||||
### default variables
|
||||
tracing=/sys/kernel/debug/tracing
|
||||
flock=/var/tmp/.ftrace-lock; wroteflock=0
|
||||
opt_duration=0; duration=; opt_pid=0; pid=; opt_filter=0; filter=
|
||||
opt_view=0; opt_headers=0; opt_stack=0; dmesg=2; debug=0; opt_force=0
|
||||
opt_list=0; target=
|
||||
PATH=$PATH:/usr/bin:/sbin # ensure we find objdump, ldconfig
|
||||
trap ':' INT QUIT TERM PIPE HUP # sends execution to end tracing section
|
||||
|
||||
function usage {
|
||||
cat <<-END >&2
|
||||
USAGE: uprobe [-FhHsv] [-d secs] [-p PID] {-l target |
|
||||
uprobe_definition [filter]}
|
||||
-F # force. trace despite warnings.
|
||||
-d seconds # trace duration, and use buffers
|
||||
-l target # list functions from this executable
|
||||
-p PID # PID to match on I/O issue
|
||||
-v # view format file (don't trace)
|
||||
-H # include column headers
|
||||
-s # show user stack traces
|
||||
-h # this usage message
|
||||
|
||||
Note that these examples may need modification to match your kernel
|
||||
version's function names and platform's register usage.
|
||||
eg,
|
||||
# trace readline() calls in all running "bash" executables:
|
||||
uprobe p:bash:readline
|
||||
# trace readline() with explicit executable path:
|
||||
uprobe p:/bin/bash:readline
|
||||
# trace the return of readline() with return value as a string:
|
||||
uprobe 'r:bash:readline +0(\$retval):string'
|
||||
# trace sleep() calls in all running libc shared libraries:
|
||||
uprobe p:libc:sleep
|
||||
# trace sleep() with register %di (x86):
|
||||
uprobe 'p:libc:sleep %di'
|
||||
# trace this address (use caution, can cause failures):
|
||||
uprobe p:libc:0xbf130
|
||||
# trace gettimeofday() for PID 1182 only:
|
||||
uprobe -p 1182 p:libc:gettimeofday
|
||||
# trace the return of fopen() only when it returns NULL:
|
||||
uprobe 'r:libc:fopen file=\$retval' 'file == 0'
|
||||
|
||||
See the man page and example file for more info.
|
||||
END
|
||||
exit
|
||||
}
|
||||
|
||||
function warn {
|
||||
if ! eval "$@"; then
|
||||
echo >&2 "WARNING: command failed \"$@\""
|
||||
fi
|
||||
}
|
||||
|
||||
function end {
|
||||
# disable tracing
|
||||
echo 2>/dev/null
|
||||
echo "Ending tracing..." 2>/dev/null
|
||||
cd $tracing
|
||||
warn "echo 0 > events/uprobes/$uname/enable"
|
||||
if (( opt_filter )); then
|
||||
warn "echo 0 > events/uprobes/$uname/filter"
|
||||
fi
|
||||
warn "echo -:$uname >> uprobe_events"
|
||||
(( opt_stack )) && warn "echo 0 > options/userstacktrace"
|
||||
warn "echo > trace"
|
||||
(( wroteflock )) && warn "rm $flock"
|
||||
}
|
||||
|
||||
function die {
|
||||
echo >&2 "$@"
|
||||
exit 1
|
||||
}
|
||||
|
||||
function edie {
|
||||
# die with a quiet end()
|
||||
echo >&2 "$@"
|
||||
exec >/dev/null 2>&1
|
||||
end
|
||||
exit 1
|
||||
}
|
||||
|
||||
function set_path {
|
||||
name=$1
|
||||
|
||||
path=$(which $name)
|
||||
if [[ "$path" == "" ]]; then
|
||||
path=$(ldconfig -v 2>/dev/null | awk -v lib=$name '
|
||||
$1 ~ /:/ { sub(/:/, "", $1); path = $1 }
|
||||
{ sub(/\..*/, "", $1); }
|
||||
$1 == lib { print path "/" $3 }')
|
||||
if [[ "$path" == "" ]]; then
|
||||
die "ERROR: segment \"$name\" ambiguous.
|
||||
program or library? Try a full path."
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ ! -x $path ]]; then
|
||||
die "ERROR: resolved \"$name\" to \"$path\", but file missing"
|
||||
fi
|
||||
}
|
||||
|
||||
function set_addr {
|
||||
path=$1
|
||||
name=$2
|
||||
sym=$3
|
||||
|
||||
[[ "$path" == "" ]] && die "ERROR: missing symbol path."
|
||||
[[ "$sym" == "" ]] && die "ERROR: missing symbol for $path"
|
||||
|
||||
addr=$(objdump -tT $path | awk -v sym=$sym '
|
||||
$NF == sym { print $1; exit }')
|
||||
[[ "$addr" == "" ]] && die "ERROR: missing symbol \"$sym\" in $path"
|
||||
(( 0x$addr == 0 )) && die "ERROR: failed resolving \"$sym\" in $path." \
|
||||
"Maybe it exists in a different target (eg, library)?"
|
||||
addr=0x$( printf "%x" 0x$addr ) # strip leading zeros
|
||||
|
||||
type=$(file $path)
|
||||
if [[ "$type" != *shared?object* ]]; then
|
||||
# subtract the base mapping address. see Documentation/trace/
|
||||
# uprobetracer.txt to understand the following steps.
|
||||
pid=$(pgrep -nx $name)
|
||||
[[ "$pid" == "" ]] && die "ERROR: no PID found for \"$name\"."
|
||||
range=$(grep 'r-xp.*'$path /proc/$pid/maps)
|
||||
[[ "$range" == "" ]] && \
|
||||
die "ERROR: no addr range found for PID $pid and segment" \
|
||||
"$path. Searched for r-xp in /proc/$pid/maps."
|
||||
base=$range
|
||||
base=0x${base%%-*}
|
||||
addr=$(( addr - base ))
|
||||
if (( addr < 0 )); then
|
||||
echo "WARNING: problems removing base addr from $sym." \
|
||||
"Trying untransposed addr."
|
||||
addr=$(( addr + base ))
|
||||
fi
|
||||
addr=0x$( printf "%x" $addr)
|
||||
fi
|
||||
}
|
||||
|
||||
### process options
|
||||
while getopts Fd:hHl:p:sv opt
|
||||
do
|
||||
case $opt in
|
||||
F) opt_force=1 ;;
|
||||
d) opt_duration=1; duration=$OPTARG ;;
|
||||
p) opt_pid=1; pid=$OPTARG ;;
|
||||
l) opt_list=1; target=$OPTARG ;;
|
||||
H) opt_headers=1 ;;
|
||||
s) opt_stack=1 ;;
|
||||
v) opt_view=1 ;;
|
||||
h|?) usage ;;
|
||||
esac
|
||||
done
|
||||
shift $(( $OPTIND - 1 ))
|
||||
uprobe=$1
|
||||
shift
|
||||
if (( $# )); then
|
||||
opt_filter=1
|
||||
filter=$1
|
||||
fi
|
||||
|
||||
### handle listing
|
||||
[[ "$opt_list" == 1 && "$uprobe" != "" ]] && die "ERROR: -l takes a target only"
|
||||
if (( opt_list )); then
|
||||
if [[ "$target" != */* ]]; then
|
||||
set_path $target
|
||||
target=$path
|
||||
fi
|
||||
objdump -tT $target | awk '$4 == ".text" { print $NF }' | sort
|
||||
exit
|
||||
fi
|
||||
|
||||
### check kernel version
|
||||
ver=$(uname -r)
|
||||
maj=${ver%%.*}
|
||||
if (( opt_force == 0 && $maj < 4 )); then
|
||||
cat <<-END >&2
|
||||
ERROR: Kernel version >= 4.0 preferred (you have $ver). Aborting.
|
||||
|
||||
Background: uprobes were first added in 3.5. I've tested them on 3.13,
|
||||
and found them unsafe, as they can crash or lock up processes, which can
|
||||
effectively lock up the system. On 4.0, uprobes seem much safer (but may
|
||||
not be 100%). They may be safer (safe?) on other 3.x kernels, but I
|
||||
don't know. You can use -F to force tracing, but you've been warned.
|
||||
END
|
||||
exit
|
||||
fi
|
||||
|
||||
### option logic
|
||||
[[ "$uprobe" == "" ]] && usage
|
||||
(( opt_pid && opt_filter )) && die "ERROR: use either -p or a filter."
|
||||
(( opt_duration && opt_view )) && die "ERROR: use either -d or -v."
|
||||
if (( opt_pid )); then
|
||||
# convert to filter
|
||||
opt_filter=1
|
||||
filter="common_pid == $pid"
|
||||
fi
|
||||
if [[ "$uprobe" != p:* && "$uprobe" != r:* ]]; then
|
||||
echo >&2 "ERROR: invalid uprobe definition (should start with p: or r:)"
|
||||
usage
|
||||
fi
|
||||
#
|
||||
# Parse the following:
|
||||
# p:bash:readline
|
||||
# p:my bash:readline
|
||||
# p:bash:readline %si
|
||||
# r:bash:readline $ret
|
||||
# p:my bash:readline %si
|
||||
# p:bash:readline si=%si
|
||||
# p:my bash:readline si=%si
|
||||
# r:bash:readline cmd=+0($retval):string
|
||||
# ... and all of the above with /bin/bash instead of bash
|
||||
# ... and all of the above with libc:sleep instead of ...
|
||||
# ... and all of the above with /lib/x86_64-linux-gnu/libc.so.6:sleep ...
|
||||
# ... and all of the above with symbol addresses
|
||||
# ... and examples from USAGE message
|
||||
# The following code is not as complicated as it looks.
|
||||
#
|
||||
utype=${uprobe%%:*}
|
||||
urest="${uprobe#*:} "
|
||||
set -- $urest
|
||||
if [[ $1 == *:* ]]; then
|
||||
uname=; probe=$1; shift; uargs="$@"
|
||||
else
|
||||
uname=$1; probe=$2; shift 2; uargs="$@"
|
||||
fi
|
||||
path=$probe; path=${path%%:*}
|
||||
addr=$probe; addr=${addr##*:}
|
||||
|
||||
# set seg and fix path (eg, seg=bash, path=/bin/bash)
|
||||
if [[ $path == */* ]]; then
|
||||
seg=${path##*/}
|
||||
seg=${seg%%.*}
|
||||
else
|
||||
seg=$path
|
||||
# determine path, eg, given "zsh" or "libc"
|
||||
set_path $path
|
||||
fi
|
||||
|
||||
# fix uname and addr (eg, uname=readline, addr=0x8db60)
|
||||
if [[ "$addr" == 0x* ]]; then
|
||||
# symbol unknown; default to seg+addr
|
||||
[[ "$uname" == "" ]] && uname=${seg}_$addr
|
||||
else
|
||||
[[ "$uname" == "" ]] && uname=$addr
|
||||
set_addr $path $seg $addr
|
||||
fi
|
||||
|
||||
# construct uprobe
|
||||
uprobe="$utype:$uname $path:$addr"
|
||||
[[ "$uargs" != "" ]] && uprobe="$uprobe $uargs"
|
||||
|
||||
if (( debug )); then
|
||||
echo "uname: \"$uname\", uprobe: \"$uprobe\""
|
||||
fi
|
||||
|
||||
### check permissions
|
||||
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
|
||||
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
|
||||
|
||||
if (( !opt_view )); then
|
||||
if (( opt_duration )); then
|
||||
echo "Tracing uprobe $uname for $duration seconds (buffered)..."
|
||||
else
|
||||
echo "Tracing uprobe $uname ($uprobe). Ctrl-C to end."
|
||||
fi
|
||||
fi
|
||||
|
||||
### ftrace lock
|
||||
[[ -e $flock ]] && die "ERROR: ftrace may be in use by PID $(cat $flock) $flock"
|
||||
echo $$ > $flock || die "ERROR: unable to write $flock."
|
||||
wroteflock=1
|
||||
|
||||
### setup and begin tracing
|
||||
echo nop > current_tracer
|
||||
if ! echo "$uprobe" >> uprobe_events; then
|
||||
echo >&2 "ERROR: adding uprobe \"$uprobe\"."
|
||||
if (( dmesg )); then
|
||||
echo >&2 "Last $dmesg dmesg entries (might contain reason):"
|
||||
dmesg | tail -$dmesg | sed 's/^/ /'
|
||||
fi
|
||||
edie "Exiting."
|
||||
fi
|
||||
if (( opt_view )); then
|
||||
cat events/uprobes/$uname/format
|
||||
edie ""
|
||||
fi
|
||||
if (( opt_filter )); then
|
||||
if ! echo "$filter" > events/uprobes/$uname/filter; then
|
||||
edie "ERROR: setting filter or -p. Exiting."
|
||||
fi
|
||||
fi
|
||||
if (( opt_stack )); then
|
||||
if ! echo 1 > options/userstacktrace; then
|
||||
edie "ERROR: enabling stack traces (-s). Exiting"
|
||||
fi
|
||||
fi
|
||||
if ! echo 1 > events/uprobes/$uname/enable; then
|
||||
edie "ERROR: enabling uprobe $uname. Exiting."
|
||||
fi
|
||||
|
||||
### print trace buffer
|
||||
warn "echo > trace"
|
||||
if (( opt_duration )); then
|
||||
sleep $duration
|
||||
if (( opt_headers )); then
|
||||
cat trace
|
||||
else
|
||||
grep -v '^#' trace
|
||||
fi
|
||||
else
|
||||
# trace_pipe lack headers, so fetch them from trace
|
||||
(( opt_headers )) && cat trace
|
||||
cat trace_pipe
|
||||
fi
|
||||
|
||||
### end tracing
|
||||
end
|
||||
Reference in New Issue
Block a user