mirror of
https://github.com/brendangregg/perf-tools.git
synced 2025-11-30 23:16:03 +07:00
cachestat for page cache statistics
This commit is contained in:
@@ -25,6 +25,7 @@ Using ftrace:
|
||||
- [execsnoop](execsnoop): trace process exec() with command line argument details. [Examples](examples/execsnoop_example.txt).
|
||||
- [opensnoop](opensnoop): trace open() syscalls showing filenames. [Examples](examples/opensnoop_example.txt).
|
||||
- [killsnoop](killsnoop): trace kill() signals showing process and signal details. [Examples](examples/killsnoop_example.txt).
|
||||
- fs/[cachestat](fs/cachestat): basic cache hit/miss statistics for the Linux page cache. [Examples](examples/cachestat_example.txt).
|
||||
- net/[tcpretrans](net/tcpretrans): show TCP retransmits, with address and other details. [Examples](examples/tcpretrans_example.txt).
|
||||
- system/[tpoint](system/tpoint): trace a given tracepoint. [Examples](examples/tpoint_example.txt).
|
||||
- kernel/[funccount](kernel/funccount): count kernel function calls, matching a string with wildcards. [Examples](examples/funccount_example.txt).
|
||||
|
||||
1
bin/cachestat
Symbolic link
1
bin/cachestat
Symbolic link
@@ -0,0 +1 @@
|
||||
../fs/cachestat
|
||||
58
examples/cachestat_example.txt
Normal file
58
examples/cachestat_example.txt
Normal file
@@ -0,0 +1,58 @@
|
||||
Demonstrations of cachestat, the Linux ftrace version.
|
||||
|
||||
|
||||
Here is some sample output showing file system cache statistics, followed by
|
||||
the workload that caused it:
|
||||
|
||||
# ./cachestat -t
|
||||
Counting cache functions... Output every 1 seconds.
|
||||
TIME HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB
|
||||
08:28:57 415 0 0 100.0% 1 191
|
||||
08:28:58 411 0 0 100.0% 1 191
|
||||
08:28:59 362 97 0 78.9% 0 8
|
||||
08:29:00 411 0 0 100.0% 0 9
|
||||
08:29:01 775 20489 0 3.6% 0 89
|
||||
08:29:02 411 0 0 100.0% 0 89
|
||||
08:29:03 6069 0 0 100.0% 0 89
|
||||
08:29:04 15249 0 0 100.0% 0 89
|
||||
08:29:05 411 0 0 100.0% 0 89
|
||||
08:29:06 411 0 0 100.0% 0 89
|
||||
08:29:07 411 0 3 100.0% 0 89
|
||||
[...]
|
||||
|
||||
I used the -t option to include the TIME column, to make describing the output
|
||||
easier.
|
||||
|
||||
The workload was:
|
||||
|
||||
# echo 1 > /proc/sys/vm/drop_caches; sleep 2; cksum 80m; sleep 2; cksum 80m
|
||||
|
||||
At 8:28:58, the page cache was dropped by the first command, which can be seen
|
||||
by the drop in size for "CACHE_MB" (page cache size) from 191 Mbytes to 8.
|
||||
After a 2 second sleep, a cksum command was issued at 8:29:01, for an 80 Mbyte
|
||||
file (called "80m"), which caused a total of ~20,400 misses ("MISSES" column),
|
||||
and the page cache size to grow by 80 Mbytes. The hit ratio during this dropped
|
||||
to 3.6%. Finally, after another 2 second sleep, at 8:29:03 the cksum command
|
||||
was run a second time, this time hitting entirely from cache.
|
||||
|
||||
Instrumenting all file system cache accesses does cost some overhead, and this
|
||||
tool might slow your target system by 2% or so. Test before use if this is a
|
||||
concern.
|
||||
|
||||
This tool also uses dynamic tracing, and is tied to Linux kernel implementation
|
||||
details. If it doesn't work for you, it probably needs fixing.
|
||||
|
||||
|
||||
Use -h to print the USAGE message:
|
||||
|
||||
# ./cachestat -h
|
||||
USAGE: cachestat [-Dht] [interval]
|
||||
-D # print debug counters
|
||||
-h # this usage message
|
||||
-t # include timestamp
|
||||
interval # output interval in secs (default 1)
|
||||
eg,
|
||||
cachestat # show stats every second
|
||||
cachestat 5 # show stats every 5 seconds
|
||||
|
||||
See the man page and example file for more info.
|
||||
167
fs/cachestat
Executable file
167
fs/cachestat
Executable file
@@ -0,0 +1,167 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# cachestat - show Linux page cache hit/miss statistics.
|
||||
# Uses Linux ftrace.
|
||||
#
|
||||
# This is a proof of concept using Linux ftrace capabilities on older kernels,
|
||||
# and works by using function profiling for in-kernel counters. Specifically,
|
||||
# four kernel functions are traced:
|
||||
#
|
||||
# mark_page_accessed() for measuring cache accesses
|
||||
# mark_buffer_dirty() for measuring cache writes
|
||||
# add_to_page_cache_lru() for measuring page additions
|
||||
# account_page_dirtied() for measuring page dirties
|
||||
#
|
||||
# It is possible that these functions have been renamed (or are different
|
||||
# logically) for your kernel version, and this script will not work as-is.
|
||||
# This script was written on Linux 3.13. This script is a sandcastle: the
|
||||
# kernel may wash some away, and you'll need to rebuild.
|
||||
#
|
||||
# USAGE: cachestat [-Dht] [interval]
|
||||
# eg,
|
||||
# cachestat 5 # show stats every 5 seconds
|
||||
#
|
||||
# Run "cachestat -h" for full usage.
|
||||
#
|
||||
# WARNING: This uses dynamic tracing of kernel functions, and could cause
|
||||
# kernel panics or freezes. Test, and know what you are doing, before use.
|
||||
# It also traces cache activity, which can be frequent, and cost some overhead.
|
||||
# The statistics should be treated as best-effort: there may be some error
|
||||
# margin depending on unusual workload types.
|
||||
#
|
||||
# REQUIREMENTS: CONFIG_FUNCTION_PROFILER, awk.
|
||||
#
|
||||
# From perf-tools: https://github.com/brendangregg/perf-tools
|
||||
#
|
||||
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU General Public License
|
||||
# as published by the Free Software Foundation; either version 2
|
||||
# of the License, or (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software Foundation,
|
||||
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||
#
|
||||
# (http://www.gnu.org/copyleft/gpl.html)
|
||||
#
|
||||
# 28-Dec-2014 Brendan Gregg Created this.
|
||||
|
||||
### default variables
|
||||
tracing=/sys/kernel/debug/tracing
|
||||
interval=1; opt_timestamp=0; opt_debug=0
|
||||
trap 'quit=1' INT QUIT TERM PIPE HUP # sends execution to end tracing section
|
||||
|
||||
function usage {
|
||||
cat <<-END >&2
|
||||
USAGE: cachestat [-Dht] [interval]
|
||||
-D # print debug counters
|
||||
-h # this usage message
|
||||
-t # include timestamp
|
||||
interval # output interval in secs (default 1)
|
||||
eg,
|
||||
cachestat # show stats every second
|
||||
cachestat 5 # show stats every 5 seconds
|
||||
|
||||
See the man page and example file for more info.
|
||||
END
|
||||
exit
|
||||
}
|
||||
|
||||
function warn {
|
||||
if ! eval "$@"; then
|
||||
echo >&2 "WARNING: command failed \"$@\""
|
||||
fi
|
||||
}
|
||||
|
||||
function die {
|
||||
echo >&2 "$@"
|
||||
exit 1
|
||||
}
|
||||
|
||||
### process options
|
||||
while getopts Dht opt
|
||||
do
|
||||
case $opt in
|
||||
D) opt_debug=1 ;;
|
||||
t) opt_timestamp=1 ;;
|
||||
h|?) usage ;;
|
||||
esac
|
||||
done
|
||||
shift $(( $OPTIND - 1 ))
|
||||
|
||||
### option logic
|
||||
if (( $# )); then
|
||||
interval=$1
|
||||
fi
|
||||
echo "Counting cache functions... Output every $interval seconds."
|
||||
|
||||
### check permissions
|
||||
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
|
||||
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
|
||||
|
||||
### enable tracing
|
||||
sysctl -q kernel.ftrace_enabled=1 # doesn't set exit status
|
||||
printf "mark_page_accessed\nmark_buffer_dirty\nadd_to_page_cache_lru\naccount_page_dirtied\n" > set_ftrace_filter || \
|
||||
die "ERROR: tracing these four kernel functions: mark_page_accessed,"\
|
||||
"mark_buffer_dirty, add_to_page_cache_lru and account_page_dirtied (unknown kernel version?). Exiting."
|
||||
warn "echo nop > current_tracer"
|
||||
if ! echo 1 > function_profile_enabled; then
|
||||
echo > set_ftrace_filter
|
||||
die "ERROR: enabling function profiling. Have CONFIG_FUNCTION_PROFILER? Exiting."
|
||||
fi
|
||||
|
||||
(( opt_timestamp )) && printf "%-8s " TIME
|
||||
printf "%8s %8s %8s %8s %12s %10s" HITS MISSES DIRTIES RATIO "BUFFERS_MB" "CACHE_MB"
|
||||
(( opt_debug )) && printf " DEBUG"
|
||||
echo
|
||||
|
||||
### summarize
|
||||
quit=0; secs=0
|
||||
while (( !quit && (!opt_duration || secs < duration) )); do
|
||||
(( secs += interval ))
|
||||
echo 0 > function_profile_enabled
|
||||
echo 1 > function_profile_enabled
|
||||
sleep $interval
|
||||
|
||||
(( opt_timestamp )) && printf "%(%H:%M:%S)T " -1
|
||||
|
||||
# cat both meminfo and trace stats, and let awk pick them apart
|
||||
cat /proc/meminfo trace_stat/function* | awk -v debug=$opt_debug '
|
||||
# match meminfo stats:
|
||||
$1 == "Buffers:" && $3 == "kB" { buffers_mb = $2 / 1024 }
|
||||
$1 == "Cached:" && $3 == "kB" { cached_mb = $2 / 1024 }
|
||||
# identify and save trace counts:
|
||||
$2 ~ /[0-9]/ && $3 != "kB" { a[$1] += $2 }
|
||||
END {
|
||||
mpa = a["mark_page_accessed"]
|
||||
mbd = a["mark_buffer_dirty"]
|
||||
apcl = a["add_to_page_cache_lru"]
|
||||
apd = a["account_page_dirtied"]
|
||||
|
||||
total = mpa - mbd
|
||||
misses = apcl - apd
|
||||
if (misses < 0)
|
||||
misses = 0
|
||||
hits = total - misses
|
||||
|
||||
ratio = 100 * hits / total
|
||||
printf "%8d %8d %8d %7.1f%% %12.0f %10.0f", hits, misses, mbd,
|
||||
ratio, buffers_mb, cached_mb
|
||||
if (debug)
|
||||
printf " (%d %d %d %d)", mpa, mbd, apcl, apd
|
||||
printf "\n"
|
||||
}'
|
||||
done
|
||||
|
||||
### end tracing
|
||||
echo 2>/dev/null
|
||||
echo "Ending tracing..." 2>/dev/null
|
||||
warn "echo 0 > function_profile_enabled"
|
||||
warn "echo > set_ftrace_filter"
|
||||
111
man/man8/cachestat.8
Normal file
111
man/man8/cachestat.8
Normal file
@@ -0,0 +1,111 @@
|
||||
.TH cachestat 8 "2014-12-28" "USER COMMANDS"
|
||||
.SH NAME
|
||||
cachestat \- Measure page cache hits/misses. Uses Linux ftrace.
|
||||
.SH SYNOPSIS
|
||||
.B cachestat
|
||||
[\-Dht] [interval]
|
||||
.SH DESCRIPTION
|
||||
This tool provides basic cache hit/miss statistics for the Linux page cache.
|
||||
|
||||
Its current implementation uses Linux ftrace dynamic function profiling to
|
||||
create custom in-kernel counters, which is a workaround until such counters
|
||||
can be built-in to the kernel. Specifically, four kernel functions are counted:
|
||||
.IP
|
||||
mark_page_accessed() for measuring cache accesses
|
||||
.IP
|
||||
mark_buffer_dirty() for measuring cache writes
|
||||
.IP
|
||||
add_to_page_cache_lru() for measuring page additions
|
||||
.IP
|
||||
account_page_dirtied() for measuring page dirties
|
||||
.PP
|
||||
It is possible that these functions have been renamed (or are different
|
||||
logically) for your kernel version, and this script will not work as-is.
|
||||
This was written for a Linux 3.13 kernel, and tested on a few others versions.
|
||||
This script is a sandcastle: the kernel may wash some away, and you'll
|
||||
need to rebuild.
|
||||
|
||||
This program's implementation can be improved in the future when other
|
||||
kernel capabilities are made available. If you need a more reliable tool now,
|
||||
then consider other tracing alternatives (eg, SystemTap). This tool is really
|
||||
a proof of concept to see what ftrace can currently do.
|
||||
|
||||
WARNING: This uses dynamic tracing of kernel functions, and could cause
|
||||
kernel panics or freezes. Test, and know what you are doing, before use.
|
||||
It also traces cache activity, which can be frequent, and cost some overhead.
|
||||
The statistics should be treated as best-effort: there may be some error
|
||||
margin depending on unusual workload types.
|
||||
|
||||
Since this uses ftrace, only the root user can use this tool.
|
||||
.SH REQUIREMENTS
|
||||
CONFIG_FUNCTION_PROFILER, which you may already have enabled and available on
|
||||
recent kernels, and awk.
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
\-D
|
||||
Include extra fields for debug purposes (see script).
|
||||
.TP
|
||||
\-h
|
||||
Print usage message.
|
||||
.TP
|
||||
\-t
|
||||
Include timestamps in units of seconds.
|
||||
.TP
|
||||
interval
|
||||
Output interval in seconds. Default is 1.
|
||||
.SH EXAMPLES
|
||||
.TP
|
||||
Show per-second page cache statistics:
|
||||
#
|
||||
.B cachestat
|
||||
.SH FIELDS
|
||||
.TP
|
||||
TIME
|
||||
Time, in HH:MM:SS.
|
||||
.TP
|
||||
HITS
|
||||
Number of page cache hits (reads). Each hit is for one memory page (the size
|
||||
depends on your processor architecture; commonly 4 Kbytes). Since this tool
|
||||
outputs at a timed interval, this field indicates the cache hit rate.
|
||||
.TP
|
||||
MISSES
|
||||
Number of page cache misses (reads from storage I/O). Each miss is for one
|
||||
memory page. Cache misses should be causing disk I/O. Run iostat(1) for
|
||||
correlation (although the miss count and size by the time disk I/O is issued
|
||||
can differ due to I/O subsystem merging).
|
||||
.TP
|
||||
DIRTIES
|
||||
Number of times a page in the page cache was written to and thus "dirtied".
|
||||
The same page may be counted multiple times per interval, if it is written
|
||||
to multiple times. This field gives an indication of how much cache churn there
|
||||
is, caused by applications writing data.
|
||||
.TP
|
||||
RATIO
|
||||
The ratio of cache hits to total cache accesses (hits + misses), as a
|
||||
percentage.
|
||||
.TP
|
||||
BUFFERS_MB
|
||||
Size of the buffer cache, for disk I/O. From /proc/meminfo.
|
||||
.TP
|
||||
CACHED_MB
|
||||
Size of the page cache, for file system I/O. From /proc/meminfo.
|
||||
.SH OVERHEAD
|
||||
This tool currently uses ftrace function profiling, which provides efficient
|
||||
in-kernel counters. However, the functions profiled are executed frequently,
|
||||
so the overheads can add up. Test and measure before use. My own testing
|
||||
showed around a 2% loss in application performance while this tool was running.
|
||||
.SH SOURCE
|
||||
This is from the perf-tools collection.
|
||||
.IP
|
||||
https://github.com/brendangregg/perf-tools
|
||||
.PP
|
||||
Also look under the examples directory for a text file containing example
|
||||
usage, output, and commentary for this tool.
|
||||
.SH OS
|
||||
Linux
|
||||
.SH STABILITY
|
||||
Unstable - in development.
|
||||
.SH AUTHOR
|
||||
Brendan Gregg
|
||||
.SH SEE ALSO
|
||||
iostat(1), iosnoop(8)
|
||||
Reference in New Issue
Block a user