mirror of
https://github.com/brendangregg/perf-tools.git
synced 2025-11-30 23:16:03 +07:00
cachestat for page cache statistics
This commit is contained in:
@@ -25,6 +25,7 @@ Using ftrace:
|
|||||||
- [execsnoop](execsnoop): trace process exec() with command line argument details. [Examples](examples/execsnoop_example.txt).
|
- [execsnoop](execsnoop): trace process exec() with command line argument details. [Examples](examples/execsnoop_example.txt).
|
||||||
- [opensnoop](opensnoop): trace open() syscalls showing filenames. [Examples](examples/opensnoop_example.txt).
|
- [opensnoop](opensnoop): trace open() syscalls showing filenames. [Examples](examples/opensnoop_example.txt).
|
||||||
- [killsnoop](killsnoop): trace kill() signals showing process and signal details. [Examples](examples/killsnoop_example.txt).
|
- [killsnoop](killsnoop): trace kill() signals showing process and signal details. [Examples](examples/killsnoop_example.txt).
|
||||||
|
- fs/[cachestat](fs/cachestat): basic cache hit/miss statistics for the Linux page cache. [Examples](examples/cachestat_example.txt).
|
||||||
- net/[tcpretrans](net/tcpretrans): show TCP retransmits, with address and other details. [Examples](examples/tcpretrans_example.txt).
|
- net/[tcpretrans](net/tcpretrans): show TCP retransmits, with address and other details. [Examples](examples/tcpretrans_example.txt).
|
||||||
- system/[tpoint](system/tpoint): trace a given tracepoint. [Examples](examples/tpoint_example.txt).
|
- system/[tpoint](system/tpoint): trace a given tracepoint. [Examples](examples/tpoint_example.txt).
|
||||||
- kernel/[funccount](kernel/funccount): count kernel function calls, matching a string with wildcards. [Examples](examples/funccount_example.txt).
|
- kernel/[funccount](kernel/funccount): count kernel function calls, matching a string with wildcards. [Examples](examples/funccount_example.txt).
|
||||||
|
|||||||
1
bin/cachestat
Symbolic link
1
bin/cachestat
Symbolic link
@@ -0,0 +1 @@
|
|||||||
|
../fs/cachestat
|
||||||
58
examples/cachestat_example.txt
Normal file
58
examples/cachestat_example.txt
Normal file
@@ -0,0 +1,58 @@
|
|||||||
|
Demonstrations of cachestat, the Linux ftrace version.
|
||||||
|
|
||||||
|
|
||||||
|
Here is some sample output showing file system cache statistics, followed by
|
||||||
|
the workload that caused it:
|
||||||
|
|
||||||
|
# ./cachestat -t
|
||||||
|
Counting cache functions... Output every 1 seconds.
|
||||||
|
TIME HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB
|
||||||
|
08:28:57 415 0 0 100.0% 1 191
|
||||||
|
08:28:58 411 0 0 100.0% 1 191
|
||||||
|
08:28:59 362 97 0 78.9% 0 8
|
||||||
|
08:29:00 411 0 0 100.0% 0 9
|
||||||
|
08:29:01 775 20489 0 3.6% 0 89
|
||||||
|
08:29:02 411 0 0 100.0% 0 89
|
||||||
|
08:29:03 6069 0 0 100.0% 0 89
|
||||||
|
08:29:04 15249 0 0 100.0% 0 89
|
||||||
|
08:29:05 411 0 0 100.0% 0 89
|
||||||
|
08:29:06 411 0 0 100.0% 0 89
|
||||||
|
08:29:07 411 0 3 100.0% 0 89
|
||||||
|
[...]
|
||||||
|
|
||||||
|
I used the -t option to include the TIME column, to make describing the output
|
||||||
|
easier.
|
||||||
|
|
||||||
|
The workload was:
|
||||||
|
|
||||||
|
# echo 1 > /proc/sys/vm/drop_caches; sleep 2; cksum 80m; sleep 2; cksum 80m
|
||||||
|
|
||||||
|
At 8:28:58, the page cache was dropped by the first command, which can be seen
|
||||||
|
by the drop in size for "CACHE_MB" (page cache size) from 191 Mbytes to 8.
|
||||||
|
After a 2 second sleep, a cksum command was issued at 8:29:01, for an 80 Mbyte
|
||||||
|
file (called "80m"), which caused a total of ~20,400 misses ("MISSES" column),
|
||||||
|
and the page cache size to grow by 80 Mbytes. The hit ratio during this dropped
|
||||||
|
to 3.6%. Finally, after another 2 second sleep, at 8:29:03 the cksum command
|
||||||
|
was run a second time, this time hitting entirely from cache.
|
||||||
|
|
||||||
|
Instrumenting all file system cache accesses does cost some overhead, and this
|
||||||
|
tool might slow your target system by 2% or so. Test before use if this is a
|
||||||
|
concern.
|
||||||
|
|
||||||
|
This tool also uses dynamic tracing, and is tied to Linux kernel implementation
|
||||||
|
details. If it doesn't work for you, it probably needs fixing.
|
||||||
|
|
||||||
|
|
||||||
|
Use -h to print the USAGE message:
|
||||||
|
|
||||||
|
# ./cachestat -h
|
||||||
|
USAGE: cachestat [-Dht] [interval]
|
||||||
|
-D # print debug counters
|
||||||
|
-h # this usage message
|
||||||
|
-t # include timestamp
|
||||||
|
interval # output interval in secs (default 1)
|
||||||
|
eg,
|
||||||
|
cachestat # show stats every second
|
||||||
|
cachestat 5 # show stats every 5 seconds
|
||||||
|
|
||||||
|
See the man page and example file for more info.
|
||||||
167
fs/cachestat
Executable file
167
fs/cachestat
Executable file
@@ -0,0 +1,167 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
#
|
||||||
|
# cachestat - show Linux page cache hit/miss statistics.
|
||||||
|
# Uses Linux ftrace.
|
||||||
|
#
|
||||||
|
# This is a proof of concept using Linux ftrace capabilities on older kernels,
|
||||||
|
# and works by using function profiling for in-kernel counters. Specifically,
|
||||||
|
# four kernel functions are traced:
|
||||||
|
#
|
||||||
|
# mark_page_accessed() for measuring cache accesses
|
||||||
|
# mark_buffer_dirty() for measuring cache writes
|
||||||
|
# add_to_page_cache_lru() for measuring page additions
|
||||||
|
# account_page_dirtied() for measuring page dirties
|
||||||
|
#
|
||||||
|
# It is possible that these functions have been renamed (or are different
|
||||||
|
# logically) for your kernel version, and this script will not work as-is.
|
||||||
|
# This script was written on Linux 3.13. This script is a sandcastle: the
|
||||||
|
# kernel may wash some away, and you'll need to rebuild.
|
||||||
|
#
|
||||||
|
# USAGE: cachestat [-Dht] [interval]
|
||||||
|
# eg,
|
||||||
|
# cachestat 5 # show stats every 5 seconds
|
||||||
|
#
|
||||||
|
# Run "cachestat -h" for full usage.
|
||||||
|
#
|
||||||
|
# WARNING: This uses dynamic tracing of kernel functions, and could cause
|
||||||
|
# kernel panics or freezes. Test, and know what you are doing, before use.
|
||||||
|
# It also traces cache activity, which can be frequent, and cost some overhead.
|
||||||
|
# The statistics should be treated as best-effort: there may be some error
|
||||||
|
# margin depending on unusual workload types.
|
||||||
|
#
|
||||||
|
# REQUIREMENTS: CONFIG_FUNCTION_PROFILER, awk.
|
||||||
|
#
|
||||||
|
# From perf-tools: https://github.com/brendangregg/perf-tools
|
||||||
|
#
|
||||||
|
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
|
||||||
|
#
|
||||||
|
# This program is free software; you can redistribute it and/or
|
||||||
|
# modify it under the terms of the GNU General Public License
|
||||||
|
# as published by the Free Software Foundation; either version 2
|
||||||
|
# of the License, or (at your option) any later version.
|
||||||
|
#
|
||||||
|
# This program is distributed in the hope that it will be useful,
|
||||||
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
# GNU General Public License for more details.
|
||||||
|
#
|
||||||
|
# You should have received a copy of the GNU General Public License
|
||||||
|
# along with this program; if not, write to the Free Software Foundation,
|
||||||
|
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
|
||||||
|
#
|
||||||
|
# (http://www.gnu.org/copyleft/gpl.html)
|
||||||
|
#
|
||||||
|
# 28-Dec-2014 Brendan Gregg Created this.
|
||||||
|
|
||||||
|
### default variables
|
||||||
|
tracing=/sys/kernel/debug/tracing
|
||||||
|
interval=1; opt_timestamp=0; opt_debug=0
|
||||||
|
trap 'quit=1' INT QUIT TERM PIPE HUP # sends execution to end tracing section
|
||||||
|
|
||||||
|
function usage {
|
||||||
|
cat <<-END >&2
|
||||||
|
USAGE: cachestat [-Dht] [interval]
|
||||||
|
-D # print debug counters
|
||||||
|
-h # this usage message
|
||||||
|
-t # include timestamp
|
||||||
|
interval # output interval in secs (default 1)
|
||||||
|
eg,
|
||||||
|
cachestat # show stats every second
|
||||||
|
cachestat 5 # show stats every 5 seconds
|
||||||
|
|
||||||
|
See the man page and example file for more info.
|
||||||
|
END
|
||||||
|
exit
|
||||||
|
}
|
||||||
|
|
||||||
|
function warn {
|
||||||
|
if ! eval "$@"; then
|
||||||
|
echo >&2 "WARNING: command failed \"$@\""
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
function die {
|
||||||
|
echo >&2 "$@"
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
### process options
|
||||||
|
while getopts Dht opt
|
||||||
|
do
|
||||||
|
case $opt in
|
||||||
|
D) opt_debug=1 ;;
|
||||||
|
t) opt_timestamp=1 ;;
|
||||||
|
h|?) usage ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
shift $(( $OPTIND - 1 ))
|
||||||
|
|
||||||
|
### option logic
|
||||||
|
if (( $# )); then
|
||||||
|
interval=$1
|
||||||
|
fi
|
||||||
|
echo "Counting cache functions... Output every $interval seconds."
|
||||||
|
|
||||||
|
### check permissions
|
||||||
|
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
|
||||||
|
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
|
||||||
|
|
||||||
|
### enable tracing
|
||||||
|
sysctl -q kernel.ftrace_enabled=1 # doesn't set exit status
|
||||||
|
printf "mark_page_accessed\nmark_buffer_dirty\nadd_to_page_cache_lru\naccount_page_dirtied\n" > set_ftrace_filter || \
|
||||||
|
die "ERROR: tracing these four kernel functions: mark_page_accessed,"\
|
||||||
|
"mark_buffer_dirty, add_to_page_cache_lru and account_page_dirtied (unknown kernel version?). Exiting."
|
||||||
|
warn "echo nop > current_tracer"
|
||||||
|
if ! echo 1 > function_profile_enabled; then
|
||||||
|
echo > set_ftrace_filter
|
||||||
|
die "ERROR: enabling function profiling. Have CONFIG_FUNCTION_PROFILER? Exiting."
|
||||||
|
fi
|
||||||
|
|
||||||
|
(( opt_timestamp )) && printf "%-8s " TIME
|
||||||
|
printf "%8s %8s %8s %8s %12s %10s" HITS MISSES DIRTIES RATIO "BUFFERS_MB" "CACHE_MB"
|
||||||
|
(( opt_debug )) && printf " DEBUG"
|
||||||
|
echo
|
||||||
|
|
||||||
|
### summarize
|
||||||
|
quit=0; secs=0
|
||||||
|
while (( !quit && (!opt_duration || secs < duration) )); do
|
||||||
|
(( secs += interval ))
|
||||||
|
echo 0 > function_profile_enabled
|
||||||
|
echo 1 > function_profile_enabled
|
||||||
|
sleep $interval
|
||||||
|
|
||||||
|
(( opt_timestamp )) && printf "%(%H:%M:%S)T " -1
|
||||||
|
|
||||||
|
# cat both meminfo and trace stats, and let awk pick them apart
|
||||||
|
cat /proc/meminfo trace_stat/function* | awk -v debug=$opt_debug '
|
||||||
|
# match meminfo stats:
|
||||||
|
$1 == "Buffers:" && $3 == "kB" { buffers_mb = $2 / 1024 }
|
||||||
|
$1 == "Cached:" && $3 == "kB" { cached_mb = $2 / 1024 }
|
||||||
|
# identify and save trace counts:
|
||||||
|
$2 ~ /[0-9]/ && $3 != "kB" { a[$1] += $2 }
|
||||||
|
END {
|
||||||
|
mpa = a["mark_page_accessed"]
|
||||||
|
mbd = a["mark_buffer_dirty"]
|
||||||
|
apcl = a["add_to_page_cache_lru"]
|
||||||
|
apd = a["account_page_dirtied"]
|
||||||
|
|
||||||
|
total = mpa - mbd
|
||||||
|
misses = apcl - apd
|
||||||
|
if (misses < 0)
|
||||||
|
misses = 0
|
||||||
|
hits = total - misses
|
||||||
|
|
||||||
|
ratio = 100 * hits / total
|
||||||
|
printf "%8d %8d %8d %7.1f%% %12.0f %10.0f", hits, misses, mbd,
|
||||||
|
ratio, buffers_mb, cached_mb
|
||||||
|
if (debug)
|
||||||
|
printf " (%d %d %d %d)", mpa, mbd, apcl, apd
|
||||||
|
printf "\n"
|
||||||
|
}'
|
||||||
|
done
|
||||||
|
|
||||||
|
### end tracing
|
||||||
|
echo 2>/dev/null
|
||||||
|
echo "Ending tracing..." 2>/dev/null
|
||||||
|
warn "echo 0 > function_profile_enabled"
|
||||||
|
warn "echo > set_ftrace_filter"
|
||||||
111
man/man8/cachestat.8
Normal file
111
man/man8/cachestat.8
Normal file
@@ -0,0 +1,111 @@
|
|||||||
|
.TH cachestat 8 "2014-12-28" "USER COMMANDS"
|
||||||
|
.SH NAME
|
||||||
|
cachestat \- Measure page cache hits/misses. Uses Linux ftrace.
|
||||||
|
.SH SYNOPSIS
|
||||||
|
.B cachestat
|
||||||
|
[\-Dht] [interval]
|
||||||
|
.SH DESCRIPTION
|
||||||
|
This tool provides basic cache hit/miss statistics for the Linux page cache.
|
||||||
|
|
||||||
|
Its current implementation uses Linux ftrace dynamic function profiling to
|
||||||
|
create custom in-kernel counters, which is a workaround until such counters
|
||||||
|
can be built-in to the kernel. Specifically, four kernel functions are counted:
|
||||||
|
.IP
|
||||||
|
mark_page_accessed() for measuring cache accesses
|
||||||
|
.IP
|
||||||
|
mark_buffer_dirty() for measuring cache writes
|
||||||
|
.IP
|
||||||
|
add_to_page_cache_lru() for measuring page additions
|
||||||
|
.IP
|
||||||
|
account_page_dirtied() for measuring page dirties
|
||||||
|
.PP
|
||||||
|
It is possible that these functions have been renamed (or are different
|
||||||
|
logically) for your kernel version, and this script will not work as-is.
|
||||||
|
This was written for a Linux 3.13 kernel, and tested on a few others versions.
|
||||||
|
This script is a sandcastle: the kernel may wash some away, and you'll
|
||||||
|
need to rebuild.
|
||||||
|
|
||||||
|
This program's implementation can be improved in the future when other
|
||||||
|
kernel capabilities are made available. If you need a more reliable tool now,
|
||||||
|
then consider other tracing alternatives (eg, SystemTap). This tool is really
|
||||||
|
a proof of concept to see what ftrace can currently do.
|
||||||
|
|
||||||
|
WARNING: This uses dynamic tracing of kernel functions, and could cause
|
||||||
|
kernel panics or freezes. Test, and know what you are doing, before use.
|
||||||
|
It also traces cache activity, which can be frequent, and cost some overhead.
|
||||||
|
The statistics should be treated as best-effort: there may be some error
|
||||||
|
margin depending on unusual workload types.
|
||||||
|
|
||||||
|
Since this uses ftrace, only the root user can use this tool.
|
||||||
|
.SH REQUIREMENTS
|
||||||
|
CONFIG_FUNCTION_PROFILER, which you may already have enabled and available on
|
||||||
|
recent kernels, and awk.
|
||||||
|
.SH OPTIONS
|
||||||
|
.TP
|
||||||
|
\-D
|
||||||
|
Include extra fields for debug purposes (see script).
|
||||||
|
.TP
|
||||||
|
\-h
|
||||||
|
Print usage message.
|
||||||
|
.TP
|
||||||
|
\-t
|
||||||
|
Include timestamps in units of seconds.
|
||||||
|
.TP
|
||||||
|
interval
|
||||||
|
Output interval in seconds. Default is 1.
|
||||||
|
.SH EXAMPLES
|
||||||
|
.TP
|
||||||
|
Show per-second page cache statistics:
|
||||||
|
#
|
||||||
|
.B cachestat
|
||||||
|
.SH FIELDS
|
||||||
|
.TP
|
||||||
|
TIME
|
||||||
|
Time, in HH:MM:SS.
|
||||||
|
.TP
|
||||||
|
HITS
|
||||||
|
Number of page cache hits (reads). Each hit is for one memory page (the size
|
||||||
|
depends on your processor architecture; commonly 4 Kbytes). Since this tool
|
||||||
|
outputs at a timed interval, this field indicates the cache hit rate.
|
||||||
|
.TP
|
||||||
|
MISSES
|
||||||
|
Number of page cache misses (reads from storage I/O). Each miss is for one
|
||||||
|
memory page. Cache misses should be causing disk I/O. Run iostat(1) for
|
||||||
|
correlation (although the miss count and size by the time disk I/O is issued
|
||||||
|
can differ due to I/O subsystem merging).
|
||||||
|
.TP
|
||||||
|
DIRTIES
|
||||||
|
Number of times a page in the page cache was written to and thus "dirtied".
|
||||||
|
The same page may be counted multiple times per interval, if it is written
|
||||||
|
to multiple times. This field gives an indication of how much cache churn there
|
||||||
|
is, caused by applications writing data.
|
||||||
|
.TP
|
||||||
|
RATIO
|
||||||
|
The ratio of cache hits to total cache accesses (hits + misses), as a
|
||||||
|
percentage.
|
||||||
|
.TP
|
||||||
|
BUFFERS_MB
|
||||||
|
Size of the buffer cache, for disk I/O. From /proc/meminfo.
|
||||||
|
.TP
|
||||||
|
CACHED_MB
|
||||||
|
Size of the page cache, for file system I/O. From /proc/meminfo.
|
||||||
|
.SH OVERHEAD
|
||||||
|
This tool currently uses ftrace function profiling, which provides efficient
|
||||||
|
in-kernel counters. However, the functions profiled are executed frequently,
|
||||||
|
so the overheads can add up. Test and measure before use. My own testing
|
||||||
|
showed around a 2% loss in application performance while this tool was running.
|
||||||
|
.SH SOURCE
|
||||||
|
This is from the perf-tools collection.
|
||||||
|
.IP
|
||||||
|
https://github.com/brendangregg/perf-tools
|
||||||
|
.PP
|
||||||
|
Also look under the examples directory for a text file containing example
|
||||||
|
usage, output, and commentary for this tool.
|
||||||
|
.SH OS
|
||||||
|
Linux
|
||||||
|
.SH STABILITY
|
||||||
|
Unstable - in development.
|
||||||
|
.SH AUTHOR
|
||||||
|
Brendan Gregg
|
||||||
|
.SH SEE ALSO
|
||||||
|
iostat(1), iosnoop(8)
|
||||||
Reference in New Issue
Block a user