perf-stat-hist example

2025-12-01 07:26:04 +07:00 · 2014-07-20 01:17:48 -07:00
parent 38e4ee1a5b
commit 481b752a16
2 changed files with 80 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -11,14 +11,14 @@ Many of these tools employ workarounds so that functionality is possible without

 Using perf_events:

- misc/[perf-stat-hist](misc/perf-stat-hist): power-of aggregations for tracepoint variables.
+- misc/[perf-stat-hist](misc/perf-stat-hist): power-of aggregations for tracepoint variables. [Examples](examples/perf-stat-hist_example.txt).

 Using ftrace:

- [iosnoop](iosnoop): trace disk I/O with details including latency.
- [execsnoop](execsnoop): trace process exec() with command line argument details.
- kernel/[funccount](kernel/funccount): count kernel functions that match a string.
- kernel/[functrace](kernel/functrace): trace kernel functions that match a string.
+- [iosnoop](iosnoop): trace disk I/O with details including latency. [Examples](examples/iosnoop_example.txt).
+- [execsnoop](execsnoop): trace process exec() with command line argument details. [Examples](examples/execsnoop_example.txt).
+- kernel/[funccount](kernel/funccount): count kernel functions that match a string. [Examples](examples/funccount_example.txt).
+- kernel/[functrace](kernel/functrace): trace kernel functions that match a string. [Examples](examples/functrace_example.txt).
 - tools/[reset-ftrace](tools/reset-ftrace): reset ftrace state if needed.

 ## Prerequisites
--- a/examples/perf-stat-hist_example.txt
+++ b/examples/perf-stat-hist_example.txt
@@ -0,0 +1,75 @@
+Demonstrations of perf-stat-hist, the Linux perf_events version.
+
+Tracing the net:net_dev_xmit tracepoint, and building a power-of-4 histogram
+for the "len" variable, for 10 seconds:
+
+# ./perf-stat-hist net:net_dev_xmit len 10
+Tracing net:net_dev_xmit, power-of-4, max 1048576, for 10 seconds...
+
+            Range          : Count    Distribution
+            0              : 0        |                                      |
+            1 -> 3         : 0        |                                      |
+            4 -> 15        : 0        |                                      |
+           16 -> 63        : 2        |#                                     |
+           64 -> 255       : 30       |###                                   |
+          256 -> 1023      : 3        |#                                     |
+         1024 -> 4095      : 446      |######################################|
+         4096 -> 16383     : 0        |                                      |
+        16384 -> 65535     : 0        |                                      |
+        65536 -> 262143    : 0        |                                      |
+       262144 -> 1048575   : 0        |                                      |
+      1048576 ->           : 0        |                                      |
+
+This showed that most of the network transmits were between 1024 and 4095 bytes,
+with a handful between 64 and 255 bytes.
+
+Cat the format file for the tracepoint to see what other variables are available
+to trace. Eg:
+
+# cat /sys/kernel/debug/tracing/events/net/net_dev_xmit/format 
+name: net_dev_xmit
+ID: 1078
+format:
+	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
+	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
+	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
+	field:int common_pid;	offset:4;	size:4;	signed:1;
+
+	field:void * skbaddr;	offset:8;	size:8;	signed:0;
+	field:unsigned int len;	offset:16;	size:4;	signed:0;
+	field:int rc;	offset:20;	size:4;	signed:1;
+	field:__data_loc char[] name;	offset:24;	size:4;	signed:1;
+
+print fmt: "dev=%s skbaddr=%p len=%u rc=%d", __get_str(name), REC->skbaddr, REC->len, REC->rc
+
+That's where "len" came from.
+
+This works by creating a series of tracepoint and filter pairs for each
+histogram bucket, and doing in-kernel counts. The overhead should in many cases
+be better than user space post-processing, however, this approach is still
+not ideal. I've called it a "perf hacktogram". The overhead is relative to
+the frequency of events, multiplied by the number of buckets. You can modify
+the script to use power-of-2 instead, or whatever you like, but the overhead
+for more buckets will be higher.
+
+
+Histogram of the returned read() syscall sizes:
+
+# ./perf-stat-hist syscalls:sys_exit_read ret 10
+Tracing syscalls:sys_exit_read, power-of-4, max 1048576, for 10 seconds...
+
+            Range          : Count    Distribution
+            0              : 90       |#                                     |
+            1 -> 3         : 9587     |######################################|
+            4 -> 15        : 69       |#                                     |
+           16 -> 63        : 590      |###                                   |
+           64 -> 255       : 250      |#                                     |
+          256 -> 1023      : 389      |##                                    |
+         1024 -> 4095      : 296      |##                                    |
+         4096 -> 16383     : 183      |#                                     |
+        16384 -> 65535     : 12       |#                                     |
+        65536 -> 262143    : 0        |                                      |
+       262144 -> 1048575   : 0        |                                      |
+      1048576 ->           : 0        |                                      |
+
+Most of our read()s were tiny, between 1 and 3 bytes.