The choice of 99 Hertz, instead of 100 Hertz, is to avoid accidentally sampling in lockstep with some periodic activity, which would produce skewed results. This is also coarse: you may want to increase that to higher rates (eg, up to 997 Hertz) for finer resolution, especially if you are sampling short bursts of activity and you'd still like enough resolution to be useful. Bear in mind that higher frequencies means higher overhead.
These are grouped into libraries of tracepoints; eg, "sock:" for socket events, "sched:" for CPU scheduler events. A key value of tracepoints is that they should have a stable API, so if you write tools that use them on one kernel version, they should work on later versions as well.
syscalls 也是一组事件。
Example
假设我们有一个因 mmap 导致虚存快速增长的程序,如何将泄漏点查出来呢?
1 2 3 4 5 6 7
// g++ -std=c++11 mmapButNotWritten.cpp -O3 -ggdb -o mmapButNotWritten #include<sys/mman.h> #include<iostream> intmain(){ auto p = mmap(nullptr, 8 * 1024, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); std::cout << p << std::endl; }
1 2
# sudo perf record --call-graph dwarf -e "syscalls:sys_exit_mmap" ./mmapButNotWritten 0x7f955b1b9000
4.9.0 版本的内核使用 kprobes 时会报错 mmap_region is out of .text, skip it. ,不建议使用。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# sudo perf probe -V mmap_region Available variables at mmap_region @<mmap_region+0> char* __func__ long unsigned int addr long unsigned int len long unsigned int pgoff struct file* file vm_flags_t vm_flags # sudo perf probe --add 'mmap_region addr' Added new event: probe:mmap_region (on mmap_region with addr) You can now use it in all perf tools, such as: perf record -e probe:mmap_region -aR sleep 1 # sudo perf record --call-graph dwarf -e probe:mmap_region ./mmapButNotWritten 0x7f8eb7a04000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.217 MB perf.data (25 samples) ]
// g++ -std=c++11 -shared -fPIC -O3 -ggdb math.cpp -o libmath.so intfuncInLib(int x, int y){ return x + y; }
1 2 3 4 5
// g++ -std=c++11 -O3 -ggdb main.cpp -L$PWD -lmath -Wl,-rpath=$PWD -o main intfuncInLib(int x, int y); intmain(){ funcInLib(1, 2); }
Example 1 —— 通过 perf 插入 uprobes
1 2 3 4
# sudo perf probe -x libmath.so '--add=funcInLib x y' Probe on address 0x580 to force probing at the function entry. Added new event: probe_libmath:funcInLib (on funcInLib in /home/demons/uprobe/libmath.so with x y)
# sudo su # echo > /sys/kernel/debug/tracing/uprobe_events # exit
Software Events
Software events may have a default period. This means that when you use them for sampling, you're sampling a subset of events, not tracing every event. You can check with perf record -vv:
This default means is that the kernel adjusts the rate of sampling so that it's capturing about 4,000 context switch events per second. If you really meant to record them all, use -c 1: