perf介绍
perf 即可对一定时间的所有进程进行采样,也可以对指定的进程或者事件进行采样,并且还可以用调用栈的形式,输出整个调用链上的汇总信息
安装perf
yum install perf -y
抓取ksoftirqd的perf数据
top查看ksoftirqd的pid
top - 18:57:28 up 1 min, 1 user, load average: 2.19, 0.78, 0.28
Tasks: 196 total, 2 running, 194 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.5 us, 4.0 sy, 0.0 ni, 93.2 id, 0.0 wa, 0.8 hi, 0.6 si, 0.0 st
MiB Mem : 7438.3 total, 4848.6 free, 1048.2 used, 1541.5 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 6016.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2694 root 20 0 1110344 362324 76024 S 10.6 4.8 0:14.05 kube-apiserver
2256 root 20 0 10.7g 69304 25036 S 5.3 0.9 0:05.19 etcd
885 root 20 0 2010184 109532 67372 S 4.7 1.4 0:05.44 kubelet
2234 root 20 0 825312 118788 61484 S 2.7 1.6 0:05.17 kube-controller
3823 root 20 0 1673000 68508 45708 S 2.0 0.9 0:01.27 calico-node
993 root 20 0 2417748 104240 50380 S 1.0 1.4 0:03.95 dockerd
2236 root 20 0 754556 53360 35484 S 0.7 0.7 0:03.82 kube-scheduler
855 root 20 0 2094764 55540 31076 S 0.3 0.7 0:01.13 containerd
1515 root 20 0 1275676 40432 27844 S 0.3 0.5 0:00.38 cri-dockerd
3720 root 20 0 751092 42528 29800 S 0.3 0.6 0:00.40 coredns
4342 root 20 0 751092 44044 29676 S 0.3 0.6 0:00.10 coredns
1 root 20 0 103164 11080 8364 S 0.0 0.1 0:03.68 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
5 root 20 0 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0-cgroup_pidlist_destroy
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kblockd
7 root 20 0 0 0 0 I 0.0 0.0 0:00.02 kworker/u256:0-flush-253:0
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
9 root 20 0 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd/ 0 # cpu0上的软中断进程,pid为9
10 root 20 0 0 0 0 R 0.0 0.0 0:00.19 rcu_sched
11 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_bh
12 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
15 root rt 0 0 0 0 S 0.0 0.0 0:00.77 migration/1
16 root 20 0 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/1 # cpu1上的软中断进程,pid为16
采样分析ksoftirqd的行为
- 使用perf对ksoftirqd进程进行采样,执行下面的命令之后,会在当前目录下生成一个perf.data文件
# 采样 30s 后退出, -p指定pid
[root@openeuler ~]# perf record -a -g -p 9 -- sleep 30
Warning:
PID/TID switch overriding SYSTEM
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (10 samples) ]
[root@openeuler ~]# ls
anaconda-ks.cfg perf.data
- 观察内核线程 ksoftirqd 的行为
perf report
perf report --stdio
执行完上面的命令之后,会生成如下的调用栈报告,按“e”可以展开
net_rx_action 和 netif_receive_skb,表明这是接收网络包(rx 表示 receive)。
火焰图
perf report生成的包括需要展开,并且层级比较深,看起来不直观,Brendan Gragg发明了火焰图,通过矢量图的形式,更直观展示汇总结果
命令操作
git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph/
perf script -i /root/perf.data | ./stackcollapse-perf.pl --all | ./flamegraph.pl > ksoftirqd.svg
/root/perf.data是上面perf record生成的文件路径
担心外网git拉不下来,提供国内的下载链接,链接里面是打包,下载下来需要解压
https://www.123pan.com/s/ULM6Vv-QM313.html 提取码:eDtq
操作示例
[root@openeuler ~]# git clone https://github.com/brendangregg/FlameGraph
Cloning into 'FlameGraph'...
remote: Enumerating objects: 1285, done.
remote: Counting objects: 100% (708/708), done.
remote: Compressing objects: 100% (148/148), done.
remote: Total 1285 (delta 584), reused 574 (delta 560), pack-reused 577
Receiving objects: 100% (1285/1285), 1.92 MiB | 3.60 MiB/s, done.
Resolving deltas: 100% (761/761), done.
[root@openeuler ~]# cd FlameGraph/
[root@openeuler FlameGraph]# ls
aix-perf.pl files.pl stackcollapse-chrome-tracing.py stackcollapse-ljp.awk stackcollapse-vtune-mc.pl
demos flamegraph.pl stackcollapse-elfutils.pl stackcollapse-perf.pl stackcollapse-vtune.pl
dev jmaps stackcollapse-faulthandler.pl stackcollapse-perf-sched.awk stackcollapse-wcp.pl
difffolded.pl pkgsplit-perf.pl stackcollapse-gdb.pl stackcollapse.pl stackcollapse-xdebug.php
docs range-perf.pl stackcollapse-go.pl stackcollapse-pmc.pl test
example-dtrace-stacks.txt README.md stackcollapse-ibmjava.pl stackcollapse-recursive.pl test.sh
example-dtrace.svg record-test.sh stackcollapse-instruments.pl stackcollapse-sample.awk
example-perf-stacks.txt.gz stackcollapse-aix.pl stackcollapse-java-exceptions.pl stackcollapse-stap.pl
example-perf.svg stackcollapse-bpftrace.pl stackcollapse-jstack.pl stackcollapse-vsprof.pl
[root@openeuler FlameGraph]# perf script -i /root/perf.data | ./stackcollapse-perf.pl --all | ./flamegraph.pl > ksoftirqd.svg
[root@openeuler FlameGraph]# sz ksoftirqd.svg # 把生成的svg图片导出来
使用浏览器打开生成结果
这张图看起来像是跳动的火焰,因此被称为火焰图。要理解火焰图,我们最重要的是区分清楚横轴和纵轴的含义。
横轴表示采样数和采样比例。一个函数占用的横轴越宽,就代表它的执行时间越长。同一层的多个函数,则是按照字母来排序。
纵轴表示调用栈,由下往上根据调用关系逐个展开。换句话说,上下相邻的两个函数中,下面的函数,是上面函数的父函数。这样,调用栈越深,纵轴就越高。
FAQ
BEGIN failed--compilation aborted at ./flamegraph.pl line 97.
- 现象
[root@host-192-168-0-35 FlameGraph]# perf script -i /root/perf.data | ./stackcollapse-perf.pl --all | ./flamegraph.pl > aa.svg
Can't locate open.pm in @INC (you may need to install the open module) (@INC contains: /usr/local/lib64/perl5/5.36 /usr/local/share/perl5/5.36 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at ./flamegraph.pl line 97.
BEGIN failed--compilation aborted at ./flamegraph.pl line 97.
- 解决
yum install perl-open