perf介绍

perf 即可对一定时间的所有进程进行采样,也可以对指定的进程或者事件进行采样,并且还可以用调用栈的形式,输出整个调用链上的汇总信息

安装perf

yum install perf -y

抓取ksoftirqd的perf数据

top查看ksoftirqd的pid

top - 18:57:28 up 1 min,  1 user,  load average: 2.19, 0.78, 0.28
Tasks: 196 total,   2 running, 194 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.5 us,  4.0 sy,  0.0 ni, 93.2 id,  0.0 wa,  0.8 hi,  0.6 si,  0.0 st
MiB Mem :   7438.3 total,   4848.6 free,   1048.2 used,   1541.5 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   6016.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                             
   2694 root      20   0 1110344 362324  76024 S  10.6   4.8   0:14.05 kube-apiserver                                                                                      
   2256 root      20   0   10.7g  69304  25036 S   5.3   0.9   0:05.19 etcd                                                                                                
    885 root      20   0 2010184 109532  67372 S   4.7   1.4   0:05.44 kubelet                                                                                             
   2234 root      20   0  825312 118788  61484 S   2.7   1.6   0:05.17 kube-controller                                                                                     
   3823 root      20   0 1673000  68508  45708 S   2.0   0.9   0:01.27 calico-node                                                                                         
    993 root      20   0 2417748 104240  50380 S   1.0   1.4   0:03.95 dockerd                                                                                             
   2236 root      20   0  754556  53360  35484 S   0.7   0.7   0:03.82 kube-scheduler                                                                                      
    855 root      20   0 2094764  55540  31076 S   0.3   0.7   0:01.13 containerd                                                                                          
   1515 root      20   0 1275676  40432  27844 S   0.3   0.5   0:00.38 cri-dockerd                                                                                         
   3720 root      20   0  751092  42528  29800 S   0.3   0.6   0:00.40 coredns                                                                                             
   4342 root      20   0  751092  44044  29676 S   0.3   0.6   0:00.10 coredns                                                                                             
      1 root      20   0  103164  11080   8364 S   0.0   0.1   0:03.68 systemd                                                                                             
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.02 kthreadd                                                                                            
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp                                                                                              
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp                                                                                          
      5 root      20   0       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0-cgroup_pidlist_destroy                                                                  
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-kblockd                                                                                
      7 root      20   0       0      0      0 I   0.0   0.0   0:00.02 kworker/u256:0-flush-253:0                                                                          
      8 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq                                                                                        
      9 root      20   0       0      0      0 S   0.0   0.0   0:00.02 ksoftirqd/  0        # cpu0上的软中断进程,pid为9                                                                                 
     10 root      20   0       0      0      0 R   0.0   0.0   0:00.19 rcu_sched                                                                                           
     11 root      20   0       0      0      0 I   0.0   0.0   0:00.00 rcu_bh                                                                                              
     12 root      rt   0       0      0      0 S   0.0   0.0   0:00.00 migration/0                                                                                         
     13 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/0                                                                                             
     14 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/1                                                                                             
     15 root      rt   0       0      0      0 S   0.0   0.0   0:00.77 migration/1                                                                                         
     16 root      20   0       0      0      0 S   0.0   0.0   0:00.03 ksoftirqd/1     # cpu1上的软中断进程,pid为16

采样分析ksoftirqd的行为

  • 使用perf对ksoftirqd进程进行采样,执行下面的命令之后,会在当前目录下生成一个perf.data文件
# 采样 30s 后退出, -p指定pid
[root@openeuler ~]# perf record -a -g -p 9 -- sleep 30
Warning:
PID/TID switch overriding SYSTEM
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (10 samples) ]


[root@openeuler ~]# ls
anaconda-ks.cfg  perf.data
  • 观察内核线程 ksoftirqd 的行为
perf report
perf report --stdio

执行完上面的命令之后,会生成如下的调用栈报告,按“e”可以展开

alt text

net_rx_action 和 netif_receive_skb,表明这是接收网络包(rx 表示 receive)。

火焰图

perf report生成的包括需要展开,并且层级比较深,看起来不直观,Brendan Gragg发明了火焰图,通过矢量图的形式,更直观展示汇总结果

命令操作

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph/
perf script -i /root/perf.data | ./stackcollapse-perf.pl --all |  ./flamegraph.pl > ksoftirqd.svg
  • /root/perf.data是上面perf record生成的文件路径

  • 担心外网git拉不下来,提供国内的下载链接,链接里面是打包,下载下来需要解压

https://www.123pan.com/s/ULM6Vv-QM313.html 提取码:eDtq

操作示例

[root@openeuler ~]# git clone https://github.com/brendangregg/FlameGraph
Cloning into 'FlameGraph'...
remote: Enumerating objects: 1285, done.
remote: Counting objects: 100% (708/708), done.
remote: Compressing objects: 100% (148/148), done.
remote: Total 1285 (delta 584), reused 574 (delta 560), pack-reused 577
Receiving objects: 100% (1285/1285), 1.92 MiB | 3.60 MiB/s, done.
Resolving deltas: 100% (761/761), done.
[root@openeuler ~]# cd FlameGraph/
[root@openeuler FlameGraph]# ls
aix-perf.pl                 files.pl                   stackcollapse-chrome-tracing.py   stackcollapse-ljp.awk         stackcollapse-vtune-mc.pl
demos                       flamegraph.pl              stackcollapse-elfutils.pl         stackcollapse-perf.pl         stackcollapse-vtune.pl
dev                         jmaps                      stackcollapse-faulthandler.pl     stackcollapse-perf-sched.awk  stackcollapse-wcp.pl
difffolded.pl               pkgsplit-perf.pl           stackcollapse-gdb.pl              stackcollapse.pl              stackcollapse-xdebug.php
docs                        range-perf.pl              stackcollapse-go.pl               stackcollapse-pmc.pl          test
example-dtrace-stacks.txt   README.md                  stackcollapse-ibmjava.pl          stackcollapse-recursive.pl    test.sh
example-dtrace.svg          record-test.sh             stackcollapse-instruments.pl      stackcollapse-sample.awk
example-perf-stacks.txt.gz  stackcollapse-aix.pl       stackcollapse-java-exceptions.pl  stackcollapse-stap.pl
example-perf.svg            stackcollapse-bpftrace.pl  stackcollapse-jstack.pl           stackcollapse-vsprof.pl
[root@openeuler FlameGraph]# perf script -i /root/perf.data | ./stackcollapse-perf.pl --all |  ./flamegraph.pl > ksoftirqd.svg
[root@openeuler FlameGraph]# sz ksoftirqd.svg       # 把生成的svg图片导出来

使用浏览器打开生成结果

alt text

这张图看起来像是跳动的火焰,因此被称为火焰图。要理解火焰图,我们最重要的是区分清楚横轴和纵轴的含义。

横轴表示采样数和采样比例。一个函数占用的横轴越宽,就代表它的执行时间越长。同一层的多个函数,则是按照字母来排序。

纵轴表示调用栈,由下往上根据调用关系逐个展开。换句话说,上下相邻的两个函数中,下面的函数,是上面函数的父函数。这样,调用栈越深,纵轴就越高。

FAQ

BEGIN failed--compilation aborted at ./flamegraph.pl line 97.

  • 现象
[root@host-192-168-0-35 FlameGraph]# perf script -i /root/perf.data | ./stackcollapse-perf.pl --all |  ./flamegraph.pl > aa.svg
Can't locate open.pm in @INC (you may need to install the open module) (@INC contains: /usr/local/lib64/perl5/5.36 /usr/local/share/perl5/5.36 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at ./flamegraph.pl line 97.
BEGIN failed--compilation aborted at ./flamegraph.pl line 97.
  • 解决
yum install perl-open

results matching ""

    No results matching ""