第二章 2.3 扩展版BPF 2.4 调用栈回溯 2.5 火焰图 2.6 事件源 2.7 kprobes
内核探测点
2.8 uprobes
用户层探测点
2.9 跟踪点 2.10 USDT 2.11 动态USDT 2.12 性能监控计数器 2.13 perf_events 第三章 3.1 概论 3.2 性能分析方法论 3.2.1 业务负载画像
谁导致了这个负载(eg PID, process name, UID, IP address)? Why is the load called (code path, stack trace, flame graph)? What is the load (IOPS, throughput, type)? How is the load changing over time (per-interval summaries)?
3.2.2 下钻分析
Start examining the highest level. Examine next-level details.
3.2.3 USE方法论
Utilization 使用率 Saturation 负载率 Errors 错误
3.2.4 检查清单法
3.3 Linux 60s分析 3.3.1 uptime 3.3.2 dmesg|tail 3.3.3 vmstat 1 3.3.4 mpstat -P ALL 1 3.3.5 pidstat 1 3.3.6 iostat -xz 1 3.3.7 free -m 3.3.8 sar -n DEV 1 3.3.9 sar -n TCP,ETCP 1 3.3.10 top 3.4 BCC工具检查清单 3.4.1 execsnoop
统计系统调用exec
1 2 3 4 5 6 # execsnoop PCOMM PID RET ARGS supervise 9660 0 ./run supervise 9661 0 ./run mkdir 9662 0 /bin/mkdir -p ./main run 9663 0 ./run
3.4.2 opensnoop
统计系统调用open
1 2 3 4 5 6 7 8 # opensnoop PID COMM FD ERR PATH 1565 redis-server 5 0 /proc/1565 /stat1603 snmpd 9 0 /proc/net/dev1603 snmpd 11 0 /proc/net/if_inet61603 snmpd -1 2 /sys/class /net /eth0 /device /vendor 1603 snmpd 11 0 /proc /sys /net /ipv4 /neigh /eth0 /retrans_time_ms 1603 snmpd 11 0 /proc /sys /net /ipv6 /neigh /eth0 /retrans_time
3.4.3 ext4slower
统计ext4文件系统
1 2 3 4 5 6 7 8 # ext4slower Tracing ext4 operations slower than 10 ms TIME COMM PID T BYTES OFF_KB LAT (ms) FILENAME 06:35:01 cron 16464 R 1249 0 16.05 common-auth 06:35:01 cron 16463 R 1249 0 16.04 common-auth 06:35:01 cron 16465 R 1249 0 16.03 common-auth 06:35:01 cron 16465 R 4096 0 10.62 login.defs 06:35:01 cron 16464 R 4096 0 10.61 login.defs
3.4.4 biolatency
bio延迟直方图
1 2 3 4 5 6 7 8 9 10 11 12 13 14 # biolatency -m Tracing block device I/O... Hit Ctrl-C to end. ^C msecs : count distribution 0 -> 1 : 16335 |****************************************|2 -> 3 : 2272 |***** |4 -> 7 : 3603 |******** |8 -> 15 : 4328 |********** |16 -> 31 : 3379 |******** |32 -> 63 : 5815 |************** |64 -> 127 : 0 | |128 -> 255 : 0 | |256 -> 511 : 0 | |512 -> 1023 : 1 | |
3.4.5 biosnoop
bio size直方图
1 2 3 4 5 6 7 8 # biosnoop TIME(s) COMM PID DISK T SECTOR BYTES LAT (ms) 0.000004001 supervise 1950 xvda1 W 13092560 4096 0.74 0.000178002 supervise 1950 xvda1 W 13092432 4096 0.61 0.001469001 supervise 1956 xvda1 W 13092440 4096 1.24 0.001588002 supervise 1956 xvda1 W 13115128 4096 1.09 1.022346001 supervise 1950 xvda1 W 13115272 4096 0.98 [...]
3.4.6 cachestat
统计文件系统缓存
1 2 3 4 5 6 7 # cachestat HITS MISSES DIRTIES HITRATIO BUFFERS_MB CACHED_MB 53401 2755 20953 95.09 % 14 90223 49599 4098 21460 92.37 % 14 90230 16601 2689 61329 86.06 % 14 90381 15197 2477 58028 85.99 % 14 90522 [...]
3.4.7 tcpconnect
统计系统调用connect
1 2 3 4 5 6 7 8 # tcpconnect PID COMM IP SADDR DADDR DPORT 1479 telnet 4 127.0 .0.1 127.0 .0.1 23 1469 curl 4 10.201 .219.236 54.245 .105.25 80 1469 curl 4 10.201 .219.236 54.67 .101.145 80 1991 telnet 6 ::1 ::1 23 2015 ssh 6 fe80::2000 :bff:fe82:3 ac fe80::2000 :bff:fe82:3 ac 22 [...]
3.4.8 tcpaccept
统计系统调用accept
1 2 3 4 5 6 # tcpaccept PID COMM IP RADDR LADDR LPORT 907 sshd 4 192.168 .56.1 192.168 .56.102 22 907 sshd 4 127.0 .0.1 127.0 .0.1 22 5389 perl 6 1234 :ab12:2040 :5020 :2299 :0 :5 :0 1234 :ab12:2040 :5020 :2299 :0 :5 :0 7001 [...]
3.4.9 tcpretrans
tcp重传统计
1 2 3 4 5 6 # tcpretrans TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE 01 :55 :05 0 4 10.153 .223.157 :22 R> 69.53 .245.40 :34619 ESTABLISHED01 :55 :05 0 4 10.153 .223.157 :22 R> 69.53 .245.40 :34619 ESTABLISHED01 :55 :17 0 4 10.153 .223.157 :22 R> 69.53 .245.40 :22957 ESTABLISHED[...]
3.4.10 runqlat
cpu上运行队列统计
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # runqlat Tracing run queue latency... Hit Ctrl-C to end. ^C usecs : count distribution 0 -> 1 : 233 |*********** |2 -> 3 : 742 |************************************ |4 -> 7 : 203 |********** |8 -> 15 : 173 |******** |16 -> 31 : 24 |* |32 -> 63 : 0 | |64 -> 127 : 30 |* |128 -> 255 : 6 | |256 -> 511 : 3 | |512 -> 1023 : 5 | |1024 -> 2047 : 27 |* |2048 -> 4095 : 30 |* |4096 -> 8191 : 20 | |8192 -> 16383 : 29 |* |16384 -> 32767 : 809 |****************************************|32768 -> 65535 : 64 |*** |
3.4.11 profile
采集CPU
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 # profile Sampling at 49 Hertz of all threads by user + kernel stack ... Hit Ctrl-C to end. ^C [...] copy_user_enhanced_fast_string copy_user_enhanced_fast_string _copy_from_iter_full tcp_sendmsg_locked tcp_sendmsg inet_sendmsg sock_sendmsg sock_write_iter new_sync_write __vfs_write vfs_write SyS_write do_syscall_64 entry_SYSCALL_64_after_hwframe [unknown] [unknown] - iperf (24092 ) 58
第四章 4.1 BCC的组件
4.2 BCC的特性
4.3 安装BCC
4.4 BCC的工具
4.5 funccount
统计指定函数调用次数
4.6 stackcount
统计栈调用次数
4.7 trace
用于跟踪和调试应用程序的执行过程
4.8 argdist
函数参数分布统计
4.9 工具文档 4.9.1 Man Page: opensnoop
4.9.2 Examples File: opensnoop 4.10 开发BCC工具
4.11 BCC的内部实现
4.12 BCC的调试 4.12.1 printf() Debugging 4.12.2 BCC Debug output 4.12.3 BCC Debug Flag 4.12.4 bpflist 4.12.6 dmesg 4.12.7 Resetting Events 4.13 Summary
第五章 5.1 bpftrace的组件
5.2 bpftrace的特性 5.2.1 bpftrace Event Sources 5.2.2 bpftrace Actions 5.2.3 bpftrace General Features
5.3 bpftrace的安装 5.3.1 Kernel Requirements 5.3.2 Ubuntu 5.3.3 Fedora 5.3.4 Post-Build Steps 5.3.5 Other Distributions
5.4 bpftrace工具
5.5 bpftrace单行程序
5.6 bpftrace的文档
5.7 bpftrace编程 5.7.1 Usage 5.7.2 Program Structure 5.7.5 Probe Wildcards 5.7.6 Filters 5.7.7 Actions 5.7.8 Hello, World! 5.7.9 Functions 5.7.10 Variables 5.7.11 Map Functions 5.7.12 Timing vfs_read()
5.8 bpftrace的帮助信息
5.9 bpftrace的探针类型 5.9.1 tracepoint 5.9.2 usdt 5.9.3 kprobe and kretprobe 5.9.4 uprobe and uretprobe 5.9.5 software and hardware 5.9.6 profile and interval
5.10 bpftrace的控制流 5.10.1 Filter 5.10.2 Ternary Operators 5.10.3 If Statements 5.10.4 Unrolled Loops 5.13.3 str()
5.11 bpftrace的运算符
5.12 bpftrace的变量 5.12.1 Built-in Variables 5.12.2 Built-ins: pid, comm, and uid 5.12.3 Built-ins: kstack and ustack 5.12.4 Built-ins: Positional Parameters 5.12.5 Scratch 5.12.6 Maps
5.13 bpftrace的函数 5.13.1 printf() 5.13.2 join() 5.13.3 str() 5.13.4 kstack() and ustack() 5.13.5 ksym() and usym() 5.13.6 kaddr() and uaddr() 5.13.7 system() 5.13.8 exit()
5.14 bpftrace映射表的操作函数 5.14.1 count() 5.14.2 sum(), avg(), min(), and max() 5.14.3 hist() 5.14.4 lhist() 5.14.5 delete() 5.14.6 clear() and zero() 5.14.7 print()
5.15 bpftrace的下一步工作 5.15.1 Explicit Address Modes 5.15.2 Other Additions 5.15.3 ply
5.16 bpftrace的内部运作
5.17 bpftrace的调试 5.17.1 printf() Debugging 5.17.2 Debug Mode 5.17.3 Verbose Mode
5.18 Summary 第六章 CPU 6.1 背景知识 6.1.1 CPU Fundamentals 6.1.2 BPF Capabilities 6.1.3 Strategy
6.2 传统工具 6.2.1 Kernel Statistics 6.2.2 Hardware Statistics 6.2.3 Hardware Sampling 6.2.4 Timed Sampling 6.2.5 Event Statistics and Tracing
6.3 BPF工具 6.3.1 execsnoop 6.3.2 exitsnoop 6.3.3 runqlat 6.3.4 runqlen 6.3.5 runqslower 6.3.6 cpudist 6.3.7 cpufreq 6.3.8 profile 6.3.9 offcputime 6.3.10 syscount 6.3.11 argdist and trace 6.3.12 funccount 6.3.13 softirqs 6.3.14 hardirqs 6.3.15 smpcalls 6.3.16 llcstat
6.4 BPF单行程序 6.4.1 BCC 6.4.2 bpftrace
第七章 内存 7.1 背景知识 7.1.1 Memory Fundamentals 7.1.2 BPF Capabilities 7.1.3 Strategy
7.2 传统工具 7.2.1 Kernel Log 7.2.2 Kernel Statistics 7.2.3 Hardware Statistics and Sampling
7.3 BPF工具 7.3.1 oomkill 7.3.2 memleak 7.3.3 mmapsnoop 7.3.4 brkstack 7.3.5 shmsnoop 7.3.6 faults 7.3.7 ffaults 7.3.8 vmscan 7.3.9 drsnoop 7.3.10 swapin 7.3.11 hfaults
7.4 BPF单行程序 7.4.1 BCC 7.4.2 bpftrace
7.5 Optional Exercises 7.6 Summary 第八章 文件系统 8.1 背景知识 8.1.1 File Systems Fundamentals 8.1.2 BPF Capabilities 8.1.3 Strategy
8.2 传统工具 8.2.1 df 8.2.2 mount 8.2.3 strace 8.2.4 perf 8.2.5 fatrace
8.3 BPF工具 8.3.1 opensnoop 8.3.2 statsnoop 8.3.3 syncsnoop 8.3.4 mmapfiles 8.3.5 scread 8.3.6 fmapfaul 8.3.7 filelife 8.3.8 vfssta 8.3.8 vfssta 8.3.9 vfscount 8.3.10 vfssiz 8.3.11 fsrwsta 8.3.12 fileslower 8.3.13 filetop 8.3.14 writesync 8.3.15 filetype 8.3.16 cachestat 8.3.17 writeback 8.3.18 dcstat 8.3.19 dcsnoop 8.3.20 mountsnoop 8.3.21 xfsslower 8.3.22 xfsdist 8.3.23 ext4dist 8.3.24 icstat 8.3.25 bufgrow 8.3.26 readahead
8.4 BPF单行程序 第九章 磁盘io
9.1 背景知识
9.2 传统工具
9.3 BPF工具 9.3.1 biolatency 9.3.2 biosnoop 9.3.3 biotop 9.3.4 bitesize 9.3.5 seeksize 9.3.6 biopattern 9.3.7 biostacks 9.3.8 bioerr 9.3.9 mdflush 9.3.10 iosched 9.3.11 scsilatency 9.3.12 scsiresult 9.3.13 nvmelatency
9.4 BPF单行程序
第十章 网络 10.1 背景知识
10.2 传统工具
10.3 BPF工具 10.3.1 sockstat 10.3.2 sofamily 10.3.3 soprotocol 10.3.4 soconnect 10.3.5 soaccept 10.3.6 socketio 10.3.7 socksize 10.3.8 sormem 10.3.9 soconnlat 10.3.10 so1stbyte 10.3.11 tcpconnect 10.3.12 tcpaccept 10.3.13 tcplife 10.3.14 tcptop 10.3.15 tcpsnoop 10.3.16 tcpretrans 10.3.17 tcpsynbl 10.3.18 tcpwin 10.3.19 tcpnagle 10.3.20 udpconnect 10.3.21 gethostlatency 10.3.22 ipecn 10.3.23 superping 10.3.24 qdisc-fq 10.3.25 qdisc-cbq, qdisc-cbs, qdisc-codel, qdisc-fq_codel, qdisc-red, and qdisc-tbf 10.3.26 netsize 10.3.27 nettxlat 10.3.28 skbdrop 10.3.29 skblife 10.3.30 ieee80211scan
10.4 BPF单行程序 10.4.1 BCC 10.4.2 bpftrace 10.4.3 BPF One-Liners Examples
第十一章 安全 11.1 背景知识 11.1.1 BPF Capabilities 11.1.2 Unprivileged BPF Users 11.1.3 Configuring BPF Security 11.1.4 Strategy
11.2 BPF工具 11.2.1 execsnoop 11.2.2 elfsnoop 11.2.3 modsnoop 11.2.4 bashreadline 11.2.5 shellsnoop 11.2.6 ttysnoop 11.2.7 opensnoop 11.2.8 eperm 11.2.9 tcpconnect and tcpaccept 11.2.10 tcpreset 11.2.11 capable 11.2.12 setuids
11.3 BPF单行程序 11.3.1 BCC 11.3.2 bpftrace 11.3.3 BPF One-Liners Examples
第十二章 编程语言
12.1 背景知识 12.1.1 Compiled 12.1.2 JIT Compiled 12.1.3 Interpreted 12.1.4 BPF Capabilities 12.1.5 Strategy
12.2 C 12.2.1 C Function Symbols 12.2.2 C Stack Traces 12.2.3 C Function Tracing 12.2.4 C Function Offset Tracing 12.2.5 C USDT 12.2.6 C One-Liners
12.3 Java 12.3.1 libjvm Tracing 12.3.2 jnistacks 12.3.3 Java Thread Names 12.3.4 Java Method Symbols 12.3.5 Java Stack Traces 12.3.6 Java USDT Probes 12.3.7 profile 12.3.8 offcputime 12.3.9 stackcount 12.3.10 javastat 12.3.11 javathreads 12.3.12 javacalls
12.3.13 javaflow
12.3.14 javagc
12.3.15 javaobjnew
12.3.16 Java One-Liners
12.4 bash shell 12.4.1 Function Counts
12.4.2 Function Argument Tracing (bashfunc.bt)
12.4.3 Function Latency (bashfunclat.bt)
12.4.4 /bin/bash
12.4.5 /bin/bash USDT
12.4.6 bash One-Liners
12.5 其他语言 12.5.1 JavaScript (Node.js)
12.5.2 C++
12.5.3 Golang
12.6 Summary
第十三章 应用程序 13.1 背景知识 13.1.1 Application Fundamentals
13.1.2 Application Example: MySQL Server
13.1.3 BPF Capabilities
13.1.4 Strategy
13.2 BPF工具 13.2.1 execsnoop
13.2.2 threadsnoop
13.2.3 profile
13.2.4 threaded
13.2.5 offcputime
13.2.6 offcpuhist
13.2.7 syscount
13.2.8 ioprofile
13.2.9 libc Frame Pointers
13.2.10 mysqld_qslower
13.2.11 mysqld_clat
13.2.12 signals
13.2.13 killsnoop
13.2.14 pmlock and pmheld
13.2.15 naptime
13.3 单行程序 13.3.1 BCC
13.3.2 bpftrace
13.4 BPF单行程序示例
第十四章 内核
14.1 背景知识 14.1.1 Kernel Fundamentals
14.1.2 BPF Capabilities
14.2 分析策略
14.3 传统工具 14.3.1 Ftrace
14.3.2 perf sched
14.3.3 slabtop
14.4 BPF工具 14.4.1 loads
14.4.2 offcputime
14.4.3 wakeuptime
14.4.4 offwaketime
14.4.5 mlock and mheld
14.4.6 Spin Locks
14.4.7 kmem
14.4.8 kpages
14.4.9 memleak
14.4.10 slabratetop
14.4.11 numamove
14.4.12 workq
14.4.13 Tasklets
14.5 BPF单行程序 14.5.1 BCC 14.5.2 bpftrace
14.6 BPF单行程序示例 14.6.1 Counting System Calls by Syscall Function 14.6.2 Counting hrtimer Starts by Kernel Function
14.7 Challenges 14.8 Summary
第十五章 容器
15.1 背景知识 15.1.1 BPF Capabilities
15.1.2 Challenges
15.1.3 Strategy
15.2 传统工具 15.2.1 From the Host
15.2.2 From the Container
15.2.3 systemd-cgtop
15.2.4 kubectl top
15.2.5 docker stats
15.2.6 /sys/fs/cgroups
15.2.7 perf
15.3 BPF工具 15.3.1 runqlat
15.3.2 pidnss
15.3.3 blkthrot
15.3.4 overlayfs
15.4 BPF单行程序
15.5 可选练习
第十六章 虚拟机管理器
16.1 背景知识 16.1.1 BPF Capabilities
16.1.2 Suggested Strategies
16.2 传统工具
16.3 访客系统的BPF工具 16.3.1 Xen Hypercalls
16.3.2 xenhyper
16.3.3 Xen Callbacks
16.3.4 cpustolen
16.3.5 HVM Exit Tracing
16.4 宿主机的BPF工具 16.4.1 kvmexits
16.4.2 Future Work
第十七章 其他BPF性能工具 17.1.1 Visualizations
17.1.2 Visualization: Heat Maps
17.1.3 Visualization: Tabular Data
17.1.4 BCC Provided Metrics
17.1.5 Internals
17.1.6 Installing PCP and Vector
17.1.7 Connecting and Viewing Data
17.1.8 Configuring the BCC PMDA
17.1.9 Future Work
17.1.10 Further Reading
17.2.1 Installation and Configuration
17.2.2 Connecting and Viewing Data
17.2.3 Future Work
17.2.4 Further Reading
17.3 Cloudflare eBPF Prometheus Exporter (with Grafana) 17.3.1 Build and Run the ebpf Exporter
17.3.3 Set Up a Query in Grafana
17.3.4 Further Reading
17.4 kubectl-trace 17.4.1 Tracing Nodes
17.4.2 Tracing Pods and Containers
17.4.3 Further Reading
17.6 Summary