为什么使用bpf分析耗时问题
网络性能时分析抓包工具是一大利器,tcpdump/wireshark等抓包工具底层都是使用bpf提供的库,抓包位置在软中断从网卡队列(ring buffer)中读取数据后发送给协议栈时同时发给抓包程序一份,也就是只能知道同一seq数据包在网卡接口处一来一回的时间,比较难分析的延时问题都是偶尔的一次抖动(一天一次,几个小时一次这样),所以抖动的原因是发生在内核还是用户太就说不清了(在tcp各监控指标都看不出异常的情况下)。使用bpf可通过kprobe或者tracepoint在协议栈各层的关键函数中添加hook点,当数据包经过该函数时,打印出seq、network namespace、时间戳等关键信息,帮助我们快速定位或者缩小问题范围。
ebpf工作原理
-
这里简单介绍下ebpf的原理,用户写好的ebpf程序,通过LLVM或者clang编译好后,通过bpf校验器校验(如防止程序导致内核崩溃),插入到内核中的bpf虚拟机中,例如图中的filter位置。通过网络处理函的静态探针(也就是tracepoint内核代码中预先定义好的hook点)或者动态探针(内核函数中只有少数函数有tracepoint,没有tracepoint的函数就需要静态探针如:kprobe,kprobe通过注册函数地址,然后在函数被调用时调用我们的bpf程序)调用bpf程序,过滤结果存到buffer,用户层通过系统调用获取过滤结果。这个图来自tcpdump papers值给出了从网卡接口函数获数据包,对于protocal stack或者内核其他子系统的工作原理都类似。
- 本文使用的bpftrace开发bpf程序,bpftrace(https://github.com/iovisor/bpftrace)实现了一套前端语言类似awk和c,不需要关心bpf程序的编译过程。同时作者也将bcc的工具用bpftrace实现了一份,安装即用。bpftrace开发的好处就是其语法简单,缺点就是太简单,不能自定义函数,不能调用内核函数,所以只适合开发一些简单的小工具。开发本文介绍的小工具时,已经有些吃力。
场景
容器环境中,用户反馈在流量没有突增的情况下,请求redis p99时而发生抖动,频率大概是一天一次。排除redis服务端,以及容器的cpu、内存等问题。数据包在协议栈的耗时无法得知,所以决定完善tcp高级指标的监控。同时我们也需要一个工具,能够知道从网卡队列到tcp状态机之前的耗时。
协议栈关键函数
- 发送:
ip_queue_xmit(tcp层处理后的段都要调用这个函数发送ip层的发送队列)------> ip_finish_output2(进入到链路层之前的处理函数,也就是netfilter和路由之后) -----> __dev_queue_xmit(将数据发送到驱动层) - 接收:
netif_rx(软中断调用的处理函数)-----> __netif_receive_skb(协议栈报文接收的入口)-----> ip_rcv(ip层入口)-----> tcp_v4_rcv(tcp层入口)-----> tcp_rcv_state_process(tcp状态机) - 网桥(因为容器使用了docker):
br_handle_frame
大家可能看到发送过程没有4层的关键函数,因为调用ip_queue_xmit之前是没有填充完整的tcp协议头的,没办法在一个函数中获取到我们需要的完整系信息。
-
下面这张图描述了网络报从宿主机到容器的一些关键函数
使用bpftrace实现功能
- bpftrace的语法建议直接去github上看,文档比较新。如果要在centos上使用bpftrace,建议使用ubuntu打镜像直接使用ubuntu的安装包,可以参考上篇文章https://www.jianshu.com/p/54cb1f2b3614。
- 下边直接给出__netif_receive_skb函数的kprobe打点代码,其他函数的代码基本一样。上边说了,由于bpftrace语法功能太简单,因为不能定义函数,所以代码重复很多。另外提一下,bpftrace还是强类型的,所以过滤ip功能,要自己把需要过滤的ip转化成网络字节序。
- 一共使用了3个命令行参数,参数一和二用于过滤IP,参数三是物理机network ns,这个ns的作用是在宿主机的二层数据处理函数__netif_receive_skb上打上时间戳,然后在容器的ns中的tcp_rcv_state_process函数处计算下时延。这样方便快速的判断是否有异常。我们程序中时延大于30ms就会打印一条日志。这是接收时,发送数据亦然。
kprobe:__netif_receive_skb
/
$1 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr ||
$1 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->daddr ||
$2 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr ||
$2 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr
/
{
$skb = ((struct sk_buff *) arg0);
$net = $skb->dev->nd_net.net;
$netif = $skb->dev->name;
$nsid = $net->ns.inum;
$iphd = ((struct iphdr *)($skb->head + $skb->network_header));
$srcaddr = $iphd->saddr;
$dstaddr = $iphd->daddr;
$srcip = ntop($iphd->saddr);
$dstip = ntop($iphd->daddr);
if ($iphd->protocol == IPPROTO_TCP) {
$tcphd = ((struct tcphdr *)($skb->head + $skb->transport_header));
$sport = $tcphd->source;
$sport = ($sport >> 8) | (($sport << 8) & 0x00FF00);
$dport = $tcphd->dest;
$dport = ($dport >> 8) | (($dport << 8) & 0x00FF00);
// __u32 be32_to_cpu(const __be32);
$seq = $tcphd->seq;
$seq = ($seq >> 24) | (($seq & 0x00FF0000) >> 8) | (( $seq & 0x00FF00) << 8) | (($seq & 0xFF) << 24);
$ack = $tcphd->ack_seq;
$ack = ($ack >> 24) | (($ack & 0x00FF0000) >> 8) | (( $ack & 0x00FF00) << 8) | (($ack & 0xFF) << 24);
if($nsid == $3) {
@rcvpkg[$seq] = nsecs;
}
$win = $tcphd->window;
$win = ($win >> 8) | (($win << 8) & 0x00FF00);
$pkgflag = $tcphd->syn > 0 ? "syn" : ($tcphd->fin > 0 ? "fin" : ($tcphd->rst > 0 ? "rst" : ($tcphd->psh > 0 ? "psh" : ($tcphd->ack > 0 ? "ack" : ""))));
time("%H:%M:%S ");
printf("%-19u %-15s %d,%s,%s,%-20d ", $nsid, $netif, pid, comm, func, cpu);
printf("flags:%s, seq:%-u, ack:%u, win:%-25u ", $pkgflag, $seq, $ack, $win);
printf("%s:%-15d %s:%-15d %d ms\n", $srcip, $sport, $dstip, $dport, (nsecs / 1000000) % 1000);
}
}
测试验证
- 用docker启一个nginx容器
[root@ ~]# docker ps | grep 639267e5f60f
639267e5f60f nginx "/docker-entrypoint.…" 13 days ago
- 获取docker pid
[root@ ~]# docker inspect --format {{.State.Pid}} 639267e5f60f
1946308
- 模拟容器网卡eth0发包延时100ms
nsenter -n -t 1946308 tc qdisc add dev eth0 root netem delay 100ms
- 模拟veth peer,就是宿主机上的veth,200ms的发包延时
tc qdisc add dev veth4458ea2 root netem delay 200ms
- 运行工具
root@:/# ./pkgtool.bt 1933764362 1966204682 4026531992
Attaching 11 probes...
Tracing tcp pkg. Hit Ctrl-C to end.
TIME NETWORK NS INTERFACE PID,COMM,FUNC,CPU PKGFLAGS SADDR:SPORT DADDR:DPORT TIMESTAMP
- 请求容器中的nginx,在除容器的宿主机之外的机器上请求。
[root@ ~]# curl 10.231.66.115(容器ip)
- 观察输出结果
12:11:24 4026531992 eth0 0,swapper/5,__netif_receive_skb,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 787 ms
12:11:24 4026531992 eth0 0,swapper/5,ip_rcv,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 787 ms
12:11:24 4026531992 docker0 0,swapper/5,ip_finish_output2,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 787 ms
12:11:24 4026531992 docker0 0,swapper/5,__dev_queue_xmit,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 787 ms
12:11:24 4026531992 veth4458ea2 0,swapper/5,__dev_queue_xmit,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 787 ms
12:11:24 4026538786 eth0 0,swapper/5,netif_rx,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 987 ms
12:11:24 4026538786 eth0 0,swapper/5,__netif_receive_skb,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 987 ms
12:11:24 4026538786 eth0 0,swapper/5,ip_rcv,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 988 ms
12:11:24 4026538786 eth0 0,swapper/5,tcp_v4_rcv,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 988 ms
WARN:RCV slow pkg: duration is 200 seq is 3511028477
12:11:24 4026538786 0,swapper/5,tcp_rcv_state_process,5 flags:syn, seq:3511028477, ack:0, win:0 10.231.66.115:49847 10.46.242.54:80 988 ms
12:11:24 4026538786 eth0 0,swapper/5,ip_finish_output2,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 988 ms
12:11:24 4026538786 eth0 0,swapper/5,__dev_queue_xmit,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 988 ms
12:11:24 4026531992 veth4458ea2 0,swapper/5,netif_rx,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 veth4458ea2 36,ksoftirqd/5,__netif_receive_skb,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 veth4458ea2 36,ksoftirqd/5,br_handle_frame_finish,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 docker0 36,ksoftirqd/5,__netif_receive_skb,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 docker0 36,ksoftirqd/5,ip_rcv,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 eth0 36,ksoftirqd/5,ip_finish_output2,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 eth0 36,ksoftirqd/5,__dev_queue_xmit,5 flags:syn, seq:278643593, ack:3511028478, win:0 10.46.242.54:80 10.231.66.115:49847 88 ms
12:11:24 4026531992 eth0 0,swapper/5,__netif_receive_skb,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 eth0 0,swapper/5,ip_rcv,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 docker0 0,swapper/5,ip_finish_output2,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 docker0 0,swapper/5,__dev_queue_xmit,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 veth4458ea2 0,swapper/5,__dev_queue_xmit,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 eth0 0,swapper/5,__netif_receive_skb,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 eth0 0,swapper/5,ip_rcv,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 docker0 0,swapper/5,ip_finish_output2,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 docker0 0,swapper/5,__dev_queue_xmit,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026531992 veth4458ea2 0,swapper/5,__dev_queue_xmit,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 95 ms
12:11:24 4026538786 eth0 0,swapper/5,netif_rx,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,__netif_receive_skb,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,ip_rcv,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,tcp_v4_rcv,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
WARN:RCV slow pkg: duration is 200 seq is 3511028478
12:11:24 4026538786 eth0 36,ksoftirqd/5,tcp_rcv_state_process,5 flags:ack, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,netif_rx,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,__netif_receive_skb,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,ip_rcv,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,tcp_v4_rcv,5 flags:psh, seq:3511028478, ack:278643594, win:0 10.231.66.115:49847 10.46.242.54:80 295 ms
12:11:24 4026538786 36,ksoftirqd/5,__ip_queue_xmit,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,ip_finish_output2,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 295 ms
12:11:24 4026538786 eth0 36,ksoftirqd/5,__dev_queue_xmit,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 295 ms
12:11:24 4026538786 1946355,nginx,__ip_queue_xmit,10 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:24 4026538786 eth0 1946355,nginx,ip_finish_output2,10 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:24 4026538786 eth0 1946355,nginx,__dev_queue_xmit,10 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:24 4026538786 1946355,nginx,__ip_queue_xmit,10 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:24 4026538786 eth0 1946355,nginx,ip_finish_output2,10 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:24 4026538786 eth0 1946355,nginx,__dev_queue_xmit,10 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 296 ms
12:11:25 4026531992 veth4458ea2 0,swapper/5,netif_rx,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 395 ms
12:11:25 4026531992 veth4458ea2 0,swapper/5,__netif_receive_skb,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 0,swapper/5,br_handle_frame_finish,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 0,swapper/5,__netif_receive_skb,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 0,swapper/5,ip_rcv,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 eth0 0,swapper/5,ip_finish_output2,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
WARN: SEND slow pkg: duration is 99 seq is 278643594
12:11:25 4026531992 eth0 0,swapper/5,__dev_queue_xmit,5 flags:ack, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,netif_rx,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,__netif_receive_skb,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,br_handle_frame_finish,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 36,ksoftirqd/5,__netif_receive_skb,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 36,ksoftirqd/5,ip_rcv,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 eth0 36,ksoftirqd/5,ip_finish_output2,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 eth0 36,ksoftirqd/5,__dev_queue_xmit,5 flags:psh, seq:278643594, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,netif_rx,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,__netif_receive_skb,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 veth4458ea2 36,ksoftirqd/5,br_handle_frame_finish,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 36,ksoftirqd/5,__netif_receive_skb,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 docker0 36,ksoftirqd/5,ip_rcv,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 eth0 36,ksoftirqd/5,ip_finish_output2,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
WARN: SEND slow pkg: duration is 100 seq is 278643832
12:11:25 4026531992 eth0 36,ksoftirqd/5,__dev_queue_xmit,5 flags:psh, seq:278643832, ack:3511028554, win:0 10.46.242.54:80 10.231.66.115:49847 396 ms
12:11:25 4026531992 eth0 0,swapper/5,__netif_receive_skb,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 408 ms
12:11:25 4026531992 eth0 0,swapper/5,ip_rcv,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 408 ms
12:11:25 4026531992 docker0 0,swapper/5,ip_finish_output2,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 408 ms
12:11:25 4026531992 docker0 0,swapper/5,__dev_queue_xmit,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 408 ms
12:11:25 4026531992 veth4458ea2 0,swapper/5,__dev_queue_xmit,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 408 ms
12:11:25 4026538786 eth0 0,swapper/5,netif_rx,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 608 ms
12:11:25 4026538786 eth0 0,swapper/5,__netif_receive_skb,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 608 ms
12:11:25 4026538786 eth0 0,swapper/5,ip_rcv,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 608 ms
12:11:25 4026538786 eth0 0,swapper/5,tcp_v4_rcv,5 flags:fin, seq:3511028554, ack:278644444, win:0 10.231.66.115:49847 10.46.242.54:80 608 ms
12:11:25 4026538786 1946355,nginx,__ip_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026538786 eth0 1946355,nginx,ip_finish_output2,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026538786 eth0 1946355,nginx,__dev_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,netif_rx,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,__netif_receive_skb,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,br_handle_frame_finish,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 docker0 0,swapper/10,__netif_receive_skb,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 docker0 0,swapper/10,ip_rcv,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 eth0 0,swapper/10,ip_finish_output2,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
WARN: SEND slow pkg: duration is 100 seq is 278644444
12:11:25 4026531992 eth0 0,swapper/10,__dev_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 eth0 0,swapper/5,__netif_receive_skb,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 720 ms
12:11:25 4026531992 eth0 0,swapper/5,ip_rcv,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 720 ms
12:11:25 4026531992 docker0 0,swapper/5,ip_finish_output2,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 720 ms
12:11:25 4026531992 docker0 0,swapper/5,__dev_queue_xmit,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 720 ms
12:11:25 4026531992 veth4458ea2 0,swapper/5,__dev_queue_xmit,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 720 ms
12:11:25 4026538786 eth0 0,swapper/5,netif_rx,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 920 ms
12:11:25 4026538786 eth0 0,swapper/5,__netif_receive_skb,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 921 ms
12:11:25 4026538786 eth0 0,swapper/5,ip_rcv,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 921 ms
12:11:25 4026538786 eth0 0,swapper/5,tcp_v4_rcv,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 921 ms
WARN:RCV slow pkg: duration is 200 seq is 3511028555
12:11:25 4026538786 0,swapper/5,tcp_rcv_state_process,5 flags:ack, seq:3511028555, ack:278644445, win:0 10.231.66.115:49847 10.46.242.54:80 921 ms
- 分析结果,发送包的100ms和接收包的200ms都已经打印出来了。拿“WARN: SEND slow pkg: duration is 100 seq is 278644444”这条日志看看,序列号是“278644444”的包从容器到宿主机上延时超过了100ms(正常的情况下不会超过5ms,如果超过5ms物理机压力可能比较大,长时间这样的话需要额外关注了)。拿着序列号“seq:278644444”,可以过滤出下面的信息:
12:11:25 4026538786 1946355,nginx,__ip_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026538786 eth0 1946355,nginx,ip_finish_output2,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026538786 eth0 1946355,nginx,__dev_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 608 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,netif_rx,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,__netif_receive_skb,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 veth4458ea2 0,swapper/10,br_handle_frame_finish,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 docker0 0,swapper/10,__netif_receive_skb,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 docker0 0,swapper/10,ip_rcv,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
12:11:25 4026531992 eth0 0,swapper/10,ip_finish_output2,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
WARN: SEND slow pkg: duration is 100 seq is 278644444
12:11:25 4026531992 eth0 0,swapper/10,__dev_queue_xmit,10 flags:fin, seq:278644444, ack:3511028555, win:0 10.46.242.54:80 10.231.66.115:49847 708 ms
可以看到容器网卡发包后(__dev_queue_xmit)到veth收包(netif_rx),时间戳由608 ms 到了708 ms(为了直观,我们把时间戳由ns转化为了ms)。这样很快就能把问题缩小到网卡接口层。
- 上边的测试只是为了说明工具,定位问题便捷。我们目前还没遇见在这一链路发生延时的问题,即使这样我们能快速的排除这段链路的问题,缩小问题范围。
最后
- bpftrace作者提供了很多实用工具,也可以根据其提供的编程语言实现自己的工具。在传统工具得不到什么信息时,bpftrace可能为你开荒破土。
- 由于bpftrace语法功能的原因,这个工具还是打印太多的数据,如果有需要像这样稍微复杂一点的工具,应该直接用c写,只打印出时延超过预期的数据。
- 网络性能分析时,一直想要确定物理机二层到容器tcp层之间是否有问题,也一直想有个这样的工具。得益于“参考”的第一个连接ebay的一边性能分析文章,否则不会这么顺畅。
参考
https://mp.weixin.qq.com/s/ZUS94PMCKsqgZFHX9b99-g
https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
https://www.tcpdump.org/papers/bpf-usenix93.pdf