一. docker网络概述
最近在尝试使用docker实现出一个简易的lvs,因此需要对docker的网通信有一个比较清晰的认识。众所周知,docker是基于linux 的namespace和CGroup技术实现资源的隔离,因此在隔离的基础之上,需要实现容器和外部网络以及不同容器之间的通信。
今天一起了解一下docker中不同容器之间的通信原理。
二.docker网络管理
1.安装网桥管理工具
yum -y install bridge-utils //安装bridge-utils工具
brctl show //docker0网桥上关联了3接口,虚拟网卡对,其中一对插在docker0上,另外一对在容器上
bridge name bridge id STP enabled interfaces
docker0 8000.0242cf2c2f1e no veth3ebe827
veth91e2dfa
vethfa44035
2.docker网络基础
ip link show //ip接口查询
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:4c:8b:a6 brd ff:ff:ff:ff:ff:ff
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:cf:2c:2f:1e brd ff:ff:ff:ff:ff:ff
5: veth3ebe827@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether de:08:fb:fa:18:5c brd ff:ff:ff:ff:ff:ff link-netnsid 0
9: veth91e2dfa@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether ae:4d:e5:ab:8e:90 brd ff:ff:ff:ff:ff:ff link-netnsid 1
11: vethfa44035@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 76:3b:de:b3:17:ce brd ff:ff:ff:ff:ff:ff link-netnsid 2
iptables -t nat -vnL //查看nat 规则POSTROUTING (地址委托,自动选择一个最合适的地址),从任何接口进来,只要不从docker0桥上出去,源地址172.18.0.0,目的地址是任何地址
Chain PREROUTING (policy ACCEPT 1032K packets, 621M bytes)
pkts bytes target prot opt in out source destination
244K 9070K DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT 235K packets, 8691K bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 125K packets, 7691K bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT 125K packets, 7693K bytes)
pkts bytes target prot opt in out source destination
796K 613M MASQUERADE all -- * !docker0 172.18.0.0/16 0.0.0.0/0
0 0 MASQUERADE tcp -- * * 172.18.0.2 172.18.0.2 tcp dpt:80
0 0 MASQUERADE tcp -- * * 172.18.0.3 172.18.0.3 tcp dpt:80
0 0 MASQUERADE tcp -- * * 172.18.0.4 172.18.0.4 tcp dpt:80
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
2 168 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
26 1312 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8001 to:172.18.0.2:80
6 292 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8002 to:172.18.0.3:80
2 80 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8003 to:172.18.0.4:80
进入busybox容器内,访问nginx容器服务
docker exec -it myhttp1 sh
wget -O - -q http://172.18.0.2
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="Generator" content="EditPlus®">
<meta name="Author" content="">
<meta name="Keywords" content="">
<meta name="Description" content="">
<title>Document</title>
</head>
<body>
hello111
</body>
</html>
docker network ls
NETWORK ID NAME DRIVER SCOPE
eb959f8fbd39 bridge bridge local
eda31bfe2346 host host local
7e2ac7e43f07 none null local
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "eb959f8fbd396a04fadef114cebf2d81b90591c895d5d0f1e3396e11c0260435",
"Created": "2020-01-03T17:22:51.609258815+08:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"0282f65f4ac9dbe593920363075a57a71408ad523ac2874965280eb8803aff68": {
"Name": "myhttp1",
"EndpointID": "93df39181193d663b7e93907349d7c1de8da2c6ab02acd047c035272d8d1d069",
"MacAddress": "02:42:ac:12:00:04",
"IPv4Address": "172.18.0.4/16",
"IPv6Address": ""
},
"0c82865fb5f217d0c14b2edf038e4330274e1ef0a9b25ddf7fddbdd428109ec9": {
"Name": "mynginx2",
"EndpointID": "315452f8885ff072eb4b5216ff0d0995a747e2ebb09271601e4e123b0a7244b1",
"MacAddress": "02:42:ac:12:00:03",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"19518cf1ed5a78b32b25147b85c39b660d8b119dfda4b958ced7a899b1ee4273": {
"Name": "mynginx1",
"EndpointID": "5ce90e778148b135514db94ce05bd4cbc84d92705e0e1f85b60fa0e9e590ec3b",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
3.docker网络实践
创建两个网络名称空间:
ip netns add r1
ip netns add r2
ip netns list
r2
r1
网络名称空间中执行命令ifconfig:
ip netns exec r1 ifconfig //没有信息,因为没有被激活
ip netns exec r1 ifconfig -a
lo: flags=8<LOOPBACK> mtu 65536
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
创建虚拟网卡对,人工分配到网络名称空间中:
ip link add name veth1.1 type veth peer name veth1.2
ip link show
12: veth1.2@veth1.1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 2a:00:8a:72:7e:2c brd ff:ff:ff:ff:ff:ff
13: veth1.1@veth1.2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 06:c7:be:2d:29:ab brd ff:ff:ff:ff:ff:ff
把设备veth1.2挪到网络命名空间r1中:
ip link set dev veth1.2 netns r1
ip link show
13: veth1.1@if12: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 06:c7:be:2d:29:ab brd ff:ff:ff:ff:ff:ff link-netnsid 3
在网络命名空间中查看已经挪过来的veth1.2:
ip netns exec r1 ifconfig -a
veth1.2: flags=4098<BROADCAST,MULTICAST> mtu 1500
ether 2a:00:8a:72:7e:2c txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
在网络命名空间中改名veth1.2为eth0:
ip netns exec r1 ip link set dev veth1.2 name eth0
ip netns exec r1 ifconfig -a
eth0: flags=4098<BROADCAST,MULTICAST> mtu 1500
ether 2a:00:8a:72:7e:2c txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
激活veth1.1:
ifconfig veth1.1 10.1.0.1/24 up
veth1.1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.1.0.1 netmask 255.255.255.0 broadcast 10.1.0.255
ether 06:c7:be:2d:29:ab txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
在网络命名空间中激活veth1.2(eth0):
ip netns exec r1 ifconfig eth0 10.1.0.2/24 up
ip netns exec r1 ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.1.0.2 netmask 255.255.255.0 broadcast 10.1.0.255
inet6 fe80::2800:8aff:fe72:7e2c prefixlen 64 scopeid 0x20<link>
ether 2a:00:8a:72:7e:2c txqueuelen 1000 (Ethernet)
RX packets 8 bytes 648 (648.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8 bytes 648 (648.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ping 10.1.0.2
PING 10.1.0.2 (10.1.0.2) 56(84) bytes of data.
64 bytes from 10.1.0.2: icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from 10.1.0.2: icmp_seq=2 ttl=64 time=0.041 ms
把veth1.1挪到网络命名空间r2:
ip link set dev veth1.1 netns r2
ifconfig //可以看到veth1.1 消失了
ip netns exec r2 ifconfig veth1.1 10.1.0.3/24 up //激活
ip netns exec r2 ifconfig //可以看到veth1.1的ip信息
ip netns exec r2 ping 10.1.0.2 //在r2中ping r1的地址
4. docker 容器网络软交换
docker run --name t1 -ti --rm busybox:latest //退出就立即删除
docker ps -a //删除后查不到
bridge模式
这是docker默认的网络设置,各个容器都具有独立的网络命名空间,拥有自己的网卡。 docker为容器创建了独立的网络环境,实现了宿主机和容器、以及容器之间的网络隔离;还可以通过docker0网桥实现容器之间、容器与宿主机之间、或者容器与外界的网络通信。
docker run --name t1 --network bridge -ti --rm busybox:latest //指定网络bridge跟默认的一样
docker run --name t1 --network none -ti --rm busybox:latest //封闭式容器没有创建网络设备
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
docker run --name t1 --network bridge -h t1.maobe.com -ti --rm busybox:latest //设置主机名
hostname
t1.maobe.com
cat /etc/resolv.conf
nameserver 8.8.8.8
nameserver 172.18.0.1
nslookup -type=A www.baidu.com
Server: 8.8.8.8
Address: 8.8.8.8:53
Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com
www.a.shifen.com canonical name = www.wshifen.com
Name: www.wshifen.com
Address: 104.193.88.123
Name: www.wshifen.com
Address: 104.193.88.77
docker run --name t1 --network bridge -h t1.maobe.com --dns 114.114.114.114 -ti --rm busybox:latest //设置dns
cat /etc/resolv.conf
nameserver 114.114.114.114
options timeout:1 rotate
nslookup -type=A www.baidu.com
Server: 114.114.114.114
Address: 114.114.114.114:53
Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com
Name: www.a.shifen.com
Address: 220.181.38.150
Name: www.a.shifen.com
Address: 220.181.38.149
docker run --name t1 --network bridge -h t1.maobe.com --dns 114.114.114.114 --dns-search ilinux.io -ti --rm busybox:latest
cat /etc/resolv.conf
search ilinux.io
nameserver 114.114.114.114
options timeout:1 rotate
docker run --name t1 --network bridge -h t1.maobe.com --dns 114.114.114.114 --dns-search ilinux.io --add-host www.maobe.com:1.1.1.1 -ti --rm busybox:latest //注入host文件解析记录
cat /etc/hosts
1.1.1.1 www.maobe.com
172.18.0.2 t1.maobe.com t1
docker run --name mynginx3 -p 8003:80 -d nginx:latest
iptables -t nat -vnL
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
0 0 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8003 to:172.18.0.3:80
docker port mynginx3
80/tcp -> 0.0.0.0:8003 //从容器的80端口,映射到宿主机的8003端口
docker kill mynginx3
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d254eac61620 nginx:latest "nginx -g 'daemon of…" 13 minutes ago Exited (137) 4 seconds ago mynginx3
docker run --name mynginx3 -p 172.17.0.10:8003:80 --rm -d nginx:latest //暴露宿主机的ip和端口
docker port mynginx3
80/tcp -> 172.17.0.10:8003
docker kill mynginx3 //如果是加了--rm启动的话,kill就直接删掉了, docker ps -a也会不存在
container模式:
联盟式容器下的容器和宿主机之间的关系和bridge模式类似,在这种模式之下新创建的容器不会创建自己的网卡和IP,而是和前面指定的容器共享IP和端口范围等,并且该容器不具有独立的网络环境。
docker run --name t1 -ti --rm busybox:latest
docker run --name t2 --network container:t1 -ti --rm busybox:latest
echo "hello world" > /tmp/index.html
httpd -h /tmp
netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :::80 :::* LISTEN 9/httpd
在ti1容器中:
wget -O - -q localhost
hello world
host模式
host模式也称宿主机网络,通过容器和宿主机之间共享网络命名空间,从而使容器和宿主机之间共享IP地址,可以充分利用容器的优势,以及宿主机的网络;类似daemonset系统级守护进程。
docker run --name t2 --network host -ti --rm busybox:latest
eth0 Link encap:Ethernet HWaddr 52:54:00:4C:8B:A6
inet addr:172.17.0.10 Bcast:172.17.15.255 Mask:255.255.240.0
echo "hello container" > /tmp/index.html
httpd -h /tmp
在宿主机:curl localhost
hello container
5.docker网桥
docker0桥网络属性
自定义docker0桥的网络属性信息:
{
"bip": "192.168.1.5/24",
"fixed-cidr": "10.20.0.0/16",
"fixed-cidr-v6": "2001:db8::/64",
"mtu": 1500,
"default-gateway": "10.20.1.1",
"default-gateway-v6": "2001:db8:abcd::89",
"dns": ["10.20.1.2","10.20.1.3"]
}
核心选项为bip,即bridge ip之意,用于指定docker0桥自身的IP地址;其它选项可通过此地址计算得出。
dockerd守护进程的C/S,其默认仅监听Unix SOcket格式的地址,/var/run/docker.sock;如果使用TCP套接字,
/etc/docker/daemon.json:
"hosts": ["tcp://0.0.0.0:2375", "unix:///var/run/docker.sock"]
也可向dockerd直接传递“-H|--host”选项;
docker 系统信息
docker info
Client:
Debug Mode: false
Server:
Containers: 3
Running: 0
Paused: 0
Stopped: 3
Images: 5
Server Version: 19.03.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
。。。。。。
创建docker网络:
docker network create -d bridge --subnet "172.26.0.0/16" --gateway "172.26.0.1" mybr0
NETWORK ID NAME DRIVER SCOPE
3625122d6da0 mybr0 bridge local
ifconfig
br-3625122d6da0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.26.0.1 netmask 255.255.0.0 broadcast 172.26.255.255
ifconfig br-3625122d6da0 172.26.0.1/16 down //停止刚建的网桥
ip link set dev br-3625122d6da0 name mybr0 //改名
ifconfig mybr0 172.26.0.1/16 up //启动
ifconfig
mybr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.26.0.1 netmask 255.255.0.0 broadcast 172.26.255.255
创建mybr0的容器(需要把bridge name改过来):
docker run --name t2 --network mybr0 -ti --rm busybox:latest
ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:AC:1A:00:02
inet addr:172.26.0.2 Bcast:172.26.255.255 Mask:255.255.0.0
docker run --name t1 -it --rm busybox:latest
ping 172.26.0.2
cat /proc/sys/net/ipv4/ip_forward //宿主机查看网络是否可以转发
1