很久没开的k8s测试环境,今天打开发现在master节点查看node发现node2 notready 状态
在node2节点查看发现kubelet停止运行了
kubelet报错:
part of the existing bootstrap client certificate is expired: 2022-06-04
通过查看/etc/kubernetes/kubelet.conf 发现证书路径/var/lib/kubelet/pki/kubelet-client-current.pem
cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1......UtLS0tLQo=
server: https://192.168.100.201:6443
name: default-cluster
contexts:
- context:
cluster: default-cluster
namespace: default
user: default-auth
name: default-context
current-context: default-context
kind: Config
preferences: {}
users:
- name: default-auth
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
然后切换到/var/lib/kubelet/pki/ 路径下查看证书日期
cd /var/lib/kubelet/pki
ll
总用量 20
-rw------- 1 root root 1061 9月 14 2020 kubelet-client-2020-09-14-18-00-01.pem
-rw------- 1 root root 1061 6月 4 2021 kubelet-client-2021-06-04-19-03-23.pem
-rw------- 1 root root 1066 6月 10 11:00 kubelet-client-2022-06-10-11-00-15.pem
lrwxrwxrwx 1 root root 59 6月 10 11:00 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2021-06-04-19-03-23.pem
-rw-r--r-- 1 root root 2144 9月 14 2020 kubelet.crt
-rw------- 1 root root 1679 9月 14 2020 kubelet.key
可以看出kubelet-client-current.pem指向的是kubelet-client-2021-06-04-19-03-23.pem 现在是2022-06-10 所以证书已经过期了。
在node2上查看证书有效期
# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -text | grep Not
Not Before: Jun 4 10:58:23 2021 GMT
Not After : Jun 4 10:58:23 2022 GMT
由于我的 master节点和node1节点都正常;
我可以用之前的kubeadm.yaml配置文件重新生成下证书
#备份之前的证书
# cp -rp /etc/kubernetes /etc/kubernetes.bak
#生成新的证书
# kubeadm alpha certs renew all --config=kubeadm.yaml
W0610 09:24:36.851093 26346 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
#备份之前的配置文件
# mkdir /root/backconf
# mv /etc/kubernetes/*.conf /root/backconf/
# ll backconf/
总用量 32
-rw------- 1 root root 5451 6月 10 09:24 admin.conf
-rw------- 1 root root 5491 6月 10 09:24 controller-manager.conf
-rw------- 1 root root 5463 9月 1 2021 kubelet.conf
-rw------- 1 root root 5439 6月 10 09:24 scheduler.conf
#重新生成配置文件
# kubeadm init phase kubeconfig all --config kubeadm.yaml
W0610 09:26:59.426236 27497 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
# ll /etc/kubernetes/
总用量 52
-rw------- 1 root root 5451 6月 10 09:27 admin.conf
-rw-r--r-- 1 root root 1025 3月 23 2021 ca.crt
-rw-r--r-- 1 root root 3117 3月 23 2021 cert.pfx
-rw-r--r-- 1 root root 1082 3月 23 2021 client.crt
-rw-r--r-- 1 root root 1679 3月 23 2021 client.key
-rw------- 1 root root 5487 6月 10 09:27 controller-manager.conf
-rw------- 1 root root 5459 6月 10 09:27 kubelet.conf
drwxr-xr-x 2 root root 113 10月 6 2021 manifests
drwxr-xr-x 3 root root 4096 9月 14 2020 pki
-rw------- 1 root root 5439 6月 10 09:27 scheduler.conf
# 将新生成的admin.conf文件覆盖掉.kube/config文件:
mv $HOME/.kube/config $HOME/.kube/config.old
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
chmod 644 $HOME/.kube/config
# 重启kube-apiserver,kube-controller,kube-scheduler,etcd这4个容器:(一定要ps -a要不有可能服务容器没启动)
# docker ps -a | grep -v pause | grep -E "etcd|scheduler|controller|apiserver" | awk '{print $1}' | awk '{print "docker","restart",$1}' | bash
# 各节点重启kubelet或相关组件:
systemctl restart kubelet
master节点就更新完成了,然后获取token在更新slave节点时要用
# kubeadm token create --print-join-command
W0610 09:40:30.975578 2435 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join 192.168.100.201:6443 --token 6co5f1.g8wnog41jopfchp8 --discovery-token-ca-cert-hash sha256:8adf630dbe900681db88950f0877faa7be4308f6fd837029ab7e9e41dd0eafd6
# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
6co5f1.g8wnog41jopfchp8 23h 2022-06-11T09:40:31+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
node节点添加进集群(需删除原先kubelet配置文件,否则加入失败)
先备份下配置文件的存放目录
cp -r /etc/kubernetes /etc/kubernetes.bak
# ll /etc/kubernetes*
/etc/kubernetes:
总用量 4
-rw------- 1 root root 1856 9月 14 2020 kubelet.conf
drwxr-xr-x 2 root root 6 4月 9 2020 manifests
drwxr-xr-x 2 root root 20 9月 14 2020 pki
/etc/kubernetes.bak:
总用量 4
-rw------- 1 root root 1856 6月 10 10:58 kubelet.conf
drwxr-xr-x 2 root root 6 6月 10 10:58 manifests
drwxr-xr-x 2 root root 20 6月 10 10:58 pki
然后删除旧的kubelet配置文件
# rm -rf /etc/kubernetes/kubelet.conf
# rm -rf /etc/kubernetes/pki/ca.crt
# rm -rf /etc/kubernetes/bootstrap-kubelet.conf #这个文件我没有
# systemctl stop kubelet
# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: inactive (dead) since 五 2022-06-10 09:38:04 CST; 1h 20min ago
Docs: https://kubernetes.io/docs/
Process: 31448 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=0/SUCCESS)
Main PID: 31448 (code=exited, status=0/SUCCESS)
6月 10 09:37:59 node2 kubelet[31448]: E0610 09:37:59.469934 31448 reflector.go:178] object-"loki"/"loki": Failed to list *v1.Secret: secrets "loki" is forb...this object
6月 10 09:37:59 node2 kubelet[31448]: W0610 09:37:59.676710 31448 status_manager.go:572] Failed to update status for pod "loki-0_loki(e0ea4379-7e48-4107-83...\"Initializ
6月 10 09:38:00 node2 kubelet[31448]: W0610 09:38:00.077588 31448 status_manager.go:572] Failed to update status for pod "sentinel-0_default(49b3d865-37ae-...type\":\"In
6月 10 09:38:00 node2 kubelet[31448]: W0610 09:38:00.476110 31448 status_manager.go:572] Failed to update status for pod "usercenter-deployment-7bf4744f58-...ementOrder/
6月 10 09:38:00 node2 kubelet[31448]: W0610 09:38:00.877862 31448 status_manager.go:572] Failed to update status for pod "getaway-deployment-6595fb8444-ztf...ntOrder/con
6月 10 09:38:02 node2 kubelet[31448]: I0610 09:38:02.721843 31448 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
6月 10 09:38:02 node2 kubelet[31448]: I0610 09:38:02.849726 31448 kubelet_node_status.go:70] Attempting to register node node2
6月 10 09:38:02 node2 kubelet[31448]: E0610 09:38:02.859581 31448 kubelet_node_status.go:92] Unable to register node "node2" with API server: nodes "node2"...ode "node2"
6月 10 09:38:04 node2 systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
6月 10 09:38:04 node2 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Hint: Some lines were ellipsized, use -l to show in full.
node2重新加入集群
# kubeadm join 192.168.100.201:6443 --token 6co5f1.g8wnog41jopfchp8 --discovery-token-ca-cert-hash sha256:8adf630dbe900681db88950f0877faa7be4308f6fd837029ab7e9e41dd0eafd6
W0610 11:00:11.849573 5754 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
验证结果
[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready master 633d v1.18.1
node1 Ready <none> 633d v1.18.1
node2 Ready <none> 633d v1.18.1
[root@master ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-629sv 1/1 Running 15 633d
coredns-7ff77c879f-hk25m 1/1 Running 15 633d
default-http-backend-55fb564b-rrddj 1/1 Running 3 146d
etcd-master 1/1 Running 15 633d
kube-apiserver-master 1/1 Running 8 386d
kube-controller-manager-master 1/1 Running 7 281d
kube-flannel-ds-amd64-g885t 1/1 Running 15 633d
kube-flannel-ds-amd64-nm5xp 1/1 Running 14 633d
kube-flannel-ds-amd64-zd56s 1/1 Running 15 633d
kube-proxy-rdf9s 1/1 Running 16 633d
kube-proxy-rsm5n 1/1 Running 14 633d
kube-proxy-wc7zr 1/1 Running 15 633d
kube-scheduler-master 1/1 Running 17 633d
kube-state-metrics-99d76dd5d-srlvt 1/1 Running 8 300d
metrics-server-7b75fd6bfb-4prml 1/1 Running 9 386d
nginx-ingress-controller-5cf88d6db5-mqp8c 1/1 Running 3 146d