记录一下最近新遇到的arm64 EulerOS系统,安装k8s问题
基础环境
系统是EulerOS-2.0-SP10,查询到的很多文章都是可以通过yum进行安装
数据盘挂载
有数据盘的机器,可以通过fdisk -l
,如当前机器有三个7T数据盘,操作如下:
- parted数据盘创建label为gpt
- fdisk数据盘创建设备
- mkfs.ext4格式化磁盘
- 挂载
防火墙
# 设置:解除防火墙限制
加载模块:
modprobe br_netfilter
持久化(解决重启后失效):
cat << EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
cat <<EOF >> /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
# 有些有防火墙需要进行关闭
ufw disable
Runtime
docker
基本走华为源,dnf也可以安装
# 这里需要换成阿里源, 换成其他源会报无法同步的错误。
yum install -y device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum makecache
yum install -y python3-policycoreutils policycoreutils policycoreutils-python-utils selinux-policy selinux-policy-base selinux-policy-targeted
yum install -y container-selinux
# 部分Euler OS可能需要手动下载container-selinux
# wget https://mirrors.huaweicloud.com/centos-altarch/7/extras/aarch64/Packages/container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
# dnf install -y https://mirrors.huaweicloud.com/centos-altarch/8.1.1911/AppStream/armhfp/os/Packages/container-selinux-2.94-1.git1e99f1d.module_el8.1.0+132+34fc7673.noarch.rpm
# 安装docker-ce
yum install docker-ce-18.06.3.ce-3.el7
# 启动docker
systemctl enable docker && systemctl start docker
cri-docker
# https://github.com/Mirantis/cri-dockerd 可以直接去GitHub查找下载包
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.14/cri-dockerd-0.3.14.arm64.tgz
tar -zxvf cri-dockerd-0.3.14.arm64.tgz
cp cri-dockerd/cri-dockerd /usr/bin/
chmod +x /usr/bin/cri-dockerd
# 配置启动文件
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# 生成socket 文件
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
# 启动CRI-DOCKER
systemctl daemon-reload
systemctl start cri-docker
systemctl enable cri-docker
systemctl is-active cri-docker
如果启动不正常,提示如下错误
添加用户组docker,groupadd docker
再进行启动
k8s组件
组件安装
# yum 源,可以换成国内源,对应k8s版本可能有些会没有
cat >/etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.27/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.27/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
yum clean all && yum makecache
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes --skip-broken --nobest
如果提示错误,Problem: cannot install the best candidate for the job nothing provides socat needed by kubelet-1.27.15-150500.1.1.aarch64
,
需要手动下载socat包进行安装,没有找到可用的源,下载链接如下:
https://www.rpmfind.net/linux/rpm2html/search.php?query=socat&submit=Search+...&system=&arch=aarch64
# 安装
rpm -Uvh --nodeps socat-2.0.0-0.b9.9.mga8.aarch64.rpm
# 重新执行kubelet的安装就可以了
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes --skip-broken --nobest
集群初始化
- 生成配置
# 创建配置
kubeadm config print init-defaults > kubeadm-config.yaml
- 修改配置
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: xxx # 一般为内网LB地址,代理三个master的6443端口,如果是单节点,配置节点的ip
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock # 不同的runtime,配置不同,一般为unix:///var/run/containerd/containerd.sock或unix:///var/run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
name: devserver-test-npu-master # master-01的hostname,需要在master-01节点上映射下
taints: null
---
controlPlaneEndpoint: xxx:6443 # 必填,一般为内网LB地址,如果是单节点,配置该节点IP
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: devserver-test # 集群名称,可以自定义
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers # 修改成国内镜像源,一般是用阿里的
kind: ClusterConfiguration
kubernetesVersion: 1.27.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.249.0.0/16 # 子网网段,如果想不同k8s集群用不同的,修改这里即可
scheduler: {}
- 初始化
kubeadm init --config=kubeadm-config.yaml
到此就可以安装网络组件calico或flannel,添加master、node节点
node环境初始化
NPU节点需要配套安装ascend相关组件,使用ascend-docker-runtime去挂载NPU到容器内部,待定。
近期无法直接拉取docker.io的镜像,有些相关组件的启动挺麻烦的