Kubeadm使用Containerd

Kubeadm如果使用Containerd

前面的主机初始化步骤这里我不再赘述,可以参考 使用Kubeadm搭建一个高可用集群 文章中的初始化部分。我直接从Containerd的安装开始,这里我使用3台主机做演示。

IP Hostname role CPU Memory
172.16.50.200 k8s-master-01 master 4 8G
172.16.50.203 k8s-node-01 node 4 8G
172.16.50.204 k8s-node-02 nide 4 8G

升级系统内核

默认centos7.6内核版本是3.10.0-957.el7.x86_64这个版本比较低,无法使用Cgroup v2,实际在生产中我们使用默认的内核版本也是遇到过一些Bug,所以这里我会做内核版本升级,这个根据个人需求来做。当然你用默认的内核版本是没问题的。

升级内核需要使用 elrepo 的yum源,首先我们导入 elrepo 的 key并安装 elrepo

1
2
[root@k8s-master-01 ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
[root@k8s-master-01 ~]# rpm -Uvh https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm

查看可用的内核

1
[root@k8s-master-01 ~]# yum --disablerepo="*" --enablerepo="elrepo-kernel" list available  --showduplicates

内核选择

  • kernel-lt(lt=long-term)长期有效
  • kernel-ml(ml=mainline)主流版本

最新内核安装

1
[root@k8s-master-01 ~]# yum --enablerepo=elrepo-kernel install kernel-ml  kernel-ml-devel

修改内核启动顺序,默认启动的顺序应该为1,升级以后内核是往前面插入,为0(如果每次启动时需要手动选择哪个内核,该步骤可以省略)

1
[root@k8s-master-01 ~]# grub2-set-default  0 && grub2-mkconfig -o /etc/grub2.cfg

使用下面命令看看确认下是否启动默认内核指向上面安装的内核

1
[root@k8s-master-01 ~]# grubby --default-kernel
1
reboot

启用Cgroup v2

要启用 Cgroup v2 你可以通过在内核命令行中添加 systemd.unified_cgroup_hierarchy=1 来配置系统去使用它。 配置后必须重启节点,使参数生效。

1
2
3
4
yum -y install -y grubby
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
grubby --info=ALL
reboot

安装 containerd

1
2
3
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install containerd.io

配置 containerd

1
2
mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

使用 systemd cgroup 驱动程序

结合 runc 使用 systemd cgroup 驱动,在 /etc/containerd/config.toml 中设置

1
2
3
4
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true

重新启动 containerd

修改配置文件后请重新启动 containerd 使配置生效:

1
systemctl restart containerd

安装crictl

crictl是连接containerd的一个client端工具,用于管理containerd中的容器,这个工具比默认的ctr好用。crictl 使用 k8s.io 命名空间,kubernetes使用的镜像也是在这个名称空间下。

1
2
VERSION="v1.22.0"  #下载与你的 kubernetes 版本相对应的版本
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
1
2
3
tar zxvf crictl-$VERSION-linux-amd64.tar.gz 
chown root.root crictl
mv crictl /usr/bin/

配置crictl工具

编辑 /etc/crictl.yaml

1
2
3
runtime-endpoint: unix:///run/containerd/containerd.sock 
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10

kubeadm部署

默认源在国外会无法安装,我们使用国内的镜像源,所有机器都要操作

1
2
3
4
5
6
7
cat <<EOF >/etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF

master部分安装相关软件

1
2
3
4
5
6
yum install -y \
kubeadm-1.22.4-0 \
kubectl-1.22.4-0 \
kubelet-1.22.4-0 \
--disableexcludes=kubernetes && \
systemctl enable kubelet

node部分安装相关软件

1
2
3
4
5
6
yum install -y \
kubeadm-1.22.4-0 \
kubectl-1.22.4-0 \
kubelet-1.22.4-0 \
--disableexcludes=kubernetes && \
systemctl enable kubelet

kubeadm配置参数

打印默认init的配置信息(此命令仅在一台master节点上执行即可)

1
kubeadm config print init-defaults > initconfig.yaml

基于默认参数修改为如下配置,由于是演示环境所以我把etcd放在k8s集群内部。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
imageRepository: registry.aliyuncs.com/k8sxio
kubernetesVersion: v1.22.4
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
controlPlaneEndpoint: 172.16.50.200:6443
apiServer:
timeoutForControlPlane: 4m0s
extraArgs:
authorization-mode: "Node,RBAC"
enable-admission-plugins: "NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeClaimResize,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Priority"
runtime-config: api/all=true
storage-backend: etcd3
certSANs:
- 127.0.0.1
- localhost
- 172.16.50.200
- k8s-master-01
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
controllerManager:
extraArgs:
bind-address: "0.0.0.0"
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
scheduler:
extraArgs:
bind-address: "0.0.0.0"
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
dns:
type: CoreDNS
etcd:
local:
dataDir: "/var/lib/etcd"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # or iptables
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr" # 调度算法
strictARP: false
syncPeriod: 15s
iptables:
masqueradeAll: true
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
failSwapOn: true # 如果开启swap则设置为false

至于如何自定义参数请参考官方文档:

1
2
3
https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/control-plane-flags/
https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/#kubeadm-k8s-io-v1beta3-ClusterConfiguration
https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#propagating-cluster-level-configuration-to-each-kubelet

干运行模式(此步骤会模拟运行看是否能够跑通)

1
kubeadm init --config initconfig.yaml --dry-run

检查镜像是否正确

1
kubeadm config images list --config initconfig.yaml

预先拉取镜像

1
2
3
4
5
6
7
8
kubeadm config images pull --config initconfig.yaml # 下面是输出
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-apiserver:v1.22.4
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-controller-manager:v1.22.4
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-scheduler:v1.22.4
[config/images] Pulled registry.aliyuncs.com/k8sxio/kube-proxy:v1.22.4
[config/images] Pulled registry.aliyuncs.com/k8sxio/pause:3.5
[config/images] Pulled registry.aliyuncs.com/k8sxio/etcd:3.5.0-0
failed to pull image "registry.aliyuncs.com/k8sxio/coredns:v1.8.4"

修改containerd pause地址

默认containerd配置文件中sandbox_image地址我们是无法拉取的,所以我们需要修改为上面我们预拉取的pause镜像地址。

1
2
vim /etc/containerd/config.toml
sandbox_image = "registry.aliyuncs.com/k8sxio/pause:3.5"

kubeadm init

由于我上面拉取coredns镜像失败所以我下面多了加个参数忽略镜像错误参数--ignore-preflight-errors="ImagePull" 这个问题后续自己处理。

1
kubeadm init --config initconfig.yaml  --ignore-preflight-errors="ImagePull"

记住init后打印的token,复制kubectl的kubeconfig,kubectl的kubeconfig路径默认是~/.kube/config

1
2
mkdir -p $HOME/.kube
cp /etc/kubernetes/admin.conf $HOME/.kube/config

init的yaml信息实际上会存在集群的configmap里,我们可以随时查看,该yaml在其他node和master join的时候会使用到

1
2
kubectl get cm -n kube-system
kubectl -n kube-system get cm kubeadm-config -o yaml

node节点加入

1
kubeadm join 172.16.50.200:6443 --token 4yvo6z.wv2u5tmehdhv4dc9 --discovery-token-ca-cert-hash sha256:b0c0724a1fbee5e53e3bd436902960b5aba17c298544155a0a10b219ef711266

安装flannel

1
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

查看集群节点

1
2
3
4
5
[root@k8s-master-01 ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master-01 Ready control-plane,master 16h v1.22.4 172.16.50.200 <none> CentOS Linux 7 (Core) 5.15.6-1.el7.elrepo.x86_64 containerd://1.4.12
k8s-node-01 Ready <none> 24m v1.22.4 172.16.50.203 <none> CentOS Linux 7 (Core) 5.15.6-1.el7.elrepo.x86_64 containerd://1.4.12
k8s-node-02 Ready <none> 23m v1.22.4 172.16.50.204 <none> CentOS Linux 7 (Core) 5.15.6-1.el7.elrepo.x86_64 containerd://1.4.12

修改Kubelet参数

修改/var/lib/kubelet/kubeadm-flags.env文件中参数,解决kubelet日志中的两个参数告警。

1
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock"

验证集群可用性

1
2
kubectl -n kube-system get pod -o wide

等待kube-system空间下的pod都是running后我们来测试下集群可用性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
cat<<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:alpine
name: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: zhangguanzhang/centos
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF

验证集群dns

1
2
3
4
5
6
[root@k8s-master-01 ~]# kubectl exec -ti busybox -- nslookup kubernetes
Server: 10.96.0.10
Address: 10.96.0.10#53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

在master上curl nginx的svc的ip出现nginx的index内容即集群正常,例如我的nginx svc ip是10.102.137.186

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[root@k8s-master-01 ~]# curl 10.102.137.186
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Kubernetes集群容器引擎切换

环境

  • OS: CentOS 7.6(当前最新版kernel)
  • Container runtime: Docker CE 20.10.11
  • Kubernetes: v1.22.4
1
2
3
4
5
[root@k8s-master-01 ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master-01 Ready control-plane,master 11m v1.22.4 172.16.50.200 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 docker://20.10.11
k8s-node-01 Ready <none> 10m v1.22.4 172.16.50.203 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 docker://20.10.11
k8s-node-02 Ready <none> 10m v1.22.4 172.16.50.204 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 docker://20.10.11

将该node标记为不可被调度,并且驱逐该node上的pod资源

1
2
3
4
[root@k8s-master-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 14m 10.244.1.2 k8s-node-01 <none> <none>
nginx-7fb7fd49b4-pzqm6 1/1 Running 0 14m 10.244.2.2 k8s-node-02 <none> <none>

驱逐该node节点上的pod资源到集群中的其它节点上去

1
kubectl drain k8s-node-02 --delete-local-data --force --ignore-daemonsets

查看之前运行在该node上的pod被调度到了集群中的哪个节点

1
2
3
4
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 23m 10.244.1.2 k8s-node-01 <none> <none>
nginx-7fb7fd49b4-bg5pn 1/1 Running 0 3m18s 10.244.1.3 k8s-node-01 <none> <none>

查看Kubernetes集群中的node资源信息

1
2
3
4
5
6
7
[root@k8s-master-01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-01 Ready control-plane,master 33m v1.22.4
k8s-node-01 Ready <none> 32m v1.22.4
k8s-node-02 Ready,SchedulingDisabled <none> 32m v1.22.4

如上信息,k8s-node-02节点已经不可被调度了,接下来开始切换容器引擎

卸载原有docker

1
2
rpm -qa | grep docker
rpm -e docker-ce-20.10.11-3.el7.x86_64 docker-ce-cli-20.10.11-3.el7.x86_64 docker-ce-rootless-extras-20.10.11-3.el7.x86_64 docker-scan-plugin-0.9.0-3.el7.x86_64

安装containerd部分请参考本文章上面部分,这里我就不在写一遍了,安装方法都是一样的。
配置kubelet使用containerd

1
2
vim /var/lib/kubelet/kubeadm-flags.env 
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock"

重启kubelet

1
2
systemctl daemon-reload
systemctl restart kubelet

验证容器引擎是否成功切换为containerd

1
2
3
4
5
[root@k8s-master-01 ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master-01 Ready control-plane,master 51m v1.22.4 172.16.50.200 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 docker://20.10.11
k8s-node-01 Ready <none> 51m v1.22.4 172.16.50.203 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 docker://20.10.11
k8s-node-02 Ready,SchedulingDisabled <none> 51m v1.22.4 172.16.50.204 <none> CentOS Linux 7 (Core) 5.4.163-1.el7.elrepo.x86_64 containerd://1.4.12

通过将节点标记为可调度,让节点重新上线

1
2
# 将 <node-to-drain> 替换为当前节点的名称
kubectl uncordon <node-to-drain>

Kubeadm使用Containerd
https://system51.github.io/2021/12/06/kubeadm-use-containerd/
作者
Mr.Ye
发布于
2021年12月6日
许可协议