kubernetes——解决集群升级之后node节点无法加入集群问题
k8s环境升级版本流程1.15——1.161.16——1.18docker版本信息18.06.3问题描述1detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd"原因:cgroup和systemd有冲突解决[root@qonde1-7 ~]#docker info | grep Cgro
·
k8s环境升级版本流程
1.15——1.16
1.16——1.18
docker版本信息18.06.3
问题描述1
detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd"
原因:cgroup和systemd有冲突
解决
[root@qonde1-7 ~]# docker info | grep Cgroup
Cgroup Driver: cgroupfs
通过以上命令查到当前的cgroup driver 为cgroupfs,需改为systemd
cat /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
[root@qonde1-7 ~]# systemctl daemon-reload
[root@qonde1-7 ~]# systemctl restart docker
[root@qonde1-7 ~]# docker info | grep Cgroup
Cgroup Driver: systemd
问题描述2
error execution phase kubelet-start: configmaps "kubelet-config-1.19" is forbidden: User "system:bootstrap:xvnp3x" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
问题排查:查看集群的kubeadm、kubectl、kubelet对应的版本和docker对应的版本
原因:经过排查发现待加入节点版本和集群版本一致。再次排查发现是升级的问题
role/rolebinding信息没有1.18的版本
[root@qmaster1-1 ~]# kubectl get role,rolebinding -n kube-system |grep kubeadm
role.rbac.authorization.k8s.io/kubeadm:kubelet-config-1.15 2021-05-15T06:42:22Z
role.rbac.authorization.k8s.io/kubeadm:kubelet-config-1.16 2021-08-12T06:10:51Z
role.rbac.authorization.k8s.io/kubeadm:nodes-kubeadm-config 2021-05-15T06:42:22Z
rolebinding.rbac.authorization.k8s.io/kubeadm:kubelet-config-1.15 Role/kubeadm:kubelet-config-1.15 165d
rolebinding.rbac.authorization.k8s.io/kubeadm:kubelet-config-1.16 Role/kubeadm:kubelet-config-1.16 76d
rolebinding.rbac.authorization.k8s.io/kubeadm:kubelet-config-1.18 Role/kubeadm:kubelet-config-1.18 6h22m
rolebinding.rbac.authorization.k8s.io/kubeadm:nodes-kubeadm-config Role/kubeadm:nodes-kubeadm-config 165d
解决:从现有的”ConfigMap kubelet-config-1.16” 创建⼀个新的ConfigMap“kubelet-config-1.18”
kubectl get cm --all-namespaces
kubectl -n kube-system get cm kubelet-config-1.16 -o yaml > kubelet-config-1.18-cm.yaml
需要修改两个地方
vim kubelet-config-1.18-cm.yaml
#modify at the bottom:
#name: kubelet-config-1.18
#delete selfLink
创建
kubectl -n kube-system create -f kubelet-config-1.18-cm.yaml
主要重新建⽴role和rolebinding配置
创建如下yaml⽂件并进⾏create
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kubeadm:get-nodes
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubeadm:get-nodes
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubeadm:get-nodes
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:bootstrappers:kubeadm:default-node-token
问题描述3
error execution phase kubelet-start: cannot get Node "qnode1-7": nodes "qnode1-7" is forbidden: User "system:bootstrap:rhjty0" cannot get resource "nodes" in API group
问题排查:发现升级到1.18版本之后没有role和rolebinding
根据1.16版本的role和rolebinding生成yaml
注意修改版本再执行创建命令
role的yaml文件
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
creationTimestamp: "2021-08-12T06:10:51Z"
name: kubeadm:kubelet-config-1.18
namespace: kube-system
resourceVersion: "23321499"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/kubeadm:kubelet-config-1.18
uid: 745124c4-dd42-472a-82e6-f8fc58e9503d
rules:
- apiGroups:
- ""
resourceNames:
- kubelet-config-1.18
resources:
- configmaps
verbs:
- get
rolebinding的yaml文件
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
creationTimestamp: "2021-08-20T10:12:20Z"
name: kubeadm:kubelet-config-1.18
namespace: kube-system
resourceVersion: "26657553"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/kubeadm:kubelet-config-1.18
uid: 68037084-5e51-44f7-8cfd-01047257ae16
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubeadm:kubelet-config-1.18
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:bootstrappers:kubeadm:default-node-token
查看相关roles rolebindings cm
[root@qmaster1-1 ~]# kubectl get cm -n kube-system kubelet-config-1.18
NAME DATA AGE
kubelet-config-1.18 1 22h
[root@qmaster1-1 ~]# kubectl get roles -n kube-system kubeadm:kubelet-config-1.18
NAME CREATED AT
kubeadm:kubelet-config-1.18 2021-10-27T01:31:47Z
[root@qmaster1-1 ~]# kubectl get rolebindings -n kube-system kubeadm:kubelet-config-1.18
NAME ROLE AGE
kubeadm:kubelet-config-1.18 Role/kubeadm:kubelet-config-1.18 6h36m
问题描述4
error: failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope
1.8版本之前.开启rbac后,apiserver默认绑定system:nodes组到system:node的clusterrole。v1.8之后,此绑定默认不存在,需要手工绑定,否则kubelet启动后会报认证错误,使用kubectl get nodes查看无法成为Ready状态
原因:kubelet-bootstrap并没有权限创建证书。所以要创建这个用户的权限并绑定到这个角色上
解决
查看系统中的角色与角色绑定
[root@qmaster1-1 ~]# kubectl get clusterrolebinding和kubectl get clusterrole
查看system:node角色绑定的详细信息
[root@qmaster1-1 ~]# kubectl describe clusterrolebindings system:node
创建角色绑定
在整个集群中授予 ClusterRole ,包括所有命名空间
[root@qmaster1-1 ~]# kubectl create clusterrolebinding kubelet-node-clusterbinding --clusterrole=system:node --group=system:nodes
[root@qmaster1-1 ~]# kubectl describe clusterrolebindings kubelet-node-clusterbinding
Name: kubelet-node-clusterbinding
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: system:node
Subjects:
Kind Name Namespace
---- ---- ---------
Group system:nodes
加入集群
在master执行以下命令获取join命令并在node节点上执⾏join命令
[root@qmaster1-1 ~]# kubeadm token create --print-join-command
更多推荐
已为社区贡献7条内容
所有评论(0)