[k8s部署踩过的坑]
系统环境系统版本docker版本roleip地址CentOS8.4.2105(Linux version 4.18.0-348.xxRed Hat 8.5.0-4)20.10.12k8s-master192.168.100.129k8s-node1192.168.100.130k8s-node2192.168.100.131kubernetes组件组件版
系统环境
系统版本 | docker版本 | role | ip地址 |
CentOS8.4.2105 (Linux version 4.18.0-348.xx Red Hat 8.5.0-4) | 20.10.12 | k8s-master | 192.168.100.129 |
k8s-node1 | 192.168.100.130 | ||
k8s-node2 | 192.168.100.131 |
kubernetes组件
组件 | 版本 |
kubeadm | 1.23.1-0 |
kubectl | 1.23.1-0 |
kubelet | 1.23.1-0 |
docker镜像
REPOSITORY | TAG | IMAGE ID | SIZE |
registry.aliyuncs.com/google_containers/kube-apiserver | v1.23.1 | b6d7abedde39 | 135 MB |
registry.aliyuncs.com/google_containers/kube-proxy | v1.23.1 | b46c42588d51 | 112MB |
registry.aliyuncs.com/google_containers/kube-controller-manager | v1.23.1 | f51846a4fd28 | 125MB |
registry.aliyuncs.com/google_containers/kube-scheduler | v1.23.1 | 71d575efe628 | 53.5MB |
registry.aliyuncs.com/google_containers/etcd | 3.5.1-0 | 25f8c7f3da61 | 293MB |
registry.aliyuncs.com/google_containers/coredns | v1.8.6 | a4ca41631cc7 | 46.8MB |
registry.aliyuncs.com/google_containers/pause | 3.6 | 6270bb605e12 | 683 kB |
发现在master上init的时候出现以下问题
kubeadm init \
--apiserver-advertise-address=192.168.100.129 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.1 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
# service-cidr和pod-network-cidr需要和其他机器IP不冲突皆可
部署的时候碰到好几个问题还未解决,并且每次init之前都会再reset一下,具体见4.4.2:
1. error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
尝试重启下kubelet
systemctl restart kubelet && systemctl enable kubelet
systemctl status kubelet
如果重启kubelet失败,则可能是swap交换分区还开启的原因,再次关闭swapoff -a然后重复上面步骤
2. 我这里初始化一直报错,目前还未解决,后续再更新
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
试过以下办法:
1. 打开/usr/lib/systemd/system/docker.service,然后将代码Environment="NO_PROXY=127.0.0.1/8, 127.0.0.1/16",一定要放在[Service] Type=notify的后面,然后重启daemon和docker,然后重启
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl restart kubelet
无效
2. 在 usr/docker/daemon.json当中添加以下内容, docker的默认驱动为cgroupfs,同步docker驱动到kubelet的默认驱动systemd
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
或
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
然后重启
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl restart kubelet
依然无效
3. 有人说swapoff永久关闭,但是我一直都是永久的,重启系统也未起效
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab
4. Init的时候增加一个参数--ignore-preflight-errors=Swap
kubeadm init \
--apiserver-advertise-address=192.168.100.129 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.1 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
--ignore-preflight-errors=Swap
依然无效
后续,将版本切换至1.21.1之后,init成功了,只能后续再调查下了
后续就是,1.23.1的版本报错的提示不像1.21.1那么友好,其实一开始报错和1.21.1一样都是因为coredns/coredns:v1.8.0镜像下载失败了,参考以下办法也可以解决
docker pull coredns/coredns:1.8.0
docker tag coredns/coredns:1.8.0 registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
docker rmi coredns/coredns:1.8.0
重新init即可成功
更多推荐
所有评论(0)