k8s学习笔记——ceph pv rbd动态挂载
//参考https://github.com/kubernetes-retired/external-storage/tree/master/ceph/rbd//参考https://www.wenjiangs.com/doc/hqefraum1、创建pool,动态pv专用的数据池2、创建ceph-secret.yamlapiVersion: v1kind: Secretmetadata:name:
//参考https://github.com/kubernetes-retired/external-storage/tree/master/ceph/rbd
//参考https://www.wenjiangs.com/doc/hqefraum
1、创建pool,动态pv专用的数据池
2、创建ceph-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret-admin
namespace: kube-system
type: "kubernetes.io/rbd"
data:
key: QVFCTVQrUmdxYkxzTUJBQS90ZExaTUVBNjY5bmxtODJkNitCeXc9PQ==
---
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
namespace: kube-system
type: "kubernetes.io/rbd"
data:
key: QVFETkJ3UmhxamhVTkJBQVhjWTJoQUlpVnczQmlOc1F6bndoUlE9PQ==
这里ceph-secret-admin的key和ceph-secret的key值可以一样,上面的内容时参考https://github.com/kubernetes-retired/external-storage/tree/master/ceph/rbd
如若使用admin用户可具有ceph操作的所有权限
若使用创建新用户可以重新定义ceph的操作权限,如下
ceph osd pool create kube 8 8
ceph auth add client.kube mon 'allow r' osd 'allow rwx pool=kube'
ceph auth get-key client.kube > /tmp/key
kubectl create secret generic ceph-secret --from-file=/tmp/key --namespace=kube-system --type=kubernetes.io/rbd
这里需要注意get-key是需要base64转换的,如果和官网上一样直接用是会报错的。
ceph auth get-key client.kube | base64
//查看
//kubectl get secret -n kube-system |grep ceph
ceph-secret kubernetes.io/rbd 1 3d11h
ceph-secret-admin kubernetes.io/rbd 1 3d11h
3、部署rbd-provisioner
这里需要注意,因为k8s上的kube-controller-manager资源是运行在容器里,它要调用物理机上的ceph操作需要另外在容器上部署一个rbd-provisioner才能操作成功,否则会报错如下:
"rbd: create volume failed, err: failed to create rbd image: executable file not found in $PATH:"
创建rbd-provisioner.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rbd-provisioner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services"]
resourceNames: ["kube-dns", "coredns"]
verbs: ["list", "get"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rbd-provisioner
subjects:
- kind: ServiceAccount
name: rbd-provisioner
namespace: kube-system
roleRef:
kind: ClusterRole
name: rbd-provisioner
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: rbd-provisioner
namespace: kube-system
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: rbd-provisioner
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: rbd-provisioner
subjects:
- kind: ServiceAccount
name: rbd-provisioner
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rbd-provisioner
namespace: kube-system
labels:
app: rbd-provisioner
spec:
replicas: 1
selector:
matchLabels:
app: rbd-provisioner
strategy:
type: Recreate
template:
metadata:
labels:
app: rbd-provisioner
spec:
nodeSelector:
app: rbd-provisioner
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:latest"
volumeMounts:
- name: ceph-conf
mountPath: /etc/ceph
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
serviceAccount: rbd-provisioner
volumes:
- name: ceph-conf
hostPath:
path: /etc/ceph
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: rbd-provisioner
namespace: kube-system
4、创建ceph-rbd-sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-rbd
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: ceph.com/rbd
#reclaimPolicy: Retain
allowVolumeExpansion: true
parameters:
monitors: 10.12.70.201:6789,10.12.70.202:6789,10.12.70.203:6789
adminId: admin
adminSecretName: ceph-secret-admin
adminSecretNamespace: kube-system
pool: kube
userId: kube
userSecretName: ceph-secret
userSecretNamespace: kube-system
fsType: ext4
imageFormat: "2"
imageFeatures: "layering"
5、创建pvc和测试应用
apiVersion: v1
kind: Pod
metadata:
name: ceph-pod1
spec:
containers:
- name: ceph-busybox
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: ceph-vol1
mountPath: /usr/share/basybox
readOnly: false
volumes:
- name: ceph-vol1
persistentVolumeClaim:
claimName: ceph-claim
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-claim
spec:
storageClassName: ceph-rbd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
//报错
当运行第5步后开始发现pod是处于pedding状态的,然后查看pvc也是pending状态
//kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ceph-claim pending ceph-rbd 2d10h
这说明rbd-provisioner没有生效。查看pod里的logs的报错信息
//kubectl get pod -n kube-system |grep rbd
rbd-provisioner-f4956975f-4ksqt 1/1 Running 0 2d15h
//kubectl logs rbd-provisioner-f4956975f-4ksqt -n kube-system
//报错
1 controller.go:1004] provision "default/ceph-claim" class "ceph-rbd": unexpected error getting claim reference: selfLink was empty, can't make reference
//找了找资料发现,kubernetes 1.20版本 禁用了 selfLink。
当前的解决方法是编辑/etc/kubernetes/manifests/kube-apiserver.yaml
在这里:
spec:
containers:
- command:
- kube-apiserver
添加这一行:
- --feature-gates=RemoveSelfLink=false
需要k8s的每个master节点都进行此操作。
//更改完后,pvc继续处于pending状态,继续查看logs信息
//报错
I0728 14:26:55.704256 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ceph-claim", UID:"e252cc3d-4ff0-400f-9bc2-feee20ecbb40", APIVersion:"v1", ResourceVersion:"19495043", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: exit status 13, command output: did not load config file, using default settings.
2021-07-28 14:26:52.645 7f70da266900 -1 Errors while parsing config file!
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open /root/.ceph/ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.645 7f70da266900 -1 Errors while parsing config file!
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open /root/.ceph/ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.645 7f70da266900 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2021-07-28 14:26:52.685 7f70da266900 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-07-28 14:26:55.689 7f70da266900 -1 monclient: get_monmap_and_config failed to get config
2021-07-28 14:26:55.689 7f70da266900 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
rbd: couldn't connect to the cluster!
这个报错是说rbd-provisioner需要ceph.conf等配置信息,在网上找到临时解决办法是通过docker拷贝将本地/etc/ceph/里的文件拷贝到镜像里去。对了这里忘了说明在执行rbd-provisioner.yaml成功后docker本地镜像会成功拉下一个quay.io/external_storage/rbd-provisioner:latest镜像,如下
//sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/external_storage/rbd-provisioner latest 9fb54e49f9bf 2 years ago 405MB
//临时拷贝命令
//sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
52218eacb4a9 quay.io/external_storage/rbd-provisioner "/usr/local/bin/rbd-…" 2 days ago Up 2 days k8s_rbd-provisioner_rbd-provisioner-f4956975f-4ksqt_kube-system_c6e08e90-3775-45f2-90fe-9fbc0eb16efc_0
//sudo docker cp /etc/ceph/ceph.conf 52218eacb4a9:/etc/ceph
这种方法一旦docker 镜像重启,拷贝的文件就没有了,所以我在rbd-provisioner.yaml文件了加载了hostpath将本地目录挂载到容器里,如下
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:latest"
volumeMounts:
- name: ceph-conf
mountPath: /etc/ceph
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
serviceAccount: rbd-provisioner
volumes:
- name: ceph-conf
hostPath:
path: /etc/ceph
//为保证rbd-provisioner和kube-controller-manager运行在同一个节点上,在master节点上打上标签
//kubectl label nodes k8s70131 app=rbd-provisioner
//kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s70131 Ready control-plane,master 137d v1.21.2 app=rbd-provisioner,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s70132,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=
//可以看到labels栏里已经有了app=rbd-provisioner标签
//删除标签操作
//kubectl label nodes k8s70131 app-
因为master节点都设置了污点,要想在其节点上部署pod需要设置容忍污点。在rbd-provisioner.yaml文件中添加如下
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
然后查看,可以看到pod在同一个节点上了
//kubectl get pod -n kube-system -o wide
rbd-provisioner-f4956975f-4ksqt 1/1 Running 0 2d16h 22.244.157.77 k8s70131 <none> <none>
kube-controller-manager-k8s70131 1/1 Running 2 44d 10.12.70.131 k8s70131 <none> <none>
//此时发现pvc依旧是pending状态,查看log信息也不再报错了,捣鼓了很久发现是rbd-provisioner镜像里的ceph-common的版本和物理机ceph集群的版本不一致导致,接下来升级镜像里的ceph版本。
//物理机上执行ceph -v
ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)
版本是v15.2.13
//进入镜像
//kubectl get pod -n kube-system |grep rbd
rbd-provisioner-f4956975f-4ksqt 1/1 Running 0 2d16h
//kubectl describe pod rbd-provisioner-f4956975f-4ksqt -n kube-system
Containers:
rbd-provisioner:
Container ID: docker://52218eacb4a91b8338cf38958fe5a5213f0bd4cc8c4d5b3d15d9cda69e8af98e
Image: quay.io/external_storage/rbd-provisioner:latest
Image ID: docker-pullable://quay.io/external_storage/rbd-provisioner@sha256:94fd36b8625141b62ff1addfa914d45f7b39619e55891bad0294263ecd2ce09a
Port: <none>
Host Port: <none>
State: Running
Started: Sat, 31 Jul 2021 18:27:44 +0800
Ready: True
Restart Count: 0
Environment:
PROVISIONER_NAME: ceph.com/rbd
Mounts:
/etc/ceph from ceph-conf (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rbd-provisioner-token-5lx5m (ro)
//kubectl exec -it rbd-provisioner-f4956975f-4ksqt -c rbd-provisioner -n kube-system
ceph -v 版本是v13.2.1
对ceph升级
修改/etc/yum.repos.d/cph.repo
https://mirrors.aliyun.com/ceph/keys/
https://mirrors.aliyun.com/ceph/rpm-15.2.13/el7/
yum clean all
yum makecache
yum -y update
升级完毕后查看pvc
//kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ceph-claim Bound pvc-6f7de232-7a76-454f-8c72-f24e21e3230a 2Gi RWO ceph-rbd 2d11h
//查看pv
//kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-6f7de232-7a76-454f-8c72-f24e21e3230a 2Gi RWO Delete Bound default/ceph-claim ceph-rbd 2d11h
k8s ceph pv rbd动态挂载设置成功
更多推荐
所有评论(0)