k8s的pod中调用使用gpu
通过pod调用使用gpu的步骤,版本要求:NVIDIA drivers ~= 384.81nvidia-docker version > 2.0Kubernetes version >= 1.101. 先在终端下执行下述命令distribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.git
通过pod调用使用gpu的步骤,版本要求:
- NVIDIA drivers ~= 384.81
- nvidia-docker version > 2.0
- Kubernetes version >= 1.10
- 已安装cuda驱动
- 先在终端下执行下述命令
1.1 ubuntu环境
- 设置仓库
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
- 更新仓库中的key
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
- 安装nvidia-docker2
sudo apt-get update && sudo apt-get install -y nvidia-docker2
- 重新载入
sudo systemctl restart docker
1.2 centos环境
- 设置仓库
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
sudo tee /etc/yum.repos.d/nvidia-docker.repo
- 更新仓库中的key
DIST=$(sed -n 's/releasever=//p' /etc/yum.conf)
DIST=${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
sudo yum makecache
- 安装nvidia-docker2
sudo yum install nvidia-docker2
- 重新载入docker daemon的设定
sudo pkill -SIGHUP dockerd
- 测试是否安装成功
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
2. 执行 cat /etc/docker/daemon.json,内容如下:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
sudo systemctl restart docker
3.在K8S中使能支持GPU,在master节点上运行
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml
返回提示
daemonset.apps/nvidia-device-plugin-daemonset created
4. 查看安装后是否有可用gpu资源
kubectl describe nodes
5. 启动GPU的pod
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
containers:
- name: cuda-container
image: nvidia/cuda:9.0-devel
resources:
limits:
nvidia.com/gpu: 2 # requesting 2 GPUs
- name: digits-container
image: nvidia/digits:6.0
resources:
limits:
nvidia.com/gpu: 2 # requesting 2 GPUs
更多推荐
所有评论(0)