通过pod调用使用gpu的步骤,版本要求:

  • NVIDIA drivers ~= 384.81
  • nvidia-docker version > 2.0 
  • Kubernetes version >= 1.10
  • 已安装cuda驱动
  1. 先在终端下执行下述命令

 1.1 ubuntu环境

  • 设置仓库

       distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

  • 更新仓库中的key

      curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

      curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

  • 安装nvidia-docker2

      sudo apt-get update && sudo apt-get install -y nvidia-docker2

  • 重新载入

       sudo systemctl restart docker

1.2 centos环境

  • 设置仓库

       distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
       curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
      sudo tee /etc/yum.repos.d/nvidia-docker.repo

  • 更新仓库中的key

       DIST=$(sed -n 's/releasever=//p' /etc/yum.conf)
       DIST=${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
       sudo yum makecache

  • 安装nvidia-docker2

       sudo yum install nvidia-docker2

  • 重新载入docker daemon的设定

        sudo pkill -SIGHUP dockerd

  •  测试是否安装成功

       docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

 

 

2. 执行 cat  /etc/docker/daemon.json,内容如下:

{

    "default-runtime": "nvidia",

    "runtimes": {

        "nvidia": {

            "path": "/usr/bin/nvidia-container-runtime",

            "runtimeArgs": []

        }

    }

}

 

sudo systemctl restart docker

3.K8S中使能支持GPU,在master节点上运行

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml
返回提示
daemonset.apps/nvidia-device-plugin-daemonset created

4. 查看安装后是否有可用gpu资源

kubectl describe nodes

5. 启动GPU的pod

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
    - name: cuda-container
      image: nvidia/cuda:9.0-devel
      resources:
        limits:
          nvidia.com/gpu: 2 # requesting 2 GPUs
    - name: digits-container
      image: nvidia/digits:6.0
      resources:
        limits:
          nvidia.com/gpu: 2 # requesting 2 GPUs
Logo

开源、云原生的融合云平台

更多推荐