Skip to content

Instantly share code, notes, and snippets.

@nanmi
Last active January 13, 2021 09:21
Show Gist options
  • Save nanmi/e489f21a69883c2103010f276cc5d4fc to your computer and use it in GitHub Desktop.
Save nanmi/e489f21a69883c2103010f276cc5d4fc to your computer and use it in GitHub Desktop.
Install nvidia-docker2 on CentOS&Ubuntu

CentOS 安装nvidia-docker2 2.1设置仓库

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

2.2更新仓库中的key

$ DIST=$(sed -n 's/releasever=//p' /etc/yum.conf)
$ DIST=${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
$ sudo yum makecache

2.3安装nvidia-docker2

$ sudo yum install nvidia-docker2

2.4 重新载入docker daemon的设定

$ sudo pkill -SIGHUP dockerd

Ubuntu 安装nvidia-docker2 1.添加 repo

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey |  sudo apt-key add -
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

2.安装nvidia-docker2

$ sudo apt-get install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerd

3.docker服务重启

$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

测试是否安装成功

$ docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi

第一次运行会花几分钟下载组件,最后显示如下结果则表示安装成功

Wed Mar 25 04:58:46 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.165.02   Driver Version: 418.165.02   CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-SXM2...  Off  | 00000000:04:00.0 Off |                    0 |
| N/A   30C    P0    42W / 300W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-SXM2...  Off  | 00000000:06:00.0 Off |                    0 |
| N/A   27C    P0    41W / 300W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla P100-SXM2...  Off  | 00000000:07:00.0 Off |                    0 |
| N/A   30C    P0    39W / 300W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla P100-SXM2...  Off  | 00000000:08:00.0 Off |                    0 |
| N/A   28C    P0    33W / 300W |      0MiB / 16280MiB |      5%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

使用--gpus all

nvidia-container-runtime-script.sh

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
          sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
          sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
---
修改/etc/docker/daemon.json文件整体信息如下:
{
    "registry-mirrors": ["你的加速仓库地址"],
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
         }	
    }
}

然后重启docker服务:

$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment