This gist is a note about install nvidia-docker
in Pop!_OS 20.10
. nvidia-docker
is used to help docker containers compute on GPU.
The basic installcation is in Nvidia's offical documentation. But there are a few tweaks to make it work on Pop!_OS 20.10
.
No surprise. Follow the offical documentaion should work.
Pop!_OS
is an "Unsupported distribution" in Nvidia source. Also, Ubuntu 20.10
are not supported by Nvidia source yet. So we need to change the distribution into ubuntu20.04
when adding sources. For instacne,
distribution="ubuntu20.04" \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Reference:
While installing nvidia-docker2
, I got the following error
(base) ➜ ~ sudo apt-get install -y nvidia-docker2
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
nvidia-docker2 : Depends: nvidia-container-runtime (>= 3.5.0) but 3.4.0-1pop1~1601325114~20.10~2880fc6 is to be installed
E: Unable to correct problems, you have held broken packages.
It is because Pop!_OS
's own source for Nvidia driver has high priority than Nvidia's offical source. But the dependencies for nvidia-docker2
falls behind to Nvidia's offical source. To fix that, we could give nvdia docker source a higher priority as folllows.
vi /etc/apt/preferences.d/nvidia-docker-pin-1002
with content;
Package: *
Pin: origin nvidia.github.io
Pin-Priority: 1002
Then follow the offical documentation by running the following command. We will launch a container with GPU.
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Reference:
Hey everyone, I have noticed that there are two main methods for installing nvidia-docker2 on Pop!_OS 22.04. One is described in the System76 support article (updated in March 2023), and the other is outlined in this gist. As a non-expert, I was curious about the differences between these two methods, and what the advantages and disadvantages of each might be. So I asked ChatGPT-4.0 to explain the differences and here's the comprehensive response it provided:
The first method, as outlined in the System76 support article, involves using the nvidia-container-toolkit package and executes the following instructions:
This approach appears straightforward and may be easier for novice users to follow. Each command updates the system, installs the necessary packages, adds the current user to the Docker group (allowing Docker commands to be run without sudo), modifies a kernel parameter to disable the unified cgroup hierarchy (a feature of systemd), reboots the system, configures Docker to use the NVIDIA libraries when running containers, and restarts Docker. However, it might not support the most recent versions of CUDA if the nvidia-container-toolkit package hasn't been updated recently. Notably, this method involves disabling the unified cgroup hierarchy feature of systemd, a significant system component, which could potentially lead to compatibility issues with future software that expects this feature to be enabled.
The second method, as described in this gist, involves using the nvidia-docker2 package and executes the following instructions:
This method seems more complex and may require a deeper understanding of Docker and Linux systems. However, it can be considered less invasive as it avoids modifying system components, and more flexible as it might support more recent versions of CUDA if the nvidia-docker2 package has been recently updated.
When considering the benefits and drawbacks of the two methods, the first option may be simpler to execute and more reliable due to its direct support from System76, the developers of Pop!_OS. However, it may not be compatible with the most current versions of CUDA. In contrast, the second option may support newer versions of CUDA, but it may be more challenging to implement and less stable since it lacks direct support from System76.
In order to determine the most suitable method for your needs, it is necessary to consider various factors such as your technical proficiency, the version of CUDA you plan to use, and your specific requirements. Which of the two methods you choose will ultimately depend on these considerations.