The distro matters! I'm using Ubuntu 20.04 since 22.04 isn't supported by cuda 11.6.
Inspired by nvidia-smi指令报错
sudo apt install nvidia-driver-470 nvidia-settings
version 470 comes with cuda 11.4 by default. Leave it alone and use seperate cuda that configured with conda
or docker
.
Don't bother installling with apt the system package manager.
What you need is the driver since the forward compatibility of cuda.
(CUDA 11 and Later Defaults to Minor Version Compatibility)
Install Docker first. See Install Docker Engine on Ubuntu.
# ignore importing apt source
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
You have to Setting up NVIDIA Container Toolkit
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
Get a image from Docker Hub
# checkout the naming convention
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-devel-ubuntu20.04 nvidia-smi
You should use --ipc="host"
or increase shm-size
shared memory size when playing with Pytorch.
docker image with --ipc=host option.
Nothing would go wrong when inferring with AUTOMATIC WebUI,
but it will be problematic when training.
It's not recommended and not the best practice to use docker as VM but I'm just lazy and it just works. You're image size would increase uncontrollably since all of the changes in container would be saved instead of last state. I know it's stupid.
An ugly solution is using volume to persist most of the working files and conda environment.
sudo docker run --name dummy -d docker-webui
sudo docker cp dummy:/root /home/ubuntu/
sudo docker cp dummy:/opt/conda /home/ubuntu/
sudo docker stop dummy && sudo docker rm dummy
# 前略
volumes:
- /home/ubuntu/workplace:/workplace
# copy from the container first after build
# Why do I do that? prevent I losting all of StAtE
- /home/ubuntu/conda:/opt/conda
- /home/ubuntu/root:/root
Stupid but it works as expected and I can't find a better solution other than using a real VM.
Don’t treat docker containers like a VM, you’ll be shooting yourself in the foot on down the road.
See also How to flatten a Docker image?