Getting dlandon's docker image to work reliably with a GPU using a nvidia's pre-built CUDA/CuDNN image
Raw notes on what I did to get GPU/CUDA/CuDNN working in dlandon's docker image. Not optimized. So I don't forget later.
There are two ways:
-
Use his docker image which derives from phusion and manually install CUDA/cuDNN. This did not work for me - I got all sorts of segfaults, install errors, driver mismatch errors (even though my host/docker cuda versions were identical). When things worked after various pacakge errors installing cuda, the image failed after a restart. I've come to the conclusion that CuDNN/CUDA install is very sensitive to environments and what works for one person may dramatically fail for the other especially when we are trying to link a host GPU to a docker container. May as well leave this to the experts (nvidia). So the next option:
-
Modify his Dockerfile to derive from a pre-installed version of CuDNN/CUDA from nvidia and put the rest of his steps in (and a step or two needed to get the init process working, which his docker relies on). Pro: No need to mess with CUDA/CuDNN, no need to remove/recompile packages post GPU software install. Con: Possibly larger/less optimized docker image (haven't compared) but something that can be cleaned up for those who want to.
Step 1: clone his repo
git clone https://github.com/dlandon/zoneminder
Step 2: Grab phusion's my_init script that is used by dlandon in his docker file
wget https://raw.githubusercontent.com/phusion/baseimage-docker/master/image/bin/my_init
chmod a+x ./my_init
Step 3: Modify his Dockerfile
let's call the new one Dockerfile.nvidia
:
FIXME: I did not bother figuring out if we could avoid my_init completely. Like I said, my goal was to get things working. So the easiest way for me was to look at how phusion did service init and borrow that script over and any deps (runit-systemd
)
I made the following changes:
- Switched base from phusion to the pre-created CUDA+CuDNN dev package from NVIDIA. This is needed to compile apps that need CUDA/CuDNN (like OpenCV)
- Copied the downloaded
./my_init
script to/usr/sbin
- Added a couple of packages we need that phusion builds in that nvidia does not so compilation would work
FIXME: Better to put a mark on cuda packages
- Removed the
apt-get clean
andapt-get autoremove
parts for now. Note thatapt-get autoremove
removes packages that have no dependents. Not a good idea for CUDA/CuDNN libs because when this builds, there are no dependents and they get removed. So when we come to compiling OpenCV, it obviously doesn't find dev libraries. What was odd was evenapt-get clean
was removing various cuda libs and had the same issue. A better solution would likely be to mark all cuda/cuddn libs to be excluded.
VERY IMPORTANT: You can't pick a random CUDA version. You MUST match your GPU driver version to the CUDA version from here. In my case, my driver version, 430.26
requires cuda 10.1
. I can't make it work with cuda 10.2
- compilation etc will work, but when you try and actually use an app that needs to use the GPU via CUDA, it will fail.
# Use the right cuda version for your driver. change to 10.2 or others as needed
FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
LABEL maintainer="dlandon"
ENV DEBCONF_NONINTERACTIVE_SEEN="true" \
DEBIAN_FRONTEND="noninteractive" \
DISABLE_SSH="true" \
HOME="/root" \
LC_ALL="C.UTF-8" \
LANG="en_US.UTF-8" \
LANGUAGE="en_US.UTF-8" \
TZ="Etc/UTC" \
TERM="xterm"
ENV PHP_VERS="7.4" \
ZM_VERS="1.34" \
ZMEVENT_VERS="5.7.4" \
SHMEM="50%" \
PUID="99" \
PGID="100"
COPY init/ /etc/my_init.d/
COPY defaults/ /root/
COPY ./my_init /sbin/
RUN apt-get update && \
apt-get -y install --no-install-recommends software-properties-common runit-systemd && \
add-apt-repository -y ppa:iconnor/zoneminder-$ZM_VERS && \
add-apt-repository ppa:ondrej/php && \
apt-get update && \
apt-get -y upgrade -o Dpkg::Options::="--force-confold" && \
apt-get -y dist-upgrade -o Dpkg::Options::="--force-confold" && \
apt-get -y install apache2 mariadb-server && \
apt-get -y install ssmtp mailutils net-tools wget sudo make && \
apt-get -y install php$PHP_VERS php$PHP_VERS-fpm libapache2-mod-php$PHP_VERS php$PHP_VERS-mysql php$PHP_VERS-gd && \
apt-get -y install libcrypt-mysql-perl libyaml-perl libjson-perl libavutil-dev && \
apt-get -y install --no-install-recommends libvlc-dev libvlccore-dev vlc
RUN apt-get -y install zoneminder
RUN rm /etc/mysql/my.cnf && \
cp /etc/mysql/mariadb.conf.d/50-server.cnf /etc/mysql/my.cnf && \
adduser www-data video && \
a2enmod php$PHP_VERS proxy_fcgi ssl rewrite expires headers && \
a2enconf php$PHP_VERS-fpm zoneminder && \
echo "extension=apcu.so" > /etc/php/$PHP_VERS/mods-available/apcu.ini && \
echo "extension=mcrypt.so" > /etc/php/$PHP_VERS/mods-available/mcrypt.ini && \
perl -MCPAN -e "force install Net::WebSocket::Server" && \
perl -MCPAN -e "force install LWP::Protocol::https" && \
perl -MCPAN -e "force install Config::IniFiles" && \
perl -MCPAN -e "force install Net::MQTT::Simple" && \
perl -MCPAN -e "force install Net::MQTT::Simple::Auth"
RUN cd /root && \
chown -R www-data:www-data /usr/share/zoneminder/ && \
echo "ServerName localhost" >> /etc/apache2/apache2.conf && \
sed -i "s|^;date.timezone =.*|date.timezone = ${TZ}|" /etc/php/$PHP_VERS/apache2/php.ini && \
service mysql start && \
mysql -uroot < /usr/share/zoneminder/db/zm_create.sql && \
mysql -uroot -e "grant all on zm.* to 'zmuser'@localhost identified by 'zmpass';" && \
mysqladmin -uroot reload && \
mysql -sfu root < "mysql_secure_installation.sql" && \
rm mysql_secure_installation.sql && \
mysql -sfu root < "mysql_defaults.sql" && \
rm mysql_defaults.sql
RUN mv /root/zoneminder /etc/init.d/zoneminder && \
chmod +x /etc/init.d/zoneminder && \
service mysql restart && \
sleep 5 && \
service apache2 restart && \
service zoneminder start
RUN systemd-tmpfiles --create zoneminder.conf && \
mv /root/default-ssl.conf /etc/apache2/sites-enabled/default-ssl.conf && \
mkdir /etc/apache2/ssl/ && \
mkdir -p /var/lib/zmeventnotification/images && \
chown -R www-data:www-data /var/lib/zmeventnotification/ && \
chmod -R +x /etc/my_init.d/ && \
cp -p /etc/zm/zm.conf /root/zm.conf && \
echo "#!/bin/sh\n\n/usr/bin/zmaudit.pl -f" >> /etc/cron.weekly/zmaudit && \
chmod +x /etc/cron.weekly/zmaudit
RUN rm -rf /tmp/* /var/tmp/* && \
chmod +x /etc/my_init.d/*.sh
VOLUME \
["/config"] \
["/var/cache/zoneminder"]
EXPOSE 80 443 9000
CMD ["/sbin/my_init"]
Step 4: Build the modified dockerfile
docker build -t zoneminder -f Dockerfile.nvidia .
Step 5: Now run it:
Note the --gpus
option, you need nvidia-docker set up. See this
Also note since the docker image starts with CuDNN+CUDA, installing face recognition should automatically use the GPU. I turned it off in my experiments just for setup speed.
DATADIR_BASE=/home/pp/fiddle/docker/appdata/Zoneminder
docker run -d --name="Zoneminder" \
--gpus all \
--net="bridge" \
--privileged="true" \
-p 8443:443/tcp \
-p 9990:9000/tcp \
-e TZ="America/New_York" \
-e SHMEM="70%" \
-e PUID="99" \
-e PGID="100" \
-e INSTALL_HOOK="1" \
-e INSTALL_FACE="0" \
-e INSTALL_TINY_YOLO="1" \
-e INSTALL_YOLO="1" \
-v "${DATADIR_BASE}/config":"/config":rw \
-v "${DATADIR_BASE}/data":"/var/cache/zoneminder":rw \
zoneminder
Monitor first run progress by monitoring logs:
docker logs -f Zoneminder
Don't rush trying to compile OpenCV. A lot of development tools are setup when this image is first run. Monitor the logs, make sure its done installing all packages
Step 6: Compile OpenCV with GPU support Once run, ssh into it:
docker exec -it Zoneminder /bin/bash
You can run nvidia-smi
and you should see both CUDA (ignore the version it reports: ref) and your GPU. If not, boo. You're screwed. Don't pass Go.
Note that the CUDA version shown in nvidia-smi
does not indicate the CUDA toolchain/lib version in the container. To make sure you are using the right cuda library, check version using nvcc --version
In my case:
root@a611be0d9502:/# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
If you already have the CPU version of opencv installed, remove it (the default docker setup installs the cpu version)
pip3 uninstall opencv-contrib-python
You can now proceed to install OpenCV as per dlandon's repo. Note that since we are using a pre-packaged CUDA+CuDNN install, you'll have to specify a CUDA_ARCH_BIN
. Go here to get the value (you are looking for the number under "Compute Capability" for your GPU card)
I've not listed exact commands, look at his opencv.sh script and directly skip to the part where the support libraries of openCV are installed (ignore all the cuda/cudnn stuff as we already have it)
My modified cmake command was: (note the extra CUDA_ARCH_BIN
)
cmake -v -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D INSTALL_PYTHON_EXAMPLES=OFF \
-D INSTALL_C_EXAMPLES=OFF \
-D OPENCV_ENABLE_NONFREE=ON \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_ARCH_BIN=6.1 \
-D ENABLE_FAST_MATH=1 \
-D CUDA_FAST_MATH=1 \
-D WITH_CUBLAS=1 \
-D OPENCV_EXTRA_MODULES_PATH=/<path>/<to>/opencv_contrib/modules/ \
-D HAVE_opencv_python3=ON \
-D PYTHON_EXECUTABLE=/usr/bin/python3 \
-D BUILD_EXAMPLES=OFF ..
make
make install
Step 7: Test GPU and cv2
Compilation can work, but things can break if you actually try using the GPU.
root@5d4231e625c3:~# python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> print(cv2.getBuildInformation())
This should print out gobs of info including: (your arch/ver may be different)
NVIDIA CUDA: YES (ver 10.1, CUFFT CUBLAS FAST_MATH)
NVIDIA GPU arch: 61
NVIDIA PTX archs:
cuDNN: YES (ver 7.6.5)
While the message only means OpenCV was compiled with CUDA/CuDNN, which we already know, now we also know importing cv2 doesn't segfault. Test 1 passed.
Now check DNN GPU support:
import cv2
config_file_abs_path='/var/lib/zmeventnotification/models/yolov3/yolov3.cfg'
weights_file_abs_path='/var/lib/zmeventnotification/models/yolov3/yolov3.weights'
net = cv2.dnn.readNet(weights_file_abs_path,config_file_abs_path)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
input ("Now do nvidia-smi and you should see this process consume GPU memory")
print ("Bye\n")
If you get errors, for example error: (-217:Gpu API call) system has unsupported display driver / cuda driver combination in function 'getCudaEnabledDeviceCount'
you used the wrong cuda version. You need to match your cuda version to the GPU card version.
If everything runs fine and you see your GPU memory being consumed, test 2 passed. You're all done.
Hi! Your modified Dockerfile results in creation of "none" image (with "none" tag). Please advice how to build normally functioning ZM with ES + Nvidia GPU (+Cuda)? I triied to build on my own from Dlandon's Docker, everything ran smoothly, but somehow events were not processed by zm_detect (only in case of hardware Cuda-enabled OpenCV).