Skip to content

Instantly share code, notes, and snippets.

@sberryman
Last active July 13, 2020 12:06
Show Gist options
  • Save sberryman/7bea400ec2743bbdd93428fbee43cb7a to your computer and use it in GitHub Desktop.
Save sberryman/7bea400ec2743bbdd93428fbee43cb7a to your computer and use it in GitHub Desktop.
DeepMatching GPU version on Ubuntu 16.04 (Docker)

DeepMatching GPU version on Ubuntu 16.04 (Docker)

DeepMatching is an algorithm that finds corresponding points in two images.

Thanks @WIStudent

This wouldn't have been possible without the work of @WIStudent and the gist posted here: https://gist.github.com/WIStudent/08072e8dd41487d2dde7ce75eec3dcbf

cuDNN v3

cuDNN download requires a free NVIDIA developer account in order to download.

  1. Visit https://developer.nvidia.com/cudnn
  2. Click the "DOWNLOAD cuDNN" button
  3. Login or join
  4. Review terms
  5. Click "Archived cuDNN Releases"
  6. Expand "Download cuDNN v3 (September 8, 2015), for CUDA 7.0 and later."
  7. Download "cuDNN v3 Library for Linux (Updated August 30th,2016)."

Running

  1. Create an empty directory
  2. Copy the Dockerfile and Makefile to the new directory
  3. Edit Dockerfile line #49 and change the path to cuDNN v3. I use python -m SimpleHTTPServer 8001 in the directory I downloaded the file and update the URL to my local machine such as http://{local_machine_ip_address}:8001/cudnn-7.0-linux-x64-v3.0.8-prod.tgz
  4. Build docker build -t deepmatching:gpu .
  5. Run docker run -it --rm deepmatching:gpu
  6. cd /root/web_gpudm_1.0/
  7. python deep_matching_gpu.py -GPU 0 liberty1.png liberty2.png
FROM nvidia/cuda:8.0-cudnn5-devel
# PYTHON 2, DO NOT USE PYTHON 3!
LABEL maintainer "Shaun Berryman <shaun@shaunberryman.com>"
# Supress warnings about missing front-end. As recommended at:
# http://stackoverflow.com/questions/22466255/is-it-possibe-to-answer-dialog-questions-when-installing-under-docker
ARG DEBIAN_FRONTEND=noninteractive
# Install some dependencies
RUN apt-get update && \
apt-get install -y \
git \
unzip \
wget \
build-essential \
cmake \
git \
nano \
pkg-config \
libprotobuf-dev \
libleveldb-dev \
libsnappy-dev \
libhdf5-serial-dev \
protobuf-compiler \
libatlas-base-dev \
&& \
apt-get install --no-install-recommends -y libboost-all-dev && \
apt-get install -y \
libgflags-dev \
libgoogle-glog-dev \
liblmdb-dev \
python-pip \
python-dev \
python-numpy \
python-scipy \
libopencv-dev \
python-opencv \
python-matplotlib \
swig \
&& \
apt-get clean && \
apt-get autoremove && \
rm -rf /var/lib/apt/lists/*
# cuDNN v3
RUN cd /root && \
wget "http://TO-DO/cudnn-7.0-linux-x64-v3.0.8-prod.tgz" && \
tar xf cudnn-7.0-linux-x64-v3.0.8-prod.tgz && \
rm cudnn-7.0-linux-x64-v3.0.8-prod.tgz && \
mv /root/cuda/include/* /usr/include/ && \
mv /root/cuda/lib64/* /usr/lib/x86_64-linux-gnu/ && \
rm -rf /root/cuda
# DeepMatching GPU
COPY Makefile /root/
RUN cd /root && \
# wget "DOWNLOAD FROM NVIDIA DIRECTLY" && \
wget "http://lear.inrialpes.fr/src/deepmatching/code/deepmatching_gpu_1.0.zip" && \
unzip deepmatching_gpu_1.0.zip && \
rm deepmatching_gpu_1.0.zip && \
cd web_gpudm_1.0 && \
unzip caffe.zip && \
rm caffe.zip && \
cd caffe/python && \
for req in $(cat requirements.txt); do pip install $req; done && \
cd ../ && \
mkdir build && \
cd build && \
cmake .. && \
make -j"$(nproc)" && \
make install -j"$(nproc)" && \
cd ../../ && \
mv /root/Makefile /root/web_gpudm_1.0/ && \
make all
WORKDIR "/root"
CMD ["/bin/bash"]
# Path to caffe's install directory that was created by Cmake
CAFFEDIR=/root/web_gpudm_1.0/caffe/build/install
CAFFELIB=$(CAFFEDIR)/lib
# 1: Include the location of the libcaffe.so library in the _gpudm.so library. Usefull if the CAFFEDIR is not in a standard location
# for libraries because you don't need to set the location by setting the variable LD_LIBRARY_PATH before using _gpudm.so.
# A disadvantage is, that _gpudm.so won't work anymore if you move the CAFFEDIR folder.
INCLUDE_CAFFE_LOCATION = 1
OPTFLAGS=-g -O2
# Path to python header file
INCLUDES += -I/usr/include/python2.7/
# Path to caffe's header files
INCLUDES += -I$(CAFFEDIR)/include/
# Path to hdf5's header files
INCLUDES += -I/usr/include/hdf5/serial/
# Path to CUDA
INCLUDES += -I/usr/local/cuda-8.0/targets/x86_64-linux/include/
#include gpudm/Makefile.config
CUDA_ARCH := \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61
HEADERS := $(shell find . -maxdepth 1 -name '*.hpp')
EXTRA_LAYERS := $(shell find . -maxdepth 1 -name '*.hpp')
all: _gpudm.so
_gpudm.so: gpudm_wrap.o $(EXTRA_LAYERS:.hpp=.o) $(EXTRA_LAYERS:.hpp=.cuo)
ifeq ($(INCLUDE_CAFFE_LOCATION),1)
g++ $(OPTFLAGS) -fPIC -L$(CAFFELIB) $^ -shared -Xlinker -rpath $(CAFFELIB) -o $@ -lcaffe -lcusparse
else
g++ $(OPTFLAGS) -fPIC -L$(CAFFELIB) $^ -o $@ -lcaffe -lcusparse
endif
%.cuo: %.cu %.hpp
nvcc $(CUDA_ARCH) -Xcompiler -fPIC $(INCLUDES) $(OPTFLAGS) -c $< -o $@
gpudm_wrap.cxx: gpudm.swig $(HEADERS)
swig -cpperraswarn -python -c++ $(INCLUDES) gpudm.swig
gpudm_wrap.o: gpudm_wrap.cxx
g++ $(OPTFLAGS) -c gpudm_wrap.cxx -fPIC $(INCLUDES) -o gpudm_wrap.o
%.o: %.cpp %.hpp
g++ $(OPTFLAGS) -c $< -fPIC $(INCLUDES) -L$(CAFFELIB) -o $@
clean:
rm -f *.pyc *~ _gpudm.so gpudm_wrap.o $(EXTRA_LAYERS:.hpp=.o) $(EXTRA_LAYERS:.hpp=.cuo)
cleanswig: clean
rm -f gpudm.py gpudm_wrap.cxx gpudm_wrap.o
@Guptajakala
Copy link

Thanks for sharing!
I have run into this problem. I didn't change any code though. Do you guys happen to have any idea?
@shensheng27 @sberryman

extra_layers.cu(692): error: no instance of overloaded function "atomicMax" matches the argument list
            argument types are: (caffe::ULONG *, caffe::ULONG)
          detected during:
            instantiation of "void caffe::kernel_parent_children<Dtype,use_nghrad,safe_write>(long, int, __nv_bool, const Dtype *, int, int, int, Dtype *, int, int, int, int, int, int, int, void *) [with Dtype=float, use_nghrad=false, safe_write=true]" 
(850): here
            instantiation of "void caffe::DeepMatchingArgMaxLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(873): here

extra_layers.cu(695): error: no instance of overloaded function "atomicMax" matches the argument list
            argument types are: (caffe::ULONG *, caffe::ULONG)
          detected during:
            instantiation of "void caffe::kernel_parent_children<Dtype,use_nghrad,safe_write>(long, int, __nv_bool, const Dtype *, int, int, int, Dtype *, int, int, int, int, int, int, int, void *) [with Dtype=float, use_nghrad=false, safe_write=true]" 
(850): here
            instantiation of "void caffe::DeepMatchingArgMaxLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(873): here

2 errors detected in the compilation of "/tmp/tmpxft_00001d4f_00000000-16_extra_layers.compute_30.cpp1.ii".
Makefile:42: recipe for target 'extra_layers.cuo' failed
make: *** [extra_layers.cuo] Error 2

@sberryman
Copy link
Author

@Guptajakala, it has been so long I completely forgot about this gist! My assumption would be version issues on the dependencies or not running nvidia-docker. Nvidia's support for docker has improved quite a bit since I ran into this issue. Can you run an interactive session for the base image nvidia/cuda:8.0-cudnn5-devel and ensure the GPU is exposed?

docker run -it --rm nvidia/cuda:8.0-cudnn5-devel /bin/bash

then check GPU is exposed with:

nvidia-smi

You should see something similar to this:

Fri Apr 10 19:56:36 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:17:00.0 Off |                  N/A |
| 67%   68C    P2   220W / 280W |   9797MiB / 11178MiB |     94%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:65:00.0  On |                  N/A |
| 57%   69C    P2   220W / 280W |   9918MiB / 11175MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

@Guptajakala
Copy link

Guptajakala commented Apr 11, 2020

@sberryman, it seems like some problem else because my output is the same as yours. Thanks for you help anyway and this great gist!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment