Skip to content

Instantly share code, notes, and snippets.

@daxanya2
Last active November 23, 2016 23:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save daxanya2/b4b705aeb07e19122955 to your computer and use it in GitHub Desktop.
Save daxanya2/b4b705aeb07e19122955 to your computer and use it in GitHub Desktop.
Ubuntu14.04.3でnvidia-docker使ってCaffeをインストールしてみた ref: http://qiita.com/daxanya1/items/f04c7f75a6d2ecb92b23
FROM cuda:7.5_cudnn70
COPY Anaconda2-2.4.1-Linux-x86_64.sh /opt
# caffe makefile:use anaconda2 / use cudnn3
COPY Makefile.config /opt
# install anaconda2
RUN echo 'export PATH=/opt/anaconda2/bin:$PATH' > /etc/profile.d/conda.sh && \
cd /opt && \
/bin/bash /opt/Anaconda2-2.4.1-Linux-x86_64.sh -b -p /opt/anaconda2 && \
rm /opt/Anaconda2-2.4.1-Linux-x86_64.sh
# prepare caffe
RUN apt-get update && apt-get install -y \
libatlas-base-dev libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler git wget && \
rm -rf /var/lib/apt/lists/*
# solution "libdc1394 error: Failed to initialize libdc1394"
# ref) http://stackoverflow.com/questions/12689304/ctypes-error-libdc1394-error-failed-to-initialize-libdc1394
RUN ln -s /dev/null /dev/raw1394
# solution "Error loading shared library libhdf5_hl.so"
# ref) https://github.com/BVLC/caffe/issues/1463
RUN cp /opt/anaconda2/pkgs/hdf5-1.8.15.1-2/lib/libhdf5.so.10.0.1 /lib/x86_64-linux-gnu/ && \
cp /opt/anaconda2/pkgs/hdf5-1.8.15.1-2/lib/libhdf5_hl.so.10.0.1 /lib/x86_64-linux-gnu/ && \
ldconfig
# make caffe
RUN cd /opt && \
git clone https://github.com/BVLC/caffe.git && \
cd caffe && \
cp /opt/Makefile.config . && \
make all -j4 && \
make test -j4
# ref) https://hub.docker.com/r/eduwass/face-the-internet-worker/~/dockerfile/
CMD sh -c 'ln -s /dev/null /dev/raw1394'; bash
$ sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
$ sudo echo 'deb https://apt.dockerproject.org/repo ubuntu-trusty main' > /etc/apt/sources.list.d/docker.list
$ sudo apt-get update
$ sudo apt-cache policy docker-engine
$ sudo apt-get install docker-engine
$ sudo service docker start
$ git clone https://github.com/NVIDIA/nvidia-docker.git
$ cd nvidia-docker
$ sudo make install
$ sudo nvidia-docker volume setup
$ ls
Anaconda2-2.4.1-Linux-x86_64.sh Dockerfile Makefile.config
$ sudo nvidia-docker build -t ml_caffe:7570ana2 .
$ sudo nvidia-docker run -it --name "runtest" ml_caffe:7570ana2
root@e14ad9a3b2d6:/# cd /opt/caffe; make runtest
[----------] Global test environment tear-down
[==========] 1744 tests from 257 test cases ran. (241958 ms total)
[ PASSED ] 1744 tests.
$ sudo docker run -it --name "runtest_docker" ml_caffe:7570ana2
root@b25a57ebfe81:/# cd /opt/caffe; make runtest
.build_release/tools/caffe
caffe: command line brew
usage: caffe <command> <args>
:
(中略)
:
[----------] 1 test from HDF5OutputLayerTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN ] HDF5OutputLayerTest/3.TestForward
F0118 12:02:51.914427 240 syncedmem.hpp:18] Check failed: error == cudaSuccess (35 vs. 0) CUDA driver version is insufficient for CUDA runtime version
*** Check failure stack trace: ***
:
(後略)
:
$ sudo nvidia-docker run -it --name "mnist" ml_caffe:7570ana2
root@a99d3a442790:/# cd /opt/caffe/
root@a99d3a442790:/opt/caffe# ./data/mnist/get_mnist.sh
root@a99d3a442790:/opt/caffe# ./examples/mnist/create_mnist.sh
root@a99d3a442790:/opt/caffe# ./examples/mnist/train_lenet.sh
I0118 13:18:48.695695 34 solver.cpp:459] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
I0118 13:18:48.700121 34 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I0118 13:18:48.702507 34 solver.cpp:321] Iteration 10000, loss = 0.00265015
I0118 13:18:48.702540 34 solver.cpp:341] Iteration 10000, Testing net (#0)
I0118 13:18:48.765329 34 solver.cpp:409] Test net output #0: accuracy = 0.9905
I0118 13:18:48.765362 34 solver.cpp:409] Test net output #1: loss = 0.0263541 (* 1 = 0.0263541 loss)
I0118 13:18:48.765379 34 solver.cpp:326] Optimization Done.
I0118 13:18:48.765384 34 caffe.cpp:215] Optimization Done.
$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ sudo nvidia-docker run -it --rm ml_caffe:7570ana2
root@34302aeabcfa:/# exit
exit
$ sudo nvidia-docker run -it --rm ml_caffe:7570ana2
Error response from daemon: Error looking up volume plugin nvidia-docker: Plugin not found
$ sudo nvidia-docker run nvidia/cuda nvidia-smi
+------------------------------------------------------+
| NVIDIA-SMI 352.63 Driver Version: 352.63 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 980 Ti Off | 0000:01:00.0 On | N/A |
| 0% 50C P8 23W / 275W | 99MiB / 6140MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
$ sudo docker run nvidia/cuda nvidia-smi
exec: "nvidia-smi": executable file not found in $PATH
$ sudo nvidia-docker run -it --name "test1" nvidia/cuda /bin/bash
(ctrl-p, ctrl-qで抜ける)
$ sudo docker attach test1
$ nvidia-smi
-> 成功する
$ sudo nvidia-docker build -t cuda:7.5_cudnn70 ./ubuntu/cuda/7.5/devel/cudnn3
$ curl "https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda2-2.4.1-Linux-x86_64.sh" > Anaconda2-2.4.1-Linux-x86_64.sh
$ curl https://raw.githubusercontent.com/BVLC/caffe/master/Makefile.config.example -O
$ cp Makefile.config.example Makefile.config
USE_CUDNN := 1
CUDA_DIR := /usr/local/cuda
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_50,code=compute_50
BLAS := atlas
ANACONDA_HOME := /opt/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
# $(ANACONDA_HOME)/include/python2.7 \
# $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
PYTHON_LIB := $(ANACONDA_HOME)/lib
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
TEST_GPUID := 0
Q ?= @
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment