bendangnuksung/docker_tf_model_setup.md

## docker_tf_model_setup.md

      
    Raw
  

              docker_tf_model_setup.md
            
          
    Setup Nvidia docker,  TF model server GPU setup

Install Docker

Reference here
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce

# check docker status
sudo systemctl status docker


Changing default docker container location (Optional)

Answers from: https://stackoverflow.com/questions/32070113/how-do-i-change-the-default-docker-container-location

Stop docker
sudo systemctl stop docker
Edit /etc/docker/daemon.json
vim /etc/docker/daemon.json
And then add the following lines:
{
  "graph":"/PATH/TO/YOUR/CONTAINER/STORAGE"
}
Start Docker
systemctl start docker
Install nvidia docker

Reference here
# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
sudo docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey |sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# restart service
sudo systemctl restart docker

# Test nvidia-smi with the latest official CUDA image
sudo docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
Check if nvidia driver works

# There will be a docker image downloaded called 'nvidia/cuda'
# we will run that docker in bash

sudo docker images
# Copy the 'IMAGE ID' of the docker 'nvidia/cuda'

# run the docker in bash
sudo docker run --runtime=nvidia -it IMAGE_ID bash

# Once you logged in it'll will show as "root@670724d97363:   "
# Now test nvidia drivers are working
nvidia-smi

# It should show something like this
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 390.116                Driver Version: 390.116                   |
# |-------------------------------+----------------------+----------------------+
# | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
# | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
# |===============================+======================+======================|
# |   0  GeForce GT 1030     Off  | 00000000:1D:00.0  On |                  N/A |
# | 36%   44C    P8    N/A /  30W |    286MiB /  1998MiB |      1%      Default |
# +-------------------------------+----------------------+----------------------+
#                                                                                
# +-----------------------------------------------------------------------------+
# | Processes:                                                       GPU Memory |
# |  GPU       PID   Type   Process name                             Usage      |
# |=============================================================================|
# +-----------------------------------------------------------------------------+

# Then exit
exit
Download Tensorflow serving GPU

Note: Cloning tf serving 1.12.0

Reference here and selecting version reference from here
sudo docker pull tensorflow/serving:1.12.0-gpu
you will have a docker image name 'tensorflow/serving'
sudo docker images
Copying your custom model or tensorflow model script to the docker image

'tensorflow/serving' docker image has default ENTRYPOINT where it calls the 'tensorflow_model_server' command when it is run

Inorder to overcome we need to overwrite the ENTRYPOINT when running it.
# First copy the IMAGE ID of the ''tensorflow serving'
sudo docker run --runtime=nvidia --entrypoint bash -it 'IMAGE_ID_tensorflow/serving'
Keep the docker running and open a new terminal. So that we can copy our models into it.
# Get the Container ID of the running docker
sudo docker ps
# Copy the CONAINTER ID

# Now copy your models into it
sudo docker cp <SOURCE_MODEL_PATH> <CONTAINER_ID>:<DESTINATION_MODEL_PATH>

# Once finished copying need to save it, so we need to commit it.
sudo docker <CONTAINER_ID> <your_repo_name>:<tag_name>
# current <your_repo_name> --> tensorflow/serving
# <tag_name> --> more like commit name

Once completed you can exit from the docker container using:
exit
If your custom tensorflow model server needs to run a Script (usually to run multiple models using model server) follow this:


Follow the same step above, which is copying your script to the container
Make the script executable:
chmod +x script_path.sh

Then save the container using COMMIT. like above
Exit from the running container
Get the new container IMAGE ID
Then run the container.
sudo docker run --runtime=nvidia -it --entrypoint=<PATH_TO_THE_SCRIPT> IMAGE_ID