Skip to content

Instantly share code, notes, and snippets.

@ohjho
Last active September 8, 2021 01:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ohjho/d4b8f62c2c1bce799ac399895e7eaa6c to your computer and use it in GitHub Desktop.
Save ohjho/d4b8f62c2c1bce799ac399895e7eaa6c to your computer and use it in GitHub Desktop.
how to install torchserve and get your first model running

Installing TorchServe

Machine Type

GCP, Ubuntu instance (Canonical, Ubuntu, 16.04 LTS, amd64 xenial image built on 2020-06-10, supports Shielded VM features)

To get this example to actually work, I followed the official documentation, this blog post by AWS, and this YouTube demo

  1. Install Java 11
sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-11-jdk

1.1 Install Python3.7

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt unpdate
sudo apt install python3.7 python3.7-dev

1.2 Install PIP

sudo apt install python-pip python3-venv python3-pip
pip install --upgrade pip

1.3 Install the right version of CUDA

this is no small feat, the right version of CUDA, nvidia driver, and PyTorch must be aligned.

  1. Instal VirtualEnvWrapper
pip install virtualenvwrapper

if you run into pip install issue this might help
if you have trouble locating virtualenvwrapper.sh this might help
do a pip check also to make sure there are no other missing packages
and lastly, make virtualenv available in Python3.7 by adding this alias to your ~/.bashrc:

alias mkvirtualenv3='mkvirtualenv --python=`which python3.7` '
  1. create a torchserve3 environment and install torchserve and torch-model-archiver
mkvirtualenv3 torchserve3
pip install torch torchtext torchvision sentencepiece psutil future
pip install torchserve torch-model-archiver

Now torchserve is availabe in your virtualenv torchserve3

Check that GPU is availabe by:

python -m torch.utils.collect_env

if you need to uninstall the wrong verison of cuda, see here and here

Extra Notes on PyTorch & CUDA
  • Cuda10.0 only works with torch==1.2 and torchvision==0.4.0 but TorchServe requires torch>=1.5
  • For our specific machine, we need driver nvidia-418, cuda-10.1, and this torch and torchvision install:
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Start TorchServe

torch serve needs a model_store directory where archived models *.mar will be served. Mine is at ~/models/model_store/.

Start TorchServe (do this in a screen by:

torchserve --start --model-store ~/models/model_store/

Configuring a Public API

You need to first enable SSL
assuming that you are using the keystore method, you need to create a config.properties file with the following:

inference_address=https://0.0.0.0:8080                                                                                    
management_address=https://0.0.0.0:8081                                                                                   
keystore=keystore.p12
keystore_pass=changeit
keystore_type=PKCS12

then start TorchServe (in the same path as your keystore.p12 and config.properties)with the following:

torchserve --start --model-store ~/models/model_store/ --ts-config config.properties

Archiving a Model

first clone the TorchServe repo to get access to the example model-file and extra-files:

git clone https://github.com/pytorch/serve.git
  1. Download a trained model into your model_store directory (mine is ~/models/model_store)
wget https://download.pytorch.org/models/densenet161-8d451a50.pth -P ~/models/model_store
  1. Archive the model (run this in the parent directory of where your TorchServe repo directory sits)
torch-model-archiver --model-name densenet161 \
--version 1.0 --model-file serve/examples/image_classifier/densenet_161/model.py \
--serialized-file ~/models/model_store/densenet161-8d451a50.pth \
--extra-files serve/examples/image_classifier/index_to_name.json \
--handler image_classifier
  1. Move the archived model into model_store
mv densenet161.mar ~/models/model_store/

Optionally you can just host the model directly here (ideally do this in a screen)

torchserve --start --model-store ~/models/model_store/ --models densenet161=densenet161.mar

to register our DenseNet161 model in ~/models/model_store/:

curl -X POST "http://localhost:8081/models?url=densenet161.mar"

to configure workers, number of gpu, timeout, etc... see here
We will add gpu to our densenet161 model here:

curl -v -X PUT "http://localhost:8081/models/densenet161?min_worker=8&number_gpu=2&synchronous=true"

using the batch_size (max batch size the model expect to handle) and max_batch_delay(milliseconds to wait to fill-up batch) flags in the management API we could enable batch inference like so:

# set batch size to 8 and max delay to 50ms for the model densenet161
curl -X POST "localhost:8081/models?url=densenet161.mar&batch_size=8&max_batch_delay=50"

See Running Models

curl "http://localhost:8081/models"

To see details of models running, for example our DenseNet161:

curl "http://localhost:8081/models/densenet161"

To simply see the health of torchserve:

curl http://localhost:8080/ping

Make Inference Request

  1. Download test image
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
  1. Send Image to Inference API
curl -X POST http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg

Reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment