We will be referring to the tutorial at Microsoft CNTK Docs.
📹 Download the cntk-setup video or view a slightly lower quality version here.
- Ubuntu 16.04
- CNTK 2.2 (GPU) docker image - This image will be
~9 GB
, so make sure you have at least that, plus space for your images. - in progress - Standard NC6 (6 vcpus, 56 GB memory) gpu on
Azure
(or CPU) - Docker
- Nvidia-Docker
- CUDA 8.0
- CuDNN 6.0
- Python 3.6
nvidia-docker build -t <path> .
nvidia-docker run -it <container id> bash
nvidia-docker start <container id>
To see if CNTK
is installed:
python -c "import cntk; print(cntk.__version__)"
Verify GPU works in container by running:
nvidia-smi
If you see No running processes found
(https://devtalk.nvidia.com/default/topic/539632/k20-with-high-utilization-but-no-compute-processes-/)
sudo nvidia-persistenced --persistence-mode
Check what version of CUDA you have installed:
nvcc -V
If this command is not found then you can refer to this github resource:
sudo apt-get nvidia-375 nvidia-modprobe
- Make sure you have set up all the requirements above
- Follow the FasterRCNN guide to run CNTK with your own images or look at the summary below:
- Prepare your image data by annotating it with bounding boxes (I would recommend the
VOTT
tagging tool) - Store your custom images in
Examples/Image/DataSets/<custom images directory>
- Download the AlexNet model from
Examples/Image/Detection/FastRCNN
python install_data_and_model.py
- Edit
CNTK/Examples/Image/Detection/utils/annotations/annotations_helper.py
.
Change from the default Grocery Image Data Set
data_set_path = os.path.join(abs_path, "../../../DataSets/Grocery")
To Your Custom Image Data Set
data_set_path = os.path.join(abs_path, "../../../DataSets/<custom images directory>")
- Run
Examples/Image/Detection/utils/annotations/annotations_helper.py
python annotations_helper.py
- Create a configuration file for your own dataset in
Examples/Image/Detection/utils/configs
called<custom image dataset name>_config.py
.
Edit these parameters:
__C.DATA.DATASET
__C.DATA.MAP_FILE_PATH
__C.DATA.NUM_TRAIN_IMAGES
__C.DATA.NUM_TEST_IMAGES
# data set config
__C.DATA.DATASET = <custom image dataset name>
__C.DATA.MAP_FILE_PATH = "/cntk/Examples/Image/DataSets/<Custom image data folder>"
__C.DATA.CLASS_MAP_FILE = "class_map.txt"
__C.DATA.TRAIN_MAP_FILE = "train_img_file.txt"
__C.DATA.TRAIN_ROI_FILE = "train_roi_file.txt"
__C.DATA.TEST_MAP_FILE = "test_img_file.txt"
__C.DATA.TEST_ROI_FILE = "test_roi_file.txt"
__C.DATA.NUM_TRAIN_IMAGES = <number of images to train>
__C.DATA.NUM_TEST_IMAGES = <number of images to test>
__C.DATA.PROPOSAL_LAYER_SCALES = [4, 8, 12]
- Change the
dataset_cfg
inget_configuration()
method ofCNTK/Examples/Image/Detection/FasterRCNN/run_faster_rcnn.py
to
from utils.configs.<custom image dataset name>_config import cfg as dataset_cfg
- Edit
CNTK/Examples/Image/Detection/FasterRCNN/FasterRCNN_config.py
. (More details here)
__C.CNTK.MAKE_MODE = False
__C.CNTK.DEBUG_OUTPUT = True
__C.VISUALIZE_RESULTS = True
__C.USE_GPU_NMS = True
__C.RESULTS_NMS_CONF_THRESHOLD = 0.82
- Make sure to have followed all steps in
Get Ready to Run FastRCNN on a Custom Dataset
- Make sure that
MAKE_MODE
isFalse
inCNTK/Examples/Image/Detection/FasterRCNN/FasterRCNN_config.py
__C.CNTK.MAKE_MODE = False
python run_faster_rcnn.py
- Make sure you obtained a trained model by following the steps in
Train and Test with FastRCNN
- Edit
CNTK/Examples/Image/Detection/FasterRCNN/FasterRCNN_config.py
to skip testing. IfMAKE_MODE
is set toTrue
, training will be skipped if a trained model already exists.
__C.CNTK.MAKE_MODE = True
- Edit
Examples/Image/Detection/utils/configs/<custom image dataset name>_config.py
to point to your trained model path. In this case, the trained model isfaster_rcnn_eval_AlexNet_e2e.model
.
__C.DATA.MODEL_PATH="/cntk/Examples/Image/Detection/FasterRCNN/Output/faster_rcnn_eval_AlexNet_e2e.model"
python run_faster_rcnn.py
sudo apt-get nvidia-375 nvidia-modprobe
sudo nvidia-docker-plugin
Use python 3.4. The versions that are currently contained in the repository are Python 3.5 for Windows and only Python 3.4 for Linux
, all 64 bit. If you need a different version you can compile it following the steps described at
Linux: https://github.com/rbgirshick/py-faster-rcnn
There are CNTK docker images for Python 3.5+, but those images only work unless you use Python 3.4. If you don't use Python 3.4, you will get tons of errors with the .so
files -- Supposedly they added the cython dependencies for linux python 3.5 and 3.6, but we still get errors when running other versions of python other than 3.4. We were using the latest CNTK docker image 2.1-gpu-python3.5-cuda8.0-cudnn6.0
apt-get install libqt4-core libqt4-dev libqt4-gui qt4-dev-tools
This error comes from missing NVIDIA libraries in your docker container. Use nvidia-docker
to run the container, and verify GPU works in container by running nvidia-smi
first.
pip install image
File "/cntk/Examples/Image/Detection/FasterRCNN/../utils/plot_helpers.py", line 145, in plot_test_set_results
img_path = img_file_names[i]
IndexError: list index out of range
Make sure the number of test images in your config files are correct! Namely,
/cntk/Examples/Image/Detection/utils/configs
we had created a custom Reverb_config.py
- verify that the value of __C.DATA.NUM_TEST_IMAGES
matches the test_img_file.txt
in /cntk/Examples/Image/DataSets/Reverb/labelled-guitars
(labelled-guitars is in the folder structure specified by CNTK with positive, negative, etc.) which was generated by the annotations_helper.py
script.
- File "/cntk/Examples/Image/Detection/FasterRCNN/../utils/plot_helpers.py", line 134, in plot_test_set_results from matplotlib.pyplot import imsave it's due to
python import matplotlib.pyplot
not working. To get it to work, make surematplotlib
is updated. I had an issue wherematplotlib
was stuck at an older version,1.5.0
`:
conda install pyqt
pip uninstall -y matplotlib && pip install -U matplotlib
- CNTK examples in 2.1 does not match the CNTK guide, which only covers CNTK 2.1
- CNTK FasterRCNN config file should be cleaned up on CNTK 2.1
- CNTK 2.2 uses a certain format for annotated source images, CNTK 2.1 does NOT use annotated images, so we have no way of knowing how to set up our images to get CNTK running on 2.1
- No makefile in CNTK 2.1
- Dockerfiles updates to support Python 3.5+ on linux - the binaries for
cython_modules
only come prepackaged for windows builds not linux builds. - Dockerfiles updates to include all the required pip packages -- we had to manually install pip packages
- Cleanup CNTK libs so loading your own data is taken from the environment:
Change the folder in that script to your data folder
afterstoring your images in the described folder structure and annotating them please run python Examples/Image/Detection/utils/annotations/annotations_helper.py
- Cleanup CNTK libs so loading your own data is taken from the environment:
Change the dataset_cfg in the get_configuration() method of run_faster_rcnn.py to from utils.configs.MyDataSet_config import cfg as dataset_cfg
- CNTK forces you to have at least one image saved in the training data folders (negative and positive), even though you may only want to test on images given a pre-trained model. - If you have no images in training, it will fail and not include the correct class labels in
class_map.txt
necessary to runannotations_helper.py
- Remove the specification of the number of training images and the number of test images config file. CNTK should handle any number of training/test images you provide.
- Update Readme for CNTK 2.2. It should mention that an
Output
folder is created after training and what we should expect to see after training (explain more about what new files are generated after training) - Printed progress status stays at 0.0% even though it is training images
- Output descriptive error messages. When I tried to run CNTK with different test images:
Evaluating Faster R-CNN model for 3 images.
Traceback (most recent call last):
File "run_faster_rcnn.py", line 34, in <module>
eval_results = compute_test_set_aps(trained_model, cfg)
File "/cntk/Examples/Image/Detection/FasterRCNN/FasterRCNN_eval.py", line 86, in compute_test_set_aps
mb_data = minibatch_source.next_minibatch(1, input_map=input_map)
File "/cntk/Examples/Image/Detection/FasterRCNN/../utils/od_mb_source.py", line 70, in next_minibatch
img_data, roi_data, img_dims, proposals, label_targets, bbox_targets, bbox_inside_weights = self.od_reader.get_next_input()
File "/cntk/Examples/Image/Detection/FasterRCNN/../utils/od_reader.py", line 57, in get_next_input
index = self._get_next_image_index()
File "/cntk/Examples/Image/Detection/FasterRCNN/../utils/od_reader.py", line 206, in _get_next_image_index
next_image_index = self._reading_order[self._reading_index]
IndexError: index 0 is out of bounds for axis 0 with size 0