Skip to content

Instantly share code, notes, and snippets.

Forked from douglasrizzo/
Created January 8, 2019 12:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Tsarpf/a3f89e15d527c024315a33ef3b0a15bd to your computer and use it in GitHub Desktop.
Save Tsarpf/a3f89e15d527c024315a33ef3b0a15bd to your computer and use it in GitHub Desktop.
TensorFlow Object Detection Model Training

TensorFlow Object Detection Model Training

This is a summary of this nice tutorial.

  1. Install TensorFlow.
  2. Download the TensorFlow models repository.

Annotating the dataset

  1. Install labelImg. This is a Python package, which means you can install it via pip, but the one from GitHub is better. It saves annotations in the PASCAL VOC format.
  2. Annotate your dataset using labelImg.
  3. Use this script to convert the XML files generated by labelImg into a single CSV file.
  4. Separate the CSV file into two, one with training examples and one with evaluation examples. Images should be selected randomly, making sure that objects from all classes are present in both of the CSV files. The usual proportions are 75 to 80% training and the rest to the evaluation dataset.
  5. Use this script to convert the two CSV files (eg. train.csv and eval.csv) into TFRecord files (eg. train.record and eval.record), the data format TensorFlow is most familiar with. Be aware you have to change these lines in the file, mapping each of YOUR object classes to an integer value. Use sequential values, beginning at 1.

Traversing the text file hell...

  1. Create a label map, like one of these. Make sure class numbers are the same ones that were used in the script from the last step, when creating the TFRecords.
  2. Download one of the neural network models provided in this page. The ones trained in the MSCoco dataset are the best ones, since they were also trained on objects.
  3. Provide a training pipeline, which is a config file that usually comes in the tar.gz file downloaded in the last step. If they don’t, they can be found here (they need some tweaking before using, for example, changing number of classes). You can find a tutorial on how to create your own here.
    • The pipeline config file has some fields that must be adjusted before training is started. Its header describes which ones. Usually, they are the fields that point to the label map, the training and evaluation directories and the neural network checkpoint. In case you downloaded one of the models provided in this page, you should untar the tar.gz file and point the checkpoint path inside the pipeline config file to the "untarred" directory of the model (see this answer for help).
    • You should also check the number of classes. MSCoco has 90 classes, but your problem may have more or less.
    • There are additional parameters that may affect how much RAM is consumed by the training process, as well as the quality of the training, but I won't go over those here.

Training the network

  1. Train the model. This is how you do it locally. Optional: in order to check training progress, TensorBoard can be started pointing its --logdir to the --train_dir of object_detection/
  2. Export the network, like this.
  3. Use the exported .pb in your object detector.


In the data augmentation section of the training pipeline, some options can be added or removed to try and make the training better. Some of the options are listed here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment