Skip to content

Instantly share code, notes, and snippets.

@douglasrizzo
Last active November 15, 2023 22:36
Show Gist options
  • Star 73 You must be signed in to star a gist
  • Fork 20 You must be signed in to fork a gist
  • Save douglasrizzo/c70e186678f126f1b9005ca83d8bd2ce to your computer and use it in GitHub Desktop.
Save douglasrizzo/c70e186678f126f1b9005ca83d8bd2ce to your computer and use it in GitHub Desktop.
TensorFlow Object Detection Model Training

How to train your own object detection models using the TensorFlow Object Detection API (2020 Update)

This started as a summary of this nice tutorial, but has since then become its own thing.

Prerequisites

  1. Choose a TensorFlow installation. TensorFlow 1 and 2 have different different neural networks avaliable, so check here and here to make your choice.

    • Tip: if you opt for one of the TF1 models, please note that the Ojbect detection API is only officialy compatible with TF 1.15.O, which works only with CUDA 10.0 (unless you compile from source). From personal experience, I know that all versions of TF from 1.12 and backwards do not work with the Object Detection API anymore.
  2. Install TensorFlow.

  3. Download the TensorFlow models repository and install the Object Detection API [TF1] [TF2].

Annotating images and serializing the dataset

For these steps, I'll recommend a collection of script I mase, which are available in this repository. All of the scripts mentioned in this section receive arguments from the command line and have help messages through the -h/--help flags. Also check the README from the repo they come from to get more details, if needed.

  1. Install labelImg. This is a Python package, which means you can install it via pip, but the one from GitHub is better.

  2. Annotate your dataset using labelImg. Each image you annotate will have its annotations saved to an individual XML file with the name of the original image file and the .xml extension.

  3. Use this script to convert the XML files generated by labelImg into a single CSV file.

    • Optional: Use this script to separate the CSV file into two, one with training examples and one with evaluation examples. Let's call them train.csv and eval.csv. Images will be selected randomly and there are options to stratify examples by class, making sure that objects from all classes are present in both datasets. The usual proportions are 75% to 80% of the annotated objects used for training and the rest for the evaluation dataset.
  4. Create a "label map" for your classes. You can check some examples to understand what they look like. You can also generate one from your original CSV file with this script.

  5. Use this script to convert each of your CSV files into two TFRecord files (eg. train.record and eval.record), a serialized data format that TensorFlow is most familiar with. You'll need to point to the directory where the image files are stored and to the label map generated in the previous step.

    • Tip: if you notice mistakes during the creation of these files, you can check their contents and compare to the ones in these examples.

Choosing a neural network and preparing the training pipeline

  1. Download your the neural network model of choice from either the Detection Model Zoo [TF1][TF2] or from the models trained for classification available here and here. This is the step in which your choice of TensorFlow version will make a difference. From my experience, many of the classification models work with TF 1.15, but I am not aware if they work with TF 2.

  2. Provide a training pipeline, which is a file with .config extension that describes the training procedure. The models provided in the Detection Zoo come with their own pipelines inside their .tar.gz file, but the classification models do not. In this situation, your options are to:

    • download one that is close enough from here (I have succesfully done that to train classification MobileNets V1, V2 and V3 for detection).
    • create your own, by following this tutorial.

    The pipeline config file has some fields that must be adjusted before training is started. The first thing you'll definitely want to keep an eye on is the num_classes attribute, which you'll need to change to the number of classes in your personal dataset.

    Other importants fields are the ones with the PATH_TO_BE_CONFIGURED string. In these fields, you'll need to point to the files they ask for, such as the label map, the training and evaluation TFRecords and the neural network checkpoint, which is a file with an extension like .ckpt or .ckpt.data-####-of-####. This file also comes with the .tar.gz file.

    In case you are using a model from the Detection Zoo, set the fine_tune_checkpoint_type field to "detection", otherwise, set it to "classification".

    There are additional parameters that may affect how much RAM is consumed by the training process, as well as the quality of the training. Things like the batch size or how many batches TensorFlow can prefetch and keep in memory may considerably increase the amount of RAM necessary, but I won't go over those here as there is too much trial and error in adjusting those.

Training the network

  1. Train the model. To do it locally, follow the steps available here: [TF1][TF2]. Optional: in order to check training progress, TensorBoard can be started pointing its --logdir to the --model_dir of object_detection/model_main.py.

  2. Export the network, like this.

  3. Use the exported .pb in your object detector.

Final Tips

In the data augmentation section of the training pipeline, some options can be added or removed to try and make the training better. Some of the options are listed here.

@douglasrizzo
Copy link
Author

@MasudHaider I imagine so, as long as you are able to clone and install the TF Object Detection API into your Colab, the other steps are simple.

@Idakwo
Copy link

Idakwo commented Oct 28, 2019

@douglasrizzo Can you please share a file to recreate the environment you used? Like a requirement.txt or .yml. I am getting one error after the other after fixing a dependency error. There are several GitHub commits to TF's Object Detection repo correlating with different TF versions. I see changes in the repo to TF2.0. I'd be great to know which commit worked for you and the corresponding set up. Thanks

@douglasrizzo
Copy link
Author

Hi @ldakwo. I haven't updated this tutorial in a while. It works with tensorflow==1.13, but I can't pinpoint the version of the
tensorflow/models repo that is compatible with tf 1.13. I also know that object detection does not work with tf2.0 as of yet, so maybe that's one source for you errors.

@marcotacuri
Copy link

cuando ejecuto la conversion de csv a tfrecords aparece el siguiente mensaje: module 'tensorflow' has no attribute 'app' en la linea: flags = tf.app.flags. Como soluciono esto

@padhulp
Copy link

padhulp commented Jul 13, 2020

Your scripts helped me create my TF records atlast! Thank you so much

@douglasrizzo
Copy link
Author

@padhulp glad they helped!

@Raman1121
Copy link

Hi @douglasrizzo
Great tutorial. I am facing a problem while running 'generate_tfrecord.py' in the imports 'from object_detection.utils import dataset_util'. When I looked at the repo, I could not find any module/ file named object_detection. Where can I find this?

@douglasrizzo
Copy link
Author

@Raman1121 you need to install the Object detection API [link].

@dossierMe
Copy link

Hi @douglasrizzo

This is very helpful. Could you help with how to go about if I want to build and train my own DL model?

@andihaki
Copy link

any reference for training facelandmark keypoint?

@philip-thielges
Copy link

Hey, in the annotation section, there is a repo linked for using labellmg. Unfortunately it isn't available anymore, do you still have access or an alternative? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment