This is a summary of this nice tutorial.
- Install labelImg. This is a Python package, which means you can install it via pip, but the one from GitHub is better. It saves annotations in the PASCAL VOC format.
- Annotate your dataset using labelImg.
- Use this script to convert the XML files generated by labelImg into a single CSV file.
- Separate the CSV file into two, one with training examples and one with evaluation examples. Images should be selected randomly, making sure that objects from all classes are present in both of the CSV files. The usual proportions are 75 to 80% training and the rest to the evaluation dataset.
- Use this script to convert the two CSV files (eg. train.csv and eval.csv) into TFRecord files (eg. train.record and eval.record), the data format TensorFlow is most familiar with. Be aware you have to change these lines in the file, mapping each of YOUR object classes to an integer value. Use sequential values, beginning at 1.
- Create a label map, like one of these. Make sure class numbers are the same ones that were used in the script from the last step, when creating the TFRecords.
- Download one of the neural network models provided in this page. The ones trained in the MSCoco dataset are the best ones, since they were also trained on objects.
- Provide a training pipeline, which is a
configfile that usually comes in the tar.gz file downloaded in the last step. If they don’t, they can be found here (they need some tweaking before using, for example, changing number of classes). You can find a tutorial on how to create your own here.
- The pipeline config file has some fields that must be adjusted before
training is started. Its header describes which ones. Usually, they
are the fields that point to the label map, the training and
evaluation directories and the neural network checkpoint. In case you
downloaded one of the models provided in this page,
you should untar the
tar.gzfile and point the checkpoint path inside the pipeline config file to the "untarred" directory of the model (see this answer for help).
- You should also check the number of classes. MSCoco has 90 classes, but your problem may have more or less.
- There are additional parameters that may affect how much RAM is consumed by the training process, as well as the quality of the training, but I won't go over those here.
- The pipeline config file has some fields that must be adjusted before training is started. Its header describes which ones. Usually, they are the fields that point to the label map, the training and evaluation directories and the neural network checkpoint. In case you downloaded one of the models provided in this page, you should untar the
- Train the model.
This is how you do it locally.
Optional: in order to check training progress, TensorBoard can be
started pointing its
- Export the network, like this.
- Use the exported
.pbin your object detector.
In the data augmentation section of the training pipeline, some options can be added or removed to try and make the training better. Some of the options are listed here