In this note I will show you how to use this state of the art real-time object detection system, named YOLOv2 with custom objects under ROS. This works only with Linux. I personally used Ubuntu 14.04 and ROS Indigo.
- A computer with Linux
- ROS
- CUDA - recommended
- OpenCV - recommended
For further information please read this.
First clone recursively into a ROS workspace from here. The instructions needed are presented there in details.
Now, navigate to the darknet folder in the darknet_ros package and change the preferences in the Makefile. For using CUDA and OpenCV, change the 0 to 1. Don't forget make in the terminal after this. You may test that YOLO actually works, by navigating to the darknet folder and following the instructions from the official YOLO site. Don't forget to download the weights!
After building with
catkin build darknet_ros -DCMAKE_BUILD_TYPE=Release
and modifying the .yaml and .launch files according to your preferences you should be able to start it by typing
roslaunch darknet_ros darknet_ros.launch
Here you can find the data I used for my project.
- copy the cfg/obj.cfg in your_path/darknet_ros/darknet_ros/yolo_network_config/cfg
- copy the desired weight to your_path/darknet_ros/darknet_ros/yolo_network_config/weights
- simple_7100.weights - only a few images used for this
- obj_8800.weights - contains all the image data
- insert config/my_yolo.yaml to your_path/darknet_ros/darknet_ros/config
- modify the specific line in darknet_ros.launch like this
file="$(find darknet_ros)/config/my_yolo.yaml
(or you can create your own .yaml based on this.)
- in ros.yaml you can modify the topics
- in case of a compressed stream, paste this at the beginning of the .launch file
<node name="republish" type="republish" pkg="image_transport" args="compressed in:=/kinect_head/rgb/image_color raw out:=/camera/rgb/image_raw" />
(change the topic if needed.)
- start the camera stream and enjoy!
First of all you need a dataset which is big enough. I learned a lot from this article, although I changed a few things.
If you have properly cropped images about your objects, and there is only the specific object on the picture, see the example below, than you can use the find_image_size.py script from my repository. Before this, you need to create separate folders in the your_path/darknet/data for all the objects you want to detect (let's call these classes from now on). Once you have your folders with the images, copy the script mentioned above, modify categoryNumber (line 24) and run it. A lot of .txt files will be generated, one for each image, already in YOLO format. The categoryNumber varies from 0 to the number of your classes -1. You need to increase it by 1 each time you want to add a new class, copy in the corresponding folder and run.
Otherwise you need to use a labeling tool. I personally prefer Euclid. It's simple, easy to use, supports YOLO format and you can label images containing more classes. It generates the necessary files, you just need to put them near the images. You need to know that this method is really time-consuming.
Once you are done with labeling, and you have all the .txt files near your pictures, each class separated in different folders, you can run process.py from your_path/darknet_ros/darknet/data/my_data. This will generate and populate the train.txt and test.txt files in the my_data folder. These contain the path to the images used for training. Note that for now, these scripts support only .png and .jpg formats, but you can change them as you wish.
Preparing the YOLO configuration files is wonderfully described here. I did the same, although I had to change
subdivisions=16
in my obj.cfg file.
After modifying .cfg,.data and .names accordingly, placing them in the correct folders and double-checking the paths(!), you may start the actual training. Download this and place in darknet folder. Now run
./darknet detector train cfg/obj.data cfg/obj.cfg darknet19_448.conv.23
The training should start.
Depending on your GPU (use CUDA!), the training could last for hours or days, as you wish. For a decent performance you need more iterations, but try to avoid overfitting. If you want to test it as soon as possible, you can stop it anytime, but keep in mind that backups are created after every 100 iterations and after 1000, only after every 1000. If you wish to continue training, you can simply use something like
./darknet detector train cfg/obj.data cfg/obj.cfg obj_1000.weights
You can test your model on a picture which contains some of the classes you used, just put it in the data folder.
./darknet detector test cfg/obj.data cfg/obj.cfg obj_1000.weights data/your_image.jpg
In case you have any issues, this might be helpful.
In my case there was a problem using the ROS package with CUDA. Adding
list(APPEND CUDA_NVCC_FLAGS "-std=c++11")
to the CMakeLists.txt, right after
FIND_PACKAGE(CUDA)
if (CUDA_FOUND)
solved it.