OpenDetection (OD) is a standalone open source project for object detection and recognition in images and 3D point clouds. The support CNN based classifiers and object detection methods with Caffe backend were added as part of Google Summmer of Code 2017. All related code can be found at https://github.com/gautamMalu/opendetection.git
- Interfacing of Caffe in OpenDetection framework
- Addition of CNN classifier class (ODClassifier.h)
- Providing methods to convert data from folder/list into LMDB format
- Addition of CNN training module (ODConvTrainer.h)
- Addition of CNN Object Detection module (ODConvDetector.h)
- Addition of methods to claculate mAP (mean Average Precision) for evaluation of object detection methods
Detailed descriptions of these are in section titled Detailed Descriptions.
The compilation and installation of OD is fairly simple. You first have to install its dependencies before compiling. You can find the instructions for installation here (http://opendetection.com/installation_instruction.html).
Opendetection fork with all the CNN moduels: https://github.com/gautamMalu/opendetection.git
Apart from dependencies mentioned in above link, OD also require Caffe for CNN modules. We are using this fork of Caffe (https://github.com/gautamMalu/caffe-faster-rcnn/) as official Caffe does not support faster-rcnn models for object detection. Build this Caffe and then run
make distribute
Required Settings in OD build system:
in cmake/od_mandatory_dependency.cmake
set(Caffe_DISTRIBUTE_DIR "YOUR_PATH_TO_CAFFE/caffe-faster-rcnn/distribute"
Apart from Caffe some othere dependencies (Glog, LMDB, GFlags) were also added, however they are dependecies of Caffe also, so they must already be in your system.
We have added some demos to showcase the usage of CNN modules.
To run this example, first download the CUB dataset and Alexnet model, assuming running these commands from OD root directory,
and the build directory is opendetection/build
./examples/classification/cub/get_cub.sh
Convert CUB dataset into LMDB format using dataset modules for training
./examples/classification/cub/create.cub.sh
Start finetuing of Alexnet for CUB
./build/examples/classification/cub/train_cub ./examples/classifcation/cub/solver.prototxt ./examples/classifcation/cub/bvlc_reference_caffenet.caffemodel
Classify an Image from CUB dataset from trained model
./build/examples/classification/cub/classify_cub model_weight_file image_src
In this example we use pre-trained faster-rcnn models which are trained on PASCAL VOC dataset and generate video feeds with overlaid detected objects.
For this first we have to download the faster-rcnn models.
./examples/objectdetector/fetch_faster_rcnn_models.sh
Once we have downloaded the modeles, run this command to run the demo with ZF net
./build/examples/objectdetector/od_faster_rcnn_cam ./examples/objectdetector/faster_rcnn_models/zf/test.prototxt ./examples/objectdetector/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
commit: https://github.com/gautamMalu/opendetection/commit/eac84d7304e305224b6705ec7c7e127c48acb0cb
I have added a FindCaffe.cmake script for the Caffe dependency, for other dependecies also I have respective cmake scripts in cmake/modules/
path (FindGFlags.cmake, FindGlog.cmake, FindLMDB.cmake), these cmake scripts were adopted from Caffe.
Required Settings in OD build system are either we set caffe_dsitrbute_dir path in cmake/od_mandatory_dependency.cmake
as
set(Caffe_DISTRIBUTE_DIR "YOUR_PATH_TO_CAFFE/caffe-faster-rcnn/distribute"
Or in cmake we can specify the Caffe_INCLUDE_DIRS which should be YOUR_PATH_TO_CAFFE/distribute/include
and Caffe_LIBRARIES which should be YOUR_PATH_TO_CAFFE/distribute/lib/libcaffe.so
commit: https://github.com/gautamMalu/opendetection/commit/0e2e516fd730c927d29f96bc6c5c7493210aae5d
First I have made a od::ODClassifer class derived from od::ObjectDetector class and then derived the od::ODConvClassifier class from this base classfier. This classifier class doesn't have training method, that is convered in a separate od::ODConvTrainer module.
examples of usage of this module: mnist and cub.
usage examples: cub example, mnist example.
Usage of Classifier class as shown in cub example:
first declare a classfier
od::g2d::ODConvClassifier *cub = new od::g2d::ODConvClassifier();
Then initiate the classifier with model definition and trained model weight file as:
cub->initClassifier(model_def, weight_file);
then set the mean Image file as:
cub->setMeanFromFile(mean_file);
now to test this classfier's accuracy on list of images along with its labels for top k classes:
float acc = cub->test(Images_ROOT_Folder,file_WITH_LIST_OF_IMAGES_WITH_LABELS,top_k);
TODO: Batch Classification
commits: https://github.com/gautamMalu/opendetection/commit/1556714b3921960f90559f7d632a684e10c71ec6
https://github.com/gautamMalu/opendetection/commit/4755e6434ce0eac2b0e07790e78e0fdacb20c099
I have added different dataset modules as different dataset class, all dataset classes have methods for converting data into LMDB, computing mean image. List of dataset classes implemented: ODDatasetCIFAR.h, ODDatasetFolder.h, ODDatasetList.h, ODDatasetMNIST.h
usage examples: cub example and mnist example
Usage of dataset module as shown in mnist:
In this examples we will create a LMDB dataset from the mnist dataset, this mnist data is in image formate seperated in two folder for training and testing. In both training and testing folders we have separate folders for differet classes.
first we will declare a datasetfolder class instance with lmdb backend:
od::DatasetFolder *mnist = new od::DatasetFolder("","lmdb");
then to create the training_LMDB
dataset from the training_folder
with shuffling:
mnist->convert_dataset(training_folder,training_LMDB,true);
compute mean Image (mnist_mean.binraryproto) from the training data:
mnist->compute_mean_image(training_LMDB, mnist_mean.binraryproto);
Now, create the testing_LMDB
data from testing_folder
without shuffling:
mnist->convert_dataset(testing_folder,testing_LMDB,false);
TODO: Add modules to interface VOC,COCO datasets into OD library. Add modules to convert dataset into other data formats apart from LMDB like HDF5
commits: https://github.com/gautamMalu/opendetection/commit/2601d212cf1e5474e5d9c35ff36f726a2d630aa3
I have addded a seperate od::ODConvTrainer.h module for the training of different CNN based classifiers and Detectors. This class has methods for training, finetuning from a pre-trained model and resume training from the given solver state, and to test a trained model from train_val.protoxt for the given number of test iterations.
usage examples: cub example, mnist example
usage of trainer module as shown in mnist example:
first declare instace of ODConvTrainer class
od::g2d::ODConvTrainer *mnist_trainer = new od::g2d::ODConvTrainer("","");
set its solver parameters for training, here are the solver parameters for mnist {learning rate:0.005, learning rate policy:fixed,max number of iterations: 10000, snapshot interval: 5000, snapshot prefix:"examples/classification/mnist/lenet", train_val model definition file:"examples/classification/mnist/lenet_train_test.prototxt"}
mnist_trainer->setSolverParameters("examples/classification/mnist/lenet_train_test.prototxt",
0.005, "fixed",10000,
5000, "examples/classification/mnist/lenet");
For detailed list of parameters please checkout this method, for setting more complex solver parameter please use setSolverParametersFromFile.
Once we have set the solver parameters we can start training with startTraining method as:
mnist_trainer->startTraining();
TOOD: Provide way of setting different optimization methods apart from SGD like Adam, AdaGrad, SGD with momentum, different learning rate policies from setSolverParameters method.
commits: https://github.com/gautamMalu/opendetection/commit/f87ca13485648df7460e96b1d094145da57c52cd I have added od::g2d::ODConvDetector class derived from generic ODDetector2D class. For this custom class I have implemented the omniDetection method, and initDetector method.
usage examples: live faster rcnn webcam example, faster rcnn example for single image.
usage as shown in live web cam example In this example we run a pre-trained faster-rcnn model which was trained on PASCAL VOC dataset on our webcam feed, and show the detected objects in the same feed with bounding boxes. First we create an instance of ODConvDetector class
od::g2d::ODConvDetector *faster_rcnn = new od::g2d::ODConvDetector();
Then we set its various configuration options for faster-rcnn model like rpn net parameters, NMS threnshold, anchors etc with config json file.
string conf = "examples/objectdetector/faster_rcnn_models/config/voc_config.json";
faster_rcnn->setConfig(conf);
After that we initiate the detector with model definition prototxt file, and model weights file.
faster_rcnn->initDetector(model_def, model_weight);
Once we have initiated the detector we can just get individual frames from the webcam using od::ODFrameGenetor and run this detector on each frame.
od::ODSceneImage * scene = frameGenerator.getNextFrame();
//Detect
od::ODDetections2D *detections = faster_rcnn->detectOmni(scene);
To showcase detections BBoxes overlaid on the frames along with class information
if(detections->size() > 0)
cv::imshow("Overlay", detections->renderMetainfo(*scene).getCVImage());
else
cv::imshow("Overlay", scene->getCVImage());
TODO: Provide methods for training of different CNNs detection models; Provide other detection models such as SSD, mobilenet-SSD; Write scripts to fetch pre-trained and custom trained models it would be similar to the script to fetch classification models. We might need to add some more layer in the custom caffe for SSD. We can also adopt/write scripts to convert models trained in different frameworks(TensorFlow, Torch) into Caffe models
commits: https://github.com/gautamMalu/opendetection/commit/f15b56d97590d949108f9d2cb9f0d401b571db2a
I have adopted this method from caffe-faster-rcnn's voc_ap method.
TODO: Add mAP calculation using voc_2007 metric; Add examples showcasing its working.
Gautam Malu - (https://github.com/gautamMalu)
Opendetection is licensed under the BSD license.
I am thankful to my mentors both Aditya Tiwari, Kripasindhu Sarkar who provided expertise that greatly assisted in this work.