With Google’s Tensorflow Object Detection API, one can choose the state-of-art models (faster RCNN, SSD, etc.) to train an object detector easily and efficiently. However, as of the day I am writing this post, the Tensorflow documentation has not seem to cover how one can train an object detector with his/her own images.
Here I is how I managed to train an object detector on my own image set, as a beginner. I hope this can be helpful.
First of all, make sure you have completely installed and validated the Tensorflow Object Detection API
Prepare Data for Input
We need to prepare the data in a way such that they can be read by the object detection trainer. For this part, you can also refer to my other post on github describing how to do this. Basically we want to turn all the sample images and associated annotation files into .record files for the object detection trainer trainer to read.
- Create a data directory to contain your data: /path/to/data/
- Under the above directory, make two directories ‘images’ and ‘annotations’, and under the ‘annotations’ make another directory ‘xmls’
- Put all your sample images (.jpeg) under /path/to/data/images
- For annotations xml files (PASCAL formatted), you can use this tool. Put all xml files under /path/to/data/annotations/xmls.
- Create a label map .pbtxt file that contains the your class information. Something like the tensorflow example but with your own classes
- Download create_my_tf_record.py and put it under tensorflow/models/object_detection/. Then open terminal, cd to that directory and run:
- You will then find the generated train.record and val.record files at /path/to/output
Download a Pretrained Model for Transfer Learning
The rest of the post is very similar to the Tensorflow documentation example, but instead of upload the training job to cloud, I will focus on running the training job locally. You can choose the COCO-pretrained model COCO-pretrained Faster R-CNN with Resnet-101 model. Unzip the downloaded tarball, and you will find many checkpoint files model.ckpt*
Configure the Object Detection Pipeline
We shall still follow the original Tensorflow example and use the predefined configure file template. In tensorflow/models/object_detection/samples/config, you can find the config file: faster_rcnn_resnet101_pets.config . Open it with a text editor and modify these contents:
- num_class: <your number of classes>
- search for PATH_TO_BE_CONFIGURED, and change to:
- fine_tune_checkpoint: “/path/to/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt”
- train_input_reader, input_path: “/path/to/train.record”
- eval_input_reader, input_path: “/path/to/val.record”
- label_map_path: “/path/to/label/map/file”
Start Training and Evaluation
For training job, open an terminal and run
where ${PATH_TO_YOUR_PIPELINE_CONFIG} points to the .config file you modified in the previous step and ${PATH_TO_TRAIN_DIR} is where the training checkpoint files and events will be saved.
For evaluation job, open another terminal and run
Note that training job and evaluation job are separate, and both will continue running until you kill them by keyboard interruption: CTRL+C.
While they are running, you can open up tensorboard and monitor the training progress. Simply run: $ tensorboard –logdir=PATH_TO_TRAIN_DIR .
If your Tensorflow is built from source, you should run from Tensorflow root directory: ./bazel-bin/tensorflow/tensorboard/tensorboard –logdir=PATH_TO_TRAIN_DIR .
Exporting the Tensorflow Graph
When you are done training, you should export the output graph by running from tensorflow/models/ :
where the checkpoint_path points to /PATH_TO_TRAIN_DIR/model.ckpt-${CHECKPOINT_NUMBER} and the inference_graph_path points to where you want to save the output_inference_graph.pb
Object Detection
Now you have your own model, and you can test it on images. Simply follow this demo, change the model and test images to your own, and voila:
Final thought, since this API is officially developed by Google, I assume that it can be easily applied to android mobile apps. It should be interesting to bring bring computer vision to mobile platform.
Leave a Reply