encountered while reading “STN-OCR: A Single Neural Network for Text Detection and Text Recognition” which adopted spatial transformer networks.
This video is very clear in understanding how it works. Although I didn’t fully understand the interpolation equations, the other parts were clear. And at the end of the video, it briefly compares the spatial transformer with deformable convolutional networks which is interesting.
Two single-class training attempts have been made where one successfully produced reliable bounding boxes and the other failed to produce even one. The successful case was a single-class ‘car’ detector and the other was a ‘face’ detector. The training results do not make sense and this post will document this erratic behavior. Continue reading “darkflow yolo v2 training from scratch not working”
here are label info for various models. yolo9000 label is here as well. but the weights are not in this repo.
for yolo9000, refer to this repo: https://github.com/philipperemy/yolo-9000
yolov2 : 2016.12.25
ssd: 2016.12.29 (last revised). first submitted 2015.12.8
yolov2 is reported to work outperform ssd according to yolov2 paper.
I’m not yet sure what mobilenet is. It just seems like a bunch of techniques to reduce and optimize a model for embedded devices.