is concerned with the theory and technology for building artificial systems that obtain information from images or multi-dimensional data.
Object Detection is used almost everywhere these days. The use cases are endless, be it Tracking objects, Video surveillance, Pedestrian detection, Anomaly detection, People Counting, Self-driving cars or Face detection, the list goes on.
A convolutional neural network (CNN) is mainly for image classification. While an R-CNN, with the R standing for region, is for object detection. A typical CNN can only tell you the class of the objects but not where they are located.
Fast R-CNN and faster R-CNN for faster speed object detection.
R-CNN takes a huge amount of time to train the network and cannot be implemented in real time as it takes many seconds for each test imageFast R-CNN
The approach to Fast R-CNN is similar to the R-CNN algorithm. But, instead of feeding the region proposals to the CNN, we feed the input image to the CNN to generate a convolutional feature map.
It faster than R-CNN, because you don't have to feed 2000 region proposals to the convolutional neural network every time.
Similar to Fast R-CNN, the image is provided as an input to a convolutional network which provides a convolutional feature map. Instead of using a selective search algorithm on the feature map to identify the region proposals, a separate network is used to predict the region proposals.R-FCN
Positive-Sensitive Score Maps (Object Detection)
In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, R-FCN propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection