Object Detection (DL#112)

As I was going through the deep learning lecture for object detection, I had this query.
After “region proposal”, if we are to crop the region and then feed it to the CNN then how will the network regress and try to correct the proposed regions as now it has access to only the cropped image and not the original image. Please help resolve the query.

Do you mean to say creating a bounding box on the original image?

The term used is “crop out” the region. I am confused as to whether that refers to creating a bounding box around the region or the literal meaning of cropping out. The latter seems to raise the question.

In context of the capstone project, you can go either way.

  1. Cropping out the image and then doing text recognition and transliteration
  2. Creating a bounding box, then text recognition on selective part and transliteration