DL Capstone Project: Cropping bounding-box images


With respect to the DL-19 Capstone Project, i’d like to know:

  1. How to proceed if I’ve identified the text in the image and successfully bounded them in a box.

  2. Thereafter, i.e. after cropping the bounded box off the processed image, extraction of string is the task to accomplish next, would like to have a little brain storming.

Awaiting Reply :slightly_smiling_face:

Hi @Bilal_Aamer,
I’ve assigned your problem to one of our moderators who have been working on the Capstone project, and possibly he’ll be the best person to answer queries in such regard. He will be available in a few days, he’ll get back to you soon. Hope you can understand.

Hi @Bilal_Aamer,

Firstly, you have to resize the cropped out images to a uniform shape. All these resized images have the containing text as labels.

You can use Convolutional Recurrent Neural Networks (CRNN) with CTC loss to train a text recognition model here. CRNN uses Conv Nets to extract the features from the image. The resultant spatially sequential feature set is passed through Recurrent Nets to output the probabilty distribution of characters at each step.Then you have to combine the character prediction at each step to get the final text.

For starters please check this code that implements Captcha Text Recogniton using CRNN