- I have completed the text detection part (using detectron2).
- Milestone -2 ( image to text ), no issue in the CNN and RNN/LSTM part.
- I am facing issue in the ctcloss part. The documentation of pytorch didn’t help me understand. How do I give the targets(ground truth) for a CTC loss function. If if accepts some kind of encoding as a input. How do I encode the ground truth( say a hindi word in our case)
Kindly clarify this doubt to help me progress further.
Thanks in advance