Hello guys, I want to use YOLO(or other object detection algorithm) for text localisation problem in Deep Learning capstone project but I don’t know how to proceed with it. There are no proper tutorials available. Since there is no library for YOLO, do I need to implement it from scratch or there is some other work around?
One more question.
When we scale these images(image with text on it) to fit certain CNN do we also have to scale the co-ordinates of words on it. And also, do these co-ordinates represents pixel indices. If yes, the coordinates must be integers but they are in decimal(floating type).