A start towards Capstone Project

I know there are a lot of threads related to this topic and I have read all of them but still, I am confused about how to start with the Capstone project. I have read the AIBharat article related to this project and I got an idea about project structure but I am struggling to implement or get a start about it. Like how should I start writing code for what? Also, the dataset provided is confusing for me. Can any mentor or any other participant doing or completed the project just give a pseudo approach/steps on how to start off with the project?

Basically, you are given the following datasets:

  • Text Detection dataset for Devanagari script (used by languages like Hindi, Marathi, etc.)
  • Text Recognition dataset for Devanagari
  • Transliteration dataset for Hindi to English script conversion

You are supposed to build the following system:

Given an image from a sign-board, say tea stall banner near your home or a railway station board in Hindi, you are supposed to extract the Devanagari Text in it and translate it to English.

Hence, you will first have to train a text detector & recognizer for Hindi as suggested here.
You can even use those repos / code as starting point.

Before that, you are encouraged to read up on how people generally go about building text detection and text recognition models using deep learning.

Some articles to start with:

Thank you @GokulNC ! I