Dataloader for text recognition crop images

How can a create a dataloader of cropped images as the the simple custom dataloader in pytorch returns original train images and each image requires multiple crops to get the text.

I want to create a dataloader which returns only the cropped images in batches.
Please help.

Maybe you can write a custom Data loader for this…
You can refer this for some help.

I need help to return multiple cropped images of a given single original image.
As a custom dataloader returns only a single original image for a given index.

How can i modify the dataloader to return all cropped images of the original image with batching.

Are you looking for something returning images as batch of multiple images?

Yes I want to return multiple cropped images for a given batch.
for eg. If batch_size = 4 then the dataloader should return all cropped images of original 4 images

You can use torchvision.RandomResizedCrop for this

I don’t want to crop the original image I want to get all cropped images containing text from the original image in batches. As a original image contains many text portions in it.

Okay, maybe this can help you out then, or else you need to train an Object Detection model (taking text as objects).

A object detector will only process only 1 image at once.
As the training data contains ~100,000 images using a batch size of 1 will be very slow.
Is there any option to process the data in bigger batches so that the process can speed up.

I’m not sure, but you can check about some function in OpenCV for this.

Can you share the dataloader part of the solution of the capstone project ?

You can take a reference from this