How to process a XML dataset for bounding boxes in object detection?

I have now images with corresponding XML files, with info annotation for bounding box for object detection.

I am not understood how to extract that data and split my data into train & test set to feed into a model?
same if I have JSON file how to do that?

You can import those xml, and json into pandas dataframe or load them directly as dictionaries.
You can refer Python XML to JSON, XML to Dict - JournalDev

how to feed this data to faster rcnn,