Imbalanced dataset

in name2lang data set is imbalanced and because of this while predicting the lang it gives high accuracy for languages which have more no.of data while almost 0% accuracy for other languages how can we deal with this?

Hi @purushartha,
You can either add data for other languages, or use just a sample data from the dominating class.

