Hey, I am able to understand the concepts without any hassle but I am unable to do projects on my own because I don’t completely know how to convert a given data into a dataset on which our model can be applied . How do I learn data pre-processing effectively?
Some options:
-
If you are enrolled with DL/FDS course, there are videos lectures where data processing (loading dataset, cleaning, and other stuff) is done and taught, you can follow along and practise.
-
Or you can search on Kaggle for some data-preprocessing tutorial. Generally the dataset is available to practise along with the tutorial. Here is one such link(there are many more)
https://www.kaggle.com/ajay1216/practical-guide-on-data-preprocessing-in-python -
If you are looking for sample datasets, you can use the default datasets available with different libraries like sklearn or seaborn
https://scikit-learn.org/stable/datasets/index.html#toy-datasets
There is accompanying documentation on how you can load such datasets. -
There is also UCI repository if you want to look beyond Toy datasets
https://archive.ics.uci.edu/ml/datasets.php
And while learning data preprocessing steps, if you face issues, you can post your doubts/questions here on the forum
Will enrolling for FDS course help me overcome all problems related to preprocessing?
The necessary fundamentals regarding the preprocessing are covered in the course, you’ll anyway need to practice them on different datasets.