what is the difference between training set and validation set.when we have our training and testing sets,why do we use validation set? and what is its function
For a plain difference,
Training set: Model learns (we care about the performance)
Validation set: Model is evaluated on how good it has learned (we care about the performance)
Test set: Model performs (we do not care about the performance)
Training set: Validation set: Test set:
Say if we have 100 instances then the typical split ratio would be 70:20:10 or 70:10:20 or 60:20:20.
Why do we care about training and validation set and not the test set?
The goal is to build a model that shows low bias and low variance such that we create a model that exhibits the low training error (learning) as well as low validation error (evaluation). This means we have the ability to control and improve the model performance (we can exploit the training and validation data sets).
Whereas, once such a model that exhibits low bias and low variance is built or observed, we are in situation to decide on selecting that model such that it can used to test for new data (we do not exploit test data) where we cannot control or improve the performance of the model.