Model Fitness
What is bias in a model?
Difference between predicted and actual value
Model Fitness
What is variance in a model?
How much the regression changes with different sets of training data
Model Fitness
What is a model’s “fit”?
Ability to perform well (has to do with training effectiveness)
Model Fitness
What are the two extremes and middle of the “fit” range?
Under-fit, balanced, over-fit
Underfit Model
What is under-fit model?
The regression is too simple, it isn’t capturing enough data to be effective.
Underfit Model
Bias and variance for an under-fit model?
Bias is high and the variance is low
Underfit Model
What is a 2-D visual for an under-fit regression?
Straight line through all observations – too general, bad predictor
Underfit Model
How does an under-fit model do predicting from training data?
Poor – the model isn’t specific enough
Underfit Model
How does an under-fit model do predicting from new data it hasn’t seen before?
Also poor – the model isn’t specific enough
Overfit Model
What is over-fit model?
The regression is too complex, it tightly fits the training data too much
Overfit Model
Bias and variance for an over-fit model?
Bias is low and the variance is high
Overfit Model
What is a 2-D visual for an over-fit regression?
Regression line directly through every training point – it’s fit to every outlier!
Overfit Model
How does an over-fit model do predicting from training data?
Perfect
Overfit Model
How does an over-fit model do predicting from new data it hasn’t seen before?
Terrible – it has too much noise built-in
Balanced Model
What is a balanced model?
The regression captures trends, ignores outliers
Balanced Model
Bias and variance for a balanced model?
Bias is low and the variance is low
Balanced Model
What is a 2-D visual for a balanced regression?
Non-linear curve through the largest clusters
Balanced Model
How does a balanced model do predicting from training data?
Good
Balanced Model
How does a balanced model do predicting from new data it hasn’t seen before?
Good