Machine Learning — Evaluation Metrics for Supervised
Model Evaluation Metrics for Supervised Learning
Model evaluation metrics are used to measure how well a machine learning model performs on unseen data.
Different metrics are used depending on the problem:
Supervised Learning
|
|---- Regression Metrics
| |
| |--- MSE
| |--- RMSE
| |--- MAE
|
|---- Classification Metrics
|
|--- Accuracy
|--- Precision
|--- Recall
|--- F1 Score
|--- Confusion Matrix
|--- ROC Curve
Regression Metrics
Regression metrics are used when predicting continuous values.
Examples:
- House price prediction
- Salary prediction
- Temperature prediction
1. Mean Squared Error (MSE)
Definition
Mean Squared Error measures the average squared difference between actual and predicted values.
The purpose of squaring:
- Makes all errors positive
- Gives higher penalty to larger errors
Formula:
MSE=\frac{1}{n}\sum_{i=1}^{n}(y_i-\hat y_i)^2
Where:
- n = Number of observations
- yi = Actual value
- ŷi = Predicted value
Example
Actual values:
[10,20,30]
Predicted values:
[12,18,35]
Errors:
[2,-2,5]
Squared errors:
[4,4,25]
MSE:
(4+4+25)/3 =11
Interpretation
Lower MSE → Better model
Advantages
- Penalizes larger errors
Disadvantages
- Sensitive to outliers
2. Root Mean Squared Error (RMSE)
Definition
RMSE is the square root of MSE.
Formula:
RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i-\hat y_i)^2}
Example
If:
MSE=11
Then:
RMSE=√11 ≈3.31
Interpretation
Lower RMSE → Better model
RMSE has the same unit as output values.
Advantages
- Easy interpretation
- Same units as target variable
Disadvantages
- Sensitive to outliers
3. Mean Absolute Error (MAE)
Definition
MAE calculates the average absolute difference between actual and predicted values.
Formula:
MAE=\frac{1}{n}\sum_{i=1}^{n}|y_i-\hat y_i|
Example
Actual:
[10,20,30]
Predicted:
[12,18,35]
Absolute errors:
[2,2,5]
MAE:
(2+2+5)/3 =3
Interpretation
Lower MAE → Better model
Advantages
- Easy to understand
- Less affected by outliers
Disadvantages
- Does not strongly penalize large errors
Classification Metrics
Used when predicting categories.
Examples:
- Spam detection
- Disease prediction
- Fraud detection
Confusion Matrix
Confusion Matrix forms the basis for several classification metrics.
Predicted PositivePredicted NegativeActual PositiveTPFNActual NegativeFPTN
Where:
TP = True Positive
TN = True Negative
FP = False Positive
FN = False Negative
Example:
Predicted YesPredicted NoActual Yes455Actual No10404. Accuracy
Definition
Measures percentage of correct predictions.
Formula:
Accuracy=\frac{TP+TN}{TP+TN+FP+FN}
Example:
TP=45 TN=40 FP=10 FN=5
Calculation:
Accuracy=(45+40)/(45+40+10+5) =0.85
Accuracy:
85%
Advantages
- Easy to understand
Disadvantages
Not suitable for imbalanced datasets.
Example:
990 healthy 10 disease
Predicting everyone as healthy:
99% accuracy
but poor model.
5. Precision
Definition
Precision measures how many predicted positive cases were actually positive.
Formula:
Precision=\frac{TP}{TP+FP}
Example:
TP=45 FP=10
Calculation:
45/(45+10) =0.818
Precision:
81.8%
Interpretation
Among predicted positives, how many are correct?
Applications
- Spam detection
- Fraud detection
6. Recall
Definition
Recall measures how many actual positive cases were identified.
Formula:
Recall=\frac{TP}{TP+FN}
Example:
TP=45 FN=5
Calculation:
45/(45+5) =0.90
Recall:
90%
Interpretation
Among actual positives, how many were found?
Applications
- Cancer detection
- Disease prediction
7. F1 Score
Definition
F1 score is the harmonic mean of Precision and Recall.
Formula:
F1=2\times\frac{Precision\times Recall}{Precision+Recall}
Example:
Precision:
0.81
Recall:
0.90
Calculation:
F1=0.85
Interpretation
Higher F1 Score
↓
Better balance
Applications
Useful for imbalanced datasets.
8. ROC Curve
Definition
ROC stands for:
Receiver Operating Characteristic Curve
ROC shows relationship between:
True Positive Rate
vs
False Positive Rate
Where:
True Positive Rate:
TPR=\frac{TP}{TP+FN}
False Positive Rate:
FPR=\frac{FP}{FP+TN}
ROC Interpretation
AUC=1.0 Perfect model AUC=0.9 Excellent model AUC=0.8 Good model AUC=0.7 Average model AUC=0.5 Random prediction
Quick Comparison
MetricUsed ForBest ValueMSERegressionLowerRMSERegressionLowerMAERegressionLowerAccuracyClassificationHigherPrecisionClassificationHigherRecallClassificationHigherF1 ScoreClassificationHigherROC-AUCClassificationHigherSimple Memory Trick
MSE → Squared Error RMSE → Root of Squared Error MAE → Absolute Error Accuracy → Overall Correctness Precision → Correct predicted positives Recall → Found actual positives F1 Score → Balance between Precision and Recall ROC → Model discrimination ability