Model Evaluation Techniques
This notebook will only deal with commonly used evaluation metrics for regression and classification. This list is not exhaustive, you are encouraged to look at the other metrics that can be used.
References:
(1) Scikit-Learn : https://scikit-learn.org/stable/modules/model_evaluation.html
(2) https://github.com/maykulkarni/Machine-Learning-Notebooks
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
1. Regression Metrics
1.1 $R^{2} \text{score}$ (Coefficient of Determination)
$ R^{2} = 1 – \frac{SS_{res}}{SS_{tot}}$
where,
$ SS_{res} = \sum_{i}^{n}(y_{i} – \hat{y}_{i})^{2}$
$ SS_{tot} = \sum_{i}^{n}(y_{i} – y_{avg})^{2}$
Possible Values:
$$
R^{2}=\begin{cases}
>0, & \text{better than average predictor}\\
0, & \text{exactly the same as average predictor}\\
<0, & \text{worse than average predictor}
\end{cases}
$$
$R^{2} \in (-\infty, 1]$
More information on the math behind the use of the R^2 score can be found here : https://nbviewer.jupyter.org/github/maykulkarni/Machine-Learning-Notebooks/blob/master/05.%20Model%20Evaluation/R%20Squared.ipynb
# Most of the metrics implemented in the scikit-learn library will be found here. (sklearn.metrics)
from sklearn.metrics import r2_score
Case 1: $R^{2} > 0$
"""
Arbitrarily define y_true and y_pred.
Assume that the following are the ground truth (y_true) and the model predictions (y_pred).
"""
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
plt.plot(y_true)
plt.plot(y_pred)
plt.title('Plot of Ground Truth vs Predictions')
plt.legend(['y_true', 'y_pred'])
plt.xlabel('X')
plt.ylabel('Y')
Text(0, 0.5, 'Y')
# The signature of the function is : r2_score(y_true, y_pred)
# An r2_score of 1 means that your model is the best. In practice, you don't achieve such high scores, rather
# r2_score will tend to 1 if you model the data properly.
r2_score(y_true, y_pred)
0.9486081370449679
Case 2: $R^{2} = 0$
plt.plot(y_true)
# Notice how we're predicting a constant value, equal to the mean of the observed ground truth data.
plt.plot([np.mean(y_true)]*len(y_true))
plt.title('Plot of Ground Truth vs Average of the Ground Truth.')
plt.legend(['y_true', 'y_pred'])
plt.xlabel('X')
plt.ylabel('Y')
Text(0, 0.5, 'Y')
# An R2 score of 0 corresponds to a horizontal predictor that is just the mean of all the observed values!
r2_score(y_true, [np.mean(y_true)]*len(y_true))
0.0
Case 3: $R^{2} < 0$
# Change the value of y_pred, and let us see what happens to the R2 score.
y_pred = [1.0, 1.0, 1.0, 1.0]
plt.plot(y_true)
plt.plot(y_pred)
plt.legend(['y_true', 'y_pred'])
plt.xlabel('X')
plt.ylabel('Y')
Text(0, 0.5, 'Y')
# The signature of the function is : r2_score(y_true, y_pred)
# An r2_score of 1 means that your model is the best. In practice, you don't achieve such high scores, rather r2_score
# will tend to 1 if you model the data properly.
r2_score(y_true, y_pred)
-0.4817987152034262
1.2 Mean Absolute Error (MAE)
$ \text{MAE} = \frac{1}{n} \sum_{i}^{n}|y_{i} – \hat{y}_{i}|$
# Import the function.
from sklearn.metrics import mean_absolute_error
# Signature of the function is : mean_absolute_error(y_true, y_pred)
mean_absolute_error(y_true, y_pred)
2.625
# Same function written from first principles!
np.mean( np.abs( np.array(y_true) - np.array(y_pred) ) )
2.625
For more information on MAE, refer the following link : https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error
1.3 Mean Squared Error (MSE)
$ \text{MSE} = \frac{1}{n} \sum_{i}^{n}|y_{i} – \hat{y}_{i}|^{2} $
# Import the function.
from sklearn.metrics import mean_squared_error
# Signature of the function is : mean_squared_error(y_true, y_pred)
mean_squared_error(y_true, y_pred)
10.8125
# Same function written from first principles!
np.mean( np.square( np.array(y_true) - np.array(y_pred) ) )
10.8125
For theory on the use of mean_squared_error, refer : https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error
1.4 Root Mean Squared Error (RMSE)
$ \text{RMSE} = \sqrt{ \frac{1}{n} \sum_{i}^{n}|y_{i} – \hat{y}_{i}|^{2} } $
# Import the function.
from sklearn.metrics import mean_squared_error
# Import numpy (will be using some of its functionality)
import numpy as np
# Signature of the function is : mean_squared_error(y_true, y_pred)
np.sqrt(mean_squared_error(y_true, y_pred))
3.2882366094914763
# Same function written from first principles!
np.sqrt( np.mean( np.square( np.array(y_true) - np.array(y_pred) ) ) )
3.2882366094914763