 # 3 Regression Metrics You Must Know: MAE, MSE, and RMSE

## Introduction 🔗

Let’s say you’ve built a new machine learning model. How do you know if it’s going to make good predictions?

How can you measure your model’s expected performance in the real world?

Today we’ll address this question for regression models. Specifically, we’ll look at three widely used regression metrics:

• Mean Absolute Error (MAE)
• Mean Squared Error (MSE)
• Root Mean Squared Error (RMSE)

Then I’ll show you how to calculate these metrics using Python and Scikit-Learn.

Let’s get started! Image Credit:  Manfred Irmer

## Regression Error 🔗

In statistics and machine learning, regression refers to a set of techniques used to predict a numerical value based on some inputs.

Suppose you want to train a model to predict airfare for US domestic flights. That would be a regression task because the output (airfare) can take on any value, say, from $100 to$1,000.

Once you’ve trained the model, you must measure its performance using a test dataset.

Let’s say we have a test dataset with 10 entries. We use the model to predict price for each entry. And then compare our predictions against the actual prices:

TABLE 1: Actual vs Predicted Values
Ticket
#
Actual
Price
Predicted
Price
1 250 265
2 110 140
3 500 480
4 200 215
5 330 290
6 490 515
7 670 750
8 210 210
9 435 420
10 375 285

How far off were our predictions from the actual prices? Let’s find out the prediction error for each entry using the below formula:

$$Error = Actual \medspace Value - Predicted \medspace Value$$

TABLE 2: Prediction Errors
Ticket
#
Actual
Price
Predicted
Price
Error
1 250 265 -15
2 110 140 -30
3 500 480 20
4 200 215 -15
5 330 290 40
6 490 515 -25
7 670 750 -80
8 210 210 0
9 435 420 15
10 375 285 90
Total 0

The model over-predicted for some entries (in red). That is, the prediction was higher than the actual value. Thus the error is negative.

For other cases (in yellow), the model under-predicted as its prediction was lower than the actual value. Hence the error is positive.

Knowing individual errors for each entry is fine. But how can we combine all of these errors to give us one metric?

What if we add up all the errors? That won’t work. The negative errors would cancel out the positive errors.

For example, the sum of all errors in TABLE 2 is 0. That would lead us to believe that our model is perfect. That’s not true, though - the model made incorrect predictions for 9 out of 10 test cases!

Let’s try a different approach.

## Mean Absolute Error (MAE) 🔗

As we saw above, the prediction error can be positive or negative. But what if we focus only on the size of the error and ignore the sign? That is, we measure the absolute value of the error.

In that case, we’ll treat two errors the same if they have equal size but only differ in sign (e.g., -80 and +80). Both are equally off from the expected value.

We can get absolute errors by dropping the sign from all the negative values:

TABLE 3: Absolute Errors
Ticket
#
Actual
Price
Predicted
Price
Error Absolute
Error
1 250 265 -15 15
2 110 140 -30 30
3 500 480 20 20
4 200 215 -15 15
5 330 290 40 40
6 490 515 -25 25
7 670 750 -80 80
8 210 210 0 0
9 435 420 15 15
10 375 285 90 90
Total 0 330

And then divide the sum of absolute errors by the number of predictions. That’ll give us the Mean Absolute Error (MAE):

$$Mean \medspace Absolute \medspace Error \medspace (MAE) = \frac{Sum \medspace of \medspace Absolute \medspace Errors}{Number \medspace of \medspace Predictions}$$

Let’s apply this formula to our example:

$$MAE = \frac{330}{10} = 33$$

## MAE vs. RMSE 🔗

Our model’s RMSE ($43.24) is significantly higher than the MAE ($33). Why is that?

Notice in TABLE 4 that we have two absolute errors (80 and 90) that are much larger than the others.

When we square all the errors to find RMSE, these two large errors dominate the others (see the last column in TABLE 4). Hence, they push RMSE to a considerably higher value than MAE.

This explains why RMSE would be a superior metric when we want to minimize larger errors.

## Practice using Python & Scikit-Learn 🔗

Now you are familiar with the regression metrics MAE, MSE, and RMSE. Let’s learn how to calculate them using Python and Scikit-Learn.

We’ll use a kaggle dataset that contains heights and weights measurements for 25,000 individuals.

We’ll first train a model to predict a person’s weight based on height. Then we’ll calculate the metrics to evaluate the model.

First off, let’s load the dataset using pandas:

import pandas as pd

'SOCR-HeightWeight.csv',
# dataset has an extra index column. We don't need it.
# Just load height and weight columns
usecols=[1, 2]
)
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25000 entries, 0 to 24999
Data columns (total 2 columns):
#   Column          Non-Null Count  Dtype
---  ------          --------------  -----
0   Height(Inches)  25000 non-null  float64
1   Weight(Pounds)  25000 non-null  float64
dtypes: float64(2)
memory usage: 390.8 KB


The data types for both columns look good. And there are no missing values.

Next, use the seaborn scatterplot to see if heights and weights are associated:

import seaborn as sns

sns.scatterplot(
x=dataset['Height(Inches)'],
y=dataset['Weight(Pounds)'],
) The weight generally goes up as the height increases. So a machine learning model should be able to capture this pattern and predict the weight with reasonable accuracy.

### Build Regression Model 🔗

Let’s use linear regression to build the model. First, we store the inputs and output in separate variables:

# Input
X = dataset['Height(Inches)']
# Output
y = dataset['Weight(Pounds)']


Next, split the dataset into training and test sets. We’ll use the training set to build the model. And then evaluate the model using the test set.

from sklearn.model_selection import train_test_split

# 67% - training set (X_train, y_train)
# 33% - test set (X_test, y_test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# X_train and X_test are instances of pandas Series because
# they contain only one column. Convert them to DataFrames
X_train = X_train.to_frame()
X_test = X_test.to_frame()


Finally, create and train a model using Scikit-Learn’s LinearRegression:

from sklearn.linear_model import LinearRegression
# Create a new model
model = LinearRegression()
# build the model using the traing data
model.fit(X_train, y_train)


### Calculate Metrics - MAE, MSE, and RMSE 🔗

We now have a fully trained model. Its time to measure it’s performance using the metrics we learned today.

First, let’s predict the weights for the test set:

predicted = model.predict(X_test)

actual_vs_predicted = pd.DataFrame(
{'Actual': y_test,
'Predicted':predicted}
)
# Show first 5 rows


Actual Predicted
7799 127.88 126.79
4427 108.97 117.22
14941 122.29 127.96
11644 118.53 124.57
15548 120.58 126.88

Scikit-Learn provides built-in functions to calculate a variety of metrics. Let’s import two of them we’ll use today:

from sklearn.metrics import (
mean_absolute_error, # MAE
mean_squared_error # MSE
)


First, we’ll compute Mean Absolute Error (MAE) using the function mean_absolute_error:

MAE = mean_absolute_error(
y_true=y_test, # actual values
y_pred=predicted # predicted values
)
MAE.round(2)

8.06


And then calculate Mean Squared Error (MSE) using mean_squared_error:

MSE = mean_squared_error(
y_true=y_test, # actual values
y_pred=predicted # predicted values
)
MSE.round(2)

102.62


Scikit-Learn doesn’t provide a function to provide Root Mean Squared Error (RMSE). But we can get RMSE by taking a square root of MSE:

# Square root of MSE gives RMSE
RMSE = MSE**(1/2)
RMSE.round(2)

10.13


Thus our model will predict weights with MAE and RMSE of 8.06 and 10.13 pounds, respectively.

## Summary & Next Steps 🔗

This post introduced the most commonly used metrics to evaluate regression models. Let’s recap what you learned today:

• What is regression prediction error?
• How to use prediction errors to calculate MAE, MSE, and RMSE.
• MAE vs. RMSE: what’s the difference, and why does it matter?
• How to compute these metrics using Python and Scikit-Learn’s built-in functions.

Now that you know regression metrics, you might wonder: what about classification models - how do I evaluate them? You can learn all about that here and here.

Title Image by babilkulesi