Regression Metrics — MAE, MSE, RMSE, R², MAPE

Choosing the right metric depends on your error tolerance and domain. MAE is robust to outliers (median-like). RMSE penalizes large errors more — use when big errors are costly (safety systems). R² measures explained variance — useful for comparing models. MAPE measures percentage error — useful when errors should scale with the magnitude of predictions. Always choose your metric BEFORE training, based on business requirements.

15 min•By Priygop Team•Updated 2026

Regression Metric Deep Dive

import numpy as np
import pandas as pd
from sklearn.metrics import (mean_absolute_error, mean_squared_error,
                              r2_score, mean_absolute_percentage_error)

# ACTUAL vs PREDICTED values
y_true = np.array([100, 200, 150, 300, 250, 180, 220, 400, 90, 350])
y_pred = np.array([110, 195, 160, 285, 270, 175, 230, 380, 95, 360])

# --- METRICS ---
mae  = mean_absolute_error(y_true, y_pred)
mse  = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
r2   = r2_score(y_true, y_pred)
mape = mean_absolute_percentage_error(y_true, y_pred)

print("Regression Metrics Summary:")
print(f"  MAE  (Mean Absolute Error):      {mae:.2f}  -- average |error|, same unit as target")
print(f"  MSE  (Mean Squared Error):       {mse:.2f}  -- penalizes large errors more")
print(f"  RMSE (Root Mean Squared Error):  {rmse:.2f}  -- same unit as target, MSE penalty")
print(f"  R2   (Coefficient of Det.):      {r2:.4f}  -- fraction of variance explained")
print(f"  MAPE (Mean Abs % Error):         {mape:.2%}  -- scale-independent percentage")

# WHEN OUTLIER INFLATES RMSE vs MAE
y_true_with_outlier = np.array([100, 200, 150, 300, 5000])  # one huge outlier
y_pred_with_outlier = np.array([110, 195, 160, 285, 290])   # terrible prediction for outlier

print("\nWith outlier (true=5000, pred=290):")
print(f"  MAE:  {mean_absolute_error(y_true_with_outlier, y_pred_with_outlier):.1f}  (not too bad)")
print(f"  RMSE: {np.sqrt(mean_squared_error(y_true_with_outlier, y_pred_with_outlier)):.1f}  (dominated by outlier!)")

# NEGATIVE R2 EXAMPLE
y_bad = np.array([500, 400, 600, 300, 700])  # very wrong predictions
r2_bad = r2_score(y_true[:5], y_bad)
print(f"  R2 can be negative: {r2_bad:.2f} (worse than always predicting mean!)")

# METRIC SELECTION GUIDE
print("\nMetric Selection Guide:")
metric_guide = {
    "MAE":  "General purpose, robust to outliers, interpretable (avg error in $ or kg)",
    "RMSE": "When large errors are costly (safety, finance) -- penalizes outlier predictions",
    "R2":   "Comparing models on same dataset, understanding explained variance (0=bad, 1=perfect)",
    "MAPE": "When prediction errors should scale: 10% error on $100 vs $10,000",
    "R2 < 0":"Model is WORSE than predicting the mean -- definitely broken",
    "R2 > 0.9":"Very high -- check for data leakage or overfitting",
}
for metric, guidance in metric_guide.items():
    print(f"  {metric:12s}: {guidance}")

Tip

Practice Regression Metrics MAE MSE RMSE R MAPE in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

Simplest ML. y = mx + b. Minimize MSE.

Practice Task

Note

Practice Task — (1) Write a working example of Regression Metrics MAE MSE RMSE R MAPE from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with Regression Metrics MAE MSE RMSE R MAPE is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.

Topics in This Module