Linear Regression
Learn the fundamentals of Linear Regression for predicting continuous values.
60 min•By Priygop Team•Last updated: Feb 2026
What is Linear Regression?
Linear Regression is a fundamental supervised learning algorithm used for predicting continuous values. It assumes a linear relationship between the input features and the target variable.
Key Concepts
- Simple Linear Regression: One input feature
- Multiple Linear Regression: Multiple input features
- Cost Function: Mean Squared Error (MSE)
- Optimization: Gradient Descent
Mathematical Foundation
- Linear Regression Equation: y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ + ε
- Where:
- • y = predicted value
- • β₀ = intercept (bias)
- • βᵢ = coefficients for features
- • xᵢ = input features
- • ε = error term
Implementation with Scikit-learn
Example
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
# Generate sample data
np.random.seed(42)
X = np.random.rand(100, 1) * 10
y = 2 * X + 1 + np.random.randn(100, 1) * 0.5
# Split the data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")
print(f"R² Score: {r2:.4f}")
print(f"Intercept: {model.intercept_[0]:.4f}")
print(f"Coefficient: {model.coef_[0][0]:.4f}")
# Visualize results
plt.scatter(X_test, y_test, color='blue', label='Actual')
plt.plot(X_test, y_pred, color='red', label='Predicted')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression Results')
plt.legend()
plt.show()Model Evaluation
Example
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# Common evaluation metrics for regression
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
r2 = r2_score(y_test, y_pred)
# Adjusted R-squared
n = len(y_test)
p = X_test.shape[1] # number of features
adjusted_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)
print(f"MAE: {mae:.4f}")
print(f"RMSE: {rmse:.4f}")
print(f"R²: {r2:.4f}")
print(f"Adjusted R²: {adjusted_r2:.4f}")