Model Stacking — Meta-Learning

Stacking (stacked generalization) trains a meta-model on the out-of-fold predictions of base models. Unlike voting (simple averaging), stacking learns which base models to trust more for different regions of the feature space. Stacking consistently outperforms individual models and is the top technique in ML competitions — often providing 0.5-2% gains over the best single model.

15 min•By Priygop Team•Updated 2026

Stacking with StackingClassifier

import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.ensemble import (StackingClassifier, RandomForestClassifier,
                               GradientBoostingClassifier, VotingClassifier)
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

np.random.seed(42)
X, y = make_classification(n_samples=3000, n_features=25, n_informative=12,
                            n_redundant=6, random_state=42)

# BASE MODELS (level 0 estimators)
base_models = [
    ("rf",  RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)),
    ("gb",  GradientBoostingClassifier(n_estimators=100, random_state=42)),
    ("svm", Pipeline([("scaler", StandardScaler()), ("svc", SVC(kernel="rbf", probability=True, random_state=42))])),
    ("knn", Pipeline([("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=10))])),
]

# META-MODEL (level 1) -- learns from base model predictions
meta_model = LogisticRegression(max_iter=1000, C=0.1, random_state=42)

# SKLEARN STACKING
stacker = StackingClassifier(
    estimators=base_models,
    final_estimator=meta_model,
    cv=5,               # out-of-fold predictions using 5-fold CV
    stack_method="predict_proba",  # base models output probabilities
    passthrough=False,  # only base model outputs to meta-model (not raw features)
    n_jobs=-1,
)

# COMPARE INDIVIDUAL MODELS VS STACKING
print("Cross-validation AUC (5-fold):")
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

for name, model in base_models:
    score = cross_val_score(model, X, y, cv=cv, scoring="roc_auc").mean()
    print(f"  {name:5s}: {score:.4f}")

# Voting ensemble
soft_voting = VotingClassifier(estimators=base_models, voting="soft", n_jobs=-1)
voting_score = cross_val_score(soft_voting, X, y, cv=cv, scoring="roc_auc").mean()
print(f"  {'vote':5s}: {voting_score:.4f}  (soft voting)")

stacking_score = cross_val_score(stacker, X, y, cv=cv, scoring="roc_auc").mean()
print(f"  {'stack':5s}: {stacking_score:.4f}  (stacking -- typically best)")

# WHEN STACKING HELPS MOST
stacking_guide = {
    "High diversity":      "Models using different algorithms make different error patterns -> stacking helps most",
    "Uncorrelated errors": "If model A fails on examples 1,2 and model B fails on 3,4 -> stack catches all",
    "Small gain (0.5-2%)": "Not worth 10x training time for simple problems; use for competitions or critical prod",
    "Blending alternative":"Simple: split data, train base models on 50%, train meta on second 50% predictions",
}
print("\nWhen stacking helps:")
for scenario, explanation in stacking_guide.items():
    print(f"  {scenario:25s}: {explanation}")

Tip

Practice Model Stacking MetaLearning in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

Fine-tune pre-trained. LoRA for efficiency.

Practice Task

Note

Practice Task — (1) Write a working example of Model Stacking MetaLearning from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with Model Stacking MetaLearning is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.

Topics in This Module