Ensemble Learning — Wisdom of Crowds in ML
Ensemble methods combine multiple models to produce better predictions than any single model. Three strategies: Bagging (train models on random data subsets in parallel — reduces variance), Boosting (train models sequentially, each focusing on previous errors — reduces bias), and Stacking (train a meta-model on predictions of base models). All three strategies exploit the same insight: diverse models make different errors, and averaging cancels them out.
Ensemble Strategies — Bagging vs Boosting vs Stacking
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import (BaggingClassifier, RandomForestClassifier,
AdaBoostClassifier, GradientBoostingClassifier,
VotingClassifier, StackingClassifier)
np.random.seed(42)
X, y = make_classification(n_samples=2000, n_features=20, n_informative=10,
n_redundant=5, random_state=42)
# SINGLE BASE MODEL (weak learner)
single_tree = DecisionTreeClassifier(max_depth=3, random_state=42)
single_score = cross_val_score(single_tree, X, y, cv=5).mean()
print(f"Single Decision Tree (depth=3): {single_score:.4f}")
# BAGGING: random subsets of data, parallel training
bagging = BaggingClassifier(
estimator=DecisionTreeClassifier(max_depth=5),
n_estimators=100,
max_samples=0.8, # use 80% of training data for each tree
max_features=0.8, # use 80% of features for each tree
bootstrap=True, # sampling with replacement
random_state=42,
n_jobs=-1,
)
bagging_score = cross_val_score(bagging, X, y, cv=5).mean()
print(f"Bagging (100 trees): {bagging_score:.4f} (+{bagging_score-single_score:.4f})")
# RANDOM FOREST: bagging + random feature selection at each split
rf = RandomForestClassifier(n_estimators=100, max_depth=None, random_state=42, n_jobs=-1)
rf_score = cross_val_score(rf, X, y, cv=5).mean()
print(f"Random Forest (100 trees): {rf_score:.4f} (+{rf_score-single_score:.4f})")
# BOOSTING: sequential, each model focuses on previous mistakes
ada = AdaBoostClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
gb = GradientBoostingClassifier(n_estimators=200, learning_rate=0.05, max_depth=3, random_state=42)
ada_score = cross_val_score(ada, X, y, cv=5).mean()
gb_score = cross_val_score(gb, X, y, cv=5).mean()
print(f"AdaBoost (100 stumps): {ada_score:.4f} (+{ada_score-single_score:.4f})")
print(f"Gradient Boosting (200 trees): {gb_score:.4f} (+{gb_score-single_score:.4f})")
# VOTING: combine diverse models (majority vote or avg probability)
voting = VotingClassifier(estimators=[
("lr", LogisticRegression(max_iter=1000, random_state=42)),
("rf", RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)),
("gb", GradientBoostingClassifier(n_estimators=100, random_state=42)),
], voting="soft") # soft: average probabilities (better than hard majority vote)
voting_score = cross_val_score(voting, X, y, cv=5).mean()
print(f"Soft Voting (LR+RF+GB): {voting_score:.4f} (+{voting_score-single_score:.4f})")Tip
Tip
Practice Ensemble Learning Wisdom of Crowds in ML in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Fine-tune pre-trained. LoRA for efficiency.
Practice Task
Note
Practice Task — (1) Write a working example of Ensemble Learning Wisdom of Crowds in ML from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with Ensemble Learning Wisdom of Crowds in ML is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.