GridSearchCV — Exhaustive Hyperparameter Search
GridSearchCV exhaustively evaluates every combination of hyperparameters specified in a parameter grid, using cross-validation for each. It is the most reliable tuning method for small grids (< 50 combinations). The key advantage: it integrates with Pipelines so hyperparameter tuning includes preprocessing steps — preventing the common mistake of tuning just the model while ignoring preprocessing choices.
GridSearchCV on Full Pipelines
import numpy as np
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier, RandomForestClassifier
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV, train_test_split, cross_val_score
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import classification_report
import time
cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
# PIPELINE TO TUNE
pipe = Pipeline([
("scaler", StandardScaler()),
("model", GradientBoostingClassifier(random_state=42)),
])
# PARAMETER GRID (use double underscore to access nested params)
param_grid = {
"model__n_estimators": [100, 200],
"model__learning_rate": [0.05, 0.1],
"model__max_depth": [3, 4, 5],
"model__subsample": [0.8, 1.0],
}
total_combinations = 1
for v in param_grid.values():
total_combinations *= len(v)
print(f"Total combinations: {total_combinations} x 5-fold CV = {total_combinations*5} fits")
t0 = time.time()
grid_search = GridSearchCV(
pipe, param_grid,
cv=5,
scoring="roc_auc",
n_jobs=-1, # use all CPU cores
refit=True, # refit best model on full training data
verbose=0,
return_train_score=True,
)
grid_search.fit(X_train, y_train)
t_elapsed = time.time() - t0
print(f"\nGridSearchCV completed in {t_elapsed:.1f}s")
print(f"Best params: {grid_search.best_params_}")
print(f"Best CV AUC: {grid_search.best_score_:.4f}")
# ANALYZE RESULTS
results_df = pd.DataFrame(grid_search.cv_results_)
top_5 = results_df.nlargest(5, "mean_test_score")[
["mean_test_score", "std_test_score", "mean_train_score",
"param_model__n_estimators", "param_model__learning_rate", "param_model__max_depth"]
].round(4)
print("\nTop 5 parameter combinations:")
print(top_5.to_string(index=False))
# FINAL EVALUATION WITH BEST MODEL
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
print("\nTest classification report (best model):")
print(classification_report(y_test, y_pred, target_names=cancer.target_names))
# TUNE PREPROCESSING PARAMS TOO (not just model params)
pipe_full = Pipeline([
("poly", PolynomialFeatures(include_bias=False)),
("scale", StandardScaler()),
("model", LogisticRegression(max_iter=1000, random_state=42)),
])
param_grid_full = {
"poly__degree": [1, 2], # tune preprocessing!
"model__C": [0.01, 0.1, 1.0, 10.0],
"model__penalty": ["l1", "l2"],
"model__solver": ["liblinear"],
}
gs_full = GridSearchCV(pipe_full, param_grid_full, cv=5, scoring="roc_auc", n_jobs=-1)
gs_full.fit(X_train, y_train)
print(f"\nFull pipeline tuning best: {gs_full.best_params_}")
print(f"Best AUC: {gs_full.best_score_:.4f}")Tip
Tip
Practice GridSearchCV Exhaustive Hyperparameter Search in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Optuna = best tool.
Practice Task
Note
Practice Task — (1) Write a working example of GridSearchCV Exhaustive Hyperparameter Search from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with GridSearchCV Exhaustive Hyperparameter Search is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.