Decision Boundaries — Visualizing Classifiers
Visualizing decision boundaries shows how each algorithm carves up the feature space to separate classes. Linear models draw straight lines/hyperplanes. Trees draw axis-aligned rectangles. SVM with RBF draws curved smooth boundaries. KNN draws bumpy irregular boundaries. This intuition tells you when each algorithm is appropriate for your data's geometry.
Comparing Decision Boundaries Across Classifiers
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons, make_circles, make_classification
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
def plot_decision_boundary(model, X, y, ax, title):
scaler = StandardScaler()
X_sc = scaler.fit_transform(X)
model.fit(X_sc, y)
h = 0.02 # grid step
x_min, x_max = X_sc[:, 0].min() - 0.5, X_sc[:, 0].max() + 0.5
y_min, y_max = X_sc[:, 1].min() - 0.5, X_sc[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
ax.contourf(xx, yy, Z, alpha=0.3, cmap="RdYlBu")
ax.scatter(X_sc[:, 0], X_sc[:, 1], c=y, cmap="RdYlBu", edgecolors="k", s=25, linewidths=0.5)
acc = model.score(X_sc, y)
ax.set_title(f"{title}\nAcc={acc:.2%}", fontsize=10)
ax.set_xticks([]); ax.set_yticks([])
# DATASETS
np.random.seed(42)
datasets = {
"Linearly Sep.": make_classification(n_samples=200, n_features=2, n_redundant=0, n_clusters_per_class=1, random_state=42),
"Moons": make_moons(n_samples=200, noise=0.20, random_state=42),
"Circles": make_circles(n_samples=200, factor=0.5, noise=0.1, random_state=42),
}
classifiers = [
("Logistic Reg.", LogisticRegression(C=1.0, max_iter=1000, random_state=42)),
("Decision Tree", DecisionTreeClassifier(max_depth=5, random_state=42)),
("KNN (k=5)", KNeighborsClassifier(n_neighbors=5)),
("SVM (RBF)", SVC(kernel="rbf", C=1.0, random_state=42)),
("Random Forest", RandomForestClassifier(n_estimators=100, random_state=42)),
]
fig, axes = plt.subplots(len(datasets), len(classifiers), figsize=(20, 12))
for row, (dataset_name, (X, y)) in enumerate(datasets.items()):
for col, (clf_name, clf) in enumerate(classifiers):
plot_decision_boundary(clf, X, y, axes[row, col], clf_name if row == 0 else "")
if col == 0:
axes[row, col].set_ylabel(dataset_name, fontsize=11, fontweight="bold")
plt.suptitle("Decision Boundaries: Different Algorithms x Different Data Geometries", fontsize=14, y=1.01)
plt.tight_layout()
plt.savefig("decision_boundaries.png", dpi=80, bbox_inches="tight")
plt.show()
# KEY TAKEAWAYS
takeaways = {
"Logistic Regression": "Only works for linearly separable classes -- straight line boundary",
"Decision Tree": "Axis-aligned rectangular regions -- can overfit with depth",
"KNN": "Smooth bumpy boundaries -- good with enough data and scaling",
"SVM (RBF)": "Smooth curved boundaries -- great for clean well-separated data",
"Random Forest": "Smooth ensemble boundaries -- robust to noise and outliers",
}
print("\nDecision boundary takeaways:")
for clf, insight in takeaways.items():
print(f" {clf:20s}: {insight}")Tip
Tip
Practice Decision Boundaries Visualizing Classifiers in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Tree = interpretable. Forest = robust. XGBoost = wins.
Practice Task
Note
Practice Task — (1) Write a working example of Decision Boundaries Visualizing Classifiers from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with Decision Boundaries Visualizing Classifiers is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.