Support Vector Machines (SVM)

SVM finds the hyperplane that maximally separates classes — the decision boundary with the largest margin. Support vectors are the training points closest to the decision boundary — they define it. For non-linearly separable data, the kernel trick implicitly maps features to higher dimensions where they become separable. SVM is powerful for high-dimensional data and robust when classes are well-separated.

25 min•By Priygop Team•Updated 2026

SVM — Margins, Kernels, and C Parameter

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.svm import SVC, SVR
from sklearn.datasets import load_breast_cancer, make_circles, make_classification
from sklearn.model_selection import cross_val_score, GridSearchCV, train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report

# DEMO: WHY KERNEL MATTERS -- non-linearly separable data
X_circles, y_circles = make_circles(n_samples=300, factor=0.4, noise=0.1, random_state=42)

fig, axes = plt.subplots(1, 3, figsize=(15, 4))
kernels_to_test = ["linear", "rbf", "poly"]

for ax, kernel in zip(axes, kernels_to_test):
    pipe = Pipeline([("scaler", StandardScaler()), ("svm", SVC(kernel=kernel, C=1.0))])
    pipe.fit(X_circles, y_circles)
    acc = pipe.score(X_circles, y_circles)

    # Decision boundary
    xx, yy = np.meshgrid(np.linspace(-1.5, 1.5, 200), np.linspace(-1.5, 1.5, 200))
    Z = pipe.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
    ax.contourf(xx, yy, Z, alpha=0.3, cmap="RdBu")
    ax.scatter(X_circles[:, 0], X_circles[:, 1], c=y_circles, cmap="RdBu", edgecolors="k", s=30)
    ax.set_title(f"Kernel: {kernel}\nAccuracy: {acc:.1%}")

plt.suptitle("SVM Kernels on Non-linearly Separable Data (Concentric Circles)")
plt.tight_layout()
plt.savefig("svm_kernels.png", dpi=100, bbox_inches="tight")
plt.show()

# SVM ON REAL DATA
cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# HYPERPARAMETER TUNING: C and gamma
param_grid = {"svm__C": [0.1, 1, 10, 100], "svm__gamma": ["scale", "auto", 0.001, 0.01]}
pipe = Pipeline([("scaler", StandardScaler()), ("svm", SVC(kernel="rbf", probability=True))])
search = GridSearchCV(pipe, param_grid, cv=5, scoring="roc_auc", n_jobs=-1)
search.fit(X_train, y_train)

print(f"Best params: {search.best_params_}")
print(f"Best CV AUC: {search.best_score_:.4f}")
print("\nTest Classification Report:")
print(classification_report(y_test, search.predict(X_test), target_names=cancer.target_names))

# SVM PARAMETERS EXPLAINED
svm_params = {
    "C":           "Regularization: low C = wide margin, more misclassifications; high C = narrow margin, fewer errors",
    "kernel":      "'rbf' (Gaussian) works best for most datasets; 'linear' for text/high-dimensional; 'poly' rarely",
    "gamma":       "Only for RBF/poly kernels: 'scale' usually best; controls with of the Gaussian influence",
    "probability": "Set True to get predict_proba() -- makes training ~2x slower",
}
print("\nSVM Parameter Guide:")
for param, desc in svm_params.items():
    print(f"  {param:15s}: {desc}")

# WHEN TO USE SVM
print("\nSVM vs other algorithms:")
use_cases = [
    ("PREFER SVM when", "data has many features vs samples (text, genomics)"),
    ("PREFER SVM when", "classes are clearly separable (small datasets)"),
    ("AVOID SVM when",  "dataset is very large (>50k rows -- too slow)"),
    ("AVOID SVM when",  "you need feature importances (SVM doesn't provide them)"),
    ("USE INSTEAD",     "Random Forest or XGBoost for most tabular data problems"),
]
for case, reason in use_cases:
    print(f"  {case}: {reason}")

Tip

Practice Support Vector Machines SVM in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

F1 = harmonic mean of Precision and Recall (balanced metric)

Practice Task

Note

Practice Task — (1) Write a working example of Support Vector Machines SVM from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with Support Vector Machines SVM is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.

Topics in This Module