ROC Curve & AUC — Threshold-Independent Evaluation
The ROC curve plots True Positive Rate (Recall) vs False Positive Rate across all possible classification thresholds. AUC-ROC (Area Under the Curve) summarizes this into a single number: 0.5 = random guessing, 1.0 = perfect. AUC-ROC is threshold-independent — it measures how well the model ranks positives above negatives. For severe class imbalance, use Precision-Recall AUC instead.
ROC Curve, PR Curve, and Threshold Selection
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.metrics import (roc_curve, roc_auc_score, precision_recall_curve,
average_precision_score, f1_score)
np.random.seed(42)
X, y = make_classification(n_samples=3000, n_features=15, weights=[0.90, 0.10], random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
models = {
"Logistic Regression": LogisticRegression(class_weight="balanced", max_iter=1000, random_state=42),
"Random Forest": RandomForestClassifier(n_estimators=100, class_weight="balanced", random_state=42),
"Gradient Boosting": GradientBoostingClassifier(n_estimators=200, learning_rate=0.05, random_state=42),
}
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
for name, model in models.items():
model.fit(X_train, y_train)
y_prob = model.predict_proba(X_test)[:, 1]
# ROC Curve
fpr, tpr, _ = roc_curve(y_test, y_prob)
auc = roc_auc_score(y_test, y_prob)
axes[0].plot(fpr, tpr, linewidth=2, label=f"{name} (AUC={auc:.3f})")
# Precision-Recall Curve
precision, recall, _ = precision_recall_curve(y_test, y_prob)
ap = average_precision_score(y_test, y_prob)
axes[1].plot(recall, precision, linewidth=2, label=f"{name} (AP={ap:.3f})")
# ROC plot formatting
axes[0].plot([0, 1], [0, 1], "k--", linewidth=1, label="Random (AUC=0.50)")
axes[0].fill_between([0, 1], [0, 1], alpha=0.05, color="gray")
axes[0].set_xlabel("False Positive Rate (1 - Specificity)")
axes[0].set_ylabel("True Positive Rate (Recall)")
axes[0].set_title("ROC Curve -- threshold independent")
axes[0].legend(loc="lower right", fontsize=9)
# PR plot formatting
baseline = y_test.mean()
axes[1].axhline(baseline, color="k", linestyle="--", linewidth=1, label=f"Random (AP={baseline:.3f})")
axes[1].set_xlabel("Recall")
axes[1].set_ylabel("Precision")
axes[1].set_title("Precision-Recall Curve -- better for imbalanced data")
axes[1].legend(loc="upper right", fontsize=9)
plt.tight_layout()
plt.savefig("roc_pr_curves.png", dpi=100, bbox_inches="tight")
plt.show()
# OPTIMAL THRESHOLD SELECTION using F1 maximization
best_model = GradientBoostingClassifier(n_estimators=200, learning_rate=0.05, random_state=42)
best_model.fit(X_train, y_train)
y_prob_gb = best_model.predict_proba(X_test)[:, 1]
thresholds = np.arange(0.1, 0.9, 0.05)
f1_scores = [f1_score(y_test, (y_prob_gb >= t).astype(int)) for t in thresholds]
optimal_t = thresholds[np.argmax(f1_scores)]
print(f"Default threshold (0.5) F1: {f1_score(y_test, (y_prob_gb >= 0.5).astype(int)):.4f}")
print(f"Optimal threshold ({optimal_t:.2f}) F1: {max(f1_scores):.4f}")
print("\nROC-AUC vs PR-AUC guide:")
print(" Use ROC-AUC when: classes are roughly balanced, want threshold-free metric")
print(" Use PR-AUC when: severe imbalance (fraud, disease), care about minority class precision")Tip
Tip
Practice ROC Curve AUC ThresholdIndependent Evaluation in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Accuracy is misleading on imbalanced data. Use F1/AUC.
Practice Task
Note
Practice Task — (1) Write a working example of ROC Curve AUC ThresholdIndependent Evaluation from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with ROC Curve AUC ThresholdIndependent Evaluation is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.