Decision Trees — Interpretable Non-linear Classification

Decision trees split data by asking yes/no questions about feature values. The tree learns which questions (splits) best separate the classes, measured by Gini impurity or information gain (entropy). The output is a flowchart you can show to a business stakeholder — the most interpretable ML model. The risk: deep trees overfit dramatically. Prune by limiting max_depth.

25 min•By Priygop Team•Updated 2026

Decision Tree — Gini, Pruning, Visualization

import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.tree import DecisionTreeClassifier, export_text, plot_tree
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import classification_report
import matplotlib.pyplot as plt

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# OBSERVE OVERFITTING vs PROPER DEPTH
print("Depth vs Performance:")
print(f"{'Depth':<8} {'Train Acc':>10} {'CV Acc (5-fold)':>16} {'Test Acc':>10} {'Diagnosis'}")
print("-" * 60)

for depth in [1, 2, 3, 5, 8, 15, None]:
    tree = DecisionTreeClassifier(max_depth=depth, criterion="gini", random_state=42)
    tree.fit(X_train, y_train)
    train_acc = tree.score(X_train, y_train)
    cv_acc    = cross_val_score(tree, X_train, y_train, cv=5).mean()
    test_acc  = tree.score(X_test, y_test)
    gap       = train_acc - cv_acc
    diagnosis = "underfitting" if cv_acc < 0.88 else ("overfitting" if gap > 0.05 else "GOOD")
    print(f"{str(depth):<8} {train_acc:>10.4f} {cv_acc:>16.4f} {test_acc:>10.4f} {diagnosis}")

# USE THE BEST DEPTH
best_tree = DecisionTreeClassifier(max_depth=4, criterion="gini", random_state=42)
best_tree.fit(X_train, y_train)
print("\nFull Classification Report (depth=4):")
print(classification_report(y_test, best_tree.predict(X_test), target_names=cancer.target_names))

# VISUALIZE THE TREE (very important for stakeholders)
fig, ax = plt.subplots(figsize=(20, 8))
plot_tree(
    best_tree, ax=ax,
    feature_names=cancer.feature_names,
    class_names=cancer.target_names,
    filled=True, rounded=True, fontsize=8,
)
plt.title("Decision Tree (depth=4) -- Brain Cancer Classification")
plt.tight_layout()
plt.savefig("decision_tree.png", dpi=80, bbox_inches="tight")
plt.show()

# TEXT REPRESENTATION (for reports)
rules = export_text(best_tree, feature_names=list(cancer.feature_names))
print("\nDecision rules (first 20 lines):")
for line in rules.split("\n")[:20]:
    print(line)

# FEATURE IMPORTANCE (Gini-based)
importance_df = pd.DataFrame({
    "Feature": cancer.feature_names,
    "Importance": best_tree.feature_importances_,
}).sort_values("Importance", ascending=False)
print("\nTop 5 features by Gini importance:")
print(importance_df.head(5).round(4).to_string(index=False))

Tip

Practice Decision Trees Interpretable Nonlinear Classification in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

Tree = interpretable. Forest = robust. XGBoost = wins.

Practice Task

Note

Practice Task — (1) Write a working example of Decision Trees Interpretable Nonlinear Classification from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with Decision Trees Interpretable Nonlinear Classification is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ml code.

Topics in This Module