SHAP explains any model’s predictions using Shapley values from game theory. pip install shap. import shap. Tree models: explainer = shap.TreeExplainer(model), shap_values = explainer(X) — returns shap.Explanation object. Access values: shap_values.values (N, features), .base_values, .data. Single observation: shap_values[0]. Binary: shap_values[:, :, 1] for positive class. Summary plot: shap.summary_plot(shap_values, X, plot_type="beeswarm") — shows all features and samples. Bar plot: shap.plots.bar(shap_values) — mean |SHAP| per feature. Waterfall: shap.plots.waterfall(shap_values[0]) — single prediction breakdown. Force plot: shap.plots.force(shap_values[0]) — HTML interactive. Dependence: shap.dependence_plot("feature_name", shap_values.values, X) — partial dependence with interaction. Linear: shap.LinearExplainer(linear_model, X_train). Deep: shap.DeepExplainer(keras_model, background). Kernel (any model): shap.KernelExplainer(predict_fn, shap.sample(X, 100)). Mean absolute: np.abs(shap_values.values).mean(axis=0) — global importance. Interaction values: tree_explainer.shap_interaction_values(X) — (N, features, features). shap.Explanation(values, base_values, data, feature_names) for custom. Claude Code generates SHAP explainability dashboards, feature importance reporters, fairness auditors, and model debugging scripts.
CLAUDE.md for SHAP
## SHAP Stack
- Version: shap >= 0.44
- Tree: shap.TreeExplainer(model)(X) → Explanation object
- Linear: shap.LinearExplainer(linear_model, X_background)(X)
- Deep: shap.DeepExplainer(nn_model, X_background)(X)
- Any model: shap.KernelExplainer(predict_fn, shap.sample(X, 100))(X)
- Access: .values (N,F) | .base_values | .data
- Global: shap.plots.bar(shap_values) | summary_plot(sv, X, "beeswarm")
- Local: shap.plots.waterfall(shap_values[i]) | force_plot(shap_values[i])
- Binary: shap_values[:,:,1] for positive class probabilities
SHAP Explainability Pipeline
# ml/shap_pipeline.py — model explainability with SHAP
from __future__ import annotations
import warnings
import numpy as np
import pandas as pd
from pathlib import Path
import shap
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use("Agg") # Non-interactive backend
warnings.filterwarnings("ignore")
# ── 1. Explainer creation ─────────────────────────────────────────────────────
def create_tree_explainer(
model,
feature_perturbation: str = "tree_path_dependent",
model_output: str = "raw", # "raw" | "probability" | "log_loss"
) -> shap.TreeExplainer:
"""
Create TreeExplainer for tree-based models.
Supports XGBoost, LightGBM, scikit-learn trees, CatBoost, etc.
feature_perturbation:
- "tree_path_dependent": exact SHAP (fast, no background needed)
- "interventional": causal SHAP (requires background data)
model_output:
- "raw": log-odds for classifiers, raw value for regressors
- "probability": probability scale (slower)
"""
return shap.TreeExplainer(
model,
feature_perturbation=feature_perturbation,
model_output=model_output,
)
def create_linear_explainer(
model,
X_background: np.ndarray | pd.DataFrame,
masker_type: str = "independent",
) -> shap.LinearExplainer:
"""
Create LinearExplainer for linear models (LogisticRegression, Ridge, Lasso, ElasticNet).
X_background should be a representative sample (~100-1000 rows).
"""
masker = shap.maskers.Independent(X_background)
return shap.LinearExplainer(model, masker)
def create_deep_explainer(
model,
X_background: np.ndarray,
n_background: int = 100,
) -> shap.DeepExplainer:
"""
Create DeepExplainer for Keras/PyTorch neural networks.
X_background: representative background samples for baseline expectation.
"""
bg = shap.sample(X_background, n_background) if len(X_background) > n_background else X_background
return shap.DeepExplainer(model, bg)
def create_kernel_explainer(
predict_fn,
X_background: np.ndarray | pd.DataFrame,
n_background: int = 100,
link: str = "identity", # "identity" | "logit"
) -> shap.KernelExplainer:
"""
Model-agnostic KernelExplainer (works with ANY predict function).
Slower than TreeExplainer — use for black-box / custom models.
link="logit" for f(x)=probability → SHAP values in log-odds space.
"""
bg = shap.sample(X_background, n_background)
return shap.KernelExplainer(predict_fn, bg, link=link)
# ── 2. Computing SHAP values ──────────────────────────────────────────────────
def compute_shap_values(
explainer,
X: np.ndarray | pd.DataFrame,
check_additivity: bool = False,
) -> shap.Explanation:
"""
Compute SHAP values for a dataset.
Returns shap.Explanation with .values (N, F), .base_values, .data.
check_additivity: set False for speed (disable SHAP sum check).
"""
return explainer(X, check_additivity=check_additivity)
def shap_values_binary(shap_vals: shap.Explanation) -> shap.Explanation:
"""
For binary classifiers that return (N, F, 2) SHAP values,
extract the positive class (index 1).
"""
if shap_vals.values.ndim == 3:
return shap_vals[:, :, 1]
return shap_vals
# ── 3. Global feature importance ─────────────────────────────────────────────
def global_importance(
shap_vals: shap.Explanation,
feature_names: list[str] = None,
top_n: int = 20,
) -> pd.DataFrame:
"""
Compute mean |SHAP| for global feature importance.
Returns DataFrame sorted by importance descending.
"""
vals = shap_vals.values if hasattr(shap_vals, "values") else shap_vals
if vals.ndim == 3:
vals = vals[:, :, 1]
names = feature_names or (
list(shap_vals.feature_names) if hasattr(shap_vals, "feature_names") and shap_vals.feature_names is not None
else [f"feature_{i}" for i in range(vals.shape[1])]
)
mean_abs = np.abs(vals).mean(axis=0)
df = pd.DataFrame({"feature": names, "mean_abs_shap": mean_abs})
return df.sort_values("mean_abs_shap", ascending=False).head(top_n).reset_index(drop=True)
def feature_direction(
shap_vals: shap.Explanation,
X: np.ndarray | pd.DataFrame,
feature_names: list[str] = None,
) -> pd.DataFrame:
"""
Compute correlation between feature value and SHAP value.
Positive = feature increases prediction; negative = decreases.
"""
vals = shap_vals.values if hasattr(shap_vals, "values") else shap_vals
if vals.ndim == 3:
vals = vals[:, :, 1]
X_arr = X.values if isinstance(X, pd.DataFrame) else X
names = feature_names or [f"f{i}" for i in range(vals.shape[1])]
rows = []
for i, name in enumerate(names):
corr = float(np.corrcoef(X_arr[:, i], vals[:, i])[0, 1])
rows.append({"feature": name, "corr": round(corr, 4),
"direction": "positive" if corr > 0 else "negative"})
return pd.DataFrame(rows).sort_values("corr", key=abs, ascending=False)
# ── 4. Individual prediction explanation ─────────────────────────────────────
def explain_prediction(
shap_vals: shap.Explanation,
idx: int = 0,
X: pd.DataFrame = None,
top_n: int = 10,
) -> dict:
"""
Explain a single prediction.
Returns base_value, prediction, and top feature contributions.
"""
sv = shap_vals[idx]
if sv.values.ndim == 2: # Multi-class
sv = shap_vals[idx, :, 1]
base = float(sv.base_values) if hasattr(sv, "base_values") else 0.0
values = sv.values if hasattr(sv, "values") else sv
names = list(sv.feature_names) if hasattr(sv, "feature_names") and sv.feature_names is not None else None
if names is None:
names = list(X.columns) if X is not None else [f"f{i}" for i in range(len(values))]
contribs = sorted(zip(names, values), key=lambda x: abs(x[1]), reverse=True)[:top_n]
prediction = base + float(np.sum(sv.values if hasattr(sv, "values") else values))
return {
"base_value": round(base, 4),
"prediction": round(prediction, 4),
"top_features": [(n, round(float(v), 4)) for n, v in contribs],
}
# ── 5. Plotting ───────────────────────────────────────────────────────────────
def save_summary_plot(
shap_vals: shap.Explanation,
X: pd.DataFrame,
output_path: str = "shap_summary.png",
plot_type: str = "beeswarm", # "beeswarm" | "bar" | "dot" | "violin"
max_display: int = 20,
) -> str:
"""Save SHAP summary (beeswarm) plot as PNG."""
fig, ax = plt.subplots(figsize=(10, max_display * 0.4 + 2))
shap.summary_plot(
shap_vals, X,
plot_type=plot_type,
max_display=max_display,
show=False,
)
plt.tight_layout()
plt.savefig(output_path, dpi=120, bbox_inches="tight")
plt.close()
print(f"Summary plot saved: {output_path}")
return output_path
def save_waterfall_plot(
shap_vals: shap.Explanation,
idx: int = 0,
output_path: str = "shap_waterfall.png",
max_display: int = 15,
) -> str:
"""Save waterfall plot for a single prediction."""
sv = shap_vals[idx]
if sv.values.ndim == 2:
sv = shap_vals[idx, :, 1]
plt.figure(figsize=(10, max_display * 0.4 + 2))
shap.plots.waterfall(sv, max_display=max_display, show=False)
plt.tight_layout()
plt.savefig(output_path, dpi=120, bbox_inches="tight")
plt.close()
print(f"Waterfall plot saved: {output_path}")
return output_path
def save_bar_plot(
shap_vals: shap.Explanation,
output_path: str = "shap_bar.png",
max_display: int = 20,
) -> str:
"""Save global importance bar chart."""
plt.figure(figsize=(8, max_display * 0.4 + 2))
shap.plots.bar(shap_vals, max_display=max_display, show=False)
plt.tight_layout()
plt.savefig(output_path, dpi=120, bbox_inches="tight")
plt.close()
return output_path
def save_dependence_plot(
shap_vals: shap.Explanation,
feature: str,
X: pd.DataFrame,
interaction: str = "auto",
output_path: str = None,
) -> str:
"""
Save dependence plot showing SHAP value vs feature value.
interaction="auto" picks the feature with strongest interaction.
"""
output_path = output_path or f"shap_dep_{feature}.png"
plt.figure(figsize=(8, 5))
vals = shap_vals.values if shap_vals.values.ndim == 2 else shap_vals.values[:, :, 1]
shap.dependence_plot(feature, vals, X,
interaction_index=interaction, show=False)
plt.tight_layout()
plt.savefig(output_path, dpi=120, bbox_inches="tight")
plt.close()
return output_path
# ── 6. Model debugging / fairness ─────────────────────────────────────────────
def shap_by_subgroup(
shap_vals: shap.Explanation,
group_labels: np.ndarray,
feature_names: list[str] = None,
) -> pd.DataFrame:
"""
Compare mean |SHAP| importance across subgroups.
Useful for detecting fairness issues (different feature usage per group).
"""
vals = shap_vals.values
if vals.ndim == 3:
vals = vals[:, :, 1]
names = feature_names or [f"f{i}" for i in range(vals.shape[1])]
groups = np.unique(group_labels)
rows = {}
for g in groups:
mask = group_labels == g
rows[f"group_{g}"] = np.abs(vals[mask]).mean(axis=0)
return pd.DataFrame(rows, index=names).sort_values(f"group_{groups[0]}", ascending=False)
# ── Demo ──────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
print("SHAP Explainability Demo")
print("="*50)
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, random_state=42)
feature_names = [f"feature_{i}" for i in range(10)]
X_df = pd.DataFrame(X, columns=feature_names)
X_tr, X_te, y_tr, y_te = train_test_split(X_df, y, test_size=0.2, random_state=42)
# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_tr, y_tr)
print(f"Model accuracy: {model.score(X_te, y_te):.3f}")
# SHAP
explainer = create_tree_explainer(model)
shap_vals = compute_shap_values(explainer, X_te)
shap_vals = shap_values_binary(shap_vals)
# Global importance
imp = global_importance(shap_vals, feature_names=feature_names, top_n=5)
print(f"\nGlobal feature importance (top 5):\n{imp}")
# Individual explanation
explanation = explain_prediction(shap_vals, idx=0, X=X_te)
print(f"\nPrediction explanation (sample 0):")
print(f" base_value = {explanation['base_value']}")
print(f" prediction = {explanation['prediction']}")
print(f" top features: {explanation['top_features'][:3]}")
# Feature direction
dirs = feature_direction(shap_vals, X_te, feature_names)
print(f"\nFeature directions (top 5):\n{dirs.head()}")
# Save plots
save_summary_plot(shap_vals, X_te, "/tmp/shap_summary.png")
save_waterfall_plot(shap_vals, idx=0, output_path="/tmp/shap_waterfall.png")
save_bar_plot(shap_vals, output_path="/tmp/shap_bar.png")
For the LIME alternative when explaining single predictions with locally faithful linear approximations that are easier to communicate to non-technical stakeholders — LIME produces simple “this is why” summaries while SHAP’s Shapley values are the only explanation method with all four axiomatic fairness properties (efficiency, symmetry, dummy, linearity), meaning they uniquely sum to the prediction gap from the baseline and correctly attribute shared contributions, making SHAP the standard for regulatory compliance (GDPR right-to-explanation, model audit trails) where attribution accuracy matters more than communication simplicity. For the scikit-learn feature_importances_ alternative when needing quick impurity-based importance — sklearn’s built-in importance is computed during training and biased toward high-cardinality and correlated features while SHAP’s TreeExplainer produces consistent, interaction-aware importances in milliseconds (same speed as predict) and the dependence_plot reveals non-linear feature effects that scikit-learn’s single importance score completely hides. The Claude Skills 360 bundle includes SHAP skill sets covering TreeExplainer for gradient boosting, LinearExplainer, DeepExplainer for neural networks, KernelExplainer for any model, global importance, waterfall and force plots, beeswarm summary, dependence plots, interaction values, and subgroup fairness analysis. Start with the free tier to try model explainability code generation.