Optuna finds the best hyperparameters with Bayesian optimization. pip install optuna. study = optuna.create_study(direction="maximize", study_name="churn-gbm", sampler=optuna.samplers.TPESampler()). The objective function: def objective(trial): then trial.suggest_float("lr", 1e-4, 0.1, log=True), trial.suggest_int("n_estimators", 50, 500), trial.suggest_categorical("criterion", ["gini", "entropy"]). Return the metric to optimize. study.optimize(objective, n_trials=100, timeout=3600, n_jobs=4). study.best_params, study.best_value, study.best_trial. Pruning: trial.should_prune() inside training loop — pruner=optuna.pruners.MedianPruner(n_startup_trials=5) stops unpromising trials early. Intermediate values: trial.report(value, step) then if trial.should_prune(): raise optuna.TrialPruned(). Persistent storage: storage="postgresql://user:pass@host/optuna" or storage="sqlite:///optuna.db" — enables parallelism across processes. Visualization: optuna.visualization.plot_optimization_history(study), plot_param_importances(study), plot_contour(study, params=["lr", "max_depth"]). LightGBM integration: LightGBMTunerCV auto-tunes all LightGBM params. optuna.integration.PyTorchLightningPruningCallback(trial, monitor="val_loss") for Lightning. optuna.integration.MLflowCallback logs each trial to MLflow. Dashboard: optuna-dashboard sqlite:///optuna.db. study.trials_dataframe() exports as pandas DataFrame. optuna.copy_study and study.add_trial for warm starts. Claude Code generates Optuna objective functions, sampler configs, pruning setups, parallel study configs, and integration callbacks.
CLAUDE.md for Optuna
## Optuna Stack
- Version: optuna >= 3.5
- Study: optuna.create_study(direction, sampler=TPESampler()/CmaEsSampler(), pruner=MedianPruner())
- Trials: trial.suggest_float(name, low, high, log=True) / suggest_int / suggest_categorical
- Prune: trial.report(value, step); if trial.should_prune(): raise optuna.TrialPruned()
- Run: study.optimize(objective, n_trials=100, n_jobs=-1, timeout=3600)
- Best: study.best_params — dict, study.best_value — float
- Storage: optuna.create_study(storage="postgresql://..." or "sqlite:///") for parallel
- Viz: optuna.visualization.plot_param_importances(study) etc.
Objective Functions
# optimization/optuna_search.py — Optuna hyperparameter optimization
from __future__ import annotations
import pickle
from typing import Any
import numpy as np
import optuna
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier, RandomForestClassifier
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
optuna.logging.set_verbosity(optuna.logging.WARNING)
FEATURE_COLS = ["age", "tenure_days", "monthly_spend", "support_tickets", "last_login_days"]
TARGET_COL = "churned"
def load_data(path: str = "data/train.csv") -> tuple[np.ndarray, np.ndarray]:
df = pd.read_csv(path)
return df[FEATURE_COLS].values, df[TARGET_COL].values
# ── Objective: GradientBoosting ───────────────────────────────────────────────
def gbm_objective(trial: optuna.Trial) -> float:
"""Objective for tuning GradientBoostingClassifier."""
params = {
"n_estimators": trial.suggest_int("n_estimators", 50, 600, step=50),
"learning_rate": trial.suggest_float("learning_rate", 1e-3, 0.3, log=True),
"max_depth": trial.suggest_int("max_depth", 2, 8),
"min_samples_leaf": trial.suggest_int("min_samples_leaf", 5, 100, log=True),
"subsample": trial.suggest_float("subsample", 0.5, 1.0),
"max_features": trial.suggest_categorical("max_features", ["sqrt", "log2", None]),
"min_samples_split": trial.suggest_int("min_samples_split", 2, 20),
}
X, y = load_data()
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
aucs = []
for fold, (train_idx, val_idx) in enumerate(cv.split(X, y)):
X_tr, X_val = X[train_idx], X[val_idx]
y_tr, y_val = y[train_idx], y[val_idx]
pipeline = Pipeline([
("scaler", StandardScaler()),
("clf", GradientBoostingClassifier(**params, random_state=42)),
])
# Use warm_start to allow intermediate reporting for pruning
pipeline.fit(X_tr, y_tr)
auc = roc_auc_score(y_val, pipeline.predict_proba(X_val)[:, 1])
aucs.append(auc)
# Report intermediate for pruning
trial.report(float(np.mean(aucs)), step=fold)
if trial.should_prune():
raise optuna.TrialPruned()
return float(np.mean(aucs))
# ── Objective: model type selection ──────────────────────────────────────────
def multi_model_objective(trial: optuna.Trial) -> float:
"""Objective that also searches over model architecture."""
model_type = trial.suggest_categorical("model_type", ["gbm", "rf"])
X, y = load_data()
cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
if model_type == "gbm":
clf = GradientBoostingClassifier(
n_estimators=trial.suggest_int("n_estimators", 50, 400),
learning_rate=trial.suggest_float("lr", 1e-3, 0.2, log=True),
max_depth=trial.suggest_int("max_depth", 2, 6),
random_state=42,
)
else: # rf
clf = RandomForestClassifier(
n_estimators=trial.suggest_int("n_estimators", 50, 400),
max_depth=trial.suggest_int("max_depth", 3, 15),
min_samples_leaf=trial.suggest_int("min_samples_leaf", 1, 20),
class_weight=trial.suggest_categorical("class_weight", [None, "balanced"]),
random_state=42,
)
pipeline = Pipeline([("scaler", StandardScaler()), ("clf", clf)])
aucs = []
for train_idx, val_idx in cv.split(X, y):
pipeline.fit(X[train_idx], y[train_idx])
aucs.append(roc_auc_score(y[val_idx], pipeline.predict_proba(X[val_idx])[:, 1]))
return float(np.mean(aucs))
# ── Study creation and optimization ──────────────────────────────────────────
def run_optimization(
n_trials: int = 100,
n_jobs: int = 4,
storage: str = "sqlite:///optuna_churn.db",
study_name: str = "churn-gbm",
) -> optuna.Study:
"""Create or load a study and run optimization."""
study = optuna.create_study(
study_name=study_name,
direction="maximize",
storage=storage,
load_if_exists=True,
sampler=optuna.samplers.TPESampler(
n_startup_trials=10,
multivariate=True,
constant_liar=True, # Better for parallel execution
),
pruner=optuna.pruners.MedianPruner(
n_startup_trials=5,
n_warmup_steps=1,
),
)
study.optimize(
gbm_objective,
n_trials=n_trials,
n_jobs=n_jobs,
timeout=None,
show_progress_bar=True,
callbacks=[
optuna.study.MaxTrialsCallback(n_trials, states=[optuna.trial.TrialState.COMPLETE]),
],
)
print(f"\nBest AUC: {study.best_value:.4f}")
print(f"Best params: {study.best_params}")
return study
# ── Retrain best model ────────────────────────────────────────────────────────
def retrain_best(study: optuna.Study, output_path: str = "best_model.pkl") -> Pipeline:
"""Retrain on full data using best hyperparameters."""
X, y = load_data()
params = study.best_params.copy()
pipeline = Pipeline([
("scaler", StandardScaler()),
("clf", GradientBoostingClassifier(**params, random_state=42)),
])
pipeline.fit(X, y)
with open(output_path, "wb") as f:
pickle.dump(pipeline, f)
print(f"Best model saved to {output_path}")
return pipeline
if __name__ == "__main__":
study = run_optimization(n_trials=100, n_jobs=2)
retrain_best(study)
LightGBM Integration
# optimization/lgbm_optuna.py — LightGBMLTuner auto-tuning
from __future__ import annotations
import lightgbm as lgb
import numpy as np
import optuna
import optuna.integration.lightgbm as lgb_optuna
import pandas as pd
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
optuna.logging.set_verbosity(optuna.logging.WARNING)
FEATURE_COLS = ["age", "tenure_days", "monthly_spend", "support_tickets", "last_login_days"]
def tune_lightgbm(data_path: str = "data/train.csv") -> dict:
"""Use LightGBMTunerCV to auto-tune all LightGBM hyperparameters."""
df = pd.read_csv(data_path)
X_train, X_val, y_train, y_val = train_test_split(
df[FEATURE_COLS].values, df["churned"].values,
test_size=0.2, stratify=df["churned"].values, random_state=42,
)
dtrain = lgb.Dataset(X_train, label=y_train)
dval = lgb.Dataset(X_val, label=y_val, reference=dtrain)
params = {
"objective": "binary",
"metric": "auc",
"verbosity": -1,
"boosting_type": "gbdt",
}
# LightGBMTunerCV handles the full parameter search automatically
tuner = lgb_optuna.LightGBMTunerCV(
params,
dtrain,
num_boost_round=1000,
folds=5,
seed=42,
callbacks=[lgb.early_stopping(50), lgb.log_evaluation(-1)],
)
tuner.run()
best_params = tuner.best_params
print(f"Best AUC (CV): {tuner.best_score:.4f}")
print(f"Best params: {best_params}")
return best_params
def lgbm_objective_manual(trial: optuna.Trial) -> float:
"""Manual LightGBM objective for full control."""
import lightgbm as lgb
df = pd.read_csv("data/train.csv")
X, y = df[FEATURE_COLS].values, df["churned"].values
X_tr, X_val, y_tr, y_val = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
params = {
"objective": "binary",
"metric": "auc",
"verbosity": -1,
"num_leaves": trial.suggest_int("num_leaves", 20, 300),
"learning_rate": trial.suggest_float("learning_rate", 1e-3, 0.3, log=True),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
"reg_alpha": trial.suggest_float("reg_alpha", 1e-8, 10.0, log=True),
"reg_lambda": trial.suggest_float("reg_lambda", 1e-8, 10.0, log=True),
}
callbacks = [
lgb.early_stopping(30, verbose=False),
lgb.log_evaluation(-1),
optuna.integration.lightgbm.LightGBMPruningCallback(trial, "auc"),
]
model = lgb.train(
params,
lgb.Dataset(X_tr, label=y_tr),
num_boost_round=500,
valid_sets=[lgb.Dataset(X_val, label=y_val)],
callbacks=callbacks,
)
return float(model.best_score["valid_0"]["auc"])
Visualization and Dashboard
# scripts/visualize_study.py — Optuna visualization + dashboard
import optuna
import optuna.visualization as vis
import plotly.io as pio
def analyze_study(study_name: str, storage: str = "sqlite:///optuna_churn.db") -> None:
study = optuna.load_study(study_name=study_name, storage=storage)
print(f"Completed trials: {len([t for t in study.trials if t.state == optuna.trial.TrialState.COMPLETE])}")
print(f"Pruned trials: {len([t for t in study.trials if t.state == optuna.trial.TrialState.PRUNED])}")
print(f"Best value: {study.best_value:.4f}")
print(f"Best params: {study.best_params}")
# Export top-10 trials
df = study.trials_dataframe()
print("\nTop 10 trials by AUC:")
print(df.nlargest(10, "value")[["number", "value", "params_n_estimators", "params_learning_rate", "params_max_depth"]])
# Save plots
fig = vis.plot_optimization_history(study)
pio.write_html(fig, "reports/optimization_history.html")
fig = vis.plot_param_importances(study)
pio.write_html(fig, "reports/param_importances.html")
fig = vis.plot_contour(study, params=["learning_rate", "max_depth"])
pio.write_html(fig, "reports/contour_lr_depth.html")
fig = vis.plot_parallel_coordinate(study)
pio.write_html(fig, "reports/parallel_coordinate.html")
print("\nCharts saved to reports/")
print("Run: optuna-dashboard sqlite:///optuna_churn.db # for interactive UI")
if __name__ == "__main__":
analyze_study("churn-gbm")
For the Ray Tune alternative when needing distributed hyperparameter search across a Ray cluster with Population Based Training, ASHA scheduler, and integration with any ML framework including custom training loops — Ray Tune scales across many machines while Optuna is simpler to set up for single-machine or multi-process search, supports more sampler algorithms out of the box (TPE, CMA-ES, QMC), and has the LightGBMTuner auto-integration. For the Weights & Biases Sweeps alternative when already using W&B for experiment tracking and wanting hyperparameter search tightly integrated with your W&B dashboard, artifact logging, and team-visible run comparisons — W&B Sweeps are simpler when you’re already logged into W&B while Optuna works as a standalone library with no external service dependency and supports pluggable storage backends. The Claude Skills 360 bundle includes Optuna skill sets covering objective functions, TPE and CMA-ES samplers, pruning, LightGBM integration, and visualization. Start with the free tier to try hyperparameter optimization generation.