ClearML is the open-source MLOps platform — self-hosted or managed. pip install clearml. Task.init(project_name="churn", task_name="Train GBM v1") creates an experiment. Auto-captures: argparse args, environment, installed packages, console output, Git commit. Manual logging: logger = Task.current_task().get_logger(). logger.report_scalar("auc", "train", value=0.87, iteration=epoch). logger.report_confusion_matrix("Confusion", "train", matrix=cm, iteration=0, xlabels=["no", "yes"], ylabels=["no", "yes"]). logger.report_image("plots", "roc_curve", image=fig). logger.report_table("summary", "metrics", table_plot=df). Artifacts: task.upload_artifact("model", artifact_object="model.pkl"). Parameters: task.connect(params_dict) — values become editable in the UI. Dataset: dataset = Dataset.create(dataset_project="Churn", dataset_name="customers-v1"), dataset.add_files("data/customers.csv"), dataset.upload(), dataset.finalize(). Access: dataset = Dataset.get(dataset_project="Churn", dataset_name="customers-v1"), path = dataset.get_local_copy(). Pipelines: @PipelineDecorator.component on each function, @PipelineDecorator.pipeline on the orchestrator. PipelineDecorator.run_locally() for testing. HPO: HyperParameterOptimizer(Task.get_task(project_name="churn", task_name="Train GBM"), hyper_parameters=[UniformParameterRange("lr", 0.001, 0.3)], objective_metric_title="auc", objective_metric_series="val", optimizer_class=OptimizerOptuna, max_number_of_concurrent_tasks=4). optimizer.start(). Model registry: OutputModel(task=task, framework="scikit-learn"). model.update_weights("model.pkl"). Claude Code generates ClearML Task configs, datasets, pipeline decorators, HPO setups, and TypeScript API clients.
CLAUDE.md for ClearML
## ClearML Stack
- Version: clearml >= 1.14
- Task: Task.init(project_name, task_name) — auto-captures args/git/packages
- Log: logger.report_scalar/report_image/report_confusion_matrix/report_table
- Connect: task.connect(config_dict) — makes hyperparams editable in UI
- Dataset: Dataset.create → add_files → upload → finalize; Dataset.get → get_local_copy
- Pipeline: @PipelineDecorator.component + @PipelineDecorator.pipeline
- HPO: HyperParameterOptimizer with UniformParameterRange/DiscreteParameterRange
- Model: OutputModel(task) → update_weights + publish + tag
Task Tracking
# train_clearml.py — full ClearML experiment with logging
from __future__ import annotations
import os
import pickle
from pathlib import Path
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from clearml import Dataset, Logger, OutputModel, Task
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import (
RocCurveDisplay,
confusion_matrix,
roc_auc_score,
average_precision_score,
)
from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
FEATURE_COLS = ["age", "tenure_days", "monthly_spend", "support_tickets", "last_login_days"]
def main():
# ── Init task ─────────────────────────────────────────────────────────
task = Task.init(
project_name="churn-prediction",
task_name="Train GBM",
task_type=Task.TaskTypes.training,
reuse_last_task_id=False,
)
# Hyperparameters — connect makes them editable in the ClearML dashboard
config = {
"n_estimators": 200,
"learning_rate": 0.05,
"max_depth": 4,
"min_samples_leaf": 10,
"random_state": 42,
}
task.connect(config, name="hyperparams")
logger: Logger = task.get_logger()
# ── Load dataset ──────────────────────────────────────────────────────
try:
# Try to get from ClearML Dataset registry
dataset = Dataset.get(dataset_project="Churn", dataset_name="customers-v1")
data_path = dataset.get_local_copy()
df = pd.read_csv(f"{data_path}/customers.csv")
except Exception:
# Fallback to local file
df = pd.read_csv("data/train.csv")
logger.report_single_value("n_samples", len(df))
logger.report_single_value("target_rate", float(df["churned"].mean()))
X = df[FEATURE_COLS].values
y = df["churned"].values
# ── Cross-validation ─────────────────────────────────────────────────
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=config["random_state"])
aucs: list[float] = []
for fold, (train_idx, val_idx) in enumerate(cv.split(X, y)):
pipeline = Pipeline([
("scaler", StandardScaler()),
("clf", GradientBoostingClassifier(
n_estimators=config["n_estimators"],
learning_rate=config["learning_rate"],
max_depth=config["max_depth"],
min_samples_leaf=config["min_samples_leaf"],
random_state=config["random_state"],
)),
])
pipeline.fit(X[train_idx], y[train_idx])
auc = roc_auc_score(y[val_idx], pipeline.predict_proba(X[val_idx])[:, 1])
aucs.append(auc)
logger.report_scalar("auc", "val_fold", value=auc, iteration=fold)
mean_auc = float(np.mean(aucs))
logger.report_single_value("cv_auc_mean", mean_auc)
logger.report_single_value("cv_auc_std", float(np.std(aucs)))
logger.report_scalar("summary", "cv_auc_mean", value=mean_auc, iteration=0)
# ── Full training ─────────────────────────────────────────────────────
final = Pipeline([
("scaler", StandardScaler()),
("clf", GradientBoostingClassifier(**{k: config[k] for k in config if k != "config"})),
])
final.fit(X, y)
y_pred = final.predict(X)
y_proba = final.predict_proba(X)[:, 1]
train_auc = float(roc_auc_score(y, y_proba))
logger.report_scalar("auc", "train", value=train_auc, iteration=0)
logger.report_scalar("auc", "cv_mean", value=mean_auc, iteration=0)
# ── Confusion matrix ──────────────────────────────────────────────────
cm = confusion_matrix(y, y_pred)
logger.report_confusion_matrix(
title="Confusion Matrix",
series="train",
matrix=cm,
iteration=0,
xlabels=["no_churn", "churn"],
ylabels=["no_churn", "churn"],
)
# ── ROC curve ────────────────────────────────────────────────────────
fig, ax = plt.subplots(figsize=(6, 5))
RocCurveDisplay.from_predictions(y, y_proba, ax=ax)
ax.set_title(f"ROC Curve (AUC={mean_auc:.4f})")
logger.report_matplotlib_figure(title="ROC Curve", series="train", iteration=0, figure=fig)
plt.close(fig)
# ── Feature importance bar chart ──────────────────────────────────────
importances = final.named_steps["clf"].feature_importances_
fi_df = pd.DataFrame({"feature": FEATURE_COLS, "importance": importances})
logger.report_table(title="Feature Importance", series="train", table_plot=fi_df, iteration=0)
# ── Save and register model ───────────────────────────────────────────
Path("models").mkdir(exist_ok=True)
model_path = "models/churn_model.pkl"
with open(model_path, "wb") as f:
pickle.dump(final, f)
output_model = OutputModel(task=task, framework="scikit-learn", label_enumeration={"churned": 1})
output_model.update_weights(weights_filename=model_path, auto_delete_file=False)
output_model.publish()
output_model.add_tags(["gbm", "production-candidate"])
task.upload_artifact("model", artifact_object=model_path)
print(f"\nCV AUC: {mean_auc:.4f}")
task.close()
if __name__ == "__main__":
main()
ClearML Pipeline
# pipelines/churn_pipeline.py — ClearML Pipeline with decorator API
from __future__ import annotations
from clearml import Dataset
from clearml.automation import PipelineDecorator
@PipelineDecorator.component(
return_values=["train_path", "test_path"],
cache=True,
task_type="data_processing",
)
def prepare_data(dataset_name: str = "customers-v1", test_size: float = 0.2) -> tuple[str, str]:
import os
from sklearn.model_selection import train_test_split
import pandas as pd
dataset = Dataset.get(dataset_project="Churn", dataset_name=dataset_name)
data_path = dataset.get_local_copy()
df = pd.read_csv(f"{data_path}/customers.csv")
train_df, test_df = train_test_split(df, test_size=test_size, stratify=df["churned"], random_state=42)
os.makedirs("/tmp/churn_pipeline", exist_ok=True)
train_df.to_csv("/tmp/churn_pipeline/train.csv", index=False)
test_df.to_csv("/tmp/churn_pipeline/test.csv", index=False)
return "/tmp/churn_pipeline/train.csv", "/tmp/churn_pipeline/test.csv"
@PipelineDecorator.component(
return_values=["model_path", "cv_auc"],
task_type="training",
packages=["scikit-learn>=1.2"],
)
def train_model(
train_path: str,
n_estimators: int = 200,
learning_rate: float = 0.05,
) -> tuple[str, float]:
import pickle
import numpy as np
import pandas as pd
from clearml import Task
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
FEATURE_COLS = ["age", "tenure_days", "monthly_spend", "support_tickets", "last_login_days"]
df = pd.read_csv(train_path)
X, y = df[FEATURE_COLS].values, df["churned"].values
pipeline = Pipeline([
("scaler", StandardScaler()),
("clf", GradientBoostingClassifier(n_estimators=n_estimators, learning_rate=learning_rate)),
])
cv_auc = float(np.mean(cross_val_score(pipeline, X, y, cv=5, scoring="roc_auc")))
pipeline.fit(X, y)
task = Task.current_task()
task.get_logger().report_single_value("cv_auc", cv_auc)
model_path = "/tmp/churn_pipeline/model.pkl"
with open(model_path, "wb") as f:
pickle.dump(pipeline, f)
return model_path, cv_auc
@PipelineDecorator.pipeline(
name="churn-training-pipeline",
project="churn-prediction",
version="1.0",
)
def churn_pipeline(
dataset_name: str = "customers-v1",
n_estimators: int = 200,
learning_rate: float = 0.05,
):
train_path, test_path = prepare_data(dataset_name=dataset_name)
model_path, cv_auc = train_model(train_path=train_path, n_estimators=n_estimators, learning_rate=learning_rate)
print(f"Pipeline complete: model={model_path}, cv_auc={cv_auc:.4f}")
if __name__ == "__main__":
PipelineDecorator.run_locally() # For local testing
churn_pipeline()
For the ZenML alternative when needing a stack-portable pipeline framework with typed step outputs, artifact lineage, and support for Kubeflow/SageMaker/Vertex orchestrators without code changes — ZenML abstracts the stack while ClearML provides a more opinionated all-in-one platform that self-hosts tracking, dataset versioning, model registry, and the clearml-agent for distributed task execution in a single open-source deployment. For the MLflow alternative when needing a pure open-source experiment tracking server with a minimal dependency footprint, model flavors for many frameworks, and REST API model deployment — MLflow is simpler to run standalone while ClearML provides a richer integrated platform including agent-based remote execution, Dataset versioning with lineage, the HPO optimizer, and a full-featured web UI without additional infrastructure like Minio or Postgres. The Claude Skills 360 bundle includes ClearML skill sets covering Task tracking, Dataset versioning, Pipeline decorators, HPO setup, and TypeScript API clients. Start with the free tier to try open-source MLOps generation.