Blog / AI / Claude Code for Ultralytics YOLO: Object Detection and Tracking

Claude Code for Ultralytics YOLO: Object Detection and Tracking

Published: October 25, 2027

•

Read time: 5 min read

•

By: Claude Skills 360

Ultralytics YOLO is the fastest object detection and segmentation framework. pip install ultralytics. from ultralytics import YOLO. Load: model = YOLO("yolo11n.pt") — sizes: n (nano), s (small), m (medium), l, x. Task-specific: yolo11n-seg.pt (segmentation), yolo11n-pose.pt (pose), yolo11n-obb.pt (oriented boxes). Detect: results = model("image.jpg", conf=0.45, iou=0.5). Batch: results = model(["img1.jpg","img2.jpg","img3.jpg"]). Video: results = model("video.mp4", stream=True), for r in results: r.save(filename="pred.jpg"). Access boxes: results[0].boxes.xyxy, .conf, .cls, .xywh. Class names: results[0].names. Plot: results[0].plot() — annotated NumPy array. Save: results[0].save(filename="out.jpg"). Train: model.train(data="coco.yaml", epochs=100, imgsz=640, batch=16, device=0). Resume: model.train(resume=True). Val: model.val() returns metrics dict with box.map50. Export: model.export(format="onnx") or "tensorrt", "coreml", "tflite". Track: results = model.track("video.mp4", persist=True, tracker="bytetrack.yaml"), results[0].boxes.id — track IDs. Segmentation: results[0].masks.xy — polygon masks. Pose: results[0].keypoints.xy — (N, 17, 2) for COCO keypoints. Custom dataset YAML: path: ./dataset, train: images/train, val: images/val, nc: 2, names: [cat, dog]. from ultralytics import solutions — speed estimation, queue management, distance calculation solutions. Claude Code generates YOLO inference scripts, training configs, custom dataset pipelines, video trackers, and export workflows.

CLAUDE.md for Ultralytics YOLO

## Ultralytics Stack
- Version: ultralytics >= 8.3
- Load: YOLO("yolo11n.pt" | "yolo11n-seg.pt" | "yolo11n-pose.pt" | custom.pt)
- Detect: model.predict(source, conf=0.45, iou=0.5, device=0, stream=True)
- Results: .boxes.xyxy | .boxes.conf | .boxes.cls | .masks.xy | .keypoints.xy
- Train: model.train(data="dataset.yaml", epochs, imgsz=640, batch)
- Export: model.export(format="onnx" | "tensorrt" | "coreml" | "tflite")
- Track: model.track(source, persist=True, tracker="bytetrack.yaml")
- Val: model.val() → metrics.box.map50 ([email protected]), map ([email protected]:.95)
- Source: path | URL | np.ndarray | torch.Tensor | list | generator

Ultralytics YOLO Pipeline

# vision/ultralytics_pipeline.py — object detection and tracking with YOLO
from __future__ import annotations
import os
import cv2
import numpy as np
from pathlib import Path
from typing import Generator

from ultralytics import YOLO


# ── 1. Model loading ──────────────────────────────────────────────────────────

def load_detector(model_size: str = "n", pretrained: bool = True) -> YOLO:
    """
    Load YOLO detection model.
    Sizes: n (3.2M) | s (11M) | m (25M) | l (43M) | x (68M)
    """
    model_name = f"yolo11{model_size}.pt" if pretrained else f"yolo11{model_size}.yaml"
    model = YOLO(model_name)
    print(f"Loaded YOLO11{model_size.upper()}: {model.info()[0]} params")
    return model


def load_segmenter(model_size: str = "n") -> YOLO:
    """Load YOLO instance segmentation model."""
    return YOLO(f"yolo11{model_size}-seg.pt")


def load_pose_model(model_size: str = "n") -> YOLO:
    """Load YOLO pose estimation model (17 COCO keypoints)."""
    return YOLO(f"yolo11{model_size}-pose.pt")


def load_custom_model(weights_path: str) -> YOLO:
    """Load a custom or fine-tuned YOLO model."""
    return YOLO(weights_path)


# ── 2. Image inference ────────────────────────────────────────────────────────

def detect_objects(
    model:     YOLO,
    source,    # str path | np.ndarray | list | URL
    conf:      float = 0.45,
    iou:       float = 0.5,
    imgsz:     int   = 640,
    classes:   list  = None,     # Filter specific class IDs
    agnostic:  bool  = False,    # Class-agnostic NMS
    device:    str   = "cpu",
) -> list[dict]:
    """
    Run object detection and return structured results.
    Returns list of dicts with boxes, scores, class names.
    """
    results = model.predict(
        source=source,
        conf=conf,
        iou=iou,
        imgsz=imgsz,
        classes=classes,
        agnostic_nms=agnostic,
        device=device,
        verbose=False,
    )

    detections = []
    for r in results:
        boxes    = r.boxes
        img_dets = []
        for i in range(len(boxes)):
            xyxy  = boxes.xyxy[i].tolist()
            score = float(boxes.conf[i])
            cls   = int(boxes.cls[i])
            name  = r.names[cls]
            img_dets.append({
                "bbox":  [round(c, 1) for c in xyxy],  # [x1, y1, x2, y2]
                "score": round(score, 3),
                "class_id": cls,
                "class_name": name,
            })
        detections.append(img_dets)

    return detections if len(detections) > 1 else (detections[0] if detections else [])


def count_objects(
    model:   YOLO,
    source,
    classes: list = None,
) -> dict[str, int]:
    """Count detections per class in an image."""
    dets = detect_objects(model, source, classes=classes)
    if isinstance(dets, dict):
        dets = [dets]
    counts: dict[str, int] = {}
    for det in (dets if isinstance(dets, list) and dets and isinstance(dets[0], list) else [dets]):
        for d in det:
            counts[d["class_name"]] = counts.get(d["class_name"], 0) + 1
    return counts


# ── 3. Instance segmentation ──────────────────────────────────────────────────

def segment_objects(
    model:  YOLO,
    image:  np.ndarray,   # BGR image (OpenCV format)
    conf:   float = 0.45,
    device: str   = "cpu",
) -> tuple[np.ndarray, list[dict]]:
    """
    Run instance segmentation.
    Returns annotated image and list of dicts with mask polygons.
    """
    results = model.predict(image, conf=conf, device=device, verbose=False)
    r = results[0]

    segments = []
    if r.masks is not None:
        for i, (mask, box) in enumerate(zip(r.masks.xy, r.boxes)):
            segments.append({
                "polygon":    mask.tolist(),   # List of (x, y) points
                "bbox":       r.boxes.xyxy[i].tolist(),
                "score":      float(box.conf),
                "class_name": r.names[int(box.cls)],
            })

    annotated = r.plot(boxes=True, masks=True)
    return annotated, segments


def create_segmentation_mask(
    image_shape: tuple,
    polygon:     list,
    fill:        int = 255,
) -> np.ndarray:
    """Create a binary mask from a polygon."""
    mask    = np.zeros(image_shape[:2], dtype=np.uint8)
    pts     = np.array(polygon, dtype=np.int32)
    cv2.fillPoly(mask, [pts], fill)
    return mask


# ── 4. Pose estimation ────────────────────────────────────────────────────────

COCO_KEYPOINTS = [
    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
    "left_wrist", "right_wrist", "left_hip", "right_hip",
    "left_knee", "right_knee", "left_ankle", "right_ankle",
]

COCO_SKELETON = [
    (16, 14), (14, 12), (17, 15), (15, 13), (12, 13),
    (6, 12), (7, 13), (6, 7), (6, 8), (7, 9), (8, 10),
    (9, 11), (2, 3), (1, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7),
]


def estimate_pose(
    model:  YOLO,
    image:  np.ndarray,
    conf:   float = 0.45,
) -> list[dict]:
    """
    Estimate human pose keypoints.
    Returns list of people with keypoints and visibility.
    """
    results = model.predict(image, conf=conf, verbose=False)
    r       = results[0]

    poses = []
    if r.keypoints is not None:
        kpts  = r.keypoints.xy.cpu().numpy()    # (N_people, 17, 2)
        confs = r.keypoints.conf.cpu().numpy() if r.keypoints.conf is not None else None

        for person_idx in range(len(kpts)):
            person_kpts = {}
            for kpt_idx, kpt_name in enumerate(COCO_KEYPOINTS):
                x, y = kpts[person_idx, kpt_idx]
                conf_val = float(confs[person_idx, kpt_idx]) if confs is not None else 1.0
                person_kpts[kpt_name] = {"x": float(x), "y": float(y), "conf": conf_val}
            poses.append({
                "bbox":      r.boxes.xyxy[person_idx].tolist() if r.boxes else None,
                "keypoints": person_kpts,
            })

    return poses


# ── 5. Video tracking ─────────────────────────────────────────────────────────

def track_objects_video(
    model:        YOLO,
    video_path:   str,
    output_path:  str   = "tracked_output.mp4",
    conf:         float = 0.45,
    tracker:      str   = "bytetrack.yaml",   # "botsort.yaml" for Re-ID
    classes:      list  = None,
) -> dict[int, list[dict]]:
    """
    Track objects across video frames.
    Returns dict mapping track_id → list of per-frame detections.
    """
    cap    = cv2.VideoCapture(video_path)
    w      = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    h      = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps    = cap.get(cv2.CAP_PROP_FPS)
    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
    out    = cv2.VideoWriter(output_path, fourcc, fps, (w, h))

    track_history: dict[int, list[dict]] = {}
    frame_num = 0

    results = model.track(
        video_path,
        stream=True,
        persist=True,
        conf=conf,
        tracker=tracker,
        classes=classes,
        verbose=False,
    )

    for r in results:
        annotated = r.plot(line_width=2)
        out.write(annotated)

        if r.boxes.id is not None:
            for i, track_id in enumerate(r.boxes.id.int().tolist()):
                if track_id not in track_history:
                    track_history[track_id] = []
                track_history[track_id].append({
                    "frame":  frame_num,
                    "bbox":   r.boxes.xyxy[i].tolist(),
                    "conf":   float(r.boxes.conf[i]),
                    "class":  r.names[int(r.boxes.cls[i])],
                })
        frame_num += 1

    cap.release()
    out.release()
    print(f"Tracked {len(track_history)} unique objects → {output_path}")
    return track_history


# ── 6. Custom training ────────────────────────────────────────────────────────

def create_dataset_yaml(
    dataset_dir: str,
    class_names: list[str],
    yaml_path:   str = "dataset.yaml",
) -> str:
    """
    Create YOLO training YAML config from a dataset directory.
    Expected structure: dataset_dir/images/{train,val}/ and dataset_dir/labels/{train,val}/
    """
    content = f"""# YOLO Dataset Config
path: {os.path.abspath(dataset_dir)}
train: images/train
val:   images/val
test:  images/test  # optional

nc: {len(class_names)}
names: {class_names}
"""
    Path(yaml_path).write_text(content)
    print(f"Dataset YAML saved: {yaml_path}")
    return yaml_path


def fine_tune(
    base_model:   str  = "yolo11n.pt",
    data_yaml:    str  = "dataset.yaml",
    epochs:       int  = 100,
    imgsz:        int  = 640,
    batch:        int  = 16,
    device:       str  = "0",   # "0" for GPU, "cpu" for CPU
    patience:     int  = 50,
    project:      str  = "./runs",
    name:         str  = "custom_yolo",
    augment:      bool = True,
) -> YOLO:
    """
    Fine-tune YOLO on a custom dataset.
    Returns trained model.
    """
    model = YOLO(base_model)
    results = model.train(
        data=data_yaml,
        epochs=epochs,
        imgsz=imgsz,
        batch=batch,
        device=device,
        patience=patience,
        project=project,
        name=name,
        # Augmentation
        hsv_h=0.015 if augment else 0.0,
        hsv_s=0.7   if augment else 0.0,
        degrees=0.0,
        translate=0.1,
        scale=0.5,
        fliplr=0.5,
        mosaic=1.0  if augment else 0.0,
        mixup=0.1   if augment else 0.0,
    )
    print(f"Training complete. Best mAP50: {results.results_dict.get('metrics/mAP50(B)', 0):.3f}")
    return model


# ── 7. Model export ───────────────────────────────────────────────────────────

def export_model(
    model:      YOLO,
    format:     str = "onnx",   # onnx | tensorrt | coreml | tflite | openvino
    imgsz:      int = 640,
    half:       bool = False,
    simplify:   bool = True,
) -> str:
    """Export YOLO model for deployment."""
    exported_path = model.export(
        format=format,
        imgsz=imgsz,
        half=half,
        simplify=simplify,
        dynamic=False,
    )
    print(f"Exported to {format}: {exported_path}")
    return str(exported_path)


if __name__ == "__main__":
    # Load model
    model = load_detector("n")   # Smallest, fastest

    # Inference on a sample image
    import urllib.request
    img_url = "https://ultralytics.com/images/bus.jpg"
    urllib.request.urlretrieve(img_url, "bus.jpg")

    # Detect
    dets = detect_objects(model, "bus.jpg", conf=0.45)
    print(f"\nDetected {len(dets)} objects:")
    for d in dets[:5]:
        print(f"  {d['class_name']}: {d['score']:.2f} @ {d['bbox']}")

    # Count
    counts = count_objects(model, "bus.jpg")
    print(f"\nObject counts: {counts}")

    # Annotate and save
    results = model.predict("bus.jpg", verbose=False)
    results[0].save("bus_annotated.jpg")
    print("Saved: bus_annotated.jpg")

For the Detectron2 alternative when needing Faster R-CNN, Mask R-CNN, and Panoptic FPN architectures with Facebook AI research papers’ reference implementations and fine-grained control over region proposal networks — Detectron2 provides research-grade two-stage detector implementations while Ultralytics YOLO’s single-stage architecture is 10-50x faster at inference, trains in hours instead of days, and includes pose estimation, OBB, and video tracking out-of-the-box with a simpler API. For the MMDetection alternative when needing OpenMMLab’s comprehensive zoo of 200+ detection algorithms including DETR, Sparse R-CNN, and ATSS with modular config-driven architecture — MMDetection provides the widest algorithm coverage while Ultralytics is better for production deployment with ONNX, TensorRT, CoreML, and TFLite export, built-in ByteTrack video tracking, and a single model.export() call replacing custom deployment pipelines. The Claude Skills 360 bundle includes Ultralytics skill sets covering YOLOv11 detection, instance segmentation, pose estimation, video tracking, custom dataset YAML, fine-tuning, multi-format export, and result visualization. Start with the free tier to try object detection code generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39