Blog / AI / Claude Code for pytest-benchmark: Python Performance Testing

Claude Code for pytest-benchmark: Python Performance Testing

Published: January 18, 2028

•

Read time: 5 min read

•

By: Claude Skills 360

pytest-benchmark measures Python performance in tests. pip install pytest-benchmark. Fixture: def test_sort(benchmark): result = benchmark(sorted, [3,1,2]); assert result == [1,2,3]. Lambda: benchmark(lambda: my_func(arg)). Setup: benchmark.pedantic(func, args=(a,b), setup=setup_fn, rounds=100, iterations=10). Group: @pytest.mark.benchmark(group="sorting"). Run: pytest --benchmark-only. Skip in normal runs: pytest --benchmark-skip. Storage: pytest --benchmark-autosave — saves JSON in .benchmarks/. Compare: pytest --benchmark-compare — compare with last saved. --benchmark-compare=0001 — compare with specific run. --benchmark-compare-fail=mean:5% — fail if mean regresses by >5%. Warmup: benchmark.warmup_rounds (default 1). rounds: number of rounds (default auto). timer: time.perf_counter (default). disable_gc=True — disable garbage collector during timing. calibrate_timer=True. Histogram: pytest --benchmark-histogram — generates PNG. JSON: pytest --benchmark-json=output.json. Parametrize: @pytest.mark.parametrize("fn", [fn1, fn2]). def test_compare(benchmark, fn): benchmark(fn, data). Min/mean/stddev/ops in output. benchmark.stats — access stats in test. benchmark.stats.mean, benchmark.stats.min, benchmark.stats.stddev. CI: export PYTHONHASHSEED=0 for reproducibility. Store .benchmarks/ in git. --benchmark-compare-fail gates CI. Claude Code generates pytest-benchmark fixtures, pedantic setups, and compare-fail thresholds for CI.

CLAUDE.md for pytest-benchmark

## pytest-benchmark Stack
- Version: pytest-benchmark >= 4.0 | pip install pytest-benchmark
- Fixture: def test_fn(benchmark): result = benchmark(func, *args, **kwargs)
- Pedantic: benchmark.pedantic(fn, args=(..), setup=setup_fn, rounds=N)
- Group: @pytest.mark.benchmark(group="name") — group related benchmarks
- CI: pytest --benchmark-autosave && pytest --benchmark-compare-fail=mean:5%
- Skip: pytest --benchmark-skip | pytest --benchmark-only for bench-only runs
- Stats: benchmark.stats.mean | .min | .max | .stddev after test runs

pytest-benchmark Performance Testing Pipeline

# tests/test_benchmarks.py — pytest-benchmark patterns
from __future__ import annotations

import json
import re
import time
from functools import lru_cache
from typing import Any

import pytest


# ─────────────────────────────────────────────────────────────────────────────
# Functions under benchmark
# ─────────────────────────────────────────────────────────────────────────────

# Sorting implementations
def bubble_sort(arr: list[int]) -> list[int]:
    result = arr.copy()
    n = len(result)
    for i in range(n):
        for j in range(0, n - i - 1):
            if result[j] > result[j + 1]:
                result[j], result[j + 1] = result[j + 1], result[j]
    return result


def insertion_sort(arr: list[int]) -> list[int]:
    result = arr.copy()
    for i in range(1, len(result)):
        key = result[i]
        j = i - 1
        while j >= 0 and result[j] > key:
            result[j + 1] = result[j]
            j -= 1
        result[j + 1] = key
    return result


# Serialization implementations
def slow_json_build(records: list[dict]) -> str:
    parts = []
    for r in records:
        parts.append(json.dumps(r))
    return "[" + ",".join(parts) + "]"


def fast_json_build(records: list[dict]) -> str:
    return json.dumps(records)


# Regex implementations
def slow_email_validate(emails: list[str]) -> list[bool]:
    pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
    return [re.match(pattern, e) is not None for e in emails]


_EMAIL_RE = re.compile(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$")


def fast_email_validate(emails: list[str]) -> list[bool]:
    return [_EMAIL_RE.match(e) is not None for e in emails]


# Fibonacci implementations
def fib_recursive(n: int) -> int:
    if n <= 1:
        return n
    return fib_recursive(n - 1) + fib_recursive(n - 2)


def fib_iterative(n: int) -> int:
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a + b
    return a


@lru_cache(maxsize=None)
def fib_memoized(n: int) -> int:
    if n <= 1:
        return n
    return fib_memoized(n - 1) + fib_memoized(n - 2)


# Data processing
def process_records_dict(records: list[dict]) -> dict[str, list]:
    result: dict[str, list] = {}
    for r in records:
        key = r.get("category", "unknown")
        result.setdefault(key, []).append(r)
    return result


def process_records_defaultdict(records: list[dict]) -> dict[str, list]:
    from collections import defaultdict
    result: dict[str, list] = defaultdict(list)
    for r in records:
        result[r.get("category", "unknown")].append(r)
    return dict(result)


# ─────────────────────────────────────────────────────────────────────────────
# Fixtures — shared test data
# ─────────────────────────────────────────────────────────────────────────────

@pytest.fixture
def small_list() -> list[int]:
    import random
    rng = random.Random(42)
    return [rng.randint(0, 1000) for _ in range(100)]


@pytest.fixture
def medium_list() -> list[int]:
    import random
    rng = random.Random(42)
    return [rng.randint(0, 1000) for _ in range(1_000)]


@pytest.fixture
def email_list() -> list[str]:
    return [f"user{i}@example.com" for i in range(500)]


@pytest.fixture
def records() -> list[dict]:
    categories = ["Electronics", "Clothing", "Books", "Home", "Sports"]
    return [
        {"id": i, "name": f"Product {i}", "category": categories[i % 5], "price": i * 0.99}
        for i in range(1_000)
    ]


# ─────────────────────────────────────────────────────────────────────────────
# 1. Basic benchmark — callable with args
# ─────────────────────────────────────────────────────────────────────────────

class TestSortBenchmarks:

    @pytest.mark.benchmark(group="sort-small")
    def test_builtin_sort(self, benchmark, small_list: list[int]) -> None:
        result = benchmark(sorted, small_list)
        assert result == sorted(small_list)

    @pytest.mark.benchmark(group="sort-small")
    def test_insertion_sort(self, benchmark, small_list: list[int]) -> None:
        result = benchmark(insertion_sort, small_list)
        assert result == sorted(small_list)

    @pytest.mark.benchmark(group="sort-small")
    def test_bubble_sort(self, benchmark, small_list: list[int]) -> None:
        """Expected to be slowest — demonstrates regression detection value."""
        result = benchmark(bubble_sort, small_list)
        assert result == sorted(small_list)


# ─────────────────────────────────────────────────────────────────────────────
# 2. benchmark.pedantic — fine-grained rounds and iterations
# ─────────────────────────────────────────────────────────────────────────────

class TestFibonacci:

    @pytest.mark.benchmark(group="fibonacci")
    def test_iterative(self, benchmark) -> None:
        result = benchmark.pedantic(fib_iterative, args=(30,), rounds=200, iterations=5)
        assert result == 832040

    @pytest.mark.benchmark(group="fibonacci")
    def test_memoized(self, benchmark) -> None:
        # Clear cache before each round for fair comparison
        fib_memoized.cache_clear()
        result = benchmark.pedantic(fib_memoized, args=(30,), rounds=200, iterations=5,
                                    setup=fib_memoized.cache_clear)
        assert result == 832040

    @pytest.mark.benchmark(group="fibonacci-slow")
    @pytest.mark.slow
    def test_recursive(self, benchmark) -> None:
        """Recursive is exponential — skip in fast test runs with -m 'not slow'."""
        result = benchmark.pedantic(fib_recursive, args=(25,), rounds=5)
        assert result == 75025


# ─────────────────────────────────────────────────────────────────────────────
# 3. Comparing implementations via parametrize
# ─────────────────────────────────────────────────────────────────────────────

@pytest.mark.parametrize("validate_fn,label", [
    (slow_email_validate, "slow-recompile"),
    (fast_email_validate, "fast-compiled"),
], ids=["slow", "fast"])
@pytest.mark.benchmark(group="email-validation")
def test_email_validation(benchmark, validate_fn, label, email_list: list[str]) -> None:
    result = benchmark(validate_fn, email_list)
    assert len(result) == len(email_list)
    assert all(r is True for r in result)


@pytest.mark.parametrize("build_fn", [slow_json_build, fast_json_build],
                         ids=["slow", "fast"])
@pytest.mark.benchmark(group="json-build")
def test_json_build(benchmark, build_fn, records: list[dict]) -> None:
    result = benchmark(build_fn, records[:100])
    parsed = json.loads(result)
    assert len(parsed) == 100


# ─────────────────────────────────────────────────────────────────────────────
# 4. Data structure comparison
# ─────────────────────────────────────────────────────────────────────────────

@pytest.mark.parametrize("process_fn", [
    process_records_dict,
    process_records_defaultdict,
], ids=["dict-setdefault", "defaultdict"])
@pytest.mark.benchmark(group="groupby")
def test_group_by_category(benchmark, process_fn, records: list[dict]) -> None:
    result = benchmark(process_fn, records)
    assert "Electronics" in result
    assert len(sum(result.values(), [])) == len(records)


# ─────────────────────────────────────────────────────────────────────────────
# 5. Accessing stats in assertions
# ─────────────────────────────────────────────────────────────────────────────

def test_sort_stats(benchmark, medium_list: list[int]) -> None:
    """Access benchmark statistics after the run for custom assertions."""
    benchmark(sorted, medium_list)

    stats = benchmark.stats
    # Sorting 1000 integers should be well under 1ms mean
    assert stats.mean < 0.001, f"Mean {stats.mean:.6f}s exceeds 1ms threshold"
    # Coefficient of variation: stddev / mean should be < 50% (stable measurement)
    if stats.mean > 0:
        cv = stats.stddev / stats.mean
        assert cv < 0.5, f"High measurement variance: CV={cv:.2%}"


def test_json_dumps_stats(benchmark, records: list[dict]) -> None:
    benchmark(json.dumps, records[:50])
    # Serialising 50 small dicts should be under 500µs
    assert benchmark.stats.mean < 0.0005

For the timeit alternative — timeit.timeit("sorted([3,1,2])", number=100_000) gives you a raw number but no statistics — no standard deviation, no outlier detection, no warmup — and the result varies across machines and Python versions with no stored baseline for comparison, while pytest-benchmark automatically runs warmup rounds, computes mean/median/stddev/IQR, stores results as JSON in .benchmarks/, and --benchmark-compare-fail=mean:10% fails CI when a commit regresses the mean by more than 10% relative to the stored baseline. For the cProfile / line_profiler alternative — profilers tell you where time is spent in a single run, while pytest-benchmark measures how much time a specific function takes across many runs with statistical confidence — the two are complementary: use pytest-benchmark’s regression gate to detect regressions in CI, and use cProfile when a benchmark fails to identify the hot path to optimize. The Claude Skills 360 bundle includes pytest-benchmark skill sets covering benchmark fixture with callable, benchmark.pedantic for rounds/iterations/setup, @pytest.mark.benchmark group annotation, parametrize for head-to-head implementation comparisons, benchmark.stats.mean/stddev post-run assertions, —benchmark-autosave for baseline storage, —benchmark-compare and —benchmark-compare-fail for CI gating, disable_gc for stable measurements, JSON and histogram output, and pytest fixture integration for pre-built test data. Start with the free tier to try performance benchmarking code generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39