Blog / AI / Claude Code for gc: Garbage Collection in Python

Claude Code for gc: Garbage Collection in Python

Published: August 6, 2028

•

Read time: 5 min read

•

By: Claude Skills 360

Python’s gc module exposes the cyclic garbage collector. import gc. collect: gc.collect(generation=2) → number of unreachable objects collected; generation 0/1/2 (0 = youngest). disable/enable: gc.disable() / gc.enable() — turn auto-GC off/on. isenabled: gc.isenabled(). get_count: gc.get_count() → (gen0, gen1, gen2) — objects tracked per generation since last collection. get_threshold: gc.get_threshold() → (700, 10, 10) defaults. set_threshold: gc.set_threshold(700, 10, 10) — tune collection frequency. get_objects: gc.get_objects(generation=None) → all tracked objects (expensive). get_referrers: gc.get_referrers(obj) → list of objects that hold a reference to obj. get_referents: gc.get_referents(*objs) → objects referenced by objs. is_tracked: gc.is_tracked(obj) — True if the GC tracks this object (containers are tracked; scalars usually not). freeze: gc.freeze() — move all objects to gen2 (for post-fork performance). get_freeze_count: gc.get_freeze_count(). callbacks: gc.callbacks — list of callables called before/after each collection; signature callback(phase, info) where phase is "start" or "stop". set_debug: gc.set_debug(gc.DEBUG_LEAK | gc.DEBUG_STATS). Three-generation model: gen0 (new), gen1 (survived 1), gen2 (old) — collection cascades upward based on thresholds. Claude Code generates memory profilers, cycle detectors, fork-safe GC configurators, and leak finders.

CLAUDE.md for gc

## gc Stack
- Stdlib: import gc
- Force:    gc.collect()                  # collect all generations
- Count:    gc.get_count()                # (gen0, gen1, gen2) pending objects
- Referrers: gc.get_referrers(obj)       # who holds a ref to obj?
- Pre-fork: gc.freeze()                  # immortalize objects before os.fork()
- Track?:   gc.is_tracked(obj)           # False for ints/strings/scalars
- Debug:    gc.set_debug(gc.DEBUG_LEAK)  # print leaked objects

gc Memory Management Pipeline

# app/gcutil.py — force collect, cycle finder, memory stats, leak detector
from __future__ import annotations

import gc
import sys
import tracemalloc
from collections import defaultdict
from dataclasses import dataclass
from typing import Any


# ─────────────────────────────────────────────────────────────────────────────
# 1. Basic helpers
# ─────────────────────────────────────────────────────────────────────────────

def force_collect() -> int:
    """
    Force a full GC cycle across all three generations.
    Returns the total number of unreachable objects collected.

    Example:
        freed = force_collect()
        print(f"freed {freed} objects")
    """
    return gc.collect(2)


def gc_counts() -> dict[str, int]:
    """
    Return current object counts per GC generation.

    Example:
        print(gc_counts())  # {"gen0": 123, "gen1": 5, "gen2": 2}
    """
    g0, g1, g2 = gc.get_count()
    return {"gen0": g0, "gen1": g1, "gen2": g2}


def gc_thresholds() -> dict[str, int]:
    """Return current GC thresholds for each generation."""
    t0, t1, t2 = gc.get_threshold()
    return {"gen0": t0, "gen1": t1, "gen2": t2}


def tune_gc(gen0: int = 700, gen1: int = 10, gen2: int = 10) -> None:
    """
    Tune GC thresholds. Lower gen0 = more frequent collection (lower throughput,
    lower peak memory). Higher gen0 = less GC overhead (higher peak memory).

    Example:
        tune_gc(gen0=1000)  # batch job: reduce GC interruptions
    """
    gc.set_threshold(gen0, gen1, gen2)


def object_type_counts(generation: int | None = None) -> dict[str, int]:
    """
    Return a dict of {type_name: count} for all objects tracked by the GC.
    Warning: calls gc.get_objects() — expensive for large heaps.

    Example:
        counts = object_type_counts()
        for t, n in sorted(counts.items(), key=lambda x: -x[1])[:10]:
            print(f"  {t:40s} {n:,}")
    """
    counts: dict[str, int] = defaultdict(int)
    for obj in gc.get_objects(generation=generation):
        counts[type(obj).__name__] += 1
    return dict(counts)


# ─────────────────────────────────────────────────────────────────────────────
# 2. Cycle and referrer analysis
# ─────────────────────────────────────────────────────────────────────────────

def find_referrers(obj: Any, max_depth: int = 1) -> list[Any]:
    """
    Return all objects that directly hold a reference to obj.
    Excludes gc internals (lists created by this function, frames, etc.).

    Example:
        refs = find_referrers(my_object)
        for r in refs:
            print(type(r).__name__, id(r))
    """
    result = gc.get_referrers(obj)
    # Filter out this function's own locals/frame and gc internals
    return [
        r for r in result
        if r is not result and not isinstance(r, type(sys._getframe()))
    ]


def ref_chain(obj: Any, max_depth: int = 3, _seen: set | None = None) -> list[str]:
    """
    Breadth-first summary of who references obj up to max_depth levels.
    Returns human-readable lines like "  dict @ 0x... (len=5)".

    Example:
        for line in ref_chain(leaking_object):
            print(line)
    """
    if _seen is None:
        _seen = {id(obj)}
    lines: list[str] = []
    queue = [(obj, 0)]
    while queue:
        current, depth = queue.pop(0)
        if depth >= max_depth:
            continue
        for ref in gc.get_referrers(current):
            if id(ref) in _seen:
                continue
            if isinstance(ref, type(sys._getframe())):
                continue
            _seen.add(id(ref))
            desc = f"{'  ' * (depth+1)}{type(ref).__name__} @ {hex(id(ref))}"
            if isinstance(ref, dict):
                desc += f" (len={len(ref)})"
            elif isinstance(ref, (list, tuple)):
                desc += f" (len={len(ref)})"
            lines.append(desc)
            queue.append((ref, depth + 1))
    return lines


def has_cycle(obj: Any) -> bool:
    """
    Return True if obj is part of a reference cycle detectable by the GC.
    Uses gc.collect() temporarily to check; may trigger side effects.

    Example:
        node = Node()
        node.self_ref = node
        has_cycle(node)  # True
    """
    was_enabled = gc.isenabled()
    gc.disable()
    before = gc.collect(2)
    # Temporarily enable just to trigger collection
    gc.enable()
    after = gc.collect(2)
    if not was_enabled:
        gc.disable()
    else:
        gc.enable()
    # More objects freed on second pass suggests cycles were broken
    return after > 0


# ─────────────────────────────────────────────────────────────────────────────
# 3. Memory snapshot with tracemalloc
# ─────────────────────────────────────────────────────────────────────────────

@dataclass
class MemorySnapshot:
    top_files:   list[dict]
    total_kb:    float
    peak_kb:     float

    @classmethod
    def take(cls, top: int = 10, key_type: str = "lineno") -> "MemorySnapshot":
        """
        Take a tracemalloc memory snapshot and return the top allocators.

        Example:
            snap = MemorySnapshot.take(top=20)
            for row in snap.top_files:
                print(row)
        """
        if not tracemalloc.is_tracing():
            tracemalloc.start()
        snapshot = tracemalloc.take_snapshot()
        stats    = snapshot.statistics(key_type)[:top]
        rows = []
        for stat in stats:
            frame = stat.traceback[0]
            rows.append({
                "file":    frame.filename,
                "lineno":  frame.lineno,
                "size_kb": round(stat.size / 1024, 2),
                "count":   stat.count,
            })
        current, peak = tracemalloc.get_traced_memory()
        return cls(
            top_files=rows,
            total_kb=round(current / 1024, 2),
            peak_kb=round(peak / 1024, 2),
        )

    def report(self, n: int = 10) -> str:
        lines = [f"Memory: current={self.total_kb:.1f} KB  peak={self.peak_kb:.1f} KB"]
        lines.append(f"{'File':50s}  {'Line':>6}  {'KB':>8}  {'Count':>8}")
        lines.append("─" * 80)
        for row in self.top_files[:n]:
            fname = row["file"]
            if len(fname) > 48:
                fname = "…" + fname[-47:]
            lines.append(f"  {fname:48s}  {row['lineno']:>6}  {row['size_kb']:>8.2f}  {row['count']:>8,}")
        return "\n".join(lines)


def memory_diff(
    before: tracemalloc.Snapshot,
    after:  tracemalloc.Snapshot,
    top: int = 10,
) -> list[dict]:
    """
    Compare two tracemalloc snapshots and return top memory growth locations.

    Example:
        before = tracemalloc.take_snapshot()
        run_operation()
        after = tracemalloc.take_snapshot()
        for row in memory_diff(before, after):
            print(row)
    """
    stats = after.compare_to(before, "lineno")[:top]
    return [
        {
            "file":      stat.traceback[0].filename,
            "lineno":    stat.traceback[0].lineno,
            "delta_kb":  round(stat.size_diff / 1024, 2),
            "new_size_kb": round(stat.size / 1024, 2),
        }
        for stat in stats
    ]


# ─────────────────────────────────────────────────────────────────────────────
# 4. GC callbacks / hooks
# ─────────────────────────────────────────────────────────────────────────────

class GCStatsCollector:
    """
    Attach to gc.callbacks to count and time collection cycles.

    Example:
        stats = GCStatsCollector.install()
        # ... run workload ...
        print(stats.report())
        stats.uninstall()
    """

    def __init__(self) -> None:
        self.collections: dict[str, int] = {"gen0": 0, "gen1": 0, "gen2": 0}
        self.total_collected: int = 0
        self._in_progress: dict[int, int] = {}

    def _callback(self, phase: str, info: dict) -> None:
        gen = info.get("generation", 0)
        key = f"gen{gen}"
        if phase == "start":
            self._in_progress[gen] = gc.get_count()[gen]
        elif phase == "stop":
            collected = info.get("collected", 0)
            self.collections[key] = self.collections.get(key, 0) + 1
            self.total_collected  += collected

    @classmethod
    def install(cls) -> "GCStatsCollector":
        inst = cls()
        gc.callbacks.append(inst._callback)
        return inst

    def uninstall(self) -> None:
        try:
            gc.callbacks.remove(self._callback)
        except ValueError:
            pass

    def report(self) -> str:
        lines = ["GC Stats:"]
        for gen, count in self.collections.items():
            lines.append(f"  {gen}: {count} collections")
        lines.append(f"  total objects freed: {self.total_collected:,}")
        return "\n".join(lines)


# ─────────────────────────────────────────────────────────────────────────────
# 5. Fork-safe GC freeze
# ─────────────────────────────────────────────────────────────────────────────

def freeze_before_fork() -> int:
    """
    Call gc.freeze() to move all tracked objects to the permanent generation.
    This prevents copy-on-write page faults in forked worker processes.
    Call this after application initialization, before forking.
    Returns the number of frozen objects.

    Example:
        load_model()
        load_index()
        frozen = freeze_before_fork()
        print(f"Froze {frozen:,} objects before fork")
        os.fork()
    """
    gc.collect()
    gc.freeze()
    return gc.get_freeze_count()


# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────

if __name__ == "__main__":
    import weakref

    print("=== gc demo ===")

    print("\n--- gc_counts / force_collect ---")
    # Create some garbage
    for _ in range(1000):
        x = list(range(10))
    counts_before = gc_counts()
    freed = force_collect()
    counts_after = gc_counts()
    print(f"  before: {counts_before}")
    print(f"  after:  {counts_after}")
    print(f"  freed:  {freed}")

    print("\n--- gc_thresholds / tune_gc ---")
    orig = gc_thresholds()
    print(f"  original: {orig}")
    tune_gc(gen0=1000, gen1=20, gen2=20)
    print(f"  tuned:    {gc_thresholds()}")
    gc.set_threshold(orig["gen0"], orig["gen1"], orig["gen2"])  # restore

    print("\n--- object_type_counts (top 5) ---")
    counts = object_type_counts()
    top5 = sorted(counts.items(), key=lambda x: -x[1])[:5]
    for t, n in top5:
        print(f"  {t:30s} {n:,}")

    print("\n--- reference cycle detection ---")
    class Node:
        def __init__(self, val):
            self.val = val
            self.ref = None

    a = Node(1)
    b = Node(2)
    a.ref = b
    b.ref = a  # cycle

    gc.disable()
    del a, b
    unreachable = gc.collect()
    print(f"  cycling objects collected: {unreachable}")
    gc.enable()

    print("\n--- find_referrers ---")
    target = {"key": "value"}
    container = [target]
    refs = find_referrers(target)
    print(f"  referrers of dict: {[type(r).__name__ for r in refs]}")

    print("\n--- GCStatsCollector ---")
    collector = GCStatsCollector.install()
    for _ in range(3):
        _ = [{i: list(range(i)) for i in range(50)} for _ in range(100)]
        gc.collect()
    collector.uninstall()
    print(collector.report())

    print("\n--- MemorySnapshot ---")
    tracemalloc.start()
    big = [list(range(1000)) for _ in range(100)]
    snap = MemorySnapshot.take(top=5)
    print(snap.report(n=3))
    del big

    print("\n--- freeze_before_fork ---")
    n_frozen = freeze_before_fork()
    print(f"  frozen objects: {n_frozen:,}")

    print("\n=== done ===")

For the tracemalloc alternative — tracemalloc (also stdlib) traces memory allocations at the Python level: tracemalloc.start(), take_snapshot(), .compare_to() returns allocation diffs by file/line — it shows where memory is allocated; gc shows what objects exist and detects cycles; tracemalloc is for “memory grew by 50 MB — find the code that allocated it”; gc is for “objects are not being collected — find what holds references” — use them together: tracemalloc for allocation profiling, gc.get_referrers() for why objects are not being freed. For the objgraph alternative — objgraph (PyPI) wraps gc.get_objects() and gc.get_referrers() with human-readable growth reports (objgraph.show_growth()), render reference graphs to PNG (show_refs()), and find the root paths keeping an object alive (find_backref_chain()) — use objgraph for interactive memory debugging sessions in a REPL or Jupyter notebook where visualizations matter; use gc directly in production code for lightweight custom leak detection, GC tuning, and pre-fork freezing. The Claude Skills 360 bundle includes gc skill sets covering force_collect()/gc_counts()/gc_thresholds()/tune_gc() basic control, object_type_counts()/find_referrers()/ref_chain()/has_cycle() object and cycle inspection, MemorySnapshot/memory_diff() tracemalloc integration, GCStatsCollector gc.callbacks hook, and freeze_before_fork() fork-safe freeze helper. Start with the free tier to try memory management and gc pipeline code generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39