Python’s gc module exposes the cyclic garbage collector. import gc. collect: gc.collect(generation=2) → number of unreachable objects collected; generation 0/1/2 (0 = youngest). disable/enable: gc.disable() / gc.enable() — turn auto-GC off/on. isenabled: gc.isenabled(). get_count: gc.get_count() → (gen0, gen1, gen2) — objects tracked per generation since last collection. get_threshold: gc.get_threshold() → (700, 10, 10) defaults. set_threshold: gc.set_threshold(700, 10, 10) — tune collection frequency. get_objects: gc.get_objects(generation=None) → all tracked objects (expensive). get_referrers: gc.get_referrers(obj) → list of objects that hold a reference to obj. get_referents: gc.get_referents(*objs) → objects referenced by objs. is_tracked: gc.is_tracked(obj) — True if the GC tracks this object (containers are tracked; scalars usually not). freeze: gc.freeze() — move all objects to gen2 (for post-fork performance). get_freeze_count: gc.get_freeze_count(). callbacks: gc.callbacks — list of callables called before/after each collection; signature callback(phase, info) where phase is "start" or "stop". set_debug: gc.set_debug(gc.DEBUG_LEAK | gc.DEBUG_STATS). Three-generation model: gen0 (new), gen1 (survived 1), gen2 (old) — collection cascades upward based on thresholds. Claude Code generates memory profilers, cycle detectors, fork-safe GC configurators, and leak finders.
CLAUDE.md for gc
## gc Stack
- Stdlib: import gc
- Force: gc.collect() # collect all generations
- Count: gc.get_count() # (gen0, gen1, gen2) pending objects
- Referrers: gc.get_referrers(obj) # who holds a ref to obj?
- Pre-fork: gc.freeze() # immortalize objects before os.fork()
- Track?: gc.is_tracked(obj) # False for ints/strings/scalars
- Debug: gc.set_debug(gc.DEBUG_LEAK) # print leaked objects
gc Memory Management Pipeline
# app/gcutil.py — force collect, cycle finder, memory stats, leak detector
from __future__ import annotations
import gc
import sys
import tracemalloc
from collections import defaultdict
from dataclasses import dataclass
from typing import Any
# ─────────────────────────────────────────────────────────────────────────────
# 1. Basic helpers
# ─────────────────────────────────────────────────────────────────────────────
def force_collect() -> int:
"""
Force a full GC cycle across all three generations.
Returns the total number of unreachable objects collected.
Example:
freed = force_collect()
print(f"freed {freed} objects")
"""
return gc.collect(2)
def gc_counts() -> dict[str, int]:
"""
Return current object counts per GC generation.
Example:
print(gc_counts()) # {"gen0": 123, "gen1": 5, "gen2": 2}
"""
g0, g1, g2 = gc.get_count()
return {"gen0": g0, "gen1": g1, "gen2": g2}
def gc_thresholds() -> dict[str, int]:
"""Return current GC thresholds for each generation."""
t0, t1, t2 = gc.get_threshold()
return {"gen0": t0, "gen1": t1, "gen2": t2}
def tune_gc(gen0: int = 700, gen1: int = 10, gen2: int = 10) -> None:
"""
Tune GC thresholds. Lower gen0 = more frequent collection (lower throughput,
lower peak memory). Higher gen0 = less GC overhead (higher peak memory).
Example:
tune_gc(gen0=1000) # batch job: reduce GC interruptions
"""
gc.set_threshold(gen0, gen1, gen2)
def object_type_counts(generation: int | None = None) -> dict[str, int]:
"""
Return a dict of {type_name: count} for all objects tracked by the GC.
Warning: calls gc.get_objects() — expensive for large heaps.
Example:
counts = object_type_counts()
for t, n in sorted(counts.items(), key=lambda x: -x[1])[:10]:
print(f" {t:40s} {n:,}")
"""
counts: dict[str, int] = defaultdict(int)
for obj in gc.get_objects(generation=generation):
counts[type(obj).__name__] += 1
return dict(counts)
# ─────────────────────────────────────────────────────────────────────────────
# 2. Cycle and referrer analysis
# ─────────────────────────────────────────────────────────────────────────────
def find_referrers(obj: Any, max_depth: int = 1) -> list[Any]:
"""
Return all objects that directly hold a reference to obj.
Excludes gc internals (lists created by this function, frames, etc.).
Example:
refs = find_referrers(my_object)
for r in refs:
print(type(r).__name__, id(r))
"""
result = gc.get_referrers(obj)
# Filter out this function's own locals/frame and gc internals
return [
r for r in result
if r is not result and not isinstance(r, type(sys._getframe()))
]
def ref_chain(obj: Any, max_depth: int = 3, _seen: set | None = None) -> list[str]:
"""
Breadth-first summary of who references obj up to max_depth levels.
Returns human-readable lines like " dict @ 0x... (len=5)".
Example:
for line in ref_chain(leaking_object):
print(line)
"""
if _seen is None:
_seen = {id(obj)}
lines: list[str] = []
queue = [(obj, 0)]
while queue:
current, depth = queue.pop(0)
if depth >= max_depth:
continue
for ref in gc.get_referrers(current):
if id(ref) in _seen:
continue
if isinstance(ref, type(sys._getframe())):
continue
_seen.add(id(ref))
desc = f"{' ' * (depth+1)}{type(ref).__name__} @ {hex(id(ref))}"
if isinstance(ref, dict):
desc += f" (len={len(ref)})"
elif isinstance(ref, (list, tuple)):
desc += f" (len={len(ref)})"
lines.append(desc)
queue.append((ref, depth + 1))
return lines
def has_cycle(obj: Any) -> bool:
"""
Return True if obj is part of a reference cycle detectable by the GC.
Uses gc.collect() temporarily to check; may trigger side effects.
Example:
node = Node()
node.self_ref = node
has_cycle(node) # True
"""
was_enabled = gc.isenabled()
gc.disable()
before = gc.collect(2)
# Temporarily enable just to trigger collection
gc.enable()
after = gc.collect(2)
if not was_enabled:
gc.disable()
else:
gc.enable()
# More objects freed on second pass suggests cycles were broken
return after > 0
# ─────────────────────────────────────────────────────────────────────────────
# 3. Memory snapshot with tracemalloc
# ─────────────────────────────────────────────────────────────────────────────
@dataclass
class MemorySnapshot:
top_files: list[dict]
total_kb: float
peak_kb: float
@classmethod
def take(cls, top: int = 10, key_type: str = "lineno") -> "MemorySnapshot":
"""
Take a tracemalloc memory snapshot and return the top allocators.
Example:
snap = MemorySnapshot.take(top=20)
for row in snap.top_files:
print(row)
"""
if not tracemalloc.is_tracing():
tracemalloc.start()
snapshot = tracemalloc.take_snapshot()
stats = snapshot.statistics(key_type)[:top]
rows = []
for stat in stats:
frame = stat.traceback[0]
rows.append({
"file": frame.filename,
"lineno": frame.lineno,
"size_kb": round(stat.size / 1024, 2),
"count": stat.count,
})
current, peak = tracemalloc.get_traced_memory()
return cls(
top_files=rows,
total_kb=round(current / 1024, 2),
peak_kb=round(peak / 1024, 2),
)
def report(self, n: int = 10) -> str:
lines = [f"Memory: current={self.total_kb:.1f} KB peak={self.peak_kb:.1f} KB"]
lines.append(f"{'File':50s} {'Line':>6} {'KB':>8} {'Count':>8}")
lines.append("─" * 80)
for row in self.top_files[:n]:
fname = row["file"]
if len(fname) > 48:
fname = "…" + fname[-47:]
lines.append(f" {fname:48s} {row['lineno']:>6} {row['size_kb']:>8.2f} {row['count']:>8,}")
return "\n".join(lines)
def memory_diff(
before: tracemalloc.Snapshot,
after: tracemalloc.Snapshot,
top: int = 10,
) -> list[dict]:
"""
Compare two tracemalloc snapshots and return top memory growth locations.
Example:
before = tracemalloc.take_snapshot()
run_operation()
after = tracemalloc.take_snapshot()
for row in memory_diff(before, after):
print(row)
"""
stats = after.compare_to(before, "lineno")[:top]
return [
{
"file": stat.traceback[0].filename,
"lineno": stat.traceback[0].lineno,
"delta_kb": round(stat.size_diff / 1024, 2),
"new_size_kb": round(stat.size / 1024, 2),
}
for stat in stats
]
# ─────────────────────────────────────────────────────────────────────────────
# 4. GC callbacks / hooks
# ─────────────────────────────────────────────────────────────────────────────
class GCStatsCollector:
"""
Attach to gc.callbacks to count and time collection cycles.
Example:
stats = GCStatsCollector.install()
# ... run workload ...
print(stats.report())
stats.uninstall()
"""
def __init__(self) -> None:
self.collections: dict[str, int] = {"gen0": 0, "gen1": 0, "gen2": 0}
self.total_collected: int = 0
self._in_progress: dict[int, int] = {}
def _callback(self, phase: str, info: dict) -> None:
gen = info.get("generation", 0)
key = f"gen{gen}"
if phase == "start":
self._in_progress[gen] = gc.get_count()[gen]
elif phase == "stop":
collected = info.get("collected", 0)
self.collections[key] = self.collections.get(key, 0) + 1
self.total_collected += collected
@classmethod
def install(cls) -> "GCStatsCollector":
inst = cls()
gc.callbacks.append(inst._callback)
return inst
def uninstall(self) -> None:
try:
gc.callbacks.remove(self._callback)
except ValueError:
pass
def report(self) -> str:
lines = ["GC Stats:"]
for gen, count in self.collections.items():
lines.append(f" {gen}: {count} collections")
lines.append(f" total objects freed: {self.total_collected:,}")
return "\n".join(lines)
# ─────────────────────────────────────────────────────────────────────────────
# 5. Fork-safe GC freeze
# ─────────────────────────────────────────────────────────────────────────────
def freeze_before_fork() -> int:
"""
Call gc.freeze() to move all tracked objects to the permanent generation.
This prevents copy-on-write page faults in forked worker processes.
Call this after application initialization, before forking.
Returns the number of frozen objects.
Example:
load_model()
load_index()
frozen = freeze_before_fork()
print(f"Froze {frozen:,} objects before fork")
os.fork()
"""
gc.collect()
gc.freeze()
return gc.get_freeze_count()
# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
import weakref
print("=== gc demo ===")
print("\n--- gc_counts / force_collect ---")
# Create some garbage
for _ in range(1000):
x = list(range(10))
counts_before = gc_counts()
freed = force_collect()
counts_after = gc_counts()
print(f" before: {counts_before}")
print(f" after: {counts_after}")
print(f" freed: {freed}")
print("\n--- gc_thresholds / tune_gc ---")
orig = gc_thresholds()
print(f" original: {orig}")
tune_gc(gen0=1000, gen1=20, gen2=20)
print(f" tuned: {gc_thresholds()}")
gc.set_threshold(orig["gen0"], orig["gen1"], orig["gen2"]) # restore
print("\n--- object_type_counts (top 5) ---")
counts = object_type_counts()
top5 = sorted(counts.items(), key=lambda x: -x[1])[:5]
for t, n in top5:
print(f" {t:30s} {n:,}")
print("\n--- reference cycle detection ---")
class Node:
def __init__(self, val):
self.val = val
self.ref = None
a = Node(1)
b = Node(2)
a.ref = b
b.ref = a # cycle
gc.disable()
del a, b
unreachable = gc.collect()
print(f" cycling objects collected: {unreachable}")
gc.enable()
print("\n--- find_referrers ---")
target = {"key": "value"}
container = [target]
refs = find_referrers(target)
print(f" referrers of dict: {[type(r).__name__ for r in refs]}")
print("\n--- GCStatsCollector ---")
collector = GCStatsCollector.install()
for _ in range(3):
_ = [{i: list(range(i)) for i in range(50)} for _ in range(100)]
gc.collect()
collector.uninstall()
print(collector.report())
print("\n--- MemorySnapshot ---")
tracemalloc.start()
big = [list(range(1000)) for _ in range(100)]
snap = MemorySnapshot.take(top=5)
print(snap.report(n=3))
del big
print("\n--- freeze_before_fork ---")
n_frozen = freeze_before_fork()
print(f" frozen objects: {n_frozen:,}")
print("\n=== done ===")
For the tracemalloc alternative — tracemalloc (also stdlib) traces memory allocations at the Python level: tracemalloc.start(), take_snapshot(), .compare_to() returns allocation diffs by file/line — it shows where memory is allocated; gc shows what objects exist and detects cycles; tracemalloc is for “memory grew by 50 MB — find the code that allocated it”; gc is for “objects are not being collected — find what holds references” — use them together: tracemalloc for allocation profiling, gc.get_referrers() for why objects are not being freed. For the objgraph alternative — objgraph (PyPI) wraps gc.get_objects() and gc.get_referrers() with human-readable growth reports (objgraph.show_growth()), render reference graphs to PNG (show_refs()), and find the root paths keeping an object alive (find_backref_chain()) — use objgraph for interactive memory debugging sessions in a REPL or Jupyter notebook where visualizations matter; use gc directly in production code for lightweight custom leak detection, GC tuning, and pre-fork freezing. The Claude Skills 360 bundle includes gc skill sets covering force_collect()/gc_counts()/gc_thresholds()/tune_gc() basic control, object_type_counts()/find_referrers()/ref_chain()/has_cycle() object and cycle inspection, MemorySnapshot/memory_diff() tracemalloc integration, GCStatsCollector gc.callbacks hook, and freeze_before_fork() fork-safe freeze helper. Start with the free tier to try memory management and gc pipeline code generation.