Blog / AI / Claude Code for ast: Python Abstract Syntax Trees

Claude Code for ast: Python Abstract Syntax Trees

Published: August 14, 2028

•

Read time: 5 min read

•

By: Claude Skills 360

Python’s ast module parses source code into an Abstract Syntax Tree for analysis and transformation. import ast. parse: tree = ast.parse(source, filename="<string>", mode="exec") — modes: "exec" (module), "eval" (single expression), "single" (interactive). dump: ast.dump(tree, indent=2) — human-readable tree string. literal_eval: ast.literal_eval("{'a': 1}") — safely evaluate literals (no arbitrary code execution). unparse: ast.unparse(node) (Python 3.9+) — tree back to source string. walk: ast.walk(tree) → iterator over all nodes. get_docstring: ast.get_docstring(node) — extract docstring from FunctionDef/ClassDef/Module. fix_missing_locations: ast.fix_missing_locations(tree) — fill in lineno/col_offset after manual node construction. NodeVisitor: class V(ast.NodeVisitor): def visit_FunctionDef(self, node): ... — override visit_X for node type X; call self.generic_visit(node) to recurse. NodeTransformer: same but can return new/modified nodes or None to delete. Key node types: Module, FunctionDef, AsyncFunctionDef, ClassDef, Assign, AnnAssign, Return, Import, ImportFrom, Call, Name, Attribute, Constant, arg. compile: code = compile(tree, "<string>", "exec") → code object. exec(code). Claude Code generates linters, security scanners, import analyzers, dead code detectors, and source-to-source transformers.

CLAUDE.md for ast

## ast Stack
- Stdlib: import ast
- Parse:   tree = ast.parse(source_str)
- Safe:    ast.literal_eval(s)             # no code execution — literals only
- Walk:    for node in ast.walk(tree):
- Visitor: class V(ast.NodeVisitor): def visit_FunctionDef(self, n): ...
- Transform: class T(ast.NodeTransformer): def visit_Call(self, n): return n
- Unparse: ast.unparse(tree)               # Python 3.9+

ast Static Analysis Pipeline

# app/astutil.py — imports, function scan, call finder, linter, transformer
from __future__ import annotations

import ast
import textwrap
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any


# ─────────────────────────────────────────────────────────────────────────────
# 1. Parse and dump helpers
# ─────────────────────────────────────────────────────────────────────────────

def parse_source(source: str, filename: str = "<string>") -> ast.Module:
    """
    Parse Python source code into an AST Module.

    Example:
        tree = parse_source(Path("app.py").read_text(), "app.py")
    """
    return ast.parse(source, filename=filename)


def parse_file(path: str | Path) -> ast.Module:
    """Parse a Python source file and return its AST."""
    p = Path(path)
    return ast.parse(p.read_text(encoding="utf-8"), filename=str(p))


def dump_ast(tree: ast.AST, indent: int = 2) -> str:
    """Return a human-readable indented dump of an AST node."""
    return ast.dump(tree, indent=indent)


def safe_eval(expr: str) -> Any:
    """
    Safely evaluate a Python literal expression string.
    Handles str, int, float, complex, bool, None, dict, list, tuple, set.
    Raises ValueError on non-literal expressions.

    Example:
        safe_eval("[1, 2, 3]")          # [1, 2, 3]
        safe_eval("{'key': 'value'}")   # {"key": "value"}
        safe_eval("open('f')")          # ValueError — not a literal
    """
    return ast.literal_eval(expr)


def back_to_source(tree: ast.AST) -> str:
    """
    Convert an AST node back to Python source code (Python 3.9+).

    Example:
        tree = parse_source("x = 1 + 2")
        code = back_to_source(tree)   # "x = 1 + 2"
    """
    return ast.unparse(tree)


# ─────────────────────────────────────────────────────────────────────────────
# 2. Import analysis
# ─────────────────────────────────────────────────────────────────────────────

@dataclass
class ImportInfo:
    module:   str       # "os.path" or "json"
    names:    list[str] # ["path"] for "import os.path as path" or ["loads"] for "from json import loads"
    aliases:  dict[str, str]  # {alias: real_name}
    lineno:   int
    is_from:  bool      # True = "from X import Y"


def extract_imports(tree: ast.Module) -> list[ImportInfo]:
    """
    Extract all import statements from an AST.

    Example:
        for imp in extract_imports(parse_source(source)):
            print(imp.module, imp.names)
    """
    results = []
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                results.append(ImportInfo(
                    module=alias.name,
                    names=[alias.asname or alias.name],
                    aliases={alias.asname: alias.name} if alias.asname else {},
                    lineno=node.lineno,
                    is_from=False,
                ))
        elif isinstance(node, ast.ImportFrom):
            module = node.module or ""
            results.append(ImportInfo(
                module=module,
                names=[a.asname or a.name for a in node.names],
                aliases={a.asname: a.name for a in node.names if a.asname},
                lineno=node.lineno,
                is_from=True,
            ))
    return results


def third_party_imports(tree: ast.Module, stdlib_modules: set[str] | None = None) -> list[str]:
    """
    Return module names that are likely third-party (not in stdlib).
    stdlib_modules defaults to a small known set; pass None to skip filtering.

    Example:
        third = third_party_imports(parse_file("app.py"))
    """
    _stdlib = stdlib_modules or {
        "os", "sys", "re", "json", "math", "datetime", "pathlib", "typing",
        "collections", "itertools", "functools", "io", "abc", "dataclasses",
        "enum", "copy", "warnings", "logging", "hashlib", "hmac", "struct",
        "queue", "threading", "asyncio", "urllib", "http", "email", "csv",
        "inspect", "ast", "traceback", "gc", "platform", "random", "secrets",
        "fractions", "decimal", "statistics", "operator", "string", "textwrap",
        "time", "calendar", "zlib", "gzip", "zipfile", "tarfile", "array",
        "weakref", "contextlib", "pprint", "pickle", "shelve", "uuid",
        "socket", "ssl", "subprocess", "shutil", "tempfile", "glob", "fnmatch",
    }
    imports = extract_imports(tree)
    third = []
    for imp in imports:
        top = imp.module.split(".")[0]
        if top and top not in _stdlib and top not in third:
            third.append(top)
    return sorted(third)


# ─────────────────────────────────────────────────────────────────────────────
# 3. Function / class summary
# ─────────────────────────────────────────────────────────────────────────────

@dataclass
class FunctionSummary:
    name:       str
    lineno:     int
    args:       list[str]
    is_async:   bool
    docstring:  str | None
    decorators: list[str]
    returns:    str | None


def list_functions(tree: ast.Module) -> list[FunctionSummary]:
    """
    Return summaries of all top-level and nested (but not method) functions.

    Example:
        for fn in list_functions(parse_source(source)):
            print(f"  {fn.name}({', '.join(fn.args)})")
    """
    results: list[FunctionSummary] = []

    class Visitor(ast.NodeVisitor):
        def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
            _collect(node, is_async=False)
            self.generic_visit(node)

        def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None:
            _collect(node, is_async=True)
            self.generic_visit(node)

    def _collect(node: ast.FunctionDef | ast.AsyncFunctionDef, is_async: bool) -> None:
        args = [a.arg for a in node.args.args]
        decs = []
        for dec in node.decorator_list:
            decs.append(ast.unparse(dec))
        ret = ast.unparse(node.returns) if node.returns else None
        results.append(FunctionSummary(
            name=node.name,
            lineno=node.lineno,
            args=args,
            is_async=is_async,
            docstring=ast.get_docstring(node),
            decorators=decs,
            returns=ret,
        ))

    Visitor().visit(tree)
    return results


@dataclass
class ClassSummary:
    name:      str
    lineno:    int
    bases:     list[str]
    methods:   list[str]
    docstring: str | None


def list_classes(tree: ast.Module) -> list[ClassSummary]:
    """
    Return summaries of all class definitions.

    Example:
        for cls in list_classes(parse_file("models.py")):
            print(f"  class {cls.name}({', '.join(cls.bases)})")
    """
    results = []
    for node in ast.walk(tree):
        if isinstance(node, ast.ClassDef):
            bases = [ast.unparse(b) for b in node.bases]
            methods = [
                n.name for n in ast.walk(node)
                if isinstance(n, (ast.FunctionDef, ast.AsyncFunctionDef))
            ]
            results.append(ClassSummary(
                name=node.name,
                lineno=node.lineno,
                bases=bases,
                methods=methods,
                docstring=ast.get_docstring(node),
            ))
    return results


# ─────────────────────────────────────────────────────────────────────────────
# 4. Security / linting checks
# ─────────────────────────────────────────────────────────────────────────────

@dataclass
class Issue:
    rule:    str
    message: str
    lineno:  int
    col:     int


class BasicLinter(ast.NodeVisitor):
    """
    A simple NodeVisitor-based linter demonstrating common checks.

    Checks:
    - SECURITY001: use of exec() or eval()
    - SECURITY002: subprocess calls with shell=True
    - STYLE001:    bare except without specific exception type
    - STYLE002:    assert statements (removed in optimized bytecode)
    """

    def __init__(self) -> None:
        self.issues: list[Issue] = []

    def _add(self, rule: str, msg: str, node: ast.AST) -> None:
        self.issues.append(Issue(rule, msg, getattr(node, "lineno", 0), getattr(node, "col_offset", 0)))

    def visit_Call(self, node: ast.Call) -> None:
        fn = node.func
        # exec / eval
        if isinstance(fn, ast.Name) and fn.id in ("exec", "eval"):
            self._add("SECURITY001", f"Use of {fn.id}() is potentially dangerous", node)
        # subprocess with shell=True
        if isinstance(fn, (ast.Name, ast.Attribute)):
            fn_name = fn.id if isinstance(fn, ast.Name) else fn.attr
            if fn_name in ("call", "run", "Popen", "check_call", "check_output"):
                for kw in node.keywords:
                    if kw.arg == "shell" and isinstance(kw.value, ast.Constant) and kw.value.value:
                        self._add("SECURITY002", "subprocess called with shell=True (injection risk)", node)
        self.generic_visit(node)

    def visit_ExceptHandler(self, node: ast.ExceptHandler) -> None:
        if node.type is None:
            self._add("STYLE001", "Bare except clause — catches all exceptions including SystemExit", node)
        self.generic_visit(node)

    def visit_Assert(self, node: ast.Assert) -> None:
        self._add("STYLE002", "Assert removed by -O flag — use explicit if/raise for runtime checks", node)
        self.generic_visit(node)


def lint_source(source: str, filename: str = "<string>") -> list[Issue]:
    """
    Run BasicLinter on source; return list of issues.

    Example:
        issues = lint_source(Path("script.py").read_text())
        for issue in issues:
            print(f"  [{issue.rule}] line {issue.lineno}: {issue.message}")
    """
    tree = ast.parse(source, filename=filename)
    linter = BasicLinter()
    linter.visit(tree)
    return sorted(linter.issues, key=lambda i: i.lineno)


# ─────────────────────────────────────────────────────────────────────────────
# 5. Source transformer
# ─────────────────────────────────────────────────────────────────────────────

class PrintToLogTransformer(ast.NodeTransformer):
    """
    NodeTransformer that replaces bare print() calls with logger.info() calls.
    Demonstrates in-place AST transformation.
    """

    def visit_Call(self, node: ast.Call) -> ast.AST:
        self.generic_visit(node)
        if isinstance(node.func, ast.Name) and node.func.id == "print":
            # Replace print(args) → logger.info(args)
            new_func = ast.Attribute(
                value=ast.Name(id="logger", ctx=ast.Load()),
                attr="info",
                ctx=ast.Load(),
            )
            return ast.Call(func=new_func, args=node.args, keywords=node.keywords)
        return node


def replace_print_with_logger(source: str) -> str:
    """
    Return source with bare print() calls replaced by logger.info() calls.

    Example:
        new_source = replace_print_with_logger('print("hello")')
        # → 'logger.info("hello")'
    """
    tree = ast.parse(source)
    transformer = PrintToLogTransformer()
    new_tree = transformer.visit(tree)
    ast.fix_missing_locations(new_tree)
    return ast.unparse(new_tree)


# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────

if __name__ == "__main__":
    source = textwrap.dedent("""\
        import os
        import json
        from datetime import datetime
        import requests   # third-party

        class Config:
            \"\"\"Application configuration.\"\"\"
            host: str = "localhost"
            port: int = 8080

            def load(self, path: str) -> dict:
                \"\"\"Load config from a JSON file.\"\"\"
                with open(path) as f:
                    return json.load(f)

        async def fetch_data(url: str, timeout: int = 30) -> dict:
            \"\"\"Fetch JSON from a URL.\"\"\"
            result = await client.get(url, timeout=timeout)
            return result.json()

        def risky(code):
            eval(code)
            exec(code)

        def bad_except():
            try:
                pass
            except:
                pass
            assert True, "sanity"

        print("Starting up")
    """)

    print("=== ast demo ===")

    print("\n--- parse + dump (first 10 lines) ---")
    tree = parse_source(source, "demo.py")
    dump = dump_ast(tree, indent=2)
    print("\n".join(dump.splitlines()[:10]) + "\n  ...")

    print("\n--- extract_imports ---")
    for imp in extract_imports(tree):
        print(f"  line {imp.lineno}: {'from ' if imp.is_from else ''}{imp.module} → {imp.names}")

    print("\n--- third_party_imports ---")
    print(f"  {third_party_imports(tree)}")

    print("\n--- list_functions ---")
    for fn in list_functions(tree):
        async_label = "async " if fn.is_async else ""
        print(f"  {async_label}def {fn.name}({', '.join(fn.args)}) → {fn.returns}  line={fn.lineno}")

    print("\n--- list_classes ---")
    for cls in list_classes(tree):
        print(f"  class {cls.name}({', '.join(cls.bases)})  methods={cls.methods}")

    print("\n--- lint_source ---")
    issues = lint_source(source)
    for issue in issues:
        print(f"  [{issue.rule}] line {issue.lineno}: {issue.message}")

    print("\n--- replace_print_with_logger ---")
    transformed = replace_print_with_logger('print("Starting up")\nprint("Done")')
    print(f"  {transformed!r}")

    print("\n--- safe_eval ---")
    print(f"  safe_eval('[1, 2, 3]') = {safe_eval('[1, 2, 3]')}")
    print(f"  safe_eval(\"{'a': 1}\") = {safe_eval(\"{'a': 1}\")}")
    try:
        safe_eval("__import__('os').system('echo hi')")
    except ValueError as e:
        print(f"  malicious eval blocked: {type(e).__name__}")

    print("\n--- back_to_source (round-trip) ---")
    simple = "x = 1 + 2 * 3"
    rt = back_to_source(parse_source(simple))
    print(f"  '{simple}' → '{rt}'")

    print("\n=== done ===")

For the inspect alternative — inspect.getsource() retrieves source code of live, already-imported objects and provides signature(), getmembers(), and call-stack introspection; ast parses source text without importing or executing it — use inspect when you have a live object and need its runtime interface; use ast for static analysis, linters, code scanners, and transformation tools that must work on arbitrary source files without running them. For the libcst / rope alternative — libcst (PyPI) provides a Concrete Syntax Tree that preserves all whitespace, comments, and formatting, enabling precise source-faithful refactoring; rope (PyPI) provides a full Python refactoring toolkit (rename, extract, inline, move); the stdlib ast loses formatting on round-trip through unparse — use libcst when transformations must preserve the existing code style (comments, blank lines, trailing commas); use rope for editor-integrated rename and extract refactoring; use ast for lightweight analysis, security scanning, and transformations where reformatting is acceptable. The Claude Skills 360 bundle includes ast skill sets covering parse_source()/parse_file()/dump_ast()/safe_eval()/back_to_source() core helpers, extract_imports()/third_party_imports() import analysis, list_functions()/list_classes() structure extraction, BasicLinter NodeVisitor for SECURITY001/SECURITY002/STYLE001/STYLE002 checks, and PrintToLogTransformer NodeTransformer/replace_print_with_logger() source transformation. Start with the free tier to try Python source analysis patterns and ast pipeline code generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39