Claude Code for construct: Binary Protocol Parsing in Python — Claude Skills 360 Blog
Blog / AI / Claude Code for construct: Binary Protocol Parsing in Python
AI

Claude Code for construct: Binary Protocol Parsing in Python

Published: June 7, 2028
Read time: 5 min read
By: Claude Skills 360

construct is a declarative binary parser/builder for Python. pip install construct. Struct: from construct import Struct, Int8ub, Int16ub; Header = Struct("magic"/Int16ub, "version"/Int8ub). Parse: obj = Header.parse(b"\x00\x01\x02"). Build: Header.build({"magic": 1, "version": 2}). sizeof: Header.sizeof() → 3. Integer types: Int8ub Int8sb Int16ub Int16sb Int32ub Int32sb Int64ub Int64sb (u=unsigned, s=signed, b=big, l=little). Float: Float32b Float64b. Byte: Byte = Int8ub. Bytes: Bytes(4) — fixed length. GreedyBytes: rest of stream. String: PascalString(Int8ub, "utf8") — length-prefixed. CString: null-terminated. Array: Array(3, Int16ub) → list of 3 ints. GreedyRange: until end. Sequence: Sequence(Int8ub, Int16ub). Switch: Switch(this.type, {0: Format0, 1: Format1}). If: If(this.has_data, Bytes(4)). Computed: Computed(lambda ctx: ctx.length * 2). Rebuild: Rebuild(field, lambda ctx: len(ctx.data)). Enum: Enum(Byte, OK=0, ERR=1). BitStruct: BitStruct("flags"/FlagsEnum(BitsSwapped(Byte), ...)). Flag: Flag — 1-bit boolean. Checksum: from construct.lib import Checksum. Pass: Pass — skip field. Padding: Padding(2). Default: Default(Int8ub, 0). FocusedSeq: FocusedSeq("key", ...). Container: dict-like result object. this: this.field — access sibling. LazyBound: recursive structures. Claude Code generates construct binary protocol parsers, file format handlers, and network packet decoders.

CLAUDE.md for construct

## construct Stack
- Version: construct >= 2.10 | pip install construct
- Struct: Struct("field"/Type, "field2"/Type2)
- Types: Int8ub|Int16ul|Int32sb|Int64ub | Float32b | Bytes(n) | GreedyBytes
- String: PascalString(Int8ub, "utf8") | CString("utf8") | GreedyString("utf8")
- Repeat: Array(n, Type) | PrefixedArray(Int8ub, Type) | GreedyRange(Type)
- Conditional: Switch(this.type, {0: T0, 1: T1}) | If(this.flag, Type)

construct Binary Parsing Pipeline

# app/binary.py — construct Struct definitions, parsing, building, protocols, file formats
from __future__ import annotations

import io
from pathlib import Path
from typing import Any

from construct import (
    Array,
    Byte,
    Bytes,
    ByteSwapped,
    Computed,
    Const,
    CString,
    Default,
    Enum,
    Flag,
    GreedyBytes,
    GreedyRange,
    GreedyString,
    If,
    IfThenElse,
    Int8sb,
    Int8ub,
    Int16sb,
    Int16ub,
    Int16ul,
    Int32sb,
    Int32ub,
    Int32ul,
    Int64ub,
    Int64ul,
    Optional,
    Padding,
    Pass,
    PascalString,
    Prefixed,
    PrefixedArray,
    Rebuild,
    RepeatUntil,
    Select,
    Sequence,
    Struct,
    Switch,
    this,
    Const,
    Float32b,
    Float64b,
    BitStruct,
    BitsSwapped,
    FlagsEnum,
    NullTerminated,
    ChecksumError,
    RawCopy,
    Tunnel,
    Compressed,
    Checksum,
    Container,
    ListContainer,
)


# ─────────────────────────────────────────────────────────────────────────────
# 1. Basic type helpers
# ─────────────────────────────────────────────────────────────────────────────

def parse_bytes(definition, data: bytes) -> Container:
    """
    Parse bytes using a construct definition.

    Example:
        result = parse_bytes(MyStruct, raw_bytes)
        print(result.field_name)
    """
    return definition.parse(data)


def build_bytes(definition, obj: dict | Container) -> bytes:
    """
    Serialize a dict/Container to bytes using a construct definition.

    Example:
        data = build_bytes(MyStruct, {"magic": 0x1234, "version": 1})
    """
    return definition.build(obj)


def parse_file(definition, path: str | Path) -> Container:
    """Parse a binary file using a construct definition."""
    return definition.parse_file(str(path))


def size_of(definition) -> int | None:
    """
    Return the fixed byte size of a definition, or None if variable-length.
    """
    try:
        return definition.sizeof()
    except Exception:
        return None


# ─────────────────────────────────────────────────────────────────────────────
# 2. Common protocol primitives
# ─────────────────────────────────────────────────────────────────────────────

# IPv4 packet header (simplified)
IPv4Header = Struct(
    "version_ihl"  / Byte,          # version(4b) + IHL(4b)
    "tos"          / Byte,          # type of service
    "total_length" / Int16ub,
    "id"           / Int16ub,
    "flags_offset" / Int16ub,       # flags(3b) + fragment offset(13b)
    "ttl"          / Byte,
    "protocol"     / Byte,          # 6=TCP, 17=UDP, 1=ICMP
    "checksum"     / Int16ub,
    "src_ip"       / Bytes(4),
    "dst_ip"       / Bytes(4),
)


# DNS question record
DNSQuestion = Struct(
    "qname"   / RepeatUntil(lambda x, lst, ctx: x == 0, Byte),
    "qtype"   / Int16ub,
    "qclass"  / Int16ub,
)


# TLV (Type-Length-Value) — common in binary protocols
TLVRecord = Struct(
    "type"    / Int8ub,
    "length"  / Int16ub,
    "value"   / Bytes(this.length),
)


def parse_tlv_stream(data: bytes) -> list[Container]:
    """Parse a stream of TLV records until data is exhausted."""
    stream  = io.BytesIO(data)
    records = []
    while stream.tell() < len(data):
        try:
            rec = TLVRecord.parse_stream(stream)
            records.append(rec)
        except Exception:
            break
    return records


# ─────────────────────────────────────────────────────────────────────────────
# 3. Message framing formats
# ─────────────────────────────────────────────────────────────────────────────

# Version-tagged message envelope
MessageTypeEnum = Enum(Byte, PING=0, PONG=1, DATA=2, ACK=3, ERROR=4)

PingPayload  = Struct("seq" / Int32ub)
PongPayload  = Struct("seq" / Int32ub)
DataPayload  = Struct(
    "seq"     / Int32ub,
    "length"  / Rebuild(Int16ub, lambda ctx: len(ctx.get("body", b""))),
    "body"    / Bytes(this.length),
)
ErrorPayload = Struct(
    "code"    / Int16ub,
    "message" / PascalString(Int8ub, "utf8"),
)

Message = Struct(
    "magic"     / Const(b"\xCA\xFE"),
    "version"   / Default(Byte, 1),
    "type"      / MessageTypeEnum,
    "payload"   / Switch(this.type, {
        "PING":  PingPayload,
        "PONG":  PongPayload,
        "DATA":  DataPayload,
        "ACK":   Struct("seq" / Int32ub),
        "ERROR": ErrorPayload,
    }),
)


def encode_message(msg_type: str, **payload_fields) -> bytes:
    """
    Build a framed Message.

    Example:
        raw = encode_message("DATA", seq=1, body=b"hello world")
        raw = encode_message("PING", seq=42)
    """
    return Message.build({
        "version": 1,
        "type":    msg_type,
        "payload": payload_fields,
    })


def decode_message(data: bytes) -> Container:
    """
    Parse a framed Message from raw bytes.

    Example:
        msg = decode_message(raw_bytes)
        if msg.type == "DATA":
            process(msg.payload.body)
    """
    return Message.parse(data)


# ─────────────────────────────────────────────────────────────────────────────
# 4. File format: simple binary file
# ─────────────────────────────────────────────────────────────────────────────

# Generic binary file with a header, directory table, and data blobs
FileHeaderStruct = Struct(
    "magic"       / Const(b"BDAT"),
    "version"     / Int16ub,
    "entry_count" / Int16ub,
    "flags"       / Default(Int8ub, 0),
    "reserved"    / Padding(3),
)

FileEntryStruct = Struct(
    "name"    / PascalString(Int8ub, "utf8"),
    "offset"  / Int32ub,
    "length"  / Int32ub,
    "crc"     / Int32ub,
)

BinaryFile = Struct(
    "header"  / FileHeaderStruct,
    "entries" / Array(this.header.entry_count, FileEntryStruct),
    "data"    / GreedyBytes,
)


def create_binary_file(entries: dict[str, bytes]) -> bytes:
    """
    Create a simple binary container with named blobs.

    Example:
        container = create_binary_file({
            "config": b'{"debug": true}',
            "data":   raw_data_bytes,
        })
    """
    import zlib
    entry_list  = []
    blobs       = []
    offset      = 0

    for name, blob in entries.items():
        entry_list.append({
            "name":   name,
            "offset": offset,
            "length": len(blob),
            "crc":    zlib.crc32(blob) & 0xFFFFFFFF,
        })
        blobs.append(blob)
        offset += len(blob)

    return BinaryFile.build({
        "header": {
            "version":     1,
            "entry_count": len(entry_list),
        },
        "entries": entry_list,
        "data":    b"".join(blobs),
    })


def read_binary_file(data: bytes) -> dict[str, bytes]:
    """
    Parse a binary container and return {name: bytes} dict.

    Example:
        blobs = read_binary_file(container_bytes)
        config = json.loads(blobs["config"])
    """
    container = BinaryFile.parse(data)
    raw       = container.data
    result    = {}
    for entry in container.entries:
        result[entry.name] = raw[entry.offset: entry.offset + entry.length]
    return result


# ─────────────────────────────────────────────────────────────────────────────
# 5. Bit-level parsing
# ─────────────────────────────────────────────────────────────────────────────

# TCP flags byte (simplified)
TCPFlags = BitStruct(
    "cwr" / Flag,
    "ece" / Flag,
    "urg" / Flag,
    "ack" / Flag,
    "psh" / Flag,
    "rst" / Flag,
    "syn" / Flag,
    "fin" / Flag,
)


def parse_tcp_flags(flags_byte: int) -> Container:
    """
    Parse TCP control bits from a single byte.

    Example:
        flags = parse_tcp_flags(0x12)  # SYN+ACK
        assert flags.syn and flags.ack
    """
    return TCPFlags.parse(bytes([flags_byte]))


def build_tcp_flags(**flags: bool) -> int:
    """
    Build a TCP flags byte from named flags.

    Example:
        byte = build_tcp_flags(syn=True, ack=True)  # 0x12
    """
    all_flags = {k: False for k in ["cwr","ece","urg","ack","psh","rst","syn","fin"]}
    all_flags.update(flags)
    encoded = TCPFlags.build(all_flags)
    return encoded[0]


# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────

if __name__ == "__main__":
    print("=== TLV parsing ===")
    # Build TLV records manually
    r1 = TLVRecord.build({"type": 1, "length": 4, "value": b"test"})
    r2 = TLVRecord.build({"type": 2, "length": 3, "value": b"abc"})
    records = parse_tlv_stream(r1 + r2)
    for r in records:
        print(f"  type={r.type} len={r.length} value={r.value!r}")

    print("\n=== Message framing ===")
    raw_ping = encode_message("PING", seq=1)
    print(f"  PING bytes ({len(raw_ping)}): {raw_ping.hex()}")
    decoded = decode_message(raw_ping)
    print(f"  Decoded: type={decoded.type} seq={decoded.payload.seq}")

    raw_data = encode_message("DATA", seq=7, body=b"hello world")
    msg = decode_message(raw_data)
    print(f"  DATA:  seq={msg.payload.seq} body={msg.payload.body!r}")

    print("\n=== Binary file container ===")
    container = create_binary_file({
        "config": b'{"debug": true}',
        "model":  b"\x00" * 16,
    })
    print(f"  Container size: {len(container)} bytes")
    blobs = read_binary_file(container)
    for name, blob in blobs.items():
        print(f"  {name!r}: {len(blob)} bytes")

    print("\n=== TCP flags ===")
    syn_ack = parse_tcp_flags(0x12)
    print(f"  0x12: syn={syn_ack.syn} ack={syn_ack.ack} fin={syn_ack.fin}")
    built = build_tcp_flags(syn=True, ack=True)
    print(f"  syn+ack built: 0x{built:02x}")

    print("\n=== IPv4 header size ===")
    print(f"  IPv4Header.sizeof(): {size_of(IPv4Header)}")

For the struct stdlib alternative — Python’s built-in struct.pack/struct.unpack is fast and zero-dependency but requires manual format string management and positional tuple access; construct defines formats declaratively with named fields, conditional parsing (Switch, If), enums, and build() — use struct for trivial fixed-format parsing or performance-critical hot paths, construct when you’re implementing a complete protocol parser with conditional and variable-length fields. For the bitstruct / bitstring alternative — bitstruct and bitstring focus specifically on bit-level manipulation and stream reading; construct provides integrated bit-level parsing via BitStruct, Flag, and BitsSwapped alongside the full byte-level Struct system — use construct for protocols that mix byte-level and bit-level fields in the same definition. The Claude Skills 360 bundle includes construct skill sets covering parse_bytes()/build_bytes()/parse_file()/size_of(), TLVRecord stream parsing, IPv4Header/DNSQuestion primitives, MessageTypeEnum/Message framing with Switch payload dispatch, encode_message()/decode_message(), FileHeaderStruct/BinaryFile container create_binary_file()/read_binary_file(), TCPFlags BitStruct parse_tcp_flags()/build_tcp_flags(). Start with the free tier to try binary protocol parsing and serialization code generation.

Keep Reading

AI

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029
AI

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029
AI

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Back to Blog

Get 360 skills free