Python dataclasses auto-generate __init__, __repr__, __eq__ from field annotations. from dataclasses import dataclass, field. Basic: @dataclass class Point: x: float; y: float. Default: @dataclass class Config: debug: bool = False; port: int = 8080. Mutable default: tags: list = field(default_factory=list). Computed: name: str = field(init=False). post_init: def __post_init__(self): if self.x < 0: raise ValueError(...). asdict: from dataclasses import asdict; asdict(point) → dict. astuple: astuple(point) → tuple. replace: from dataclasses import replace; p2 = replace(p, y=10). frozen: @dataclass(frozen=True) — immutable, hashable. order: @dataclass(order=True) — adds <,>,<=,>=. slots: @dataclass(slots=True) (Python 3.10+) — faster attribute access, less memory. kw_only: @dataclass(kw_only=True) (Python 3.10+). KW_ONLY sentinel: field1: int; _: KW_ONLY; field2: str. InitVar: invite_code: InitVar[str] — passed to init but not stored. ClassVar: count: ClassVar[int] = 0 — not in init. fields(): from dataclasses import fields; [(f.name, f.type) for f in fields(cls)]. is_dataclass(): from dataclasses import is_dataclass; is_dataclass(obj). make_dataclass: D = make_dataclass("D", [("x",int),("y",int)]). Inheritance: @dataclass class Child(Parent): extra: str. JSON: json.dumps(asdict(obj)). Claude Code generates typed data models, DTO classes, config dataclasses, and serialization helpers.
CLAUDE.md for dataclasses
## dataclasses Stack
- Stdlib: from dataclasses import dataclass, field, asdict, astuple, replace, fields
- Basic: @dataclass | @dataclass(frozen=True) | @dataclass(order=True, slots=True)
- Field: field(default=val) | field(default_factory=list) | field(repr=False, compare=False)
- Validate: def __post_init__(self): validate fields, raise ValueError
- Serialize: asdict(obj) → dict | json.dumps(asdict(obj)) | replace(obj, field=newval)
- Introspect: fields(cls) | is_dataclass(obj) | f.name, f.type, f.default for f in fields(cls)
dataclasses Modeling Pipeline
# app/models.py — dataclasses with validation, serialization, inheritance, factories
from __future__ import annotations
import json
import re
from dataclasses import (
KW_ONLY,
ClassVar,
InitVar,
asdict,
astuple,
dataclass,
field,
fields,
is_dataclass,
make_dataclass,
replace,
)
from datetime import date, datetime
from typing import Any
# ─────────────────────────────────────────────────────────────────────────────
# 1. Basic data models
# ─────────────────────────────────────────────────────────────────────────────
@dataclass
class Point:
"""
Simple 2D point.
Example:
p = Point(1.0, 2.0)
p2 = replace(p, y=5.0)
assert p2.x == 1.0 and p2.y == 5.0
"""
x: float
y: float
def distance_to(self, other: Point) -> float:
return ((self.x - other.x) ** 2 + (self.y - other.y) ** 2) ** 0.5
@dataclass(frozen=True, order=True)
class Version:
"""
Immutable, comparable semantic version.
Example:
v1 = Version(2, 1, 0)
v2 = Version(2, 0, 3)
assert v1 > v2
versions = sorted([v2, v1, Version(1, 0, 0)])
"""
major: int
minor: int
patch: int = 0
def __str__(self) -> str:
return f"{self.major}.{self.minor}.{self.patch}"
@classmethod
def parse(cls, s: str) -> Version:
parts = [int(p) for p in s.split(".", 2)]
return cls(*parts) if len(parts) >= 2 else cls(parts[0], 0, 0)
@dataclass(slots=True)
class Vector2D:
"""
Memory-efficient 2D vector (slots=True saves ~30% memory vs dict-backed).
Example:
v = Vector2D(3.0, 4.0)
print(v.length()) # 5.0
"""
x: float
y: float
def length(self) -> float:
return (self.x ** 2 + self.y ** 2) ** 0.5
def add(self, other: Vector2D) -> Vector2D:
return Vector2D(self.x + other.x, self.y + other.y)
def scale(self, factor: float) -> Vector2D:
return Vector2D(self.x * factor, self.y * factor)
# ─────────────────────────────────────────────────────────────────────────────
# 2. Validated models with __post_init__
# ─────────────────────────────────────────────────────────────────────────────
@dataclass
class EmailAddress:
"""
Validated email with normalized representation.
Example:
email = EmailAddress(" [email protected] ")
print(email.address) # "[email protected]"
print(email.domain) # "example.com"
"""
_raw: InitVar[str]
address: str = field(init=False)
local: str = field(init=False, repr=False)
domain: str = field(init=False)
_pattern: ClassVar[re.Pattern] = re.compile(
r"^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$"
)
def __post_init__(self, _raw: str) -> None:
normalized = _raw.strip().lower()
if not self._pattern.match(normalized):
raise ValueError(f"Invalid email address: {_raw!r}")
self.address = normalized
self.local, self.domain = normalized.split("@", 1)
@dataclass
class DateRange:
"""
Inclusive date range with validation.
Example:
dr = DateRange(date(2024, 1, 1), date(2024, 3, 31))
assert date(2024, 2, 15) in dr
print(dr.days) # 91
"""
start: date
end: date
def __post_init__(self) -> None:
if self.end < self.start:
raise ValueError(f"end ({self.end}) must be >= start ({self.start})")
def __contains__(self, d: date) -> bool:
return self.start <= d <= self.end
@property
def days(self) -> int:
return (self.end - self.start).days + 1
def overlaps(self, other: DateRange) -> bool:
return self.start <= other.end and other.start <= self.end
@dataclass
class Money:
"""
Decimal money value with currency.
Example:
price = Money(9.99, "USD")
total = price.add(Money(2.50, "USD"))
print(str(total)) # "USD 12.49"
"""
amount: float
currency: str = "USD"
def __post_init__(self) -> None:
if len(self.currency) != 3:
raise ValueError(f"currency must be 3-letter ISO code, got {self.currency!r}")
self.amount = round(self.amount, 2)
def add(self, other: Money) -> Money:
if self.currency != other.currency:
raise ValueError(f"Cannot add {self.currency} and {other.currency}")
return Money(self.amount + other.amount, self.currency)
def __str__(self) -> str:
return f"{self.currency} {self.amount:.2f}"
# ─────────────────────────────────────────────────────────────────────────────
# 3. Nested and collection fields
# ─────────────────────────────────────────────────────────────────────────────
@dataclass
class Address:
street: str
city: str
state: str
zip: str
country: str = "US"
@dataclass
class UserProfile:
"""
Rich user model with nested and collection fields.
Example:
user = UserProfile(
id=1, name="Alice", email=EmailAddress("[email protected]"),
address=Address("123 Main St", "Springfield", "IL", "62701"),
)
user.tags.append("premium")
d = asdict(user) # fully serializable dict
"""
id: int
name: str
email: EmailAddress
_: KW_ONLY # all fields after this are keyword-only
address: Address | None = None
tags: list[str] = field(default_factory=list)
metadata: dict[str, Any] = field(default_factory=dict)
created: datetime = field(default_factory=datetime.utcnow)
active: bool = True
@property
def display_name(self) -> str:
return self.name.strip() or self.email.local
# ─────────────────────────────────────────────────────────────────────────────
# 4. Serialization helpers
# ─────────────────────────────────────────────────────────────────────────────
def _serialize(val: Any) -> Any:
"""Recursively serialize dataclasses and non-JSON types."""
if is_dataclass(val) and not isinstance(val, type):
return {k: _serialize(v) for k, v in asdict(val).items()}
if isinstance(val, datetime):
return val.isoformat()
if isinstance(val, date):
return val.isoformat()
if isinstance(val, (list, tuple)):
return [_serialize(v) for v in val]
if isinstance(val, dict):
return {k: _serialize(v) for k, v in val.items()}
return val
def to_json(obj: Any, indent: int | None = None) -> str:
"""
Serialize a dataclass (or nested structure) to JSON string.
Handles datetime, date, and nested dataclasses.
Example:
json_str = to_json(user, indent=2)
"""
return json.dumps(_serialize(obj), indent=indent, default=str)
def from_dict(cls, data: dict) -> Any:
"""
Construct a (flat) dataclass from a dict, ignoring unknown keys.
Example:
user = from_dict(UserConfig, {"port": 8080, "debug": True, "unknown_key": "x"})
"""
valid_keys = {f.name for f in fields(cls)}
filtered = {k: v for k, v in data.items() if k in valid_keys}
return cls(**filtered)
# ─────────────────────────────────────────────────────────────────────────────
# 5. Inheritance and mixins
# ─────────────────────────────────────────────────────────────────────────────
@dataclass
class BaseEntity:
"""Base entity with auto-managed metadata."""
id: int
created_at: datetime = field(default_factory=datetime.utcnow, repr=False)
updated_at: datetime = field(default_factory=datetime.utcnow, repr=False)
def touch(self) -> None:
object.__setattr__(self, "updated_at", datetime.utcnow())
@dataclass
class Product(BaseEntity):
"""
A product extending BaseEntity.
Example:
p = Product(id=1, name="Widget", price=Money(9.99), sku="WGT-001")
print(p) # Product(id=1, name='Widget', sku='WGT-001', ...)
d = asdict(p) # serializable dict
"""
name: str = ""
price: Money = field(default_factory=lambda: Money(0.0))
sku: str = ""
in_stock: bool = True
tags: list[str] = field(default_factory=list)
# ─────────────────────────────────────────────────────────────────────────────
# 6. Dynamic dataclass factory
# ─────────────────────────────────────────────────────────────────────────────
def schema_to_dataclass(name: str, schema: dict[str, type]) -> type:
"""
Dynamically create a dataclass from a name → type mapping.
Example:
Record = schema_to_dataclass("Record", {"id": int, "name": str, "score": float})
r = Record(id=1, name="Alice", score=0.95)
"""
return make_dataclass(name, [(k, t) for k, t in schema.items()])
# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
print("=== dataclasses demo ===")
print("\n--- Version (frozen, order) ---")
v1 = Version(2, 1, 0)
v2 = Version(2, 0, 3)
v3 = Version.parse("1.5.2")
print(f" v1={v1}, v2={v2} v1>v2: {v1 > v2}")
print(f" sorted: {sorted([v1, v2, v3])}")
print("\n--- EmailAddress (InitVar, ClassVar) ---")
email = EmailAddress(" [email protected] ")
print(f" address={email.address}, domain={email.domain}")
try:
EmailAddress("not-an-email")
except ValueError as e:
print(f" validation: {e}")
print("\n--- DateRange ---")
dr = DateRange(date(2024, 1, 1), date(2024, 3, 31))
print(f" days={dr.days}, contains 2024-02-15: {date(2024, 2, 15) in dr}")
print("\n--- Money ---")
price = Money(9.99)
tax = Money(0.80)
print(f" {price} + {tax} = {price.add(tax)}")
print("\n--- UserProfile (KW_ONLY, nested, to_json) ---")
user = UserProfile(
id=1,
name="Alice",
email=EmailAddress("[email protected]"),
address=Address("123 Main St", "Springfield", "IL", "62701"),
tags=["premium", "beta"],
)
print(f" display_name: {user.display_name}")
j = to_json(user, indent=None)
parsed = json.loads(j)
print(f" JSON keys: {list(parsed.keys())}")
print("\n--- Product (inheritance) ---")
p = Product(id=42, name="Widget", price=Money(19.99), sku="WGT-001")
p.tags.append("sale")
print(f" {p.name}: {p.price} sku={p.sku} tags={p.tags}")
print("\n--- schema_to_dataclass ---")
Row = schema_to_dataclass("Row", {"id": int, "name": str, "score": float})
r = Row(id=1, name="Alice", score=0.95)
print(f" {r}")
print(f" asdict: {asdict(r)}")
print("\n--- replace ---")
p2 = replace(Point(1.0, 2.0), y=10.0)
print(f" replace: {p2}")
print("\n--- Vector2D (slots) ---")
v = Vector2D(3.0, 4.0)
print(f" length: {v.length()} scaled: {v.scale(2.0)}")
print("\n=== done ===")
For the pydantic alternative — Pydantic v2 provides runtime type validation and coercion (a string "42" is auto-coerced to int), schema generation, JSON serialization via .model_dump(), and validators as decorators — all on top of a dataclass-like definition syntax; Python dataclasses have zero-overhead validation (only what you write in __post_init__), no coercion, and are part of the standard library — use Pydantic for API input/output models where you need schema generation, automatic type coercion, and rich validators, dataclasses for internal data models, DTOs between modules, and anywhere you want typed structure without external dependencies. For the attrs alternative — attrs (@attr.s / @attr.define) provides a similar feature set to dataclasses but predates them, supports validators and converters as field-level declarative specifications, and offers deep customization of __init__, hashing, and slot behavior; Python dataclasses cover the common 80% of attrs use cases as a stdlib solution — use attrs for complex validation/converter logic at the field level or for backwards-compatible Python 2.7 projects, dataclasses for new code in Python 3.7+ where stdlib-only is preferred. The Claude Skills 360 bundle includes dataclasses skill sets covering Point/Version/Vector2D simple models, EmailAddress/DateRange/Money validated models with __post_init__, UserProfile with KW_ONLY/nested/collection fields, BaseEntity/Product inheritance, to_json()/from_dict() serialization helpers, and schema_to_dataclass() dynamic factory. Start with the free tier to try typed data modeling and dataclass pipeline code generation.