Blog / AI / Claude Code for wcwidth: Unicode Display Width in Python

Claude Code for wcwidth: Unicode Display Width in Python

Published: March 6, 2028

•

Read time: 5 min read

•

By: Claude Skills 360

wcwidth measures the display width of Unicode characters in a terminal. pip install wcwidth. wcwidth: from wcwidth import wcwidth, wcswidth. Single char: wcwidth("A") → 1. wcwidth("中") → 2. wcwidth("\u0300") → 0 (combining accent). wcwidth("\x00") → 0. Wide chars: all CJK unified ideographs, katakana, fullwidth forms → 2. Zero-width: combining marks, control characters → 0 or -1. String width: wcswidth("Hello") → 5. wcswidth("Hello 世界") → 9. Returns -1 if string contains non-printable control characters. Safe: wcswidth("text") or len("text") for fallback. Padding: text.ljust(width) is wrong for CJK — use display_ljust(text, width) helper. Truncate: loop chars summing wcwidth until budget exhausted rather than slicing by index. Table: calculate column widths using wcswidth not len. Progress bar: wcswidth(label) to account for wide emoji and CJK labels. East Asian Width: full/wide characters (F/W) → 2 columns. Half/narrow/neutral/ambiguous → 1 column. Emoji: most emoji are wide (2 columns) but presentation modifiers affect display. wcwidth("👋") → 2. wcwidth("❤") → 1 (narrow heart). Claude Code generates wcwidth-aware table formatters, terminal progress bars, and Unicode column alignment utilities.

CLAUDE.md for wcwidth

## wcwidth Stack
- Version: wcwidth >= 0.2 | pip install wcwidth
- Char width: wcwidth("A") → 1 | wcwidth("中") → 2 | wcwidth("\u0300") → 0
- String width: wcswidth("Hello 世界") → 9 | returns -1 for non-printable chars
- Pad: never use str.ljust(n) for Unicode — use display_pad(text, width) helpers
- Truncate: sum wcwidth per character until budget exhausted, not s[:n]
- Tables: compute column widths with wcswidth, not len(); pad with display_ljust()
- Emoji: wcwidth("👋") → 2 (wide) | wcwidth("❤") → 1 (narrow)

wcwidth Unicode Display Width Pipeline

# app/display_width.py — wcwidth terminal alignment, padding, and table formatting
from __future__ import annotations

from wcwidth import wcwidth, wcswidth


# ─────────────────────────────────────────────────────────────────────────────
# 1. Character and string width
# ─────────────────────────────────────────────────────────────────────────────

def char_width(ch: str) -> int:
    """
    Return the display width of a single character.
    0: combining/zero-width characters (e.g. U+0300 combining grave)
    1: standard ASCII and most Latin characters
    2: CJK ideographs, fullwidth forms, most emoji
    -1: non-printable control characters
    """
    return wcwidth(ch)


def string_width(text: str) -> int:
    """
    Return the total display width of a string in a terminal.
    Each wide (CJK/emoji) character counts as 2; zero-width counts as 0.
    Returns -1 if the string contains non-printable control characters.
    "Hello 世界" → 9 (5 + 1 space + 2 + 2)
    """
    return wcswidth(text)


def safe_string_width(text: str) -> int:
    """
    Return display width; fall back to len(text) if wcswidth returns -1.
    Useful when text may contain escape sequences (e.g. ANSI color codes).
    """
    w = wcswidth(text)
    return w if w >= 0 else len(text)


def has_wide_chars(text: str) -> bool:
    """Return True if the string contains any wide (2-column) characters."""
    return any(wcwidth(ch) == 2 for ch in text)


def has_zero_width(text: str) -> bool:
    """Return True if the string contains zero-width combining characters."""
    return any(wcwidth(ch) == 0 for ch in text)


# ─────────────────────────────────────────────────────────────────────────────
# 2. Unicode-safe padding and alignment
# ─────────────────────────────────────────────────────────────────────────────

def display_ljust(text: str, width: int, fillchar: str = " ") -> str:
    """
    Left-justify text in a field of `width` display columns.
    Correctly handles wide CJK characters that occupy 2 columns.
    "Hello 世界" display-width=9 → padded to 15 with 6 spaces → 15 columns wide.
    """
    current = safe_string_width(text)
    padding = max(0, width - current)
    return text + fillchar * padding


def display_rjust(text: str, width: int, fillchar: str = " ") -> str:
    """Right-justify text in a field of `width` display columns."""
    current = safe_string_width(text)
    padding = max(0, width - current)
    return fillchar * padding + text


def display_center(text: str, width: int, fillchar: str = " ") -> str:
    """Center text in a field of `width` display columns."""
    current = safe_string_width(text)
    padding = max(0, width - current)
    left    = padding // 2
    right   = padding - left
    return fillchar * left + text + fillchar * right


def display_pad(text: str, width: int, align: str = "left", fillchar: str = " ") -> str:
    """
    Pad text to `width` display columns.
    align: "left" (default), "right", "center"
    """
    if align == "right":
        return display_rjust(text, width, fillchar)
    if align == "center":
        return display_center(text, width, fillchar)
    return display_ljust(text, width, fillchar)


# ─────────────────────────────────────────────────────────────────────────────
# 3. Unicode-safe truncation
# ─────────────────────────────────────────────────────────────────────────────

def display_truncate(text: str, max_width: int, ellipsis: str = "…") -> str:
    """
    Truncate text to at most `max_width` display columns.
    Adds `ellipsis` if truncation occurs.
    Never cuts in the middle of a wide character.
    "Hello 世界 World" truncated to 10 → "Hello 世…" (not "Hello 世�")
    """
    ellipsis_width = safe_string_width(ellipsis)
    budget = max_width - ellipsis_width

    width_so_far = 0
    cut_point    = 0
    for i, ch in enumerate(text):
        cw = wcwidth(ch)
        if cw < 0:
            cw = 1  # treat non-printable as 1 for safety
        if width_so_far + cw > budget:
            return text[:cut_point] + ellipsis
        width_so_far += cw
        cut_point = i + 1

    # Fits without truncation
    return text


def display_slice(text: str, start_col: int, end_col: int) -> str:
    """
    Extract a substring that spans columns [start_col, end_col).
    Handles wide characters: a wide char is included only if it fits entirely.
    """
    result      = []
    col         = 0
    for ch in text:
        cw = max(0, wcwidth(ch))
        if col >= end_col:
            break
        if col + cw > end_col:
            # Wide char straddles the boundary — pad with space
            if col >= start_col:
                result.append(" " * (end_col - col))
            break
        if col + cw > start_col:
            result.append(ch)
        col += cw
    return "".join(result)


# ─────────────────────────────────────────────────────────────────────────────
# 4. Table formatting
# ─────────────────────────────────────────────────────────────────────────────

def format_table(
    rows: list[list[str]],
    headers: list[str] | None = None,
    padding: int = 1,
    separator: str = "│",
    header_separator: str = "─",
) -> str:
    """
    Format a 2D list of strings as a fixed-width terminal table.
    Uses display widths (wcswidth) instead of len() for CJK-safe alignment.

    format_table(
        headers=["Name", "City"],
        rows=[["Alice", "München"], ["Bob", "北京"], ["Charlie", "New York"]]
    )
    """
    all_rows = ([headers] if headers else []) + rows
    num_cols = max(len(r) for r in all_rows)

    # Compute column widths as maximum display width per column
    col_widths = [0] * num_cols
    for row in all_rows:
        for j, cell in enumerate(row):
            col_widths[j] = max(col_widths[j], safe_string_width(str(cell)))

    pad = " " * padding

    def format_row(row: list[str]) -> str:
        cells = []
        for j in range(num_cols):
            cell = str(row[j]) if j < len(row) else ""
            cells.append(pad + display_ljust(cell, col_widths[j]) + pad)
        return separator + separator.join(cells) + separator

    lines = []
    if headers:
        lines.append(format_row(headers))
        # Separator line
        divider_cells = [header_separator * (col_widths[j] + 2 * padding) for j in range(num_cols)]
        lines.append("├" + "┼".join(divider_cells) + "┤")

    for row in rows:
        lines.append(format_row(row))

    return "\n".join(lines)


# ─────────────────────────────────────────────────────────────────────────────
# 5. Progress bar label helper
# ─────────────────────────────────────────────────────────────────────────────

def progress_label(label: str, max_cols: int = 20) -> str:
    """
    Fit a label into exactly max_cols display columns.
    Wide CJK/emoji characters are accounted for so the bar stays aligned.
    """
    w = safe_string_width(label)
    if w >= max_cols:
        return display_truncate(label, max_cols)
    return display_ljust(label, max_cols)


# ─────────────────────────────────────────────────────────────────────────────
# Demo
# ─────────────────────────────────────────────────────────────────────────────

if __name__ == "__main__":
    print("=== Character widths ===")
    samples = [
        ("ASCII A",        "A"),
        ("Space",          " "),
        ("CJK 中",         "中"),
        ("CJK 北",         "北"),
        ("Emoji 👋",       "👋"),
        ("Narrow heart ❤", "❤"),
        ("Combining ̀",   "\u0300"),
        ("Fullwidth Ａ",   "\uff21"),
        ("Katakana ア",    "\u30a2"),
    ]
    for label, ch in samples:
        print(f"  {label:20} wcwidth={wcwidth(ch):2}  char={ch!r}")

    print("\n=== String display widths ===")
    strings = [
        "Hello",
        "Hello 世界",
        "北京 Shanghai",
        "café",
        "👋 Hello",
        "ASCII only",
    ]
    for s in strings:
        print(f"  len={len(s):3}  wcswidth={wcswidth(s):3}  {s!r}")

    print("\n=== Padding (width=15) ===")
    texts = ["Hello", "Hello 世界", "北京", "café", "👋 Hi"]
    for t in texts:
        padded = display_ljust(t, 15)
        print(f"  |{padded}|  (input={safe_string_width(t)} cols)")

    print("\n=== Truncation (max_width=10) ===")
    for t in texts:
        print(f"  {t!r:20} → {display_truncate(t, 10)!r}")

    print("\n=== Table ===")
    table = format_table(
        headers=["Name", "City", "Score"],
        rows=[
            ["Alice",   "München",  "98"],
            ["Bob",     "北京",     "87"],
            ["Charlie", "New York",  "92"],
            ["飞鸿",    "上海",     "76"],
        ],
    )
    print(table)

For the len() alternative — len(text) counts code points, not display columns; “Hello 世界” has len of 9 but displays as 12 columns wide because each CJK character occupies 2 columns in a monospace terminal; using len for column alignment in a table with CJK content causes misaligned rows, which is why text coming from APIs, CSV imports, or user input with Japanese/Chinese/Korean text needs wcswidth instead. For the unicodedata.east_asian_width() alternative — unicodedata.east_asian_width(ch) returns the East Asian Width category string (“W”, “F”, “H”, “Na”, “A”, “N”) which you’d then have to map to 1/2 yourself and handle ambiguous characters; wcwidth wraps this mapping and handles the standard correctly, making it a simpler one-call API for terminal column math. The Claude Skills 360 bundle includes wcwidth skill sets covering wcwidth() single-character width, wcswidth() string display width, safe_string_width() with fallback, has_wide_chars()/has_zero_width() detection helpers, display_ljust()/display_rjust()/display_center() padding functions, display_pad() multi-alignment, display_truncate() safe truncation at column boundary, display_slice() column range extraction, format_table() CJK-safe terminal table renderer, and progress_label() fixed-width bar label helper. Start with the free tier to try Unicode-aware terminal formatting code generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39