eBPF (extended Berkeley Packet Filter) lets you run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. It’s the technology behind modern observability tools like Cilium, Falco, and Pixie. With Claude Code, you can write eBPF programs for custom tracing, profiling, and security monitoring without being a kernel expert.

What eBPF Can Do

eBPF programs attach to kernel hooks — system calls, network events, CPU samples, function entries — and execute safely inside the kernel with near-zero overhead. Common use cases:

Performance profiling: CPU flamegraphs showing exactly which code paths are hot
System call tracing: which files are being opened, which syscalls are slow
Network observability: packet inspection, latency measurement, connection tracking
Security monitoring: detecting anomalous syscall patterns, unexpected file access

CLAUDE.md for eBPF Projects

## eBPF Development Setup
- BCC Python tools for quick exploration and scripting
- libbpf + CO-RE for portable production programs (compiled once, run anywhere)
- bpftool for inspecting loaded programs and maps
- Linux kernel 5.15+ (CO-RE and ring buffer support)
- Build: clang-14+ with BTF support

## Safety Rules
- eBPF programs run in kernel — verifier rejects unsafe code (unbounded loops, null deref)
- Always use bounded loops: for (i = 0; i < MAX_ENTRIES; i++)
- Initialize all stack variables before use
- Ring buffer preferred over perf_event_array for new programs (more efficient)

Tracing Slow System Calls with BCC

Which system calls in my application are taking more than 1ms?
Show me the distribution by syscall and process.

#!/usr/bin/env python3
# trace_slow_syscalls.py — find slow syscalls with BCC

from bcc import BPF
import ctypes

# BPF program (C, runs in kernel)
bpf_text = """
#include <uapi/linux/ptrace.h>

// Track syscall start times
BPF_HASH(start, u64, u64);

// Output ring buffer — more efficient than perf_event_array
struct event_t {
    u32 pid;
    u32 syscall_id;
    u64 duration_ns;
    char comm[16];
};
BPF_RINGBUF_OUTPUT(events, 1 << 20);  // 1MB ring buffer

// On syscall entry: record start time
TRACEPOINT_PROBE(raw_syscalls, sys_enter) {
    u64 id = bpf_get_current_pid_tgid();
    u64 ts = bpf_ktime_get_ns();
    start.update(&id, &ts);
    return 0;
}

// On syscall exit: measure duration, emit if slow
TRACEPOINT_PROBE(raw_syscalls, sys_exit) {
    u64 id = bpf_get_current_pid_tgid();
    u64 *tsp = start.lookup(&id);
    if (!tsp) return 0;
    
    u64 duration = bpf_ktime_get_ns() - *tsp;
    start.delete(&id);
    
    // Only emit events > 1ms
    if (duration < 1000000) return 0;
    
    struct event_t *event = events.ringbuf_reserve(sizeof(*event));
    if (!event) return 0;
    
    event->pid = id >> 32;
    event->syscall_id = args->id;
    event->duration_ns = duration;
    bpf_get_current_comm(&event->comm, sizeof(event->comm));
    
    events.ringbuf_submit(event, 0);
    return 0;
}
"""

# Syscall name lookup
import subprocess
def get_syscall_name(syscall_id):
    try:
        result = subprocess.run(['ausyscall', str(syscall_id)], capture_output=True, text=True)
        return result.stdout.strip() or f'syscall_{syscall_id}'
    except:
        return f'syscall_{syscall_id}'

class Event(ctypes.Structure):
    _fields_ = [
        ("pid", ctypes.c_uint32),
        ("syscall_id", ctypes.c_uint32),
        ("duration_ns", ctypes.c_uint64),
        ("comm", ctypes.c_char * 16),
    ]

b = BPF(text=bpf_text)

print(f"Tracing syscalls > 1ms... Ctrl+C to stop")
print(f"{'COMM':<20} {'PID':<8} {'SYSCALL':<20} {'LATENCY'}")

def handle_event(ctx, data, size):
    event = ctypes.cast(data, ctypes.POINTER(Event)).contents
    comm = event.comm.decode('utf-8', errors='replace')
    syscall = get_syscall_name(event.syscall_id)
    latency_ms = event.duration_ns / 1_000_000
    print(f"{comm:<20} {event.pid:<8} {syscall:<20} {latency_ms:.2f}ms")

b["events"].open_ring_buffer(handle_event)

try:
    while True:
        b.ring_buffer_poll()
except KeyboardInterrupt:
    pass

CPU Profiling with Stack Traces

Generate a CPU flamegraph for my Python application.
I want to see which functions are consuming CPU time.

#!/usr/bin/env python3
# cpu_profile.py — sample CPU stacks at 99Hz and generate flamegraph data

from bcc import BPF
import signal
import sys
import os

bpf_text = """
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

struct key_t {
    u32 pid;
    u32 tgid;
    int user_stack_id;
    int kernel_stack_id;
    char name[TASK_COMM_LEN];
};

BPF_HASH(counts, struct key_t);
BPF_STACK_TRACE(stack_traces, 16384);

// Triggered by perf timer at 99Hz
int do_perf_event(struct bpf_perf_event_data *ctx) {
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    u32 tgid = bpf_get_current_pid_tgid();
    
    // Filter to specific PID if set
    if (FILTER_PID > 0 && pid != FILTER_PID) return 0;
    
    struct key_t key = {};
    key.pid = pid;
    key.tgid = tgid;
    key.user_stack_id = stack_traces.get_stackid(ctx, BPF_F_USER_STACK);
    key.kernel_stack_id = stack_traces.get_stackid(ctx, 0);
    bpf_get_current_comm(&key.name, sizeof(key.name));
    
    u64 *count = counts.lookup_or_try_init(&key, &(u64){0});
    if (count) (*count)++;
    
    return 0;
}
"""

target_pid = int(sys.argv[1]) if len(sys.argv) > 1 else 0
bpf_text = bpf_text.replace('FILTER_PID', str(target_pid))

b = BPF(text=bpf_text)

# Attach to CPU sampling event at 99Hz
b.attach_perf_event(
    ev_type=0,  # PERF_TYPE_HARDWARE
    ev_config=1,  # PERF_COUNT_HW_CPU_CYCLES
    fn_name="do_perf_event",
    sample_period=0,
    sample_freq=99,
)

print(f"Profiling {'PID ' + str(target_pid) if target_pid else 'all PIDs'} for 30 seconds...")
import time
time.sleep(30)

# Output in folded format for flamegraph.pl
with open('/tmp/out.folded', 'w') as f:
    for k, v in b["counts"].items():
        user_stack = []
        kernel_stack = []
        
        if k.user_stack_id >= 0:
            user_stack = [b.sym(addr, k.pid).decode('utf-8', errors='replace') 
                         for addr in b["stack_traces"].walk(k.user_stack_id)]
        if k.kernel_stack_id >= 0:
            kernel_stack = [b.ksym(addr).decode('utf-8', errors='replace')
                           for addr in b["stack_traces"].walk(k.kernel_stack_id)]
        
        # Folded format: frame;frame;frame count
        stack = ';'.join(reversed(user_stack + kernel_stack))
        comm = k.name.decode('utf-8', errors='replace')
        f.write(f"{comm};{stack} {v.value}\n")

print("Wrote /tmp/out.folded")
print("Generate flamegraph: flamegraph.pl /tmp/out.folded > /tmp/flamegraph.svg")

Network Packet Tracing with libbpf (CO-RE)

Trace TCP connection latency — time from SYN to established.
This needs to be a portable binary I can deploy to production servers.

// tcp_latency.bpf.c — CO-RE (Compile Once, Run Everywhere)
// Compiled with: clang -O2 -g -target bpf -D__TARGET_ARCH_x86 -I/usr/include/bpf -c tcp_latency.bpf.c

#include "vmlinux.h"  // Generated BTF header
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

// Track connection start times keyed by sock pointer
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 65536);
    __type(key, struct sock *);
    __type(value, u64);
} syn_start SEC(".maps");

// Output events
struct conn_event {
    u32 saddr;
    u32 daddr;
    u16 sport;
    u16 dport;
    u64 latency_ns;
    u32 pid;
};

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);  // 16MB
} events SEC(".maps");

// Hook: TCP state change to SYN_SENT (client connecting)
SEC("tracepoint/sock/inet_sock_set_state")
int trace_tcp_state(struct trace_event_raw_inet_sock_set_state *ctx) {
    if (ctx->protocol != IPPROTO_TCP) return 0;
    
    struct sock *sk = (struct sock *)ctx->skaddr;
    
    if (ctx->newstate == TCP_SYN_SENT) {
        // Record time when SYN sent
        u64 ts = bpf_ktime_get_ns();
        bpf_map_update_elem(&syn_start, &sk, &ts, BPF_ANY);
    }
    
    if (ctx->newstate == TCP_ESTABLISHED) {
        // Connection established — measure latency
        u64 *tsp = bpf_map_lookup_elem(&syn_start, &sk);
        if (!tsp) return 0;
        
        u64 latency = bpf_ktime_get_ns() - *tsp;
        bpf_map_delete_elem(&syn_start, &sk);
        
        struct conn_event *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
        if (!e) return 0;
        
        // CO-RE: read kernel struct fields safely across kernel versions
        e->saddr = BPF_CORE_READ(sk, __sk_common.skc_rcv_saddr);
        e->daddr = BPF_CORE_READ(sk, __sk_common.skc_daddr);
        e->sport = BPF_CORE_READ(sk, __sk_common.skc_num);
        e->dport = bpf_ntohs(BPF_CORE_READ(sk, __sk_common.skc_dport));
        e->latency_ns = latency;
        e->pid = bpf_get_current_pid_tgid() >> 32;
        
        bpf_ringbuf_submit(e, 0);
    }
    
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Security Monitoring — Detecting Unexpected Behavior

Alert when any process tries to write to /etc/passwd or /etc/shadow.
This should work across all processes, even containerized ones.

#!/usr/bin/env python3
# detect_etc_writes.py — alert on writes to /etc/ sensitive files

from bcc import BPF
import ctypes
import pwd
import os

bpf_text = """
#include <uapi/linux/ptrace.h>
#include <linux/fs.h>

struct alert_t {
    u32 pid;
    u32 uid;
    char comm[16];
    char filename[64];
    u32 container_id;  // namespace inode as proxy for container
};

BPF_RINGBUF_OUTPUT(alerts, 1 << 16);

// Trace openat() for writes to /etc/
TRACEPOINT_PROBE(syscalls, sys_enter_openat) {
    // Check flags for write intent: O_WRONLY=1, O_RDWR=2, O_CREAT=0x40
    if (!(args->flags & (O_WRONLY | O_RDWR | O_CREAT))) return 0;
    
    char filename[64];
    bpf_probe_read_user_str(filename, sizeof(filename), args->filename);
    
    // Check if target is in /etc/
    if (filename[0] != '/' || filename[1] != 'e' || 
        filename[2] != 't' || filename[3] != 'c' || 
        filename[4] != '/') return 0;
    
    // Skip expected writers (package managers, etc.)
    char comm[16];
    bpf_get_current_comm(&comm, sizeof(comm));
    
    struct alert_t *alert = alerts.ringbuf_reserve(sizeof(*alert));
    if (!alert) return 0;
    
    alert->pid = bpf_get_current_pid_tgid() >> 32;
    alert->uid = bpf_get_current_uid_gid();
    bpf_get_current_comm(&alert->comm, sizeof(alert->comm));
    bpf_probe_read_kernel_str(&alert->filename, sizeof(alert->filename), filename);
    
    alerts.ringbuf_submit(alert, 0);
    return 0;
}
"""

class Alert(ctypes.Structure):
    _fields_ = [
        ("pid", ctypes.c_uint32),
        ("uid", ctypes.c_uint32),
        ("comm", ctypes.c_char * 16),
        ("filename", ctypes.c_char * 64),
    ]

b = BPF(text=bpf_text)

print("Monitoring /etc/ writes... (Ctrl+C to stop)")

def handle_alert(ctx, data, size):
    alert = ctypes.cast(data, ctypes.POINTER(Alert)).contents
    comm = alert.comm.decode('utf-8', errors='replace')
    filename = alert.filename.decode('utf-8', errors='replace')
    
    try:
        username = pwd.getpwuid(alert.uid).pw_name
    except KeyError:
        username = str(alert.uid)
    
    # Get process cmdline for more context
    try:
        with open(f'/proc/{alert.pid}/cmdline', 'rb') as f:
            cmdline = f.read().replace(b'\x00', b' ').decode('utf-8', errors='replace').strip()
    except:
        cmdline = 'unknown'
    
    print(f"🚨 ALERT: {comm} (PID {alert.pid}, user {username}) writing to {filename}")
    print(f"   cmdline: {cmdline[:100]}")

b["alerts"].open_ring_buffer(handle_alert)

try:
    while True:
        b.ring_buffer_poll(timeout=100)
except KeyboardInterrupt:
    pass

Using Claude Code for eBPF Development

The eBPF verifier is strict. When your program fails to load, Claude Code interprets verifier errors:

My eBPF program fails to load with:
"R1 type=map_value expected=fp, pkt, pkt_end, pkt_meta, mem, ringbuf_mem, buf, trusted_ptr_"
What does this mean and how do I fix it?

This means you’re passing a map value pointer where the verifier expects a specific pointer type. Common cause: using a pointer after bpf_map_lookup_elem without checking for NULL first. Fix: always check if (!ptr) return 0; before using pointers from map lookups — the verifier requires null checks to prove safety.

For eBPF in production Kubernetes clusters with Cilium (which is built on eBPF for networking and observability), see the Kubernetes guide. For system-level performance optimization combining eBPF profiling with application-level changes, the performance optimization guide covers the full stack. The Claude Skills 360 bundle includes systems programming skill sets for eBPF, kernel tracing, and low-level performance analysis. Start with the free tier to try tracing program generation.

Claude Code for eBPF: Linux Kernel Observability and Performance Analysis

What eBPF Can Do

CLAUDE.md for eBPF Projects

Tracing Slow System Calls with BCC

CPU Profiling with Stack Traces

Network Packet Tracing with libbpf (CO-RE)

Security Monitoring — Detecting Unexpected Behavior

Using Claude Code for eBPF Development

Keep Reading

Claude Code for Haskell: Pure Functional Programming and Type System Mastery

Claude Code for Crystal: Ruby-Like Syntax with C-Level Performance

Claude Code for Zig: Systems Programming Without Hidden Control Flow

Put these ideas into practice