LLM agents go beyond single-turn completions: they use tools to take actions, maintain context across steps, and reason iteratively until a task is complete. Building reliable agents requires careful design of tool schemas, robust error handling when models call tools incorrectly, and memory management to keep agents focused on the current task. Claude Code, being an agent itself, generates agent code with deep understanding of these patterns.
This guide covers LLM agent development with Claude Code: tool calling, memory systems, ReAct orchestration, streaming, and evaluation.
CLAUDE.md for Agent Projects
## LLM Agent Stack
- Anthropic Claude API (claude-sonnet-4-6 for agents, haiku for quick tasks)
- Tool calling: Anthropic tool_use content blocks
- Memory: in-context window + Redis for longer-running agents
- Orchestration: custom ReAct loop (not LangChain — too much abstraction)
## Agent Patterns
- Tools: narrow and composable — one thing each, clear error messages
- Tools should be idempotent where possible
- Always handle ToolUseBlock and ToolResultBlock in message loops
- Stream long-running agent tasks — don't leave users waiting for 30+ second responses
- Evaluate with assertions on intermediate steps, not just final output
Tool Definition and Calling
Create an agent that can search a knowledge base, create tickets,
and send Slack notifications. Define the tools and the agent loop.
// src/tools/definitions.ts
import Anthropic from '@anthropic-ai/sdk';
export const tools: Anthropic.Tool[] = [
{
name: 'search_knowledge_base',
description: 'Search the internal knowledge base for documentation, runbooks, and FAQs. Use this to answer questions before creating tickets.',
input_schema: {
type: 'object' as const,
required: ['query'],
properties: {
query: {
type: 'string',
description: 'Search query — be specific for better results',
},
max_results: {
type: 'number',
description: 'Maximum results to return (default: 5, max: 20)',
default: 5,
},
},
},
},
{
name: 'create_ticket',
description: 'Create a support ticket in the issue tracker. Only use this when the knowledge base doesn\'t answer the question.',
input_schema: {
type: 'object' as const,
required: ['title', 'description', 'priority'],
properties: {
title: { type: 'string', description: 'Brief ticket title (< 80 chars)' },
description: { type: 'string', description: 'Full description with context and steps to reproduce' },
priority: {
type: 'string',
enum: ['low', 'medium', 'high', 'critical'],
},
assignee_email: {
type: 'string',
format: 'email',
description: 'Assign to specific team member (optional)',
},
},
},
},
{
name: 'send_slack_notification',
description: 'Send a Slack message to a channel or user. Use for urgent issues or to notify team leads.',
input_schema: {
type: 'object' as const,
required: ['channel', 'message'],
properties: {
channel: {
type: 'string',
description: 'Channel name (e.g., #ops-alerts) or user ID (e.g., U12345)',
},
message: { type: 'string' },
urgent: {
type: 'boolean',
description: 'Add @here mention for urgent issues (default: false)',
default: false,
},
},
},
},
];
Agent Orchestration Loop
// src/agent/support-agent.ts
import Anthropic from '@anthropic-ai/sdk';
import { tools } from '../tools/definitions';
import { executeTool } from '../tools/executor';
const client = new Anthropic();
export async function runSupportAgent(
userMessage: string,
conversationHistory: Anthropic.MessageParam[] = [],
): Promise<{ response: string; actionsLog: string[] }> {
const actionsLog: string[] = [];
const messages: Anthropic.MessageParam[] = [
...conversationHistory,
{ role: 'user', content: userMessage },
];
const systemPrompt = `You are a support agent. When a user reports an issue:
1. First search the knowledge base to find existing documentation
2. If the knowledge base answers the question, respond directly
3. If the issue requires action, create a ticket
4. For critical issues (production down, data loss), also send a Slack notification to #ops-alerts
Always be concise and clear in your responses.`;
// ReAct loop: reason → act → observe → repeat until done
while (true) {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
system: systemPrompt,
tools,
messages,
});
if (response.stop_reason === 'end_turn') {
// Agent finished — extract text response
const text = response.content
.filter((b): b is Anthropic.TextBlock => b.type === 'text')
.map(b => b.text)
.join('');
return { response: text, actionsLog };
}
if (response.stop_reason === 'tool_use') {
// Add assistant's tool call to history
messages.push({ role: 'assistant', content: response.content });
// Execute all tool calls (may be multiple in one turn)
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type !== 'tool_use') continue;
actionsLog.push(`Tool: ${block.name} | Input: ${JSON.stringify(block.input)}`);
try {
const result = await executeTool(block.name, block.input);
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
content: JSON.stringify(result),
});
} catch (error) {
// Return errors to the agent — let it decide how to proceed
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
is_error: true,
content: `Error: ${(error as Error).message}`,
});
}
}
// Add tool results to history and continue the loop
messages.push({ role: 'user', content: toolResults });
continue;
}
// Unexpected stop reason
break;
}
return { response: 'Agent stopped unexpectedly', actionsLog };
}
Memory Systems
The agent needs to remember context across multiple requests
from the same user. Add a memory layer.
// src/memory/agent-memory.ts
import { Redis } from 'ioredis';
import Anthropic from '@anthropic-ai/sdk';
const redis = new Redis(process.env.REDIS_URL!);
interface AgentMemory {
userId: string;
recentMessages: Anthropic.MessageParam[];
facts: string[]; // Extracted facts about the user's environment
openTickets: string[]; // Ticket IDs created this session
}
export async function loadMemory(userId: string): Promise<AgentMemory> {
const stored = await redis.get(`agent:memory:${userId}`);
if (!stored) {
return { userId, recentMessages: [], facts: [], openTickets: [] };
}
return JSON.parse(stored);
}
export async function saveMemory(memory: AgentMemory): Promise<void> {
// Keep last 20 messages for context window efficiency
memory.recentMessages = memory.recentMessages.slice(-20);
await redis.setex(
`agent:memory:${memory.userId}`,
7 * 24 * 60 * 60, // 7 day TTL
JSON.stringify(memory),
);
}
export async function extractAndSaveFacts(
userId: string,
conversation: Anthropic.MessageParam[],
): Promise<void> {
// Use a quick model to extract facts from the conversation
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-haiku-4-5-20251001',
max_tokens: 500,
system: 'Extract factual observations about the user\'s technical environment from this conversation. Return a JSON array of strings. Only extract concrete facts (OS, stack, error patterns), not questions or hypotheticals.',
messages: [{ role: 'user', content: JSON.stringify(conversation) }],
});
try {
const facts = JSON.parse((response.content[0] as Anthropic.TextBlock).text) as string[];
const memory = await loadMemory(userId);
memory.facts = [...new Set([...memory.facts, ...facts])].slice(-50); // Max 50 facts
await saveMemory(memory);
} catch {
// Extraction failed — not critical
}
}
Streaming Agent Responses
// src/agent/streaming-agent.ts
export async function* streamSupportAgent(
userMessage: string,
userId: string,
): AsyncGenerator<{ type: 'text' | 'tool_call' | 'error'; content: string }> {
const memory = await loadMemory(userId);
const stream = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
system: systemPrompt,
tools,
messages: [...memory.recentMessages, { role: 'user', content: userMessage }],
stream: true,
});
let currentToolName = '';
let currentToolInput = '';
for await (const event of stream) {
if (event.type === 'content_block_delta') {
if (event.delta.type === 'text_delta') {
yield { type: 'text', content: event.delta.text };
} else if (event.delta.type === 'input_json_delta') {
currentToolInput += event.delta.partial_json;
}
}
if (event.type === 'content_block_start' && event.content_block.type === 'tool_use') {
currentToolName = event.content_block.name;
yield { type: 'tool_call', content: `Using ${currentToolName}...` };
}
}
}
Agent Evaluation
// tests/agent.eval.ts — Evaluation framework for agent correctness
describe('Support agent evaluations', () => {
it('searches knowledge base before creating tickets', async () => {
const { actionsLog } = await runSupportAgent(
'How do I reset my 2FA device?'
);
// Must search before deciding to create a ticket
const searchAction = actionsLog.find(a => a.includes('search_knowledge_base'));
const ticketAction = actionsLog.find(a => a.includes('create_ticket'));
expect(searchAction).toBeTruthy(); // Must search
// For a knowable question, should not create a ticket
expect(ticketAction).toBeUndefined();
});
it('creates ticket for novel issues not in knowledge base', async () => {
const { actionsLog } = await runSupportAgent(
'The deployment pipeline is failing with error code XZ-9991'
);
const ticketAction = actionsLog.find(a => a.includes('create_ticket'));
expect(ticketAction).toBeTruthy();
});
it('notifies Slack for critical issues', async () => {
const { actionsLog } = await runSupportAgent(
'URGENT: Production database is down, all users affected'
);
const slackAction = actionsLog.find(a => a.includes('send_slack_notification'));
expect(slackAction).toBeTruthy();
// Should use #ops-alerts channel
expect(slackAction).toContain('ops-alerts');
});
});
For integrating ML models and embeddings into agent memory, see the machine learning guide. For streaming LLM responses over WebSockets and SSE, see the WebSockets guide. The Claude Skills 360 bundle includes LLM integration skill sets covering tool use patterns, agent orchestration, and evaluation frameworks. Start with the free tier to try agent code generation.