AWS Bedrock provides access to Claude, Llama, Mistral, and other foundation models through a unified API with IAM, VPC, and CloudWatch integration. Bedrock Agents enable autonomous multi-step task execution with custom action groups. Bedrock Knowledge Bases provide managed RAG over S3 documents. Claude Code generates Bedrock invocation patterns, streaming handlers, agent definitions, and the CDK infrastructure for deploying AI applications on AWS.

CLAUDE.md for Bedrock Projects

## Bedrock Stack
- AWS SDK v3 with @aws-sdk/client-bedrock-runtime
- Primary model: anthropic.claude-sonnet-4-5-20251022-v2:0 (latest Claude Sonnet)
- Streaming: InvokeModelWithResponseStreamCommand for latency-sensitive UX
- Agents: Bedrock Agents with Lambda action groups for tool use
- RAG: Bedrock Knowledge Bases (S3 → OpenSearch Serverless → retrieval)
- Guardrails: content filtering + PII detection on all user-facing prompts
- Region: us-east-1 (broadest model availability)
- Auth: IAM role with bedrock:InvokeModel + bedrock:InvokeAgent permissions
- Cost: use converse API with caching for repeated system prompts (up to 90% savings)

Basic Invocation with Converse API

// lib/bedrock.ts
import {
    BedrockRuntimeClient,
    ConverseCommand,
    ConverseStreamCommand,
    type Message,
} from '@aws-sdk/client-bedrock-runtime';

const client = new BedrockRuntimeClient({ region: 'us-east-1' });

const DEFAULT_MODEL = 'anthropic.claude-sonnet-4-5-20251022-v2:0';

export interface ConversationOptions {
    modelId?: string;
    maxTokens?: number;
    temperature?: number;
    systemPrompt?: string;
    guardrailId?: string;
    guardrailVersion?: string;
}

export async function chat(
    messages: Message[],
    options: ConversationOptions = {},
): Promise<string> {
    const {
        modelId = DEFAULT_MODEL,
        maxTokens = 2048,
        temperature = 0.7,
        systemPrompt,
        guardrailId,
        guardrailVersion,
    } = options;
    
    const response = await client.send(new ConverseCommand({
        modelId,
        messages,
        system: systemPrompt ? [{ text: systemPrompt }] : undefined,
        inferenceConfig: { maxTokens, temperature },
        guardrailConfig: guardrailId
            ? { guardrailIdentifier: guardrailId, guardrailVersion: guardrailVersion ?? 'DRAFT', trace: 'enabled' }
            : undefined,
    }));
    
    if (response.output?.message?.content?.[0]?.text) {
        return response.output.message.content[0].text;
    }
    
    // Check for guardrail intervention
    if (response.stopReason === 'guardrail_intervened') {
        throw new Error('Content filtered by guardrail');
    }
    
    throw new Error('Unexpected response format');
}

// Streaming response — yields text chunks
export async function* chatStream(
    messages: Message[],
    options: ConversationOptions = {},
): AsyncGenerator<string> {
    const { modelId = DEFAULT_MODEL, maxTokens = 2048, systemPrompt } = options;
    
    const response = await client.send(new ConverseStreamCommand({
        modelId,
        messages,
        system: systemPrompt ? [{ text: systemPrompt }] : undefined,
        inferenceConfig: { maxTokens },
    }));
    
    if (!response.stream) throw new Error('No stream in response');
    
    for await (const event of response.stream) {
        if (event.contentBlockDelta?.delta?.text) {
            yield event.contentBlockDelta.delta.delta?.text ?? '';
        }
    }
}

Tool Use (Function Calling)

// lib/bedrockTools.ts — tool use with ConverseCommand
import type { Tool, ToolResultBlock } from '@aws-sdk/client-bedrock-runtime';

const TOOLS: Tool[] = [
    {
        toolSpec: {
            name: 'get_order',
            description: 'Look up an order by ID. Returns order details including status, items, and total.',
            inputSchema: {
                json: {
                    type: 'object',
                    properties: {
                        order_id: { type: 'string', description: 'The order ID (e.g. ord_123abc)' },
                    },
                    required: ['order_id'],
                },
            },
        },
    },
    {
        toolSpec: {
            name: 'list_orders',
            description: 'List orders for a customer, optionally filtered by status.',
            inputSchema: {
                json: {
                    type: 'object',
                    properties: {
                        customer_id: { type: 'string' },
                        status: { type: 'string', enum: ['pending', 'processing', 'shipped', 'delivered', 'cancelled'] },
                        limit: { type: 'integer', minimum: 1, maximum: 50 },
                    },
                    required: ['customer_id'],
                },
            },
        },
    },
];

async function executeToolCall(name: string, input: Record<string, unknown>): Promise<string> {
    switch (name) {
        case 'get_order':
            return JSON.stringify(await orderService.getOrder(input.order_id as string));
        case 'list_orders':
            return JSON.stringify(await orderService.listOrders(input));
        default:
            throw new Error(`Unknown tool: ${name}`);
    }
}

export async function runOrderAgent(userMessage: string): Promise<string> {
    const messages: Message[] = [{ role: 'user', content: [{ text: userMessage }] }];
    
    while (true) {
        const response = await client.send(new ConverseCommand({
            modelId: DEFAULT_MODEL,
            messages,
            toolConfig: { tools: TOOLS, toolChoice: { auto: {} } },
            system: [{ text: 'You are a helpful order management assistant. Use the provided tools to answer questions about orders.' }],
            inferenceConfig: { maxTokens: 2048 },
        }));
        
        // End of conversation
        if (response.stopReason === 'end_turn') {
            return response.output?.message?.content?.[0]?.text ?? '';
        }
        
        // Tool use requested
        if (response.stopReason === 'tool_use') {
            const assistantContent = response.output?.message?.content ?? [];
            messages.push({ role: 'assistant', content: assistantContent });
            
            // Execute all tool calls in parallel
            const toolResults: ToolResultBlock[] = await Promise.all(
                assistantContent
                    .filter(block => 'toolUse' in block && block.toolUse)
                    .map(async (block) => {
                        const { toolUseId, name, input } = block.toolUse!;
                        try {
                            const result = await executeToolCall(name!, input as Record<string, unknown>);
                            return { toolUseId: toolUseId!, content: [{ text: result }] };
                        } catch (e: any) {
                            return { toolUseId: toolUseId!, content: [{ text: `Error: ${e.message}` }], status: 'error' };
                        }
                    })
            );
            
            messages.push({ role: 'user', content: toolResults.map(r => ({ toolResult: r })) });
        }
    }
}

Bedrock Knowledge Base (RAG)

// lib/knowledgeBase.ts
import {
    BedrockAgentRuntimeClient,
    RetrieveAndGenerateCommand,
    RetrieveCommand,
} from '@aws-sdk/client-bedrock-agent-runtime';

const agentRuntimeClient = new BedrockAgentRuntimeClient({ region: 'us-east-1' });

const KNOWLEDGE_BASE_ID = process.env.BEDROCK_KB_ID!;

// Retrieve + generate in one call
export async function askKnowledgeBase(question: string, sessionId?: string): Promise<{
    answer: string;
    citations: Array<{ text: string; location: string }>;
    sessionId: string;
}> {
    const response = await agentRuntimeClient.send(new RetrieveAndGenerateCommand({
        input: { text: question },
        retrieveAndGenerateConfiguration: {
            type: 'KNOWLEDGE_BASE',
            knowledgeBaseConfiguration: {
                knowledgeBaseId: KNOWLEDGE_BASE_ID,
                modelArn: `arn:aws:bedrock:us-east-1::foundation-model/${DEFAULT_MODEL}`,
                retrievalConfiguration: {
                    vectorSearchConfiguration: { numberOfResults: 5 },
                },
                generationConfiguration: {
                    inferenceConfig: {
                        textInferenceConfig: {
                            maxTokens: 1024,
                            temperature: 0.3,  // Lower temp for factual retrieval
                        },
                    },
                },
            },
        },
        sessionId,
    }));
    
    const citations = (response.citations ?? []).flatMap(c =>
        (c.retrievedReferences ?? []).map(ref => ({
            text: ref.content?.text ?? '',
            location: ref.location?.s3Location?.uri ?? '',
        }))
    );
    
    return {
        answer: response.output?.text ?? '',
        citations,
        sessionId: response.sessionId ?? '',
    };
}

// Pure retrieval: get relevant chunks without generation
export async function retrieveChunks(query: string, k = 5) {
    const response = await agentRuntimeClient.send(new RetrieveCommand({
        knowledgeBaseId: KNOWLEDGE_BASE_ID,
        retrievalQuery: { text: query },
        retrievalConfiguration: {
            vectorSearchConfiguration: { numberOfResults: k },
        },
    }));
    
    return (response.retrievalResults ?? []).map(r => ({
        text: r.content?.text ?? '',
        score: r.score ?? 0,
        source: r.location?.s3Location?.uri ?? '',
    }));
}

CDK Infrastructure

// infra/bedrock-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3 from 'aws-cdk-lib/aws-s3';

export class BedrockAiStack extends cdk.Stack {
    constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
        super(scope, id, props);
        
        // IAM role for Lambda to call Bedrock
        const bedrockRole = new iam.Role(this, 'BedrockLambdaRole', {
            assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com'),
            inlinePolicies: {
                bedrock: new iam.PolicyDocument({
                    statements: [
                        new iam.PolicyStatement({
                            effect: iam.Effect.ALLOW,
                            actions: [
                                'bedrock:InvokeModel',
                                'bedrock:InvokeModelWithResponseStream',
                                'bedrock:Retrieve',
                                'bedrock:RetrieveAndGenerate',
                            ],
                            resources: [
                                `arn:aws:bedrock:*::foundation-model/*`,
                                `arn:aws:bedrock:${this.region}:${this.account}:knowledge-base/*`,
                            ],
                        }),
                    ],
                }),
            },
        });
        
        const chatFunction = new lambda.Function(this, 'ChatFunction', {
            runtime: lambda.Runtime.NODEJS_22_X,
            handler: 'index.handler',
            code: lambda.Code.fromAsset('lambda/chat'),
            role: bedrockRole,
            timeout: cdk.Duration.seconds(30),
            memorySize: 512,
            environment: {
                MODEL_ID: 'anthropic.claude-sonnet-4-5-20251022-v2:0',
                KB_ID: process.env.BEDROCK_KB_ID ?? '',
            },
        });
    }
}

For the Anthropic SDK direct usage patterns (without AWS Bedrock), see the Anthropic SDK guide for tool use, streaming, and agentic loop patterns. For the LangChain/LangGraph orchestration layer that wraps Bedrock models, the LangChain guide covers stateful agent graphs. The Claude Skills 360 bundle includes AWS Bedrock skill sets covering model invocation, Bedrock Agents, and Knowledge Base RAG patterns. Start with the free tier to try Bedrock integration generation.

Claude Code for AWS Bedrock: Building AI Applications on Managed Infrastructure

CLAUDE.md for Bedrock Projects

Basic Invocation with Converse API

Tool Use (Function Calling)

Bedrock Knowledge Base (RAG)

CDK Infrastructure

Keep Reading

Claude Code for PyTorch: Model Training, Custom Datasets, and Production Deployment

Claude Code for MCP: Building Model Context Protocol Servers

Claude Code for LangChain and LangGraph: AI Agent Orchestration

Put these ideas into practice