A single WebSocket server works fine until you need more than one instance — then messages sent to a user connected to instance A don’t reach that user from instance B. The fix is a pub/sub layer (Redis is standard) that broadcasts messages across all server instances. Claude Code generates the adapter configuration, reconnection logic, and health check patterns that make multi-instance WebSocket deployments reliable.

This guide covers WebSocket scaling with Claude Code: Redis adapter for Socket.io, sticky sessions in load balancers, connection lifecycle, and graceful shutdown.

The Multi-Instance Problem

We're scaling our chat app to 3 server instances.
Messages from users on server 1 aren't reaching
users on server 2. Fix it.

When a user sends a message, it only reaches users connected to the same server instance:

Client A → Server 1 → message published to Redis
                         ↓
                    Redis pub/sub
                    ↙         ↘
               Server 1     Server 2  → Client B receives it
               Server 3 → Client C receives it

Redis pub/sub makes every server instance see every message.

Socket.io with Redis Adapter

// src/socket-server.ts
import { createServer } from 'http';
import { Server } from 'socket.io';
import { createAdapter } from '@socket.io/redis-adapter';
import { createClient } from 'redis';

const httpServer = createServer(app);

const io = new Server(httpServer, {
  cors: {
    origin: process.env.ALLOWED_ORIGIN ?? '*',
    methods: ['GET', 'POST'],
    credentials: true,
  },
  // Reconnection: clients retry automatically with exponential backoff
  pingTimeout: 20000,
  pingInterval: 10000,
  
  // Allow clients to upgrade from HTTP long-polling to WebSocket
  transports: ['websocket', 'polling'],
});

// Redis adapter — replaces default in-memory adapter
const pubClient = createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();

await Promise.all([pubClient.connect(), subClient.connect()]);

io.adapter(createAdapter(pubClient, subClient));
console.log('Socket.io using Redis adapter');

// Now io.to('room').emit() broadcasts via Redis to all instances
export { io };

Room-Based Broadcasting

// src/socket-handlers/chat.ts
import { Server, Socket } from 'socket.io';
import { verifySocketAuth, getRoomMembers, saveMessage } from '../lib';

export function registerChatHandlers(io: Server, socket: Socket) {
  // Auth middleware ran before this — socket.data.userId is set
  const userId = socket.data.userId;
  
  socket.on('join_channel', async (channelId: string) => {
    // Validate user has access to this channel
    const hasAccess = await checkChannelAccess(userId, channelId);
    
    if (!hasAccess) {
      socket.emit('error', { code: 'FORBIDDEN', message: 'Access denied' });
      return;
    }
    
    await socket.join(`channel:${channelId}`);
    
    // Notify others in the room
    socket.to(`channel:${channelId}`).emit('user_joined', {
      userId,
      channelId,
      timestamp: new Date().toISOString(),
    });
    
    socket.emit('joined', { channelId });
  });
  
  socket.on('send_message', async ({ channelId, content, clientMessageId }: {
    channelId: string;
    content: string;
    clientMessageId: string; // Client-generated ID for deduplication
  }) => {
    // Save to database
    const message = await saveMessage({
      channelId,
      userId,
      content,
      clientMessageId,
    });
    
    // Broadcast to ALL users in channel — via Redis across all instances
    io.to(`channel:${channelId}`).emit('new_message', {
      id: message.id,
      channelId,
      userId,
      content: message.content,
      timestamp: message.createdAt,
      clientMessageId,
    });
  });
  
  socket.on('typing', ({ channelId }: { channelId: string }) => {
    // Notify others (not sender) — volatile: ok to drop if disconnected
    socket.volatile.to(`channel:${channelId}`).emit('user_typing', {
      userId,
      channelId,
    });
  });
  
  socket.on('disconnect', (reason) => {
    console.log(`User ${userId} disconnected: ${reason}`);
    // Room cleanup is automatic — Socket.io removes from all rooms on disconnect
  });
}

Authentication Middleware

// src/socket-middleware/auth.ts
import { Server } from 'socket.io';
import { verifyJWT } from '../lib/jwt';

export function applyAuthMiddleware(io: Server) {
  io.use(async (socket, next) => {
    const token = socket.handshake.auth.token ||
                  socket.handshake.headers.authorization?.replace('Bearer ', '');
    
    if (!token) {
      return next(new Error('Authentication required'));
    }
    
    const user = verifyJWT(token);
    if (!user) {
      return next(new Error('Invalid or expired token'));
    }
    
    // Attach user to socket for use in handlers
    socket.data.userId = user.id;
    socket.data.user = user;
    
    next();
  });
}

Sticky Session Load Balancing

Our Kubernetes ingress is round-robin load balancing WebSocket connections.
The HTTP upgrade handshake lands on server 1 but subsequent requests
go to server 2 and fail.

WebSocket connections require all traffic to go to the same server instance (at least until the upgrade completes). Nginx handles this with ip_hash:

# nginx.conf
upstream socket_servers {
  ip_hash;  # Sticky sessions: same IP → same upstream
  server socket-1:3000;
  server socket-2:3000;
  server socket-3:3000;
}

server {
  listen 80;
  
  location /socket.io/ {
    proxy_pass http://socket_servers;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    
    # WebSocket connections can be long-lived
    proxy_read_timeout 3600s;
    proxy_send_timeout 3600s;
  }
}

For Kubernetes with more sophisticated routing:

# Ingress annotation for sticky sessions
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/affinity-mode: "balanced"
    nginx.ingress.kubernetes.io/session-cookie-name: "socket-server"
    nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"

Graceful Shutdown

When we deploy a new version, existing WebSocket connections
get dropped abruptly. Add graceful shutdown.

// src/graceful-shutdown.ts
import { Server } from 'socket.io';
import { Server as HttpServer } from 'http';

export function setupGracefulShutdown(
  io: Server,
  httpServer: HttpServer,
) {
  const shutdown = async (signal: string) => {
    console.log(`${signal} received — graceful shutdown starting`);
    
    // Stop accepting new WebSocket connections
    io.close(() => {
      console.log('Socket.io server closed');
    });
    
    // Notify all connected clients to reconnect elsewhere
    io.emit('server_shutdown', {
      message: 'Server restarting — reconnecting you to another instance',
      reconnectDelay: 2000,
    });
    
    // Give clients 2 seconds to get the disconnect message
    await new Promise(resolve => setTimeout(resolve, 2000));
    
    // Disconnect all sockets
    for (const socket of await io.fetchSockets()) {
      socket.disconnect(true);
    }
    
    // Stop accepting new HTTP connections
    httpServer.close(() => {
      console.log('HTTP server closed');
      process.exit(0);
    });
    
    // Force exit after 30s
    setTimeout(() => {
      console.error('Forced shutdown after timeout');
      process.exit(1);
    }, 30_000);
  };
  
  process.on('SIGTERM', () => shutdown('SIGTERM'));
  process.on('SIGINT', () => shutdown('SIGINT'));
}

Client-side reconnection with backoff:

// Frontend — Socket.io client reconnects automatically
const socket = io('https://api.myapp.com', {
  auth: { token: getAuthToken() },
  reconnection: true,
  reconnectionAttempts: 10,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 10000,
  randomizationFactor: 0.5,
});

socket.on('server_shutdown', ({ reconnectDelay }) => {
  // Server told us it's shutting down — proactively reconnect after the delay
  setTimeout(() => socket.connect(), reconnectDelay);
});

socket.on('connect_error', (error) => {
  console.error('Connection error:', error.message);
  // Socket.io retries automatically — just log
});

For the foundational WebSocket and real-time patterns, see the WebSockets guide. For Redis used for pub/sub and caching alongside WebSockets, see the Redis guide. For deployment in Kubernetes with horizontal pod autoscaling and proper shutdown hooks, see the Kubernetes guide. The Claude Skills 360 bundle includes WebSocket scaling skill sets for production chat and real-time applications. Start with the free tier to try WebSocket architecture patterns.

Claude Code for WebSocket Scaling: Redis Pub/Sub and Sticky Sessions

The Multi-Instance Problem

Socket.io with Redis Adapter

Room-Based Broadcasting

Authentication Middleware

Sticky Session Load Balancing

Graceful Shutdown

Keep Reading

Claude Code for Functional Programming: Pure Functions, Composition, and fp-ts

Claude Code for Web Scraping: Playwright, Anti-Bot Handling, and Data Pipelines

Claude Code for Search Implementation: Full-Text, Vector, and Faceted Search

Put these ideas into practice