Scheduled jobs are where reliability matters most — no one notices when they succeed, and production incidents happen when they fail silently or run twice. Claude Code generates job infrastructure with proper retry logic, distributed locking, and observability. The pattern applies whether you’re using node-cron for simple schedules, BullMQ for distributed queues, or cloud scheduler services.
This guide covers scheduled jobs with Claude Code: basic cron, BullMQ queues, distributed locking, and job monitoring.
Simple Cron with node-cron
We need a daily report email at 9am UTC.
Also weekly cleanup of expired sessions every Sunday midnight.
// src/jobs/scheduler.ts
import cron from 'node-cron';
import { sendDailyReport } from './daily-report';
import { cleanupExpiredSessions } from './cleanup-sessions';
import { logger } from '../lib/logger';
export function startScheduler() {
// Daily report — 9:00 AM UTC
cron.schedule('0 9 * * *', async () => {
await runJob('daily-report', sendDailyReport);
}, {
timezone: 'UTC',
});
// Weekly cleanup — Sundays at midnight UTC
cron.schedule('0 0 * * 0', async () => {
await runJob('cleanup-expired-sessions', cleanupExpiredSessions);
}, {
timezone: 'UTC',
});
logger.info('Scheduler started');
}
// Wrapper: adds error handling, duration logging, and Sentry reporting
async function runJob(name: string, fn: () => Promise<void>) {
const start = Date.now();
logger.info({ job: name }, 'Job started');
try {
await fn();
const duration = Date.now() - start;
logger.info({ job: name, duration }, 'Job completed successfully');
// Heartbeat to monitoring (e.g., BetterUptime, Healthchecks.io)
if (process.env[`${name.toUpperCase().replace(/-/g, '_')}_PING_URL`]) {
await fetch(process.env[`${name.toUpperCase().replace(/-/g, '_')}_PING_URL`]!);
}
} catch (error) {
logger.error({ job: name, error }, 'Job failed');
Sentry.captureException(error, { tags: { job: name } });
// Don't re-throw — cron continues running other schedules
}
}
BullMQ for Distributed Job Queues
We process thousands of image resizes per day.
They're currently blocking the web request.
Move them to a background queue.
// src/jobs/queues.ts
import { Queue, Worker, QueueEvents, type Job } from 'bullmq';
import Redis from 'ioredis';
const connection = new Redis({
host: process.env.REDIS_HOST!,
port: parseInt(process.env.REDIS_PORT ?? '6379'),
maxRetriesPerRequest: null, // Required by BullMQ
});
// Queue definition
export const imageQueue = new Queue('image-processing', {
connection,
defaultJobOptions: {
attempts: 3, // Retry 3 times on failure
backoff: {
type: 'exponential',
delay: 2000, // 2s, 4s, 8s
},
removeOnComplete: { age: 24 * 60 * 60 }, // Keep 24h
removeOnFail: { age: 7 * 24 * 60 * 60 }, // Keep failed 7 days
},
});
// Add jobs from the web server
export async function enqueueImageResize(imageId: string, userId: string) {
const job = await imageQueue.add(
'resize',
{ imageId, userId },
{
// Prevent duplicate processing of same image
jobId: `resize-${imageId}`, // Deterministic ID — duplicates are ignored
priority: 1,
},
);
return job.id;
}
// Worker — runs in a separate process
export function startImageWorker() {
const worker = new Worker(
'image-processing',
async (job: Job) => {
const { imageId, userId } = job.data;
// Update progress — visible in monitoring
await job.updateProgress(0);
const original = await downloadImage(imageId);
await job.updateProgress(25);
const resized = await resizeImage(original, { width: 800, height: 600 });
await job.updateProgress(75);
await uploadResizedImage(imageId, resized);
await job.updateProgress(100);
// Store result — accessible to the job producer
return { resizedUrl: `https://cdn.example.com/images/${imageId}/800x600.webp` };
},
{
connection,
concurrency: 5, // Process 5 images simultaneously
limiter: {
max: 100, // Max 100 jobs per minute (rate limiting)
duration: 60 * 1000,
},
},
);
worker.on('completed', (job) => {
logger.info({ jobId: job.id, result: job.returnvalue }, 'Image processed');
});
worker.on('failed', (job, error) => {
logger.error({ jobId: job?.id, error: error.message }, 'Image processing failed');
});
return worker;
}
// Web server — enqueue and return immediately
app.post('/api/images/:id/resize', authenticate, async (req, res) => {
const jobId = await enqueueImageResize(req.params.id, req.user.id);
res.json({
message: 'Resize queued',
jobId,
statusUrl: `/api/jobs/${jobId}`,
});
// Response: 202 Accepted — processing happens asynchronously
});
// Check job status
app.get('/api/jobs/:jobId', authenticate, async (req, res) => {
const job = await imageQueue.getJob(req.params.jobId);
if (!job) return res.status(404).json({ error: 'Job not found' });
const state = await job.getState();
res.json({
id: job.id,
state,
progress: job.progress,
result: job.returnvalue,
failedReason: job.failedReason,
attemptsMade: job.attemptsMade,
});
});
Distributed Locking
The session cleanup job is running on 3 servers simultaneously.
It's processing the same sessions 3 times. Add a distributed lock.
// src/lib/distributed-lock.ts
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
// Simple distributed lock using Redis SET NX
export async function withLock<T>(
lockKey: string,
ttlSeconds: number,
fn: () => Promise<T>,
): Promise<T | null> {
const lockValue = `${process.pid}-${Date.now()}-${Math.random()}`;
// Atomic: SET only if key doesn't exist
const acquired = await redis.set(
`lock:${lockKey}`,
lockValue,
'NX', // Only set if Not eXists
'EX', ttlSeconds, // TTL in seconds
);
if (!acquired) {
logger.info({ lock: lockKey }, 'Lock not acquired — another instance is running');
return null;
}
logger.info({ lock: lockKey }, 'Lock acquired');
try {
return await fn();
} finally {
// Only release if we still hold the lock (value matches)
const luaScript = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
`;
await redis.eval(luaScript, 1, `lock:${lockKey}`, lockValue);
logger.info({ lock: lockKey }, 'Lock released');
}
}
// Usage in cleanup job
async function cleanupExpiredSessions() {
const result = await withLock('cleanup-expired-sessions', 300, async () => {
const deleted = await db.query(
'DELETE FROM sessions WHERE expires_at < NOW() RETURNING id'
);
logger.info({ count: deleted.rowCount }, 'Expired sessions cleaned up');
return deleted.rowCount;
});
if (result === null) {
logger.info('Cleanup skipped — another instance is running');
}
}
For running background workers in Kubernetes with proper scaling and shutdown hooks, see the Kubernetes guide. For observability on job failures including distributed tracing, see the OpenTelemetry guide. The Claude Skills 360 bundle includes job queue skill sets for BullMQ and distributed scheduling patterns. Start with the free tier to move synchronous operations to background queues.