Platform engineering is the practice of building internal tooling that makes developers more productive. Instead of each team solving the same infrastructure problems, the platform team builds once: service templates, deployment pipelines, observability scaffolding, and self-service APIs for common operations. Claude Code accelerates both the platform itself and the tooling developers use to interact with it.
Golden Path Service Templates
Create a golden path template for new microservices.
Every new service should get: HTTP server, health checks,
structured logging, Prometheus metrics, distributed tracing,
Dockerfile, and Kubernetes manifests — all pre-wired.
CLAUDE.md for Platform Engineering
## Platform Stack
- Languages: Go (primary), TypeScript (secondary)
- Container registry: ECR (us-east-1)
- Kubernetes: EKS, namespace per team, ArgoCD for GitOps
- Service mesh: Istio with mTLS (see service-mesh playbook)
- Observability: OpenTelemetry → Grafana Cloud (traces + metrics + logs)
- Secret management: AWS Secrets Manager + External Secrets Operator
## Golden Path Requirements
Every new service MUST have:
- /healthz (liveness) and /readyz (readiness) endpoints
- Structured JSON logs via zap (Go) or pino (Node)
- OTLP traces initialized from OTEL_EXPORTER_OTLP_ENDPOINT env var
- Prometheus /metrics endpoint
- Graceful shutdown on SIGTERM (drain connections, then exit)
- Multi-stage Dockerfile (< 100MB final image)
- Kubernetes Deployment + Service + HPA manifests
Generate a golden path Go service template for: user-service.
Apply all platform standards. Output the complete file tree.
// cmd/server/main.go — golden path Go service
package main
import (
"context"
"net/http"
"os"
"os/signal"
"syscall"
"time"
"github.com/prometheus/client_golang/prometheus/promhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"go.opentelemetry.io/otel/sdk/trace"
"go.uber.org/zap"
"github.com/mycompany/user-service/internal/api"
"github.com/mycompany/user-service/internal/config"
"github.com/mycompany/user-service/internal/database"
)
func main() {
// Structured logger — always first
logger, _ := zap.NewProduction()
defer logger.Sync()
cfg := config.Load()
// Initialize OpenTelemetry tracing
tp := initTracing(cfg.OtelEndpoint, logger)
defer tp.Shutdown(context.Background())
// Database connection
db := database.Connect(cfg.DatabaseURL, logger)
defer db.Close()
// HTTP server with all required endpoints
mux := http.NewServeMux()
// Platform requirements
mux.HandleFunc("/healthz", healthz)
mux.HandleFunc("/readyz", readyz(db))
mux.Handle("/metrics", promhttp.Handler())
// Service endpoints
handler := api.NewHandler(db, logger)
handler.Register(mux)
srv := &http.Server{
Addr: ":" + cfg.Port,
Handler: mux,
ReadTimeout: 10 * time.Second,
WriteTimeout: 30 * time.Second,
IdleTimeout: 120 * time.Second,
}
// Start server
go func() {
logger.Info("Starting server", zap.String("port", cfg.Port))
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
logger.Fatal("Server failed", zap.Error(err))
}
}()
// Graceful shutdown on SIGTERM/SIGINT
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
<-quit
logger.Info("Shutdown signal received, draining connections...")
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
logger.Error("Server shutdown error", zap.Error(err))
}
logger.Info("Server stopped")
}
func initTracing(endpoint string, logger *zap.Logger) *trace.TracerProvider {
if endpoint == "" {
logger.Warn("OTEL_EXPORTER_OTLP_ENDPOINT not set — tracing disabled")
return trace.NewTracerProvider()
}
exporter, err := otlptracehttp.New(
context.Background(),
otlptracehttp.WithEndpoint(endpoint),
otlptracehttp.WithInsecure(),
)
if err != nil {
logger.Error("Failed to create trace exporter", zap.Error(err))
return trace.NewTracerProvider()
}
tp := trace.NewTracerProvider(
trace.WithBatcher(exporter),
trace.WithSampler(trace.ParentBased(trace.TraceIDRatioBased(0.1))), // 10% sampling
)
otel.SetTracerProvider(tp)
return tp
}
Self-Service Infrastructure API
Developers should be able to provision a new database
via API call without involving the platform team.
Build the provisioning service.
// platform-api/src/routes/databases.ts
import { Router } from 'express';
import { z } from 'zod';
import { RDSClient, CreateDBInstanceCommand, DescribeDBInstancesCommand } from '@aws-sdk/client-rds';
import { SecretsManagerClient, CreateSecretCommand } from '@aws-sdk/client-secrets-manager';
import { db } from '../lib/db';
import { requireAuth, requireTeamMembership } from '../middleware/auth';
import { auditLog } from '../lib/audit';
const router = Router();
const rds = new RDSClient({ region: 'us-east-1' });
const secrets = new SecretsManagerClient({ region: 'us-east-1' });
const ProvisionRequest = z.object({
name: z.string().regex(/^[a-z][a-z0-9-]{2,30}$/, 'Must be lowercase alphanumeric with hyphens'),
tier: z.enum(['dev', 'staging', 'production']),
engine: z.enum(['postgres', 'mysql']).default('postgres'),
instanceClass: z.enum(['db.t3.micro', 'db.t3.small', 'db.t3.medium']).default('db.t3.micro'),
multiAz: z.boolean().default(false),
});
// POST /api/databases — self-service database provisioning
router.post('/', requireAuth, async (req, res) => {
const team = req.user.team;
const parse = ProvisionRequest.safeParse(req.body);
if (!parse.success) {
return res.status(400).json({ error: 'Validation failed', details: parse.error.flatten() });
}
const { name, tier, engine, instanceClass, multiAz } = parse.data;
// Enforce guardrails
if (tier === 'production' && instanceClass === 'db.t3.micro') {
return res.status(400).json({
error: 'Production databases must use at least db.t3.small',
hint: 'Use db.t3.small or larger for production workloads'
});
}
if (tier === 'production' && !multiAz) {
return res.status(400).json({
error: 'Production databases require Multi-AZ for availability'
});
}
const dbIdentifier = `${team}-${name}-${tier}`;
// Check for duplicate
const existing = await db('databases').where({ identifier: dbIdentifier }).first();
if (existing) {
return res.status(409).json({ error: `Database ${dbIdentifier} already exists` });
}
// Generate credentials
const password = generateSecurePassword();
const secretArn = await createSecret(secrets, dbIdentifier, password);
// Provision RDS instance
await rds.send(new CreateDBInstanceCommand({
DBInstanceIdentifier: dbIdentifier,
DBInstanceClass: instanceClass,
Engine: engine,
MasterUsername: 'admin',
MasterUserPassword: password,
AllocatedStorage: tier === 'production' ? 100 : 20,
StorageType: 'gp3',
MultiAZ: multiAz,
DeletionProtection: tier === 'production',
BackupRetentionPeriod: tier === 'production' ? 7 : 1,
Tags: [
{ Key: 'Team', Value: team },
{ Key: 'Environment', Value: tier },
{ Key: 'ManagedBy', Value: 'platform-api' },
],
}));
// Track in platform database
await db('databases').insert({
identifier: dbIdentifier,
team,
tier,
engine,
instance_class: instanceClass,
secret_arn: secretArn,
status: 'creating',
created_by: req.user.id,
});
await auditLog({ action: 'database.created', actor: req.user.id, resource: dbIdentifier, tier });
res.status(202).json({
identifier: dbIdentifier,
status: 'creating',
estimatedReadyAt: new Date(Date.now() + 15 * 60 * 1000).toISOString(),
secretArn: secretArn,
message: `Database ${dbIdentifier} is being provisioned. Check status with GET /api/databases/${dbIdentifier}`,
});
});
Backstage Software Catalog
Set up a Backstage catalog that auto-discovers services
from our Kubernetes cluster and GitHub repos.
# catalog/templates/microservice/template.yaml
# Backstage Software Template — creates a new service with golden path
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: microservice-template
title: New Microservice
description: Creates a new Go microservice with all platform integrations
tags:
- go
- golden-path
- recommended
spec:
owner: platform-team
type: service
parameters:
- title: Service Details
required:
- name
- owner
- description
properties:
name:
title: Service Name
type: string
pattern: '^[a-z][a-z0-9-]{2,30}-service$'
description: "Must end in -service (e.g., payment-service)"
ui:autofocus: true
owner:
title: Owning Team
type: string
ui:field: OwnerPicker
ui:options:
catalogFilter:
kind: Group
description:
title: Description
type: string
tier:
title: Service Tier
type: string
enum: [internal, external, critical]
description: "critical = PagerDuty rotation, SLA monitoring required"
- title: Choose Repository Location
properties:
repoUrl:
title: Repository Location
type: string
ui:field: RepoUrlPicker
ui:options:
allowedHosts:
- github.com
allowedOrganizations:
- mycompany
steps:
- id: fetch-template
name: Generate service from template
action: fetch:template
input:
url: ./skeleton
values:
name: ${{ parameters.name }}
owner: ${{ parameters.owner }}
description: ${{ parameters.description }}
tier: ${{ parameters.tier }}
- id: publish
name: Publish to GitHub
action: publish:github
input:
allowedHosts: ['github.com']
description: ${{ parameters.description }}
repoUrl: ${{ parameters.repoUrl }}
defaultBranch: main
gitCommitMessage: "feat: initialize service from golden path template"
- id: register
name: Register in catalog
action: catalog:register
input:
repoContentsUrl: ${{ steps['publish'].output.repoContentsUrl }}
catalogInfoPath: '/catalog-info.yaml'
- id: create-jira
name: Create onboarding ticket
action: jira:issue:create
input:
projectKey: PLATFORM
summary: "Platform onboarding: ${{ parameters.name }}"
description: "New service ${{ parameters.name }} (owner: ${{ parameters.owner }}) needs: 1) PagerDuty rotation setup, 2) Grafana dashboard, 3) SLO definition"
output:
links:
- title: Open Repository
url: ${{ steps['publish'].output.remoteUrl }}
- title: Open in Catalog
url: ${{ steps['register'].output.catalogInfoUrl }}
Platform CLI
Developers need a CLI to interact with the platform:
check service health, view logs, open dashboards,
request access — without learning kubectl.
// platform-cli/src/commands/status.ts
import { Command } from 'commander';
import { platformApi } from '../lib/api';
import chalk from 'chalk';
import Table from 'cli-table3';
export const statusCommand = new Command('status')
.description('Show health status of your team\'s services')
.option('-t, --team <team>', 'Team name (default: inferred from git config)')
.action(async (options) => {
const team = options.team ?? await getTeamFromGitConfig();
const spinner = ora(`Fetching status for team ${team}...`).start();
try {
const services = await platformApi.get(`/services?team=${team}`);
spinner.stop();
const table = new Table({
head: ['Service', 'Status', 'Pods', 'CPU', 'Memory', 'Error Rate', 'P99'],
style: { head: ['cyan'] },
});
for (const svc of services) {
const statusColor = svc.status === 'healthy' ? chalk.green
: svc.status === 'degraded' ? chalk.yellow
: chalk.red;
table.push([
chalk.bold(svc.name),
statusColor(svc.status),
`${svc.pods.ready}/${svc.pods.total}`,
`${svc.metrics.cpuPercent}%`,
`${svc.metrics.memoryMb}MB`,
svc.metrics.errorRate > 1
? chalk.red(`${svc.metrics.errorRate.toFixed(2)}%`)
: `${svc.metrics.errorRate.toFixed(2)}%`,
svc.metrics.p99LatencyMs > 500
? chalk.yellow(`${svc.metrics.p99LatencyMs}ms`)
: `${svc.metrics.p99LatencyMs}ms`,
]);
}
console.log(table.toString());
const unhealthy = services.filter(s => s.status !== 'healthy');
if (unhealthy.length > 0) {
console.log(chalk.red(`\n⚠ ${unhealthy.length} service(s) need attention`));
for (const svc of unhealthy) {
console.log(chalk.red(` ${svc.name}: ${svc.statusReason}`));
console.log(chalk.dim(` Run: platform logs ${svc.name} --tail=50`));
}
}
} catch (err) {
spinner.fail('Failed to fetch status');
console.error(chalk.red(err.message));
process.exit(1);
}
});
For the GitOps foundation that platform teams typically use to manage cluster state, see the GitOps and ArgoCD guide. For the Kubernetes infrastructure underlying the platform, the Kubernetes guide and Helm charts guide cover the deployment layer. The Claude Skills 360 bundle includes platform engineering skill sets for IDP setup, service templates, and self-service tooling. Start with the free tier to try service template generation.