Cloudflare Workers: Patterns That Actually Matter in Production

Workers are real edge compute — not Lambda@Edge duct-taped to CloudFront. But there are a handful of configuration mistakes and runtime pitfalls that will wreck you in production if you don't know about them. Here's what I've learned shipping Workers at scale.

The pitch for Cloudflare Workers is simple: your code runs in 300+ data centers globally, cold start is milliseconds, and you don't manage any servers. The reality mostly holds up. But "zero-config edge" implies a few things that aren't actually true out of the box, and the failure modes are subtle.

Configuration must-haves

Set compatibility_date to today

Every new Workers project should have a compatibility_date set to the current date in wrangler.jsonc. This opts you into the latest runtime behavior, APIs, and bug fixes. Older dates preserve legacy behavior for existing projects — fine for stability, but new projects should start current:

{
  "name": "my-worker",
  "compatibility_date": "2026-04-08",
  "compatibility_flags": ["nodejs_compat"]
}

Always enable nodejs_compat

Without nodejs_compat, imports from node:crypto, node:buffer, and node:stream fail at runtime with cryptic errors. Most non-trivial packages depend on at least one of these. Enable the flag and stop debugging phantom import failures.

Never hand-write your Env interface

If you're using TypeScript, run npx wrangler types to auto-generate your Env interface from your actual wrangler.jsonc bindings. Hand-writing it means it will drift from reality. Add --check in CI to fail if types are stale:

# Generate types locally
npx wrangler types

# CI check — exits 1 if wrangler.jsonc and generated types are out of sync
npx wrangler types --check

Streaming bodies — the 128 MB wall

Workers have a 128 MB memory limit. That's not a theoretical concern — it bites you as soon as you start buffering response bodies. The classic mistake:

// WRONG: buffers the entire response body into memory
const response = await fetch(upstreamUrl);
const text = await response.text(); // 150 MB → OOM
// RIGHT: stream the body directly without buffering
const response = await fetch(upstreamUrl);
return new Response(response.body, {
  status: response.status,
  headers: response.headers,
});

If you absolutely need to inspect or modify the body, enforce a size limit before reading:

export default {
  async fetch(request: Request): Promise {
    const MAX_BYTES = 10 * 1024 * 1024; // 10 MB
    const contentLength = Number(request.headers.get("content-length") ?? 0);
    if (contentLength > MAX_BYTES) {
      return new Response("Request too large", { status: 413 });
    }
    const body = await request.arrayBuffer();
    // process body...
  }
}

waitUntil for post-response work

ctx.waitUntil() lets you run work after the response has been sent — analytics, cache writes, logging that isn't on the critical path. The response returns fast; the work continues behind the scenes for up to 30 seconds.

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise {
    const response = await handleRequest(request, env);

    // Fire and forget — happens after response is sent
    ctx.waitUntil(logToAnalytics(request, response, env));

    return response;
  }
}
Pitfall: don't destructure ctx. This breaks with "Illegal invocation":

const { waitUntil } = ctx; waitUntil(promise); // ❌

Always call it directly: ctx.waitUntil(promise); // ✅

Queues vs Workflows: which one you actually need

Both handle async work, but they're different tools. Getting this wrong means either over-engineering or hitting a wall when your use case doesn't fit.

Reach for Queues when… Reach for Workflows when…
Decoupling producer from consumer Multiple dependent steps
Fan-out (one event → many consumers) Step results must be persisted
Buffering or batching messages Only failed steps should retry (not the whole job)
Simple single-step background jobs Long-running processes (hours or days)
At-least-once delivery with retries Human-approval steps via step.waitForEvent()

The patterns compose: use a Queue for high-throughput ingestion, then have the consumer trigger a Workflow per item for complex multi-step fulfillment. Queue handles the burst; Workflow handles the durability.

// Queue consumer triggers a Workflow per message
export default {
  async queue(batch: MessageBatch, env: Env): Promise {
    for (const msg of batch.messages) {
      await env.MY_WORKFLOW.create({
        id: msg.body.orderId,
        params: msg.body,
      });
      msg.ack();
    }
  }
}

Service bindings for Worker-to-Worker

If you have multiple Workers and one needs to call another, don't make an HTTP request to the public URL. Use service bindings: they're free, bypass the public internet, and support type-safe RPC via WorkerEntrypoint.

// auth-worker.ts
import { WorkerEntrypoint } from "cloudflare:workers";

export default class AuthService extends WorkerEntrypoint {
  async verifyToken(token: string): Promise {
    // Your auth logic here
    return token === await this.env.DB.prepare(
      "SELECT token FROM sessions WHERE token = ?"
    ).bind(token).first("token");
  }
}
// api-worker.ts — calls auth-worker via binding, no HTTP
export default {
  async fetch(request: Request, env: Env): Promise {
    const token = request.headers.get("Authorization")?.slice(7) ?? "";
    const valid = await env.AUTH_SERVICE.verifyToken(token);
    if (!valid) return new Response("Unauthorized", { status: 401 });
    // ...
  }
}

Declare the binding in wrangler.jsonc:

{
  "services": [
    { "binding": "AUTH_SERVICE", "service": "auth-worker" }
  ]
}

Hyperdrive for external databases

Every request to an external Postgres or MySQL database pays a TCP + TLS + auth handshake overhead. From a Cloudflare edge node, that's often 300–500ms before your first query even runs. Hyperdrive maintains a regional connection pool close to your database and eliminates that overhead.

import { Client } from "pg";

export default {
  async fetch(request: Request, env: Env): Promise {
    // Create a new Client per request — Hyperdrive manages the actual pool
    const client = new Client({ connectionString: env.HYPERDRIVE.connectionString });
    await client.connect();

    try {
      const result = await client.query("SELECT * FROM posts LIMIT 10");
      return Response.json(result.rows);
    } finally {
      await client.end();
    }
  }
}

Requires nodejs_compat and a Hyperdrive binding in wrangler.jsonc. If you're hitting an external database from Workers without Hyperdrive, you're leaving significant latency on the table.

Secrets: wrangler secret put, never in code

API keys, database URLs, OAuth secrets — none of these belong in wrangler.jsonc or source code:

# Deploy a secret (takes effect immediately)
wrangler secret put DATABASE_URL

# Stage a secret without deploying (use with gradual rollouts)
wrangler versions secret put DATABASE_URL

# In local dev: .dev.vars file (git-ignored, never committed)
# DATABASE_URL=postgres://localhost/mydb

Access secrets in the Worker exactly like any other binding: env.DATABASE_URL. No special handling needed — they're just environment variables that Cloudflare encrypts at rest and injects at runtime.

Custom domains vs routes — the DNS mistake everyone makes

These are different things with different DNS requirements:

The common failure: you add a route for api.example.com/v2/* but there's no DNS record for api.example.com. Result: ERR_NAME_NOT_RESOLVED and an hour of debugging.

Fix: if you're using a route but have no real origin behind it, add a proxied AAAA record pointing to 100:: as a placeholder. Cloudflare intercepts the request before it ever reaches that address.

Global mutable state will betray you

Workers reuse isolates across requests. Module-level variables persist between requests in the same isolate. This is intentional for performance (it's how you cache model weights or DB connections), but it means you can accidentally leak data between requests:

// WRONG — currentUser bleeds across requests
let currentUser: User | null = null;

export default {
  async fetch(request: Request, env: Env): Promise {
    currentUser = await getUser(request, env); // race condition
    const data = await getDataFor(currentUser);
    return Response.json(data);
  }
}

// RIGHT — pass state through function arguments
export default {
  async fetch(request: Request, env: Env): Promise {
    const user = await getUser(request, env);
    const data = await getDataFor(user);
    return Response.json(data);
  }
}

Observability: turn it on before you need it

Enable logs and traces in wrangler.jsonc before you deploy to production. You want the data already collected when the first intermittent error appears, not after:

{
  "observability": {
    "enabled": true,
    "logs": { "head_sampling_rate": 1 },
    "traces": { "enabled": true, "head_sampling_rate": 0.01 }
  }
}

Use structured JSON logs, not plain strings. They're queryable in the dashboard:

// Queryable
console.log(JSON.stringify({ event: "request", path: url.pathname, status: 200, durationMs: 42 }));

// Not queryable
console.log("Request to /api/users completed in 42ms");

console.error() maps to "error" severity; console.warn() maps to "warning". Use them consistently so your alerts fire on the right things.

Security: two things that are easy to get wrong

Token comparison timing attacks

Direct string comparison leaks timing information. Use constant-time comparison instead:

async function verifyToken(provided: string, expected: string): Promise {
  const encoder = new TextEncoder();
  const [providedHash, expectedHash] = await Promise.all([
    crypto.subtle.digest("SHA-256", encoder.encode(provided)),
    crypto.subtle.digest("SHA-256", encoder.encode(expected)),
  ]);
  return crypto.subtle.timingSafeEqual(providedHash, expectedHash);
}

Random values for security-sensitive operations

Math.random() is not cryptographically secure. For tokens, IDs, and anything security-related:

// Unique ID
const id = crypto.randomUUID();

// Random bytes
const bytes = new Uint8Array(32);
crypto.getRandomValues(bytes);

The non-obvious one: no floating promises

An unawaited promise in a Worker is silently dropped when the isolate terminates. The work doesn't run. The error doesn't surface. You just lose data:

// WRONG — this analytics write may never complete
writeAnalytics(event); // floating promise

// RIGHT — awaited
await writeAnalytics(event);

// RIGHT — if you don't want to block the response
ctx.waitUntil(writeAnalytics(event));

Enable @typescript-eslint/no-floating-promises in your ESLint config and let the linter catch these before they reach production.