Chapter 21: Security, Compliance, and Deployment
How do we meet security and compliance requirements?
Security requirements kill platform choices. An engineering team can build features on any modern platform, but enterprise security (compliance certifications, data residency, audit trails) narrows the options drastically. Cloudflare passes most enterprise security reviews, but extracting that value requires deliberate architecture.
This chapter covers decisions that make enterprise adoption possible: how the isolate model shapes your security posture, authentication patterns, compliance constraints, and why Cloudflare's deployment model enables confidence traditional platforms can't match.
The security model of isolates
Understanding the isolate model's security implications shapes every decision on the platform.
Workers run in V8 isolates, the same technology that keeps browser tabs from reading each other's memory. This has been battle-tested across billions of browser instances daily.
Traditional servers are long-lived processes with long-lived attack surfaces. A containerised application maintains memory across requests. A credential cached at 9am remains available at 5pm for any exploit achieving code execution. Isolates are ephemeral. Memory clears after execution. The exploitation window shrinks from hours to milliseconds.
This model affects how you handle secrets. In a traditional server, you might load database credentials at startup and reuse them across requests. In Workers, secrets are injected per-request through the environment object. No persistent process to compromise, no memory to dump for credentials. The attack surface is fundamentally smaller.
What isolates don't protect
The V8 isolate model protects you from other customers' code, not from your own mistakes. Isolates provide no protection against application vulnerabilities, misconfigured bindings, or accidentally persisting sensitive data to storage services.
Understanding the security model means understanding its boundaries. Isolates protect against memory-based attacks across requests but don't address several threat categories that remain your responsibility.
Isolates don't protect against your own code. If your Worker exfiltrates data through legitimate network calls, whether through supply chain compromise, logic bug, or malicious insider code, the isolate model is irrelevant. The code runs with the permissions you've granted.
Isolates don't protect against misconfiguration. Overly permissive CORS headers, bindings that grant production data access from staging environments, secrets accidentally logged: these are architectural failures, not runtime vulnerabilities. The platform can't save you from yourself.
Isolates don't prevent persisting sensitive data inappropriately. The "narrow window" benefit assumes your code doesn't deliberately write sensitive information to KV, D1, or Durable Objects where it persists indefinitely. Cache authentication tokens in KV without TTLs and you've recreated the persistent state problem the isolate model avoids.
The isolate model shrinks your attack surface. It doesn't eliminate the need for secure coding practices, careful configuration management, and deliberate data handling. Think of it as defence in depth: the platform provides multiple layers, but you're responsible for others.
Defence in depth: what the platform provides
The isolate model is the first layer, but Cloudflare doesn't stop there. Understanding the additional layers matters for security evaluations and explains why certain constraints exist.
Isolates share processes with other customers' code; that's what makes the model economically viable. But multiple overlapping protections prevent cross-isolate attacks even if V8 itself has vulnerabilities.
The V8 sandbox, deployed in Workers since 2024, eliminates entire classes of memory-corruption attacks. Traditional exploits corrupt pointers in V8's heap to reach other parts of the process. The sandbox removes all pointers from the heap entirely, replacing them with 32-bit offsets trapped within a 4 GiB "pointer cage." Even if an attacker corrupts an offset, the corrupted value can only point within the cage; it cannot escape to other isolates or system memory. Defence through architecture: the attack surface doesn't exist rather than being protected.
Hardware memory protection keys add another layer. Modern CPUs support Protection Keys for Userspace, allowing different threads to have different memory permissions. Cloudflare assigns each isolate a random protection key from the available set. If a V8 vulnerability allowed reading another isolate's memory, the mismatched protection key triggers a hardware trap rather than returning data. With roughly twelve keys available, this catches approximately 92% of cross-isolate access attempts, not by detecting the attack but by making it physically impossible at the CPU level.
Cordons group Workers by trust level within the runtime. A free-tier Worker never executes in the same process as an enterprise customer's code. If a zero-day V8 vulnerability allowed cross-isolate access within a process, cordon boundaries limit blast radius to Workers at similar trust levels.
Beyond the runtime, a process-level sandbox (the "layer 2 sandbox") has been in place since Workers launched. Using Linux namespaces and seccomp, it prohibits all filesystem access and all direct network access from the sandbox process. Workers communicate only through controlled channels to supervisor processes that mediate external interaction. A Worker cannot read arbitrary files because no filesystem exists. It cannot open arbitrary network connections because the sandbox blocks all network syscalls; outbound HTTP routes through a proxy that enforces restrictions and adds tracing headers. Even complete V8 runtime compromise leaves the attacker trapped in a sandbox with no filesystem, no network, and no path to other customers' data.
The V8 engine itself receives security attention proportional to its importance. Google pays substantial bounties for V8 sandbox escapes and operates continuous fuzzing infrastructure. When patches publish, Cloudflare's automated pipeline builds and deploys within hours. Patch gap: under 24 hours from V8 fix publication to production deployment, compared to days or weeks typical of platforms coordinating updates across customer-managed infrastructure.
Why certain apis don't exist
Some Workers constraints that appear to be performance limitations are actually security requirements. Understanding this reframes them from restrictions to features.
Workers provide no high-resolution timers. When you call Date.now() during request handling, the value is frozen at request arrival; it doesn't advance during execution. This prevents timing attacks, including Spectre-class vulnerabilities that rely on measuring execution time to infer secrets from speculative execution. You cannot construct a timer by other means because there's no access to threading, shared memory, or other primitives that could serve as timing oracles.
The single-threaded execution model similarly serves security, not just simplicity. Threading would enable timing channels between concurrent operations. Shared memory would create side-channel opportunities. By eliminating these primitives entirely, Workers close attack vectors that other platforms address through microcode updates, kernel patches, and ongoing vigilance.
Defence through absence. The platform cannot leak timing information through high-resolution timers because high-resolution timers don't exist. It cannot suffer thread-based side channels because threads don't exist. These aren't features too difficult to implement; they are attack surfaces deliberately excluded.
For workloads that genuinely need threading or precise timing, Containers provide a different security model with different trade-offs. The constraints exist in Workers specifically because the multi-tenant isolate model demands them.
Authentication at the edge
Moving authentication to the edge isn't a security decision; it's an economic decision that happens to improve security.
Traditional flow: an invalid request arrives, passes through your load balancer, reaches your application server, triggers a database query to validate credentials, fails validation, and returns an error. You've consumed compute, database connections, and network bandwidth for a request that was never legitimate.
At the edge, that same invalid request hits your Worker, fails JWT validation in microseconds, and returns immediately. No backend resources consumed. No database query executed.
If ten percent of your traffic fails authentication (expired tokens, malformed requests, bot probes), edge validation prevents that ten percent from reaching your backend. For an application serving ten million requests daily, that's one million backend requests eliminated. At typical compute and database connection costs, the savings compounds quickly.
Choosing an authentication strategy
Cloudflare supports multiple authentication patterns, each suited to different scenarios. The choice matters more than implementation details.
Cloudflare Access delegates authentication to identity providers you already use: Okta, Azure AD, Google Workspace, or dozens of others. Users authenticate through their familiar login flow; your Worker receives verified identity in request headers. Access handles session management, token refresh, and identity provider integration. Your code simply reads verified identity from headers and proceeds with business logic.
The architectural benefit is separation of concerns. Access handles the complexity of SAML, OIDC, and identity provider quirks. You avoid reimplementing federated authentication, which is notoriously easy to get subtly wrong.
Access excels when you have existing identity infrastructure, when you're protecting internal tools or employee-facing applications, or when compliance requires audit trails tied to corporate identity. Less suited to consumer applications where users don't have accounts in your identity provider.
Custom JWT validation makes sense when building consumer applications with your own user database, when you need fine-grained claims beyond basic identity, or when building API-first services where federated identity doesn't fit. You control token format, claims structure, and validation logic. The tradeoff: you handle token issuance, refresh flows, and revocation.
JWT validation at the edge is straightforward; libraries handle the cryptography. The architectural decisions matter more than the code. What claims do you need? How long should tokens live? How do you handle refresh? These questions have the same answers anywhere; edge location doesn't change token design principles.
API keys suit service-to-service communication where simplicity outweighs sophistication. Generate a random key, hash it for storage, validate on each request. API keys work well for tracking usage by client, building developer APIs with usage tiers, or when the calling service can securely store credentials.
The decision framework:
| Scenario | Recommended Approach |
|---|---|
| Internal tools, employee access | Cloudflare Access |
| Consumer application, own user database | Custom JWT |
| B2B SaaS with SSO requirements | Cloudflare Access or SAML integration |
| Public API with developer keys | API keys with rate limiting |
| Service-to-service, high trust | API keys or mutual TLS |
| Mixed: employees and external users | Access for employees, JWT for external |
Where session state lives
Where you store sessions depends fundamentally on what "session" means. A session might be a token with metadata (KV), queryable user state (D1), or a coordination point across concurrent requests (Durable Objects).
KV suits sessions that are fundamentally tokens with metadata: user ID, expiration time, perhaps preferences. KV's eventual consistency means a user logging in on one continent might not see that session on another for up to sixty seconds. For most consumer applications, this latency is invisible; users don't switch continents mid-session. KV sessions are cheap, fast to read, and simple.
The limitation is query capability. You cannot ask KV "find all sessions for user X" or "count active sessions." For single-session policies or displaying active sessions to users, KV's key-value model doesn't help.
D1 suits sessions requiring strong consistency or queries across session data. When you need to list all active sessions for a user, invalidate sessions by user ID, or enforce "single session only" policies, D1's relational model makes these operations natural. Writes are strongly consistent, so a session created in one location is immediately valid everywhere, at the cost of slightly higher latency than KV for reads (though read replicas mitigate this).
Durable Objects suit sessions that are coordination points. If the session must track WebSocket connections, maintain presence information, or coordinate across multiple simultaneous requests, Durable Objects provide the single-threaded consistency guarantee that prevents race conditions. A user opening your application in multiple tabs should see consistent state across all of them; the Durable Object ensures requests process sequentially rather than racing.
The test is straightforward: does your "session" need to do anything, or does it just need to exist? Tokens that prove identity belong in KV. Records that track state belong in D1. Actors that coordinate behaviour belong in Durable Objects.
Secrets management
Wrangler secrets are secure; the real question is whether you need capabilities that static secrets cannot provide, such as rotation auditing, dynamic generation, or cross-platform consistency.
Wrangler secrets: simple and usually sufficient
Wrangler secrets are encrypted at rest, injected at runtime, and never appear in logs or configuration files. Your Worker accesses secrets through the environment object, identical to regular environment variables but with encryption and isolation guarantees.
The limitation: Wrangler secrets are static. Changing a secret requires redeployment. For most applications, this is irrelevant; you change secrets rarely, and redeployment takes seconds.
Most teams should start with Wrangler secrets. Simple, secure, sufficient for the vast majority of applications. The complexity of external secrets managers is justified only when you have specific requirements static secrets can't meet.
When external secrets managers matter
External secrets managers (HashiCorp Vault, AWS Secrets Manager, cloud provider equivalents) add complexity but provide capabilities Wrangler secrets lack.
Zero-downtime rotation matters for PCI compliance where cryptographic keys must rotate on schedule, or high-security environments where credential rotation is measured in hours rather than months.
Centralised audit matters when compliance frameworks require tracking who accessed which secrets when, with tamper-proof logs that cannot be modified by the teams using the secrets.
Dynamic secrets matter when generating short-lived database credentials on demand, reducing blast radius if credentials are compromised.
Cross-platform consistency matters when managing secrets across Cloudflare, AWS, and GCP, and drift between platforms creates operational risk.
| Your situation | Recommendation |
|---|---|
| Secrets change less than monthly | Wrangler secrets |
| PCI/SOC2 requires rotation audit trail | External manager |
| Multi-cloud, need consistency | External manager |
| Startup, moving fast | Wrangler secrets |
| Enterprise, compliance-driven | External manager |
If you need external secrets manager capabilities, the integration is justified. If not, you're adding network calls, caching strategies, fallback handling, and operational complexity for no benefit.
Rotation without downtime
When secrets must rotate without interrupting service, support overlapping validity by keeping both old and new keys valid during the transition. Your validation logic accepts either key, allowing you to set the new key, deploy with both valid, wait for all clients to update, and finally remove the old key.
This dual-key period prevents rotation from causing failures. The pattern is simple but requires designing for it from the start; retrofitting dual-key support onto a system assuming single keys is substantially more work than building it in initially.
Compliance as architecture
Compliance certifications prove Cloudflare's infrastructure meets standards, but your architecture determines whether your use of that infrastructure meets standards. The platform is eligible; you make it compliant.
Cloudflare maintains SOC 2 Type II, ISO 27001, GDPR compliance documentation, HIPAA eligibility with Business Associate Agreements, and PCI DSS compliance. Your account team can provide attestation documents for procurement.
What matters more: understanding how compliance requirements translate into architectural decisions. Listing certifications doesn't help you build compliant systems.
GDPR: design for deletion
GDPR grants data subjects specific rights. The architectural challenge: these rights assume you can find all data about a person, and most systems aren't designed for that.
The right to deletion sounds simple until you've spread user data across D1 tables, R2 objects, KV entries, and Durable Object storage. GDPR doesn't care about your storage architecture. It cares whether you can find and delete everything.
Design your data model around a consistent user identifier that appears everywhere. When a deletion request arrives, you need a tractable query ("find everything with user_id X"), not an archaeological expedition through heterogeneous storage systems. D1 foreign keys help within a database, but deletion across D1, R2, and KV requires application logic you must design intentionally.
The right to access means users can request all data you hold. Same architectural principle: if you cannot enumerate all storage locations for a user, you cannot fulfil access requests accurately.
Data minimisation is an architectural philosophy, not a technical control. Before storing something, ask whether you actually need it. Cloudflare's storage makes keeping data easy and cheap; compliance requires discipline about what to keep. Data you don't store is data you don't need to protect, export, or delete.
Data residency: performance tradeoffs
Jurisdictional restrictions that keep data in specific regions carry consequences beyond compliance checkboxes.
Restricting a Durable Object to the EU means that object only runs in EU data centres. Users in Asia experience intercontinental round-trips: hundreds of milliseconds instead of tens. Compliance and performance are often in tension; architecture is deciding where that tension resolves.
When restricting Durable Objects to a jurisdiction, think about who accesses the data. EU data accessed only by EU users? The restriction costs nothing. Global users accessing EU-restricted data? You're adding hundreds of milliseconds to every interaction.
The decision framework for jurisdictional restrictions:
| Data Type | Restriction Typically Needed? | Performance Impact |
|---|---|---|
| PII of EU residents | Yes, for GDPR | Acceptable; users are in EU |
| Financial records for EU entity | Depends on regulations | Acceptable; transactions are not latency-critical |
| Real-time game state | Rarely | Severe; latency matters |
| Global user preferences | Rarely | Unnecessary; preferences are not usually sensitive |
D1 database location works similarly. Creating a database with location constraints ensures the primary resides in the specified region. Read replicas may exist elsewhere for performance, but writes always route to the restricted primary.
Data localization suite: enterprise controls
For enterprises with stringent data sovereignty requirements, Cloudflare offers the Data Localization Suite as an Enterprise add-on. The suite provides three independent controls that layer together depending on your compliance needs.
Regional Services controls where TLS termination and request processing occur. When you configure a hostname for EU Regional Services, requests from anywhere in the world route to EU data centres before decryption. A user in Singapore connecting to your EU-regionalised hostname experiences their traffic routed across the globe to Europe before Cloudflare decrypts and processes it. Layer 3/4 DDoS mitigation still happens globally, but all application-layer processing (WAF, Workers, caching) occurs only within your specified region.
Geo Key Manager controls where your TLS private keys are stored. By default, Cloudflare distributes encrypted keys to all data centres for local TLS termination. With Geo Key Manager, keys stay within specified regions. Data centres outside those regions must request session keys from key-holding data centres, adding latency to connection establishment. This latency only affects the initial TLS handshake; subsequent requests and session resumption don't require key server access. The trade-off is meaningful: a user in Asia connecting to a site with EU-only keys pays an intercontinental round-trip on first connection.
Customer Metadata Boundary controls where your logs and analytics are stored. With CMB configured for EU, all request metadata stays in EU infrastructure. The consequence: dashboard analytics and some API endpoints won't work when accessed from outside your configured region unless you enable out-of-region access (which allows viewing but keeps storage localised).
These controls are independent. You might use Regional Services without Geo Key Manager, or Geo Key Manager without Customer Metadata Boundary. Each adds compliance capability at a cost: Regional Services adds latency for non-regional users; Geo Key Manager adds connection establishment latency; Customer Metadata Boundary restricts analytics access. Understand each trade-off before enabling.
Not all Cloudflare products work with all DLS features. HTTP/3 doesn't work with Regional Services. Smart Placement is incompatible. Several dashboard analytics won't populate with EU Customer Metadata Boundary. Verify product compatibility against Cloudflare's current documentation before committing to DLS for compliance purposes.
Regional coverage varies by control. Regional Services supports dozens of regions including individual countries, the EU, FedRAMP-compliant data centres, and even specific US states. Customer Metadata Boundary currently supports only EU and US. Geo Key Manager supports a narrower set including EU, US, and select individual countries. The intersection of all three controls determines what's actually achievable for multi-layered compliance requirements.
HIPAA: eligibility is not compliance
Cloudflare's SOC 2 certification or HIPAA BAA covers their infrastructure, not your application. You remain responsible for access controls, audit logging, data handling, and all other application-level compliance requirements.
HIPAA requires a Business Associate Agreement before storing protected health information. Cloudflare offers BAAs for enterprise accounts. But a BAA doesn't make your application HIPAA-compliant; it makes Cloudflare's infrastructure eligible for use in a compliant application.
Your architecture must still implement access controls, audit logging, encryption requirements, and minimum necessary access principles. Workers provide good building blocks (isolated execution, encrypted storage, no persistent memory), but compliance requires deliberate design, not just platform selection.
The BAA is table stakes, not the finish line. Cloudflare won't be the reason you fail a HIPAA audit. Whether you pass depends on everything you build on top.
PCI: scope reduction is the strategy
The strongest PCI strategy is scope reduction: don't handle cardholder data at all. Use payment processors (Stripe, Adyen, Braintree) that tokenise card data before it reaches your systems. Your Workers process tokens, not card numbers. Tokens aren't cardholder data, so PCI scope shrinks dramatically.
If you must handle card data directly, requirements escalate significantly: encryption in transit, encryption at rest, access logging, key rotation, vulnerability management, penetration testing, and more. Cloudflare's infrastructure provides some of these controls, but you're responsible for application-level implementation.
Most applications should avoid this complexity through tokenisation. The payment processor handles PCI compliance for card handling; you handle PCI compliance for everything else, a much smaller surface area.
What compliance can't easily achieve
Honest assessment of limitations builds credibility. Some compliance requirements are genuinely difficult on Cloudflare's platform.
Cross-region audit logging requires intentional design. Logs from Workers don't automatically aggregate into a central, tamper-proof audit store. For comprehensive audit trails, build the aggregation and ship logs to a centralised system via Logpush or application-level logging.
Fine-grained access control within a Worker is your responsibility. If multiple teams share a Worker and compliance requires tracking which team accessed which data, that's application logic you implement, not platform capability you configure.
Infrastructure you control is required by some compliance frameworks. Edge compute is, by definition, someone else's infrastructure. If compliance mandates that all processing occur on hardware you own, serverless platforms (Cloudflare included) don't fit.
Deployment as hypothesis testing
A deployment without predefined success criteria isn't a gradual rollout; it's a gradual hope that drifts until reality intrudes. Define what "working" means before you deploy, or you'll discover what "broken" means from your users.
Why Cloudflare deployment is different
Rollback on traditional platforms means "redeploy the previous version." Rollback on Cloudflare means "route traffic differently." It's the difference between minutes and milliseconds.
Traditional deployments replace infrastructure. Tear down the old version, stand up the new one. Rollback means reversing that process: redeploying the previous artifact, waiting for instances to become healthy, shifting traffic.
Cloudflare deployments add code versions. Your previous version isn't torn down when you deploy; it continues running, serving its percentage of traffic. Both versions exist simultaneously. Gradual rollout means changing routing percentages, not provisioning infrastructure. Rollback means routing back to the version already running.
This architectural difference makes aggressive gradual rollouts viable. Deploy at 1% knowing that if something breaks, you're seconds from recovery, not minutes. Safety margin is larger because recovery time is shorter.
Defining success criteria
Before deploying, establish measurable criteria. Write these down before you deploy, not while staring at ambiguous metrics trying to decide whether the deployment succeeded.
Error rate threshold: "Error rate must not increase by more than 0.1% compared to the previous version." Be specific about what counts as an error: 5xx responses only, or also 4xx that indicate bugs? Over what time window: instantaneous, or averaged over five minutes?
Latency threshold: "P99 latency must remain below 200ms." Choose the percentile that matters for your application. P50 hides tail latency issues that affect your most complex requests or unluckiest users. P99 or P99.9 reveals them.
Business metrics: "Conversion rate must not decrease." "API success rate must stay above 99.9%." Connect deployment health to outcomes that matter. A deployment passing technical metrics but tanking business metrics isn't successful.
No new error types: "No error messages appearing that were not present in the previous version." New errors often indicate new bugs. This criterion catches issues that might not increase error rate significantly but indicate problems.
Gradual rollout percentages
Start at a percentage that gives signal without excessive risk.
One percent for high-risk changes: database schema migrations, authentication changes, anything touching payment flows. Verify nothing catastrophic happens before broader exposure.
Ten percent for moderate-risk changes: new features, significant refactoring, dependency updates. Enough traffic to surface issues, limited blast radius if something goes wrong.
Fifty percent for low-risk changes: bug fixes with good test coverage, minor feature adjustments, configuration changes. Mostly verifying that nothing unexpected happens.
The goal is statistical significance. At 1% of traffic, you need enough requests to distinguish signal from noise. High-traffic application: 1% might be thousands of requests per minute, plenty of signal. Low-traffic application: 1% might be ten requests per hour, insufficient to detect problems. Adjust percentages to your traffic volume.
Bake time
How long at each percentage? Long enough for problems to surface.
Some issues appear immediately: crashes, syntax errors, obvious logic bugs. These surface in minutes.
Other issues take time: race conditions under load, edge cases triggered by specific user behaviour, performance degradation that only manifests under sustained traffic. These might take hours.
A reasonable starting point: one hour at each percentage stage, or until you've processed at least ten thousand requests at that percentage, whichever comes later. Adjust based on traffic patterns and risk tolerance.
Duration matters more for time-dependent bugs. If your application has daily patterns (peak traffic at certain hours, batch jobs that run at midnight), you might need a full cycle to verify the deployment handles all scenarios.
Rollback triggers
Define when to rollback before you need to decide under pressure. Predefined triggers prevent the "let's wait and see" instinct that lets small problems become outages.
Automatic triggers: error rate exceeds threshold by 2x, any 5xx errors in the first hundred requests, P99 latency doubles. Implement these in your deployment pipeline or monitoring system.
Manual triggers: customer complaints about the new version, unexpected behaviour in monitoring dashboards, gut feeling that something is wrong. Experienced engineers develop intuition for deployments that don't feel right. Trust that intuition; rollback is cheap.
When triggers fire, rollback immediately. Both versions are already running; you're just changing routing percentages. No reason to delay.
Environment progression
Maintain separate environments: development, staging, production. Deploy to staging first. Run automated tests against staging. Verify manually. Only then deploy to production with gradual rollout.
The binding model makes environment separation clean. Staging and production Workers are identical artifacts with different bindings. Staging points to staging databases, staging KV namespaces, staging Durable Object namespaces. Production points to production resources.
What should differ between environments? Data sources must differ; you don't want staging tests modifying production data. Code paths should not differ. Avoid conditional logic based on environment that means staging doesn't test production behaviour.
Feature flags provide safer conditional behaviour than environment checks. A feature flag tested in staging behaves identically in production; an environment check by definition behaves differently. Need to disable a feature in production that's enabled in staging? Use a feature flag, not environment detection.
Automated deployment with Workers builds
Manual deployments work for side projects. Production systems need automation: code changes trigger builds, tests run, deployments happen consistently regardless of which engineer pushed the commit.
Workers Builds is Cloudflare's native CI/CD system. Connect your GitHub or GitLab repository, configure build settings, and every push to your production branch automatically deploys. No external CI system required, no deployment scripts to maintain, no credentials to manage.
Connecting a repository
For a new Worker, create it through the dashboard and select "Import a repository" during setup. For existing Workers, navigate to Settings → Builds → Connect, then authorise Cloudflare to access your Git provider.
Once connected, configure three essential settings:
Production branch determines which branch triggers production deployments. Pushes run your build command followed by wrangler deploy. Most teams use main or master.
Build command runs before deployment. If your Worker requires compilation, bundling, or code generation, specify that command here. A TypeScript Worker might use npm run build; a Worker with no build step can leave this empty.
Deploy command defaults to npx wrangler deploy. Customise for specific Wrangler flags or environment targeting.
Build command: npm run build
Deploy command: npx wrangler deploy --env production
The Wrangler version comes from your package.json. Lock it explicitly to ensure consistent builds:
{
"devDependencies": {
"wrangler": "^3.91.0"
}
}
Preview urls for every branch
Production deployments matter, but so does reviewing changes before they reach production. Enable non-production branch builds to generate preview URLs for every branch and pull request.
When enabled, pushes to non-production branches trigger builds but execute a different deploy command. By default: npx wrangler versions upload. This creates a new version without promoting to production, accessible at a stable preview URL based on branch name.
Each pull request receives two URLs: a commit-specific URL that changes with each push, and a branch-specific URL that always points to the latest commit. Share the branch URL with reviewers; it updates automatically as you push fixes.
The preview URL pattern: <branch-name>-<worker-name>.<subdomain>.workers.dev. A branch named feature/new-checkout on a Worker named api produces a preview at feature-new-checkout-api.your-subdomain.workers.dev.
Monorepo support
Organisations with multiple Workers in a single repository need fine-grained control over which Workers rebuild when code changes.
Connect each Worker to the same repository but configure different root directories. A Worker at services/api/ only sees changes within that directory; a Worker at services/auth/ only sees its own changes. Set the root directory in Settings → Builds → Root directory.
Build watch paths provide additional control. By default, any file change triggers a build. Configure include and exclude patterns to specify which paths trigger rebuilds:
Include paths: services/api/**, shared/utils/**
Exclude paths: **/*.md, **/*.test.ts
A change to services/api/handler.ts triggers the API Worker build. A change to services/auth/config.ts doesn't. A change to shared/utils/helpers.ts triggers both if both include it in their watch paths. For monorepos with many Workers, this prevents unnecessary builds and keeps deployment times predictable.
When to use external CI/CD instead
Workers Builds covers most deployment needs but doesn't fit every situation.
Self-hosted Git isn't supported. If your repository lives on self-hosted GitHub Enterprise or GitLab, use external CI/CD with Wrangler commands.
Complex testing requirements may need more than Workers Builds provides. If deployment depends on integration tests against staging databases, security scans, or approval workflows, orchestrate those in GitHub Actions or your existing CI platform, then call wrangler deploy at the end.
Multi-service coordination where Worker deployment must happen atomically with database migrations or other service updates requires external orchestration. Workers Builds deploys Workers; it doesn't coordinate with external systems.
The decision: if your deployment is "build, maybe run tests, deploy Worker," Workers Builds handles it with minimal configuration. Complex orchestration? External CI/CD provides the flexibility you need.
Private network connectivity
Some backends aren't publicly accessible. Your application needs data from a database inside a VPC, an internal API behind a firewall, a legacy system that cannot be exposed to the internet. Workers VPC provides secure connectivity through Cloudflare Tunnel with a binding-based access model that prevents SSRF vulnerabilities.
Cloudflare tunnel: the foundation
Cloudflare Tunnel creates an outbound-only connection from your private network to Cloudflare's edge. Install cloudflared on a server inside your network; it establishes a connection to Cloudflare; Workers route requests through that tunnel to your internal resources.
The security model is compelling: no inbound ports opened, no public IP addresses exposed, no firewall rules to maintain. The tunnel initiates from inside your network. If the tunnel process stops, connectivity stops; no dormant attack surface waiting to be exploited.
For production deployments, run at least two cloudflared replicas on separate hosts for redundancy. Each host should have minimum 4 GB RAM and 4 CPU cores. The tunnel maintains persistent connections to Cloudflare, eliminating the need for complex firewall configurations.
VPC services: the binding layer
Workers VPC is currently in beta. Features and APIs may change before general availability. While in beta, Workers VPC is available for free to all Workers plans.
VPC Services build on Tunnels to provide a binding-based access model. Rather than giving Workers broad access to everything reachable through a tunnel, you create VPC Services representing specific endpoints, then bind those services to Workers.
[[vpc_services]]
binding = "PRIVATE_API"
service_id = "e6a0817c-79c5-40ca-9776-a1c019defe70"
Your Worker accesses the private service through the binding:
const response = await env.PRIVATE_API.fetch(
"http://internal-api.company.local/users"
);
This architecture provides SSRF protection by design. A VPC Service binding only reaches the specific configured endpoint. An attacker who compromises your Worker's request handling cannot pivot to arbitrary internal hosts; the binding restricts access to the configured service. This mirrors how bindings to D1, R2, and other Cloudflare resources prevent SSRF attacks.
VPC Services also enable role-based access control. The Connectivity Directory Admin role creates and manages services; the Connectivity Directory Bind role allows developers to use existing services without the ability to create new ones. This separation matters in larger organisations where infrastructure teams control what's exposed while application teams consume those services.
The limit is 1,000 VPC Services per account. Each service represents one endpoint in your private network; you can bind multiple Workers to the same service.
The hybrid edge pattern
Workers become particularly valuable as a global edge layer connecting multiple backend clouds. Your organisation might have databases in AWS, APIs in GCP, and legacy systems on-premises. Workers provide a unified global entry point routing to the appropriate backend based on request type.
A single edge layer means consistent authentication, logging, and rate limiting across all backends. Users experience consistent latency regardless of which backend serves their request; the edge is always close, even if backends aren't.
The architectural insight: Workers aren't just a compute platform; they're a global routing layer that happens to execute code. Use that capability to unify fragmented backends into a coherent experience.
Security practices at the edge
A few security practices deserve explicit mention, not because they're Cloudflare-specific, but because the edge model affects implementation.
Rate limiting
Distributed rate limiting is hard. Traditional approaches use centralised counters (Redis, typically), but centralised anything adds latency at the edge.
KV-based rate limiting works for approximate limits where a small percentage of over-limit requests (typically up to 5-10% during burst periods) is acceptable. Eventual consistency means racing requests might all succeed before the counter propagates. For limits around 100 requests per minute per user, this is usually fine. Abuse prevention doesn't require perfect accuracy; it requires making abuse expensive.
Durable Objects-based rate limiting provides exact limits with strong consistency. Route rate limit checks to a Durable Object keyed by user ID; single-threaded execution guarantees accurate counting. Cost: a Durable Object lookup per request, still fast, but not free.
Choose based on precision requirements. Approximate limits protect against abuse while being cheap and fast. Exact limits enforce contractual quotas or prevent resource exhaustion where every request over the limit costs real money.
Input validation
Validate inputs at the edge, before they propagate to backends. Invalid inputs rejected at the edge never consume backend resources.
Use structured validation libraries rather than ad-hoc checks. TypeScript's type system helps at compile time; runtime validation with libraries like Zod catches what types can't express. The combination catches most input errors before they cause problems.
Dependency minimisation
Workers have a smaller attack surface than traditional servers: no filesystem, no persistent processes, limited APIs. Maintain that advantage by minimising dependencies.
Every npm package is code you haven't reviewed executing in your security context. Supply chain attacks happen with uncomfortable regularity. Prefer standard APIs over utility libraries. When dependencies are necessary, prefer well-maintained packages with security track records. Audit regularly.
Workers provide no filesystem, no persistent processes, and limited APIs. This isn't just a constraint; it's a security feature. You cannot accidentally expose sensitive files because no files exist. You cannot accidentally spawn subprocesses because process spawning isn't available. The limitations that sometimes frustrate developers also limit attackers.
What comes next
This chapter completes Part VI: Production Operations. With cost management (Chapter 19), observability (Chapter 20), and security fundamentals (this chapter), you have the operational foundation for production applications.
Part VII covers architecture patterns: reusable designs for common problems, multi-tenant architectures, and honest assessment of when Cloudflare isn't the right choice.