Chapter 26: Building on Cloudflare
The mental models that matter and the path forward
Twenty-five chapters of architecture, trade-offs, and patterns distil down to principles you can carry forward, whether you're deploying your first Worker today or leading a platform migration next quarter.
The mental models that matter
Three conceptual shifts separate teams that thrive on Cloudflare from those that struggle against it. Understand and internalise these:
Think horizontal, not vertical
Cloudflare's platform assumes horizontal scaling from the start. Durable Objects work best as one object per entity, not one object for all entities. D1 databases are designed for many smaller databases, not one monolithic instance. KV stores billions of independent keys rather than hierarchical collections.
This inverts traditional thinking. On hyperscalers, you scale up until you can't, then reluctantly shard. On Cloudflare, you start sharded. A multi-tenant application creates a database per tenant. A rate limiter creates an object per user. A session store distributes across KV keys. The platform's constraints push you toward architectures that scale indefinitely.
If you find yourself asking "how do I make this bigger," you're asking the wrong question. Ask instead: "how do I partition this naturally?"
Think global, not regional
Code deploys everywhere. You don't select regions; you restrict them when regulations require it. A Worker written in London runs in São Paulo, Sydney, and San Francisco within seconds of deployment. No multi-region configuration, no replication policies, no failover planning.
Data has locality because physics demands it, but the mental model differs from hyperscaler regions. Durable Objects place themselves near their first requesters. D1 databases can have read replicas at the edge while writes route to the primary. KV caches globally with eventual consistency. The platform handles distribution; you handle the logic.
This eliminates categories of complexity that enterprises spend months managing on hyperscalers. It also eliminates certain controls those enterprises may require. Know which category you're in.
Think primitives, not services
Cloudflare provides building blocks rather than pre-assembled solutions: Workers give you compute (you build your API), D1 gives you SQLite (you design your schema), and Durable Objects give you coordination (you implement your patterns).
This trades initial convenience for lasting flexibility. A hyperscaler managed service might solve eighty percent of your problem instantly but fight you on the remaining twenty percent, whereas Cloudflare primitives require more work upfront but compose without constraint.
The truths worth remembering
Constraints are architectural, not temporary. The memory limit exists because thousands of isolates share a process. The database size limits exist because D1 databases are Durable Objects. HTTP-only inbound traffic exists because Cloudflare is a proxy network. These constraints enable the benefits; waiting for them to disappear misunderstands the platform.
Durable Objects have no equivalent elsewhere. AWS, Azure, and Google Cloud offer nothing comparable; single-threaded actors with automatic global routing, durable storage, and coordination guarantees exist only on Cloudflare. This makes them simultaneously the platform's most powerful feature and its greatest lock-in risk: problems that require distributed locking, consensus protocols, and careful race condition handling elsewhere reduce to straightforward single-threaded code.
CPU time and wall time are different. Workers charge for compute, not for waiting, so a function waiting two seconds for an external API while computing for twenty milliseconds costs the same as one completing in twenty milliseconds with no waiting. This inverts traditional serverless economics and rewards architectures that orchestrate and delegate rather than compute everything locally.
Boring architecture endures. The right pattern isn't the cleverest one. It's the one that makes your system predictable, understandable, and debuggable by someone at 3am who's never seen the code.
AI at the edge means AI on the right continent, not AI in every city. Workers AI eliminates the operational complexity of GPU provisioning and model deployment, but don't expect edge deployment to make inference instant. Model inference takes the time it takes; the edge reduces network latency to the inference endpoint, not the inference itself. Choose Workers AI for operational simplicity and unified billing, not for latency miracles.
Design for the platform or choose a different platform. If Cloudflare's primitives map naturally to your domain, you'll build faster and run cheaper than anywhere else. If your design requires frequent workarounds, consider whether a different platform fits better.
The decision framework
When evaluating whether Cloudflare fits a workload, work through three questions in this order:
First, check hard limits. Does your workload exceed memory constraints per request? Require inbound TCP or UDP? Need cross-partition transactions with strong consistency? Demand a single massive datastore rather than partitioned ones? Failing any of these doesn't mean Cloudflare is bad; it means the workload doesn't fit.
Second, assess architectural alignment. Does your workload benefit from edge execution? Does horizontal scaling match your data model? Does eventual consistency work for cross-partition operations? Misalignment creates friction that compounds over time.
Third, consider your team. Has the team built on Cloudflare before? Strong JavaScript or TypeScript experience? Comfortable with primitives rather than managed services? Platform fit includes human fit.
What Cloudflare does best
Request-response workloads with global users benefit immediately. APIs, web applications, and webhook processors see latency improvements without multi-region complexity, and cold starts measured in milliseconds rather than seconds eliminate an entire category of optimisation other platforms require.
Coordination-heavy workloads find a natural home. Real-time collaboration, game state, rate limiting, and session management become dramatically simpler with Durable Objects, with problems requiring distributed locking and consensus protocols elsewhere reducing to single-threaded actors with automatic routing. When coordination extends to audio and video, Cloudflare Realtime provides WebRTC infrastructure with the same global-first philosophy (anycast routing to the nearest SFU, no regional configuration, latency advantages that come from architecture rather than manual optimisation).
I/O-heavy workloads benefit economically. API aggregation, external service orchestration, and webhook processing pay only for compute time, not wall time. Workloads spending ninety percent of their time waiting pay for ten percent.
Latency-sensitive global applications see immediate improvements. When users span continents and milliseconds matter, running code in hundreds of locations changes what's possible.
AI-augmented applications find a natural home when AI is a feature rather than the product. Adding intelligence to existing applications, such as search that understands intent, support that suggests answers, or content that classifies itself, becomes straightforward with Workers AI and Vectorize. The same horizontal scaling philosophy applies: one vector index per tenant, embeddings at the edge, inference as a binding rather than an external service. For applications where "good enough" AI delivered instantly beats "perfect" AI delivered slowly, the platform excels.
What Cloudflare doesn't do
Memory-intensive workloads exceeding isolate limits cannot run in Workers; complex data transformations, large file processing, and certain ML inference tasks need Containers or external compute.
Long-running compute cannot complete in a single Worker invocation; video transcoding, complex simulations, and large-scale data processing need Workflows, Containers, or different architectures entirely.
Non-HTTP protocols cannot reach your code directly with one exception: Cloudflare Realtime handles WebRTC for audio and video, but game servers needing custom UDP, proprietary protocols, and IoT devices without HTTP bridges need traditional cloud infrastructure.
Traditional database patterns expecting single large instances with complex cross-partition queries fight the horizontal model; if your architecture assumes vertical scaling, you'll need to rethink it.
Frontier AI requirements demanding GPT-4 or Claude-level reasoning won't be satisfied by Workers AI's open-source models. If your users compare output quality to ChatGPT and the gap matters, use AI Gateway to route to external providers. Cloudflare's AI stack suits applications where AI enhances functionality; it doesn't compete with dedicated AI infrastructure for applications where AI quality is the primary differentiator.
The path forward
If evaluating Cloudflare, prototype before committing. Build something small that exercises the primitives your production workload needs, hit constraints deliberately, measure latency from your users' locations, and calculate costs at expected scale. A week of prototyping reveals more than months of analysis.
If migrating, migrate incrementally. Move one service, validate it works, then move the next. Maintain rollback capability until confidence is established through production operation. Measure everything: latency, costs, operational complexity before and after. Migration that can't demonstrate improvement isn't success.
If building new systems, start with Cloudflare's defaults and find reasons not to use them. Global deployment, instant scaling, negligible cold starts, integrated security: you begin with capabilities that require significant effort elsewhere. You'll discover quickly whether the platform's constraints affect your workload.
If Cloudflare doesn't fit, that's a valid conclusion. The platform solves certain problems better than anything else and solves other problems poorly. Choosing the right tool for your workload matters more than platform loyalty.
The opportunity
Cloudflare's Developer Platform represents something genuinely different in cloud computing - not incrementally better but architecturally distinct. The isolate model, global-first deployment, coordination primitives, and horizontal scaling philosophy are fundamental design choices that change what's possible and what's practical.
The platform is younger than the hyperscalers. Gaps exist and edges remain rough. Documentation sometimes lags capability. But the foundation is sound, the trajectory is clear, and the problems it solves well, it solves better than alternatives.
Building on Cloudflare means accepting different trade-offs than AWS, Azure, or GCP. For workloads that fit, those trade-offs unlock capabilities that would be expensive, complex, or impossible elsewhere. For workloads that don't fit, forcing them onto this platform creates more friction than choosing one better suited to the task.
This book's goal has been helping you tell the difference before you commit. You now understand the mental models that matter: horizontal over vertical, global over regional, primitives over managed services. You know when coordination problems call for Durable Objects and when simple storage suffices. You know when real-time media needs Realtime's WebRTC infrastructure and when WebSockets through Durable Objects handle the job. You know when Workers AI fits and when to route to external providers through AI Gateway. You understand why the platform's constraints exist and how to design within them rather than against them.
What remains is the work itself: the satisfying process of turning these patterns into running systems. You have the architectural foundations. You understand the trade-offs. You know what fits and what doesn't.
Now build.