Chapter 25: Migration Playbooks

How do I move existing workloads to Cloudflare, and when should I?

The previous chapter helped determine whether Cloudflare fits your workload. This chapter addresses the practical question: how do you get there?

Migration is not a goal; faster applications, lower costs, and simpler operations are goals. Migration is a means and an expensive one. Before planning how to migrate, establish why and whether expected benefits justify the certain costs. This chapter provides playbooks for common migration scenarios, assuming you've already decided migration is worthwhile.

The migration principles

Successful migrations share common principles regardless of source or target. These aren't best practices to consider. They're requirements that distinguish success from cautionary tales.

Coexistence before cutover

Run old and new systems in parallel, routing some traffic to the new system while the old remains operational. Validate behaviour, compare results, and build confidence. Only after the new system proves itself do you cut over completely.

New systems always misbehave in ways you didn't anticipate: edge cases in data, traffic patterns you didn't test, integrations that assumed behaviours your new system doesn't provide. Coexistence gives you time to discover these problems without user-facing outages.

The cost of running two systems is real but bounded; the cost of a failed atomic cutover is unbounded. Coexistence is insurance worth paying for.

Incremental over atomic

Migrate one service, one data store, one capability at a time; each increment is a chance to learn, adjust, and verify. Atomic migrations compound risks and create uncertainty about what failed if something goes wrong.

Incrementalism also manages organisational risk: a team migrating one service learns lessons they apply to the next, whereas a team migrating everything at once learns lessons they can only apply to the post-mortem.

The objection is usually "but the systems are interconnected, we can't migrate piece by piece," which is sometimes true but more often a failure of imagination. Most systems can be decomposed; the question is whether you've tried. Hybrid architectures are valid intermediate states, not failures to complete migration.

Reversibility as requirement

Every migration step should be reversible. Keep S3 data until R2 proves reliable if you migrate from S3 to R2; maintain the ability to route traffic back if you migrate compute from Lambda to Workers; keep old DNS records available for quick revert if you migrate DNS.

Irreversible migrations are bets; reversible migrations are experiments. You want experiments. The confidence you have before migration is always less justified than you think, and the problems you'll discover are always different from those anticipated.

The cost of reversibility is maintaining old infrastructure during migration, the same cost as coexistence and equally worth paying. Set explicit timelines ("We'll maintain Lambda functions for 30 days after traffic reaches zero") to prevent indefinite accumulation while preserving rollback capability during the risk window.

Observation before and during

Establish baseline metrics before migration (latency distributions, error rates, costs, user experience measures), then monitor the same metrics during and after. Without baselines, you can't know if migration improved anything; without continuous monitoring, you can't catch regressions until users report them.

The metrics that matter depend on why you're migrating. For latency, measure p50, p95, and p99 from representative geographic locations; for cost, track resource consumption at sufficient granularity to compare; for operational simplicity, measure time-to-deploy, incident frequency, and mean time to recovery.

"We migrated successfully" is not a useful claim without metrics. "We reduced p99 latency from 450ms to 120ms while reducing compute costs by 35%" is useful and requires observation; plan for it.

Playbook: S3 to R2

Object storage migration is conceptually simple (copy files from one bucket to another), but it becomes operationally complex at scale with billions of objects, petabytes of data, and applications expecting zero downtime. R2 provides two migration tools for different scenarios.

Super slurper: complete migration

Super Slurper copies all objects from a source bucket to R2 in a single migration job. Use it when you want everything migrated, can tolerate egress costs, and need migration completed within a predictable timeframe.

Configure Super Slurper through the Cloudflare dashboard: specify source credentials, source bucket, destination R2 bucket, and optional path filters. It handles parallelisation, retries, and progress reporting. Migration that would take weeks with sequential copying completes in hours or days.

Super Slurper supports S3-compatible sources beyond AWS: Google Cloud Storage, MinIO, Backblaze B2, Wasabi, DigitalOcean Spaces.

The process:

Create destination R2 bucket
Configure Super Slurper with source credentials and bucket details
Optionally configure path filters for specific prefixes
Start migration and monitor progress
Validate migrated objects match source (spot-check counts and checksums)
Update application configuration to point to R2
Maintain source bucket during validation period (weeks, not days)
Delete source bucket after confidence is established

Calculate Egress Costs Before Migration

AWS charges $0.09/GB for S3 egress. Migrating 10 TB costs $900 in egress fees alone, potentially more than several months of R2 storage. For massive buckets, Sippy's on-demand migration may be more economical than Super Slurper's complete copy.

Sippy: incremental migration

Sippy migrates objects on demand. Configure R2 as a caching layer in front of your source bucket. When an object is requested from R2 and doesn't exist, Sippy fetches it from the source, stores it in R2, and returns it. Frequently accessed objects migrate first; rarely accessed objects migrate only when needed.

The benefit is economics. You pay egress only for objects actually requested, and frequently requested objects incur egress only once. For buckets where 10% of objects receive 90% of requests, Sippy can reduce migration egress costs by an order of magnitude.

The tradeoff is timeline. Migration completes only when all objects have been requested, which might be never. Objects never accessed remain in the source bucket indefinitely, continuing to incur storage costs.

Use Sippy when:

Egress costs for complete migration are prohibitive
You want to serve frequently accessed content from R2 immediately
Complete migration isn't required
Most objects are rarely requested

The process:

Create destination R2 bucket
Configure Sippy with source bucket credentials
Update application to request from R2 (Sippy handles fallback transparently)
Monitor migration progress as objects copy on access
Optionally run Super Slurper with "skip existing" to complete remaining objects

Combined strategy: Enable Sippy first to immediately serve frequently accessed objects from R2. After access patterns stabilise, run Super Slurper to migrate the long tail. This minimises egress costs while ensuring complete migration.

After migration: compatibility notes

R2 is S3-compatible, but compatibility isn't identity. Test thoroughly. Common differences:

ETags may differ. R2's ETag calculation matches S3 for single-part uploads, but Sippy may migrate multipart objects with different part sizes, producing different ETags. Applications validating ETags across migration will see mismatches.

Some S3 features aren't supported. S3 Object Lock, S3 Select, Requester Pays, and certain storage classes don't exist in R2. Check the compatibility matrix; missing features require application changes.

Endpoint URLs change. S3 endpoints follow bucket-name.s3.region.amazonaws.com. R2 follows account-id.r2.cloudflarestorage.com or custom domains. Update application configuration and SDK initialisations.

IAM policies don't transfer. R2 uses API tokens with specific permissions. Recreate your access control model using R2's authentication mechanisms.

Playbook: Lambda to Workers

Moving serverless compute from Lambda to Workers requires understanding model differences, not just translating syntax.

The model differences

Geographic distribution: Lambda functions are regional (deploy to us-east-1, execute in us-east-1; global distribution requires deploying to multiple regions and managing routing through Route 53, CloudFront, or API Gateway). Workers are global by default; deploy once and code executes at whichever of Cloudflare's 300+ locations is closest to each user.

Resource limits: Lambda supports up to 10 GB memory and 15-minute execution. Workers have 128 MB and 30 seconds for HTTP handlers (up to 15 minutes for Cron Triggers and Queue consumers, 5 minutes for Workflows). This is the most common migration blocker.

Cold starts: Lambda cold starts range from 100ms to several seconds depending on runtime and VPC attachment. An entire ecosystem exists to mitigate them. Workers cold starts are under 5ms, typically imperceptible. Cold start mitigation strategies don't transfer because the problem doesn't exist.

Networking: Lambda connects to VPCs natively through ENI attachment. Workers connect to private resources through Cloudflare Tunnel or VPC Services integration.

Pricing: Lambda charges for GB-seconds (memory multiplied by wall-clock time). Workers charge for CPU time, with I/O wait free. Lambda charges while your function waits for a database response; Workers don't. I/O-heavy workloads typically cost less on Workers.

Assessment questions

Before migrating any Lambda function, answer these questions:

Does it fit Workers' constraints? If the function uses more than 128 MB memory, Workers isn't the right target without architectural changes. If it runs longer than 30 seconds for HTTP requests, you need Workflows, Queues, or a different approach.

What does it access? If the function accesses VPC resources (RDS, ElastiCache, internal services), how will Workers reach them? Cloudflare Tunnel works but adds latency and complexity. If using DynamoDB, will you migrate to D1, use Hyperdrive with an external database, or accept cross-cloud latency?

How is it triggered? HTTP triggers translate directly to Workers fetch handlers. SQS triggers require migrating to Cloudflare Queues or maintaining SQS with a polling Worker. EventBridge, Step Functions, and other AWS-specific triggers require alternative architectures.

What libraries does it use? Some npm packages assume Node.js APIs Workers don't provide: filesystem access, child processes, native modules. Test dependencies in the Workers environment before committing.

What's the quantified benefit? "Lower latency" isn't a benefit; "reducing p95 latency from 180ms to 45ms for European users" is. If you can't quantify expected improvement, you can't evaluate whether migration succeeded.

Migration Without Metrics Is Hope, Not Engineering

"Lower latency" and "reduced costs" aren't quantified benefits. Without baseline metrics and continuous measurement, you can't know if migration succeeded. If you can't articulate expected improvement in numbers, question whether you should migrate.

Migration process

For functions that pass assessment:

1. Translate the handler. Lambda handlers receive event and context objects with AWS-specific structure. Workers handlers receive standard Request objects and env for bindings. The translation is mechanical but requires attention to input parsing and output formatting.

Lambda to Workers handler translation
// Lambda
export const handler = async (event, context) => {
  const body = JSON.parse(event.body);
  const userId = event.pathParameters.userId;
  
  // ... business logic ...
  
  return {
    statusCode: 200,
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(result)
  };
};

// Workers
export default {
  async fetch(request, env) {
    const body = await request.json();
    const url = new URL(request.url);
    const userId = url.pathname.split('/')[2]; // or use a router
    
    // ... business logic (largely unchanged) ...
    
    return Response.json(result);
  }
};

2. Replace AWS SDK calls. DynamoDB becomes D1 or Hyperdrive. S3 becomes R2 bindings. SQS becomes Queue bindings. Secrets Manager becomes Wrangler secrets. Each requires understanding Cloudflare's semantics; similar but not identical.

3. Configure bindings. Resources IAM-attached in Lambda are binding-attached in Workers. Add appropriate bindings to wrangler.toml.

4. Test locally. Use wrangler dev to verify behaviour against realistic inputs including edge cases from production logs.

5. Deploy to preview. Test with real Cloudflare infrastructure but without production traffic. Verify integrations, latency, and error handling.

6. Route incrementally. Use a routing layer to send a percentage of traffic to the new Worker. Start with 1%, increase to 10%, then 50%, then 100%. Compare metrics at each stage.

7. Monitor and compare. Compare latency distributions, error rates, and costs. Some differences are expected (geographic distribution changes latency patterns); others indicate bugs.

8. Complete cutover. When metrics confirm the Worker performs as well or better, route all traffic to Workers. Keep Lambda deployed but not receiving traffic.

9. Decommission. After 2-4 weeks of stable operation, delete Lambda functions. The cost of maintaining dormant functions is minimal compared to having no rollback option.

Playbook: Vercel/Netlify to Workers

Vercel and Netlify provide managed deployment for frameworks like Next.js, SvelteKit, and Astro. Migration to Cloudflare means moving from managed platform to managed infrastructure with more control, more configuration, and different trade-offs.

What changes

Deployment model: Vercel and Netlify infer configuration from your framework (push to git and deployment happens), whereas Cloudflare requires explicit wrangler.toml configuration. The magic is replaced with configuration files you control.

Environment handling: Vercel's environment variables are managed through their dashboard. Cloudflare uses Wrangler secrets for sensitive values and wrangler.toml vars for non-sensitive configuration.

Serverless functions: Vercel Functions and Netlify Functions have their own conventions: file-based routing, specific handler signatures. These become Workers with explicit routing. Translation is straightforward but requires touching every function.

Edge functions: Vercel and Netlify Edge Functions are conceptually similar to Workers but have API differences. The concepts transfer; the syntax doesn't.

Data services: Vercel KV maps to Workers KV. Vercel Postgres migrates to D1 or connects through Hyperdrive. Blobs map to R2. Plan these as part of the overall effort.

The process

1. Choose your framework's Cloudflare path:

Next.js: @opennext/cloudflare or @cloudflare/next-on-pages (check current documentation)
React Router v7 (Remix): Official Cloudflare Vite plugin
SvelteKit: @sveltejs/adapter-cloudflare
Astro: @astrojs/cloudflare
Nuxt: Nitro cloudflare preset

2. Create Cloudflare resources. Before migrating code, create needed infrastructure: KV namespaces, D1 databases or Hyperdrive connections, R2 buckets, Queues.

3. Configure wrangler.toml. Define bindings, build commands, and output configuration. This replaces Vercel's implicit configuration.

wrangler.toml - Vercel migration configuration
name = "my-app"
compatibility_date = "2025-01-15"
compatibility_flags = ["nodejs_compat"]

[build]
command = "npm run build"

[[kv_namespaces]]
binding = "CACHE"
id = "your-kv-namespace-id"

[[d1_databases]]
binding = "DB"
database_name = "my-app-db"
database_id = "your-d1-database-id"

4. Adapt environment variables. Move secrets using wrangler secret put. Move non-secret configuration to wrangler.toml vars or .dev.vars.

5. Adapt API routes and middleware. Vercel API routes become Workers handlers. Request/response model is similar but not identical.

6. Test thoroughly. Deploy to a preview URL. Test every route, API endpoint, authentication flow, edge case. Framework adapters handle most translation, but subtle differences surface in testing.

7. Configure DNS. If your domain is on Cloudflare, configure Workers routes. Otherwise, transfer the domain or serve from a workers.dev subdomain initially.

8. Compare metrics. Run Cloudflare alongside Vercel. Compare Core Web Vitals, time to first byte, error rates. Investigate differences before completing migration.

Reasons not to migrate

Migration from Vercel or Netlify isn't always beneficial. Consider staying if:

Framework integration is tighter elsewhere. Some Vercel features (automatic ISR, preview deployments with database branching, integrated analytics) don't have direct Cloudflare equivalents. If these are central to your workflow, migration removes capabilities you depend on.

Managed simplicity has value. Vercel and Netlify excel at removing decisions. Cloudflare provides more control but requires more configuration. If your team values the managed experience and the current platform works well, migration adds operational burden without proportionate benefit.

Cost comparison favours the current platform. Don't assume Cloudflare is cheaper. Compare actual costs at your traffic levels, including engineering time for migration.

Migration effort exceeds benefit timeline. If migration takes three months and you're uncertain about the product's future beyond six months, payback may not justify investment.

Migrate when Cloudflare provides specific advantages you need: lower latency through global distribution, Durable Objects for coordination, Workers AI for inference, or demonstrated cost savings at your scale. Don't migrate because it feels like progress.

Playbook: Containers to Cloudflare

Organisations running containers on ECS, Kubernetes, or other orchestration platforms have two migration targets: Workers serve workloads that fit the isolate model, while Cloudflare Containers serve workloads that genuinely need container capabilities.

Deciding the target

Most containerised workloads can run as Workers. Containers often exist because the team knew containers, not because the workload required them.

Migrate to Workers when:

Memory usage stays under 128 MB
Execution completes within time limits
Dependencies run in V8 (JavaScript, TypeScript, WebAssembly)
No filesystem persistence required between requests

Migrate to Cloudflare Containers when:

Memory requirements exceed 128 MB
The workload requires filesystem access
Dependencies include native binaries that won't compile to WebAssembly
Runtime isn't JavaScript (Python ML models, Go services, etc.)

Workers should be the default assumption. Containers are for workloads that don't fit.

Container migration process

For workloads targeting Cloudflare Containers:

1. Optimise for cold starts. Container cold starts are 2-10 seconds; faster than traditional cloud containers but slower than Workers' sub-5ms. Minimise image size with slim base images and multi-stage builds. Defer initialisation where possible.

2. Configure the container in wrangler.toml:

wrangler.toml - Container configuration
[[containers]]
class_name = "MyContainer"
image = "./Dockerfile"
max_instances = 10

3. Create the routing Worker. Workers route requests to Containers, deciding which requests need container processing and which can be handled at the edge.

4. Handle the cold start experience. Unlike Workers, container cold starts are user-visible. Address through loading indicators, optimistic UI updates, or architectural changes that pre-warm containers for predictable traffic.

5. Deploy and test with realistic traffic patterns, paying attention to cold start frequency and duration.

Containers on Cloudflare serve workloads needing container capabilities deployed globally with Cloudflare's network benefits. They're not a general container platform.

Playbook: database migration

Compute migration often triggers database migration because Workers connecting to RDS in us-east-1 inherit that latency regardless of where the Worker runs. But database migration is not always the answer, and when it is, the target depends on your data model and access patterns.

The fundamental decision: migrate or connect

Before planning how to migrate data, decide whether to migrate it at all. Consider these factors:

Use Hyperdrive (keep your existing database) when:

Your PostgreSQL or MySQL database works well and you've invested in its schema, indexes, and operational tooling
Data volume exceeds D1's 10 GB limit per database and sharding adds unwanted complexity
Your application relies on PostgreSQL-specific features: advanced JSON operators, full-text search with ranking, PostGIS spatial queries, stored procedures, or triggers
Regulatory requirements mandate specific database platforms or locations
The database serves multiple applications, not just the workload you're migrating

Hyperdrive provides connection pooling, prepared statement caching, and global connection reuse. Workers connect through Hyperdrive; the database stays where it is. This is a valid permanent architecture, not a migration stepping stone.

Migrate to D1 when:

You want edge-native data access without cross-region latency
Your data model fits SQLite's capabilities
Total data per logical database stays under 10 GB
You're building new applications or rebuilding existing ones
Horizontal partitioning (database per tenant, per region, or per entity type) matches your domain

Migrate to KV when:

Data is key-value shaped with simple access patterns
Eventual consistency (up to 60 seconds) is acceptable
You're replacing DynamoDB, Redis, or similar stores used primarily for caching or simple lookups

Schema translation: PostgreSQL and MySQL to D1

D1 runs SQLite. Most SQL translates directly, but PostgreSQL and MySQL features that don't exist in SQLite require application changes.

Data types that need translation:

PostgreSQL/MySQL	SQLite/D1	Notes
SERIAL, AUTO_INCREMENT	INTEGER PRIMARY KEY	SQLite auto-increments INTEGER PRIMARY KEY automatically
BOOLEAN	INTEGER	Use 0 and 1; SQLite has no native boolean
TIMESTAMP WITH TIME ZONE	TEXT or INTEGER	Store as ISO 8601 strings or Unix timestamps
JSON, JSONB	TEXT	SQLite stores JSON as text; use json_extract() for queries
ARRAY	TEXT or separate table	No native arrays; serialise or normalise
UUID	TEXT or BLOB	Store as string or 16-byte blob
ENUM	TEXT with CHECK constraint	No native enums
DECIMAL, NUMERIC	REAL or TEXT	REAL for calculations; TEXT for exact decimal preservation

Features that don't translate:

Stored procedures, triggers, and functions don't exist in SQLite. Logic that runs inside the database must move to application code. This is often beneficial: business logic in Workers is testable, version-controlled, and doesn't hide in database definitions. But it requires rewriting, not just schema translation.

Foreign key constraints exist in SQLite but are disabled by default. D1 enables them, but behaviour differs subtly from PostgreSQL. Test constraint violations explicitly; don't assume identical behaviour.

Full-text search exists in SQLite via FTS5, but the syntax and ranking differ from PostgreSQL's tsvector/tsquery. If search quality matters, consider whether D1's FTS meets requirements or whether Vectorize with semantic search better serves the use case.

D1's horizontal model

D1 databases have a 10 GB limit, which isn't a temporary constraint but rather reflects D1's architecture as Durable Objects underneath. The platform assumes many small databases, not one large one.

When to shard from the start:

If your current database exceeds 10 GB, or will exceed it within a year, plan your D1 architecture around multiple databases. Common patterns:

Database per tenant for multi-tenant SaaS. Each customer's data lives in isolation. The 10 GB limit applies per tenant, not globally. Cross-tenant queries become impossible, which is often a feature for data isolation.
Database per entity type when different data has different access patterns. Users in one database, transactions in another, audit logs in a third. Each can scale independently.
Database per time period for append-heavy data. Current month in one database, archives in others. Query routing adds complexity but prevents unbounded growth.

Routing queries to the right database:

Sharded architectures need routing logic. A Worker determines which database handles each request based on tenant ID, entity type, or date range. This logic is straightforward but must be consistent; routing errors corrupt data.

Simple tenant-based routing
function getDatabaseForTenant(env: Env, tenantId: string): D1Database {
  // Bindings named DB_TENANT_1, DB_TENANT_2, etc.
  const dbName = `DB_TENANT_${tenantId}`;
  return env[dbName] as D1Database;
}

For hundreds of tenants, binding-per-tenant becomes unwieldy. Store database IDs in a routing table (itself a D1 database or KV) and use the D1 API to connect dynamically.

Data migration process

For databases moving to D1:

1. Export schema and data. Use pg_dump, mysqldump, or equivalent. Export schema separately from data; you'll modify the schema before importing.

2. Translate the schema. Convert data types, remove unsupported features, add SQLite-specific syntax. Test the schema against an empty D1 database before importing data.

3. Transform the data. Convert boolean values, format timestamps, serialise arrays. Write transformation scripts that produce SQLite-compatible INSERT statements or CSV files.

4. Import incrementally. D1 has request size limits. Batch imports into chunks of a few thousand rows. Use transactions for consistency within batches.

# Split large SQL files for batch import
split -l 1000 data.sql data_chunk_
for chunk in data_chunk_*; do
  wrangler d1 execute my-database --file="$chunk"
done

5. Verify row counts and checksums. Compare counts per table. For critical data, compute checksums on source and destination to verify integrity.

6. Run application tests against D1. Schema compatibility doesn't guarantee application compatibility. Test every query path; SQLite's type affinity and NULL handling differ subtly from PostgreSQL.

7. Deploy with dual-write. Initially write to both old and new databases. Read from the old database. This lets you verify D1 handles production write patterns before switching reads.

8. Switch reads, then remove dual-write. Once D1 proves reliable under production load, switch reads to D1. After a validation period, remove writes to the old database.

DynamoDB to Cloudflare

DynamoDB's key-value and document model maps to either KV or D1, depending on how you use it.

DynamoDB as key-value store → KV:

If you use DynamoDB primarily for simple get/put operations with partition keys, KV is the natural target. Access patterns translate directly:

DynamoDB	Workers KV
GetItem	kv.get(key)
PutItem	kv.put(key, value)
DeleteItem	kv.delete(key)

KV's eventual consistency (up to 60 seconds for global propagation) differs from DynamoDB's strongly consistent reads option. If your application uses consistent reads, evaluate whether eventual consistency works or whether Durable Objects better fit the access pattern.

DynamoDB with complex queries → D1:

If you use DynamoDB's Query and Scan operations with sort keys, filters, and secondary indexes, D1's SQL model may actually simplify your code. GSIs and LSIs exist because DynamoDB's query model is limited; SQL handles the same access patterns natively.

Translate DynamoDB's single-table design back to normalised tables. The patterns that optimise DynamoDB access (composite keys, overloaded attributes, sparse indexes) don't apply to SQL databases and make schemas harder to understand.

DynamoDB Streams → Queues:

If you use DynamoDB Streams for change data capture, Workers don't have a direct equivalent for D1. Options include:

Application-level events: publish to Queues when data changes
Polling patterns: periodically query for changes using timestamps
External CDC: if the source remains DynamoDB during transition, process streams in Lambda and forward to Cloudflare

When not to migrate data

Database migration is expensive. The combination of schema translation, data transformation, application changes, and validation often exceeds compute migration effort by a factor of three or more.

Don't migrate data when:

Hyperdrive solves the latency problem. If your concern is Workers connecting to a distant database, Hyperdrive's connection pooling and caching may provide sufficient improvement without migration risk.

The database serves multiple applications. Migrating data used by systems outside your control creates coordination overhead that rarely justifies the benefit.

You're uncertain about D1's fit. Prototype with Hyperdrive first. If access patterns prove D1-compatible and the migration benefit becomes clear, migrate then. Premature data migration creates rollback complexity that compute migration doesn't.

Compliance requires specific platforms. Some regulations mandate specific database technologies or certifications. Verify D1's compliance status before assuming you can migrate.

Data migration should follow compute migration, not lead it. Prove the compute layer works with Hyperdrive; then evaluate whether native D1 access justifies migration investment.

Migration anti-patterns

Some migration approaches fail predictably. These anti-patterns describe what happens when migration processes are flawed:

Anti-Patterns That Predict Failure

Big bang migrations maximise risk. Migrating without baselines means you can't measure success. Atomic cutover without rollback bets everything on perfection. Migration as goal wastes engineering time without delivering value. These aren't risks to manage; they're failure modes to avoid.

Big bang migration. Moving everything simultaneously maximises risk and minimises learning. Migrate incrementally; learn as you go.

Migration without baselines. If you don't measure current performance, you can't know if migration improved anything. "It feels faster" isn't evidence.

Migration without rollback. Every step should be reversible. Deleting source data before validating migration eliminates your recovery path. Keep rollback options open until confidence is established through production operation, not hope.

Migration as goal. "We migrated to Cloudflare" is an activity, not an achievement. Achievements are measurable improvements: reduced latency, lower costs, simpler operations. If you can't articulate what migration improves in quantifiable terms, question whether you should migrate.

Complete migration when partial suffices. Hybrid architectures are valid long-term states. Migration doesn't have to be total to be valuable.

Underestimating integration complexity. Compute migration is often the easy part. Integrations (authentication services, monitoring systems, CI/CD pipelines) frequently take longer than anticipated. Inventory all integrations before estimating effort.

Getting help

Cloudflare provides migration support for enterprise customers; Solutions Architects help assess migration candidates, design sequences, troubleshoot issues, and validate outcomes.

For non-enterprise migrations, Cloudflare's documentation is comprehensive (though sometimes overwhelming), community forums vary in quality, and the Discord is active for specific questions. Edge cases require consulting specific documentation or community expertise.

Migration is investment, and the hours spent migrating don't ship features but recreate existing capability on new infrastructure. Invest wisely by quantifying expected benefits, migrating incrementally, maintaining rollback capability, and measuring outcomes. The goal isn't migration but rather the improvement migration enables.

What comes next

Chapter 26 closes this book by distilling twenty-five chapters into the principles that matter most. Whether you're starting your first Worker or leading a platform migration, these mental models provide the foundation for building well on Cloudflare.

The migration principles​

Coexistence before cutover​

Incremental over atomic​

Reversibility as requirement​

Observation before and during​

Playbook: S3 to R2​

Super slurper: complete migration​

Sippy: incremental migration​

After migration: compatibility notes​

Playbook: Lambda to Workers​

The model differences​

Assessment questions​

Migration process​

Playbook: Vercel/Netlify to Workers​

What changes​

The process​

Reasons not to migrate​

Playbook: Containers to Cloudflare​

Deciding the target​

Container migration process​

Playbook: database migration​

The fundamental decision: migrate or connect​

Schema translation: PostgreSQL and MySQL to D1​

D1's horizontal model​

Data migration process​

DynamoDB to Cloudflare​

When not to migrate data​

Migration anti-patterns​

Getting help​

What comes next​