When Does a SaaS Need to Migrate Cloud Platforms?
Most AI SaaS products start on Vercel/Railway and migrate to AWS or GCP at some point. The triggers: Vercel costs exceeding $5,000/month (usually at high traffic volume), compliance requirements (HIPAA, SOC2, FedRAMP), need for GPU instances, custom networking requirements, or enterprise customer requirements for data residency.
Phase 1: Migration Planning (2–4 weeks)
Audit your current infrastructure: list every service, its cost, its dependencies. Map out your target architecture on AWS/GCP. Identify: stateless services (easy to migrate), stateful services (databases — careful), external integrations (need URL changes). Build a risk register: what breaks if migration step X fails?
Phase 2: Set Up Target Infrastructure
Use Terraform or Pulumi to define your target infrastructure as code. Set up: VPC with public/private subnets, ECS or Kubernetes cluster for your services, RDS PostgreSQL (migrate from Neon/Supabase), ElastiCache Redis, Application Load Balancer, Route 53 DNS, CloudFront CDN. Test in staging before touching production.
Phase 3: Database Migration (The Hard Part)
Zero-downtime database migration strategy:
- Set up RDS replica of your current database
- Enable continuous replication (using pglogical or AWS DMS)
- Wait for replica to catch up to primary (near real-time lag)
- In a maintenance window: stop writes to old DB, verify replica is current, cut over DNS to new DB, start writes to new DB
- Keep old DB running for 24h as rollback option
Phase 4: Traffic Migration
Use weighted routing: send 10% of traffic to new infrastructure → monitor for errors → increase to 50% → 100%. Roll back is instant: change weights back. Vercel Traffic Splitting or Route 53 weighted records handle this elegantly.