Progressive Delivery
Progressive delivery means rolling out changes to a small subset of users or traffic first, verifying health, then gradually expanding.
If something breaks, only a fraction of users are affected and you can stop or roll back before it gets worse.
The alternative—deploying to 100% of traffic at once—means every bad change is a full-blast incident.
Deployment Strategies
Section titled “Deployment Strategies”| Strategy | How It Works | Rollback Speed | Complexity |
|---|---|---|---|
| Rolling update | Replace instances one at a time (or in batches); old and new versions run side by side briefly | Moderate — redeploy previous version | Low |
| Blue/green | Run two identical environments; deploy to the inactive one, then switch traffic | Fast — switch traffic back | Medium (two full environments) |
| Canary | Route a small percentage of traffic to the new version; increase gradually if healthy | Fast — route traffic away from canary | Medium-high (traffic splitting, observability) |
| Traffic shifting | Gradually move traffic from old to new (e.g. 1% → 5% → 25% → 100%) with automated health checks | Fast — shift back | High (automation, SLI integration) |
Rolling Updates
Section titled “Rolling Updates”The simplest progressive strategy. Your orchestrator (e.g. Kubernetes, ECS) replaces instances in batches.
Each batch is health-checked before the next begins.
- When to use — Default for most services. Good when you have health checks and can tolerate brief mixed-version traffic.
- Watch out for — Schema changes or API contract changes where old and new versions are incompatible. If the deploy fails midway, you have a mixed fleet.
Blue/Green Deployments
Section titled “Blue/Green Deployments”Two identical environments: “blue” (current) and “green” (new).
Deploy to green, verify, then switch the load balancer or DNS to point at green.
- When to use — When you want zero-downtime cutover and fast rollback. Common for stateless services.
- Watch out for — Database migrations must be backward-compatible (both versions may read/write during cutover). Cost of running two environments, even briefly.
- Rollback — Switch traffic back to blue. Green becomes the next deployment target.
Canary Releases
Section titled “Canary Releases”Deploy the new version to a small slice of traffic (e.g. 1-5%). Monitor SLIs (error rate, latency, throughput) for that slice.
If healthy, promote to more traffic. If not, roll back the canary.
- When to use — When you need confidence that a change works under real traffic before full rollout. Especially valuable for high-traffic services.
- What to monitor — Compare canary SLIs against the baseline (the non-canary instances). Look for elevated error rates, latency spikes, or resource consumption changes. See Error Rate and Throughput and Latency Percentiles.
- Automation — Mature teams automate canary analysis: if SLIs degrade beyond a threshold, the canary is automatically rolled back.
Traffic Shifting
Section titled “Traffic Shifting”A more granular version of canary. Instead of a binary “canary or not,” you shift traffic in controlled increments (1% → 5% → 25% → 50% → 100%) with automated health gates at each step.
- When to use — Critical services where you want maximum control and automated safety.
- Requires — Traffic splitting (service mesh, load balancer rules, or feature flag routing), SLI-based health checks, and automation to advance or roll back.
Choosing A Strategy
Section titled “Choosing A Strategy”| Concern | Recommendation |
|---|---|
| Simplest to start | Rolling update |
| Fast rollback, stateless service | Blue/green |
| High-traffic, need real-traffic validation | Canary |
| Maximum safety, willing to invest in automation | Traffic shifting |
Most teams start with rolling updates and move to canary or blue/green as they grow.
The right choice depends on your service’s risk profile, traffic volume, and how much you’ve invested in observability and automation.
See Also
Section titled “See Also”- Feature Flags and Rollback — Decouple deploy from release so you can control exposure without redeploying.
- CI/CD for Applications — How the pipeline triggers and gates these strategies.
- Load and Stress Testing — Validate that new versions don’t regress performance before progressive rollout.
- Infrastructure / redundancy example — Deployment and rollback patterns in the context of infrastructure design.