Skip to content

Mobile App — Designed in Stages

First PublishedByAtif Alam

You don’t need to design for scale on day one.

Define what you need, build the simplest thing that works, then evolve as traffic and requirements grow.

Here we use a generic mobile app—users, CRUD content, maybe some media upload—as the running example. The same staged thinking works for chat, feeds, payments, or anything else you’re building.

Requirements and Constraints (no architecture yet)

Section titled “Requirements and Constraints (no architecture yet)”

Functional Requirements

  • Core actions (e.g. sign up/login, create/update/delete item, list items, view item)
  • Optional: upload/download media, share, search, notifications

Quality Requirements

  • Latency target (e.g. p95 < 300ms for reads)
  • Availability target (e.g. 99.9% vs 99.99%)
  • Consistency needs (strong vs eventual; per feature)
  • Expected scale (DAU/MAU, QPS, storage growth) — DAU = Daily Active Users, MAU = Monthly Active Users, QPS = Queries (or requests) Per Second
  • Security/compliance (PII, encryption, audit, retention)

Key Entities

  • User, Session/AuthToken
  • Core object: Item / Post / Note
  • Optional: Media, Device, Notification, etc.

Primary Use Cases and Access Patterns

  • Most common reads/writes (e.g. “list my items” 10x more frequent than “create item”)
  • Hot keys risk — one key or partition gets most of the traffic and becomes a bottleneck (e.g. user’s data, global config, shared resource like a single “trending” list)
  • Read-heavy vs write-heavy

Given this, start with the simplest MVP architecture that meets requirements, then evolve it as usage grows.

Stage 1 — MVP (simple, correct, not over-engineered)

Section titled “Stage 1 — MVP (simple, correct, not over-engineered)”

Goal

Ship fast, keep it reliable, and keep the system understandable.

Components

  • Mobile or Web App
  • Single Backend Service (Monolith API)
    • REST/GraphQL endpoints
    • Handles auth + business logic
  • Single Primary Database (usually relational to start: Postgres/MySQL)
  • Basic Object Storage (only if you have media/files)

Minimal Diagram

Mobile or Web App
|
v
Monolith API (stateless)
|
v
Primary DB (SQL)
(+ Object Storage if media)

Patterns and Concerns (don’t overbuild)

  • Auth: sessions/JWT (simple)
  • Input validation + basic authorization checks
  • Basic indexing on main query paths
  • Basic logging (structured) + simple metrics (requests, errors, latency)
  • Backups for DB

Why This Is a Correct MVP

  • One service, one DB → easiest to reason about, easiest to ship
  • Vertical scaling buys you time (bigger instance) before adding complexity

Stage 2 — Growth Phase (users rising, bottlenecks appear)

Section titled “Stage 2 — Growth Phase (users rising, bottlenecks appear)”

What Triggers the Growth Phase?

  • API CPU/memory becomes a bottleneck
  • DB starts getting slow (read QPS, connection saturation, heavy indexes)
  • Media delivery becomes expensive or slow
  • You need reliability during deploys, spikes, partial failures

Components to Add (incrementally)

  • Load Balancer
  • Multiple API instances (still one codebase; avoid microservices yet)
  • Cache (Redis/Memcached) for hot reads, sessions, rate limit counters
  • Queue + Worker(s) for slow tasks (emails, image processing, push notifications)
  • CDN (if serving static/media)
  • Read replicas for DB (if read-heavy)

Growth Diagram

+------------------+
Mobile App ---> | Load Balancer |
+------------------+
| |
v v
API Server API Server (horizontally scaled)
\ /
\ /
v v
Cache (Redis) <-- for hot reads, sessions
|
v
Primary DB ---> Read Replicas (optional)
API Server --> Queue --> Workers (async jobs)
Object Storage <--- CDN (if media)

Patterns and Concerns to Introduce (practical scaling)

  • Rate limiting (per user/IP, prevent abuse)
  • Timeouts + retries (with backoff; idempotency for writes)
  • Cache strategy (often cache-aside + TTL to start)
  • Basic autoscaling (scale API servers on CPU/QPS/latency)
  • Better observability: dashboards + alerts, p95/p99 latency tracking
  • Blue/green or canary deploys (reduce risk)

Still Avoid (common over-engineering here)

  • Splitting into many microservices “because scale”
  • Multi-region active-active before you need it
  • Complex event sourcing without a strong reason

Stage 3 — Advanced Scale (very high traffic, global, strict reliability)

Section titled “Stage 3 — Advanced Scale (very high traffic, global, strict reliability)”

What Triggers Advanced Scale?

  • DB becomes the limiting factor even with replicas (write throughput, storage size)
  • You need multi-region latency and availability
  • You have multiple teams; ownership boundaries matter
  • You have “power users” / hotspots / high fanout workloads
  • You need stronger operational guarantees (SLOs, incident load, compliance)

Components (common advanced additions)

Global Traffic Management

  • Global LB / geo routing
  • Multi-region deployment (active-passive or active-active depending on requirements)

Service Decomposition (when justified)

  • Split monolith into a few services by domain (Auth, Core API, Media, Notifications, Search)
  • Add API Gateway if it helps routing, auth enforcement, versioning at scale

Data Scaling

  • Sharding/partitioning of primary DB (hash/range/tenant-based)
  • Dedicated data stores by workload:
    • Search index for search
    • KV store for extremely hot lookups
    • Object storage + CDN for media
  • Separate analytics pipeline/warehouse for reporting

Event Backbone

  • Pub/Sub for domain events
  • Stream processing for real-time analytics/feeds (if relevant)

Reliability Infrastructure

  • Circuit breakers / bulkheads
  • Disaster recovery runbooks, chaos testing (light mention)
  • Stronger secrets management, audit logging

Advanced Diagram (conceptual)

Global DNS / Geo Routing
|
v
Global Load Balancer
|
+--------+--------+
| |
Region A Region B
(active) (active/standby)
| |
API Gateway API Gateway
| |
+----------+-----+ +------+----------+
| Core API Service| | Core API Service|
+----------+-----+ +------+----------+
| |
Cache / KV Cache / KV
| |
Sharded DB cluster Sharded DB cluster
(or primary in one region + replicas)
Media Service --> Object Storage --> CDN
Notification Service --> Pub/Sub --> workers/push
Search Service --> Search Index
Analytics --> Stream --> Warehouse

Patterns and Concerns at This Stage

  • Consistency Strategy explicitly stated: strong consistency for money/auth; eventual consistency acceptable for feeds/counters
  • Idempotency everywhere (especially across retries and async)
  • Saga/Outbox Pattern if you have cross-service workflows
  • Multi-Region Data Strategy: where writes happen, replication model (sync/async), conflict handling if active-active
  • SLO-Driven Ops: error budgets, on-call, alert tuning, capacity planning

MVP meets the requirements for your target users and QPS.

As you grow, the first bottlenecks will typically be API scale and DB read load—so you add a load balancer, horizontal API scaling, cache, read replicas, and an async queue.

At very large scale, DB writes and global availability become the next bottlenecks; then you introduce sharding, multi-region deployment, splitting services by domain, and an event backbone where it pays off.

This approach gives you:

  • Start Simple — one service, one DB, ship and learn.
  • Scale Intentionally — add components and patterns when bottlenecks or requirements justify them.
  • Add Complexity Only When Required — avoid over-engineering until you need it.