The 4-Step Design Framework

A repeatable mental model for tackling any system design problem.

Module 12: The 4-Step Design Framework
Track 3: Global Scale (6+ YoE)
You now have the building blocks — scaling, caching, databases, queues, load balancing. But how do you pull them together when faced with a blank whiteboard? This module gives you a structured, repeatable process for designing any system — whether it's a 45-minute interview or a real-world greenfield architecture.
Step 1: Requirements & Scope (5 min)

The biggest failure mode: designing the wrong system. Before drawing any boxes, spend 5 minutes clarifying exactly what you're building and for whom.

Functional Requirements

What does the system do? These are the user-facing features. Be specific:

  • ❌ "Design a URL shortener" → too vague
  • ✅ "Users can create short URLs. Short URLs redirect to the original. Users can see click analytics. URLs expire after 30 days by default. Custom aliases are supported."

Ask clarifying questions: Can users delete URLs? Is authentication required? Are there rate limits? What's the max URL length?

Non-Functional Requirements (NFRs)

How should it behave? These are the quality attributes:

Availability

99.9%? 99.99%? 99.9% = 8.76 hours downtime/year. 99.99% = 52 minutes/year. Big difference in architecture.

Latency

P99 latency target? URL redirect should be <50ms. Analytics dashboard can tolerate 2-3 seconds.

Consistency

Strong or eventual? URL creation must be consistent (no duplicates). Analytics can be eventually consistent.

Scale

How many users? Requests/sec? Data volume? This feeds directly into Step 2.


Step 2: Back-of-Envelope Estimation (5 min)

Estimations ground your design in reality. They tell you whether you need 1 server or 1,000, and whether your data fits on SSD or requires distributed storage.

Key Numbers Every Engineer Should Know

Operation Latency
L1 cache reference0.5 ns
L2 cache reference7 ns
Main memory reference100 ns
SSD random read150 μs
HDD random read10 ms
Same-datacenter round trip0.5 ms
Cross-continent round trip150 ms

Estimation Example: URL Shortener

// Traffic estimation
100M new URLs/month → ~40 URLs/sec (write)
Read:Write ratio = 100:1 → 4,000 reads/sec
// Storage estimation (5 years)
100M URLs/month × 12 months × 5 years = 6B URLs
Avg URL size: 500 bytes (original) + 7 bytes (short code) + metadata
6B × 1 KB ≈ 6 TB total storage
// Cache estimation (80/20 rule: 20% of URLs get 80% of traffic)
4,000 reads/sec × 86,400 sec/day = 345M reads/day
Cache 20% of daily reads: 345M × 0.2 × 1 KB ≈ 70 GB
// → Fits in a single Redis instance (up to 256 GB RAM)

Step 3: High-Level Design (15 min)

Now draw the architecture. Start with the happy path for the most critical use case, then expand.

The Universal Building Blocks

  • Clients: Web, mobile, API consumers
  • Load Balancer / API Gateway: Entry point, TLS termination, rate limiting, routing
  • Application Servers: Stateless services behind LB
  • Cache: Redis/Memcached for hot reads
  • Database: PostgreSQL (OLTP), Cassandra (write-heavy), etc.
  • Message Queue: Kafka/SQS for async processing
  • Object Storage: S3 for files, images, videos
  • CDN: Cloudflare/CloudFront for static assets and edge caching

API Design First

Before drawing boxes, define the API contract. For a URL shortener: POST /api/shorten {"url": "...", "ttl": 3600} → {"short_url": "..."} and GET /{short_code} → 302 Redirect. The API reveals the data flow, which dictates the architecture.


Step 4: Deep Dive (20 min)

Pick 2-3 components and go deep. This is where you demonstrate technical depth. Common deep-dive areas:

Data Model & Schema

Table schemas, partition keys, indexes. How do you handle queries efficiently? What denormalization is needed?

Scaling Bottlenecks

What breaks at 10x traffic? Database sharding strategy? Hot partition handling? Cache stampede prevention?

Failure Modes

What happens if the database is down? If a datacenter fails? What's the blast radius? How do you detect and recover?

Consistency & Concurrency

Race conditions? Distributed transactions? How do you ensure exactly-once processing? What consistency model?


Common Anti-Patterns
  • Over-engineering: Designing for 1B users when you have 1,000. Start simple, add complexity only when justified by estimation.
  • Buzzword architecture: Adding Kafka, Redis, Elasticsearch to every design without justification. Each component adds operational cost.
  • Ignoring failure: A design without failure modes is a design that will fail catastrophically. What happens when X dies?
  • Monologue mode: In interviews, driving the design without checking in. In real life, designing without stakeholder alignment.
  • Jumping to solutions: Drawing boxes before understanding requirements. The most common mistake.

Further Reading
  • System Design Interview by Alex Xu — Step-by-step walkthroughs of 15+ real-world designs. (ByteByteGo, 2020)
  • Designing Data-Intensive Applications by Martin Kleppmann — The engineering reference behind every design decision. (O'Reilly, 2017)
  • The System Design Primer — GitHub — Open-source collection of scalability topics and design examples.
  • High Scalability — Case studies of how real companies scale their architectures.

Apply It

Design: File Storage

Put the framework into practice. Design an S3-like distributed file storage system from scratch.

Continue Journey