Applied Design: Social Feed

Module 14: Applied Design — Social Feed

Track 3: Global Scale (6+ YoE)

"Design a social media news feed" is the quintessential system design question. It covers storage, caching, distributed systems, and real-time delivery. The core challenge: when a user opens their app, they expect to see a personalized, ranked timeline of posts from people they follow — assembled in under 200ms.

Step 1: Requirements

Functional Requirements

Users can create posts (text, images, videos)
Users can follow/unfollow other users
Users see a timeline of posts from people they follow
Timeline is ranked by relevance (not just chronological)
Posts appear in followers' timelines within 5 seconds
Support likes, comments, and shares

Non-Functional Requirements

Scale: 500M DAU, 1B total users
Latency: Feed loads in <200ms
Availability: 99.99%
Read-heavy: 100:1 read:write ratio

Step 2: Estimation

// Write traffic

500M DAU, 10% post daily → 50M posts/day → ~600 posts/sec

// Read traffic

500M DAU × 10 feed loads/day = 5B reads/day → ~60K reads/sec

// Fan-out

Avg user has 200 followers

600 posts/sec × 200 followers = 120K fan-out writes/sec

// Celebrity problem: Taylor Swift has 90M followers

// 1 post → 90M fan-out writes → impossible at write time

// Storage

50M posts/day × 365 × 5 years = 91B posts

91B × 1 KB avg = ~91 TB metadata

Step 3: High-Level Design

Post Creation Flow

1. Client → API Gateway → Post Service

2. Post Service → write post to Posts DB (PostgreSQL)

3. Post Service → upload media to Object Storage (S3)

4. Post Service → publish event to Kafka: "new_post"

5. Fan-out Service consumes event

6. Fan-out Service → fetch poster's follower list

7. Fan-out Service → write post_id to each follower's timeline cache

Feed Reading Flow

1. Client → API Gateway → Feed Service

2. Feed Service → read user's pre-computed timeline from Redis

3. Timeline contains list of post_ids, sorted by rank score

4. Feed Service → fetch post details from Posts cache/DB (multiget)

5. Feed Service → return assembled feed to client (<200ms)

Step 4: Deep Dives

Fan-Out: The Core Tradeoff

Approach	How It Works	Pro	Con
Fan-out on Write (Push)	When user posts, write to all followers' timelines immediately	Feed reads are fast (pre-computed)	Celebrity: 90M writes per post
Fan-out on Read (Pull)	When user opens feed, query all followed users' posts and merge	Writes are fast (just store the post)	Slow reads if following 1000+ users
Hybrid ✓	Push for normal users, pull for celebrities (>10K followers)	Best of both worlds	More complex implementation

The Hybrid Approach (Twitter/X)

Twitter uses the hybrid model. For normal users (<10K followers), fan-out on write pushes post IDs into followers' timeline caches. For celebrities (>10K followers), their posts are fetched on-demand at read time and merged with the pre-computed timeline. This keeps write latency bounded while still delivering fast reads.

Feed Ranking

Chronological feeds are simple but engagement-poor. Modern feeds use ML-based ranking:

Candidate generation: Collect ~1,000 candidate posts (from timeline cache + celebrity pulls)
Feature extraction: Post age, author affinity, engagement signals, content type, user preferences
Scoring: ML model predicts P(engagement) for each candidate
Re-ranking: Diversity rules (don't show 5 posts from same author in a row), ads insertion, anti-spam filtering
Truncation: Return top 20 posts, paginate with cursor

Lessons from the Trenches

Case Study: Twitter's Timeline Architecture

Twitter's fan-out service writes ~4,500 tweets/sec to 500M+ timeline caches. Each timeline is stored as a sorted set in Redis. When you open Twitter, your feed is already pre-computed — the app fetches post IDs from your timeline cache, then hydrates them with post details. For celebrities like Obama or Taylor Swift, their tweets are not fanned out — they're merged on-read. This hybrid approach handles both the "hot celebrity" and "normal user" patterns efficiently.

Takeaway: The hybrid fan-out model is the industry standard. Pre-compute when you can, pull on-demand when you must.

Case Study: Instagram's Ranked Feed

In 2016, Instagram switched from chronological to ML-ranked feeds. The ranking model predicts engagement probability for each candidate post, considering: relationship closeness, post age, content similarity to past engagement, and post type (photo, video, carousel). The switch increased average session time by 10% — users saw more relevant content and scrolled deeper.

Takeaway: Ranked feeds dramatically increase engagement. The ranking model is the secret weapon — invest in ML if your product depends on content discovery.

Related Simulators

Cache Eviction Consistent Hashing Database Sharding Async Architecture

Next Design

Design: Real-time Chat

WebSocket connections, presence heartbeats, message ordering, and offline sync.

Continue Journey