The Problem: It’s Not Just a Database #
When a Junior Engineer designs Twitter, they draw a Users table and a Tweets table. Then they write this SQL query to build the Home Timeline:
SELECT * FROM tweets
WHERE user_id IN (SELECT following_id FROM followers WHERE user_id = 'ME')
ORDER BY created_at DESC;
The Logic Check: This works for 100 users. At 100 million users, this query kills the database instantly. Twitter is a Read-Heavy system (people read 100x more than they post). If you force the database to do a massive “Join” every time a user refreshes their feed, the latency will be seconds, not milliseconds.
The Evolution: Push vs. Pull #
To solve this, Twitter moved through distinct architectural phases. Understanding why they switched is the key to mastering System Design.
Phase 1: The “Pull” Model (SQL) #
- Logic: Fetch tweets on demand when the user refreshes.
- Failure: As mentioned, huge JOINs are too slow at scale.
Phase 2: The “Push” Model (Fan-Out on Write) #
- Logic: We optimize for the Reader, not the Writer.
- Mechanism: When I tweet, the system doesn’t just save the tweet. It immediately looks up all 1,000 of my followers and “pushes” the Tweet ID into their pre-computed cached lists (Redis).
- Result: When a follower opens Twitter, there is zero calculation. The timeline is already waiting in Redis:
[Tweet_A, Tweet_B, Tweet_C]. Access is O(1). - The Catch: This works great until… The Justin Bieber Problem.
The “Justin Bieber” Problem (Celebrity Fan-Out) #
If I tweet, I have 500 followers. Pushing 500 updates to Redis caches takes milliseconds. If Justin Bieber (or Elon Musk) tweets, they have 100+ million followers. Trying to update 100 million Redis lists instantly creates a massive “Write Spike.” The backlog clogs the queues, delaying tweets for everyone else.
The Solution: The Hybrid Architecture #
The “Logic” solution is recognizing that not all users are equal. You need a Hybrid approach.
- Normal Users (Push): If you have < 5k followers, your tweets are pushed immediately to your followers’ caches. (Optimized for fast reads).
- Celebrities (Pull): If you have > 1M followers, your tweet is not pushed. It is saved to the database.
- Timeline Construction: When a user requests their timeline, the system does two things:
- Step A: Pulls the pre-computed list of “Normal” tweets from Redis.
- Step B: Queries the “Celebrity” table to see if any followed VIPs have tweeted recently.
- Step C: Merges Step A and Step B in memory and serves the feed.
Architecture Diagram: The Hybrid Flow #
Here is the logical flow of a Hybrid Timeline system.
graph TD
User[User Posts Tweet] --> Service[Tweet Service]
Service --> DB[(Global Tweet DB)]
Service -- Check Follower Count --> Logic{Is Celebrity?}
Logic -- No (< 10k Followers) --> Fanout[Fan-Out Service]
Fanout --> Cache[(Redis Timeline Cache)]
note1[Push Model: Updates followers' lists immediately]
Logic -- Yes (> 1M Followers) --> NoOp[Do Nothing / Save to DB Only]
note2[Pull Model: Wait for read time]
subgraph "Reading the Timeline"
Reader[User Refreshes Feed] --> Load[Timeline Service]
Load --> GetCache[Get from Redis Cache]
Load --> GetCeleb[Query Celebrity Tweets]
GetCache --> Merge[Merge & Sort]
GetCeleb --> Merge
Merge --> Reader
endKey Takeaway for Architects #
System Design is rarely binary. It is rarely “Use A” or “Use B.” The best architectures use Segmentation. You segment your users (VIPs vs. Normal), you segment your data (Hot vs. Cold), and you apply different logic to each segment to optimize performance.
Twitter (X) is not just a “social network”; it is a high-speed message broker that handles distinct traffic patterns simultaneously.