Designing Robust Rate Limiters

3 min read

Table of Contents

The "DDoS" Misconception
The "Naive" Approach: Fixed Window Counter
The "Pro" Logic: The Token Bucket Algorithm
Architecture Diagram: The Atomic Check
Alternative: The Leaky Bucket (Shopify)
Distributed Rate Limiting
Conclusion

The “DDoS” Misconception #

Most engineers think Rate Limiters are just security tools to stop hackers from DDoS-ing a site. While true, their primary logical purpose is Quality of Service (QoS) for good users.

Without a rate limiter, one “Noisy Neighbor” (a script kiddie or a bug in a client’s infinite loop) can consume 100% of your database connections, causing timeouts for your Premium Enterprise users. Architecture is about fairness. A Rate Limiter is the traffic cop that enforces that fairness.

The “Naive” Approach: Fixed Window Counter #

The first attempt at a rate limiter usually looks like this:

User makes a request at 10:00:01.
Increment a Redis key user_123_10:00.
If value > 100, reject.

The Logic Flaw: The “Edge Case.” If a user sends 100 requests at 10:00:59 and another 100 requests at 10:01:01, they have sent 200 requests in 2 seconds. The counter sees them as “legal” (100 per distinct minute), but your server sees a massive spike that crashes the database.

The “Pro” Logic: The Token Bucket Algorithm #

This is the industry standard (used by Stripe, AWS, and Uber). Imagine a bucket that holds Tokens.

Capacity: The bucket holds max 10 tokens.
Refill Rate: We add 1 token every second.
Consumption: Every API request costs 1 token.

Why this Logic Wins:

Supports Bursts: If the bucket is full, a user can instantly fire 10 requests (Burst). This feels “fast” to the user.
Enforces Limits: Once empty, the user is forced to wait for the refill rate (1 req/sec).
Memory Efficient: You only need to store two integers per user: CurrentTokens and LastRefillTimestamp.

Architecture Diagram: The Atomic Check #

To implement this at scale, you cannot use simple Database reads/writes because of Race Conditions.

Scenario: Two requests come in at the exact same millisecond. Both read “Tokens = 1”. Both subtract 1. Both succeed. The bucket goes to -1.
The Fix: Use Redis with Lua Scripts. Lua scripts execute atomically inside Redis. No other command can run while the script is calculating the token math.

graph TD
    Client[Client Request] --> API[API Gateway / Middleware]
    
    API -- "1. Check Limit (User ID)" --> Redis[(Redis Cache)]
    
    subgraph "Redis (Atomic Lua Script)"
    Redis --> Check{Has Tokens?}
    Check -- Yes (Count - 1) --> Allow
    Check -- No (Count = 0) --> Deny
    end
    
    Allow -- "2. Returns OK" --> API
    Deny -- "2. Returns 429 Too Many Requests" --> API
    
    API -- "3a. Forward Request" --> Backend[Backend Service]
    API -- "3b. Return Error" --> Client

Alternative: The Leaky Bucket (Shopify) #

Shopify uses a slightly different logic because they manage e-commerce “Flash Sales.”

The Logic: Requests enter a queue (the bucket) and are processed at a constant rate (the leak).
The Trade-off: It smoothes out traffic perfectly (protects the database), but it forces users to wait in a queue rather than failing fast.
Decision:
- Use Token Bucket for APIs (User expects an immediate answer).
- Use Leaky Bucket for background jobs or checkout queues (User is willing to wait).

Distributed Rate Limiting #

The problem gets harder when you have 100 API Servers. If you limit locally (in-memory), User A hits Server 1 (Limit: 10) and Server 2 (Limit: 10). They get double the capacity. The Solution: You must use a centralized store like Redis. The Cost: Every API request now has a network round-trip to Redis. This adds latency. The Optimization: “Local Batching.” Allow servers to count 10 requests locally, then sync “I processed 10” to Redis in one go. You lose strict precision (users might go slightly over limit), but you gain massive performance.

Conclusion #

Building a Rate Limiter in your application code (e.g., inside a Laravel/Express controller) is an architectural anti-pattern. Ideally, this logic should sit in your Infrastructure (API Gateway, Nginx, or Envoy sidecar). Keep your application logic pure; let the infrastructure handle the traffic policing.

What are your Feelings

Still stuck? How can we help?

Updated on February 14, 2026

Core Logic

System Design

The Architect's Path

Designing Robust Rate Limiters

The “DDoS” Misconception #

The “Naive” Approach: Fixed Window Counter #

The “Pro” Logic: The Token Bucket Algorithm #

Architecture Diagram: The Atomic Check #

Alternative: The Leaky Bucket (Shopify) #

Distributed Rate Limiting #

Conclusion #

What are your Feelings

The “DDoS” Misconception #

The “Naive” Approach: Fixed Window Counter #

The “Pro” Logic: The Token Bucket Algorithm #

Architecture Diagram: The Atomic Check #

Alternative: The Leaky Bucket (Shopify) #

Distributed Rate Limiting #

Conclusion #

What are your Feelings

Share This Article :

How can we help?