View Categories

Uber: Geo-Spatial Architecture – The “Hexagon” Logic

2 min read

The “Nearest Neighbor” Trap #

If you ask a developer to build a “Find Drivers Near Me” feature, their first instinct is usually PostgreSQL + PostGIS.

They write a query like this:

SELECT * FROM drivers 
WHERE ST_Distance(driver_location, user_location) < 5000; -- 5km radius

The Logic Check: This works for a food delivery app with 1,000 drivers. It fails catastrophically for Uber.

Why? Write Velocity.

Uber drivers update their location every 4 seconds. If you have 50,000 active drivers in NYC, that is 12,500 DB writes per second. Relational databases (even with B-Tree or R-Tree indexes) cannot re-index fast enough to keep those “Nearest Neighbor” queries performant. The index locking will kill your CPU.

The Solution: Grid Systems (Bucketing) #

To scale, you must stop thinking in “Coordinates” (Infinite precision) and start thinking in “Buckets” (Discrete IDs).

If you divide the world into a grid, you can just ask: “Which bucket am I in? Who else is in this bucket?”

Google uses S2 (Square Grid).

Uber uses H3 (Hexagonal Grid).

The architectural question is: Why did Uber re-invent the wheel? Why Hexagons?

The Logic: The “Neighbor” Problem #

The problem with Squares (S2/Geohash) is that they are mathematically awkward for radius searches.

  1. ** unequal Distances:** If you are in the center of a square, the distance to the corner neighbor is longer than the distance to the side neighbor ($$d$$vs$$d\sqrt{2}$$). This causes “kinks” in your search radius.
  2. Neighbor Count: A square has 8 neighbors (4 sides, 4 corners).
  3. Hexagon Superiority:
    • Equidistance: A hexagon has 6 neighbors. The distance to the center of all 6 neighbors is identical. This allows Uber to approximate a circle (radius) simply by aggregating “Ring 1” (immediate neighbors) and “Ring 2” (neighbors of neighbors).
    • Smoothing: When visualizing “Surge Pricing,” hexagons allow for smooth gradients between zones. Squares create sharp, artificial blocky edges that look bad on a map.

Architecture Diagram: The H3 Indexing Flow #

Uber doesn’t query a database for “Latitude 40.7128.” They query for “Hexagon ID 8928308280fffff.”

graph TD
    subgraph "Driver Side (Write Path)"
    Driver[Driver App] -- "GPS Ping (Lat/Lon)" --> LB[Load Balancer]
    LB --> LocSvc[Location Service]
    LocSvc -- "Convert Lat/Lon to H3 ID" --> Logic[H3 Library]
    Logic -- "H3 ID: 892830..." --> Redis[(Redis Cluster)]
    
    note1[Key Insight: Redis is sharded by H3 ID, not Driver ID.]
    end

    subgraph "Rider Side (Read Path)"
    Rider[Rider App] -- "Request Ride (Lat/Lon)" --> Dispatch[Dispatch Service]
    Dispatch -- "Calculate H3 ID of Rider" --> Logic2[H3 Library]
    Logic2 -- "Get Drivers in Hex 892830..." --> Redis
    Redis -- "List of Driver IDs" --> Dispatch
    Dispatch -- "Filter & Match" --> Rider
    end

The “Diskless” Sharding Strategy #

Since the location data is ephemeral (we don’t care where a driver was 10 minutes ago, only now), Uber runs this entirely in memory.

  • Ringpop: Uber built a library called Ringpop to handle application-layer sharding.
  • The Logic:
    1. The world is divided into thousands of H3 Hexagons.
    2. Each Node (Server) in the cluster is responsible for a specific set of Hexagons.
    3. When a driver pings, the Load Balancer hashes the H3 ID and routes the ping to the specific server that “owns” that patch of land.
    4. Zero DB Writes: The driver’s location is just updated in that server’s RAM.
    5. Result: “Nearest Neighbor” becomes an O(1) memory lookup, not an O(N) database query.

Real-World Lesson: Dynamic Precision #

H3 is hierarchical. A parent hexagon (Resolution 1) contains 7 child hexagons (Resolution 2).

  • Rural Area: Uber tracks drivers at Resolution 7 (Large Hexagons) because cars are spread out.
  • Manhattan: Uber tracks drivers at Resolution 9 (Tiny Hexagons) to distinguish between a car on 5th Ave vs. 6th Ave.

Conclusion #

Architecture is about choosing the right shape for your data.

  • Delivery App (Small Scale): Use Postgres/PostGIS. It’s accurate and easy.
  • Map/Navigation (Google): Use S2 (Squares). It fits standard map projections better.
  • Ride Sharing (Uber): Use H3 (Hexagons). The equidistance property is critical for radial math and moving assets.