Real-Time Ride-Hailing Architecture: Uber & Grab

This series dives deep into the technical architecture behind the most critical feature of ride-hailing applications: Real-time capabilities.

Seeing a car move smoothly on a map might seem simple, but behind it lies a massive distributed network: from battery-optimized GPS transport protocols, map gridding algorithms using hexagons (H3), the Kafka backbone processing millions of events per second, the DISCO system for optimal ride matching, to RAMEN — Uber’s real-time notification push network.

All content is synthesized from the official engineering blogs of Uber, Grab, and Lyft.

Series Contents

Executive Summary — The Big Picture of Real-time Ride-Hailing Systems

The Engineering Challenge Imagine you are an engineer at Uber or Grab. Your system must: Ingest GPS coordinates from millions of drivers every 4 seconds. Store and index all these positions in memory to query them in under 10ms. When a user requests a ride, find and rank the best drivers within a few kilometers, calculate the Estimated Time of Arrival (ETA) based on real-time traffic, and push the ride offer to the driver’s phone instantly — all within 2 seconds. Simultaneously, continuously calculate dynamic pricing (surge pricing) based on the supply-demand ratio in each area, updating every few seconds. This is not a typical CRUD application. It is one of the most complex distributed systems in the world. ...

GPS Location Ingestion at Scale: gRPC Streaming, MQTT & Kalman Filter in Ride-Hailing

The Challenge: Millions of Drivers, Every 4 Seconds Answer-first: Uber and Grab handle 1.25 million GPS write operations per second from ~5 million active drivers. HTTP REST fails at this scale due to per-request TCP+TLS handshake overhead. The solution is persistent connections (gRPC streams or MQTT) with Protobuf serialization, Kalman Filter noise reduction, and batched coordinate uploads — cutting network calls by 67% while maintaining sub-200ms end-to-end latency. Grab has approximately 5 million drivers operating in Southeast Asia. Uber has over 5 million drivers globally. If every driver sends a GPS coordinate every 4 seconds, the system must receive: ...

H3 Geospatial Indexing: How Uber Finds Nearby Drivers with Hexagonal Spatial Index

The Problem: Finding a Needle in a Haystack Answer-first: Uber and Grab find the nearest available driver in under 100ms by dividing the Earth’s surface into hexagonal cells (H3 index at Resolution 8, each ~0.74 km²). Instead of calculating distance to every driver, they look up only the 7 cells nearest to the rider — reducing millions of comparisons to dozens. When you tap “Book” on Grab, the system must find the most suitable driver within a radius of a few kilometers. But the system is tracking millions of drivers simultaneously. The naive approach — calculating the distance from you to every driver — is impossible: ...

Apache Kafka & Flink in Ride-Hailing: Event Streaming Architecture at Uber Scale

Why Do We Need Event Streaming? Millions of events occur every second in a ride-hailing system: Driver A updates their GPS coordinates. Customer B opens the app and requests a ride. Driver C accepts a ride offer and starts moving. Customer D cancels a ride. Surge pricing updates the multiplier in the Downtown area. If every service called each other directly (synchronous communication), the system would become tightly coupled and fragile — one slow service would bring down the entire chain. The solution is Event Streaming: every event is pushed into a central “pipeline,” and services independently subscribe to listen to the events they care about. ...

Ride-Hailing Dispatch Algorithm: How Uber DISCO & Grab DispatchGym Match Drivers

Every time you tap “Book Ride,” a system makes dozens of decisions in under two seconds: Which driver? What route? What’s the real ETA? This article breaks down exactly how the dispatch algorithm works — from the greedy approach that fails at scale, to the bipartite graphs, batched matching, and surge pricing mechanics that power Uber, Lyft, Grab, and Gojek today. Why a Greedy Dispatch Algorithm Fails (Closest Driver Problem) The first instinct when designing a matching system is to pair every customer with their nearest driver. However, this Greedy approach causes massive losses at a system-wide scale: ...

Surge Pricing Algorithm: How Ride-Hailing Engines Calculate Surge Rate in Real Time

Series context: This is Part 5 of the Real-Time Ride-Hailing Architecture series. For location ingestion and geospatial indexing, start at Part 1. What Is Surge Rate? (And How Is It Calculated?) Surge rate is the real-time price multiplier (e.g., 2.0×) applied by ride-hailing platforms when ride demand in a geographic zone exceeds available driver supply. It is recalculated every 30–60 seconds per H3 hexagon cell using a demand/supply ratio fed into a lookup table or ML model. ...

Uber RAMEN: How Ride-Hailing Apps Push Real-Time Notifications to Millions of Devices

The Problem: Pushing Instant Notifications to Millions of Devices Answer-first: Uber’s RAMEN system maintains persistent gRPC bidirectional streams to every active driver app. When a match is made, the ride offer travels from the matching engine to the driver’s phone in under 100ms — without polling. This is how millions of connections are held open simultaneously without crashing the backend. When DISCO decides to match you with Driver John Doe, the system must: ...

Series Contents#

Series Contents