Layrs reposted this
8 load balancing algorithms explained in 2 minutes and 5 seconds.Because you don't have time to read the "documentation" your senior left before switching to that Gen AI startup. Btw, I keep sharing system design insights like these regularly, so stick around.. And if you're learning system design for the first time, check out layrs.me - It’s a complete interactive platform where you learn by doing - 60+ problems available, you can add your own problems - AI-assisted feedback + forces you to think in 1st principles Alright, back to the algorithms. [1] Round Robin: The Baseline Distributes requests sequentially across servers. Simple, predictable, zero overhead. Use when: All servers have identical capacity and requests have similar processing costs. Breaks when: Server capabilities differ or request complexity varies wildly. [2] Weighted Round Robin: Capacity-Aware Distribution Assigns weights based on server capacity. A server with weight 3 gets 3x more requests than weight 1. Use when: Your infrastructure is heterogeneous (mix of instance types, on-prem + cloud). Critical for: Gradual rollouts where new servers handle less traffic initially. [3] Least Connections: Dynamic Session Balancing Routes to the server with fewest active connections. Adapts in real-time to load. Use when: Request processing time varies significantly (long-polling, WebSockets, file uploads). Why it matters: Prevents one server from getting hammered while others sit idle. [4] Least Response Time: Speed-First Routing Combines active connections with server response latency. Routes to fastest available server. Use when: User experience is critical and server performance varies (multi-region, degraded instances). Trade-off: Higher overhead from continuous latency monitoring. [5] IP Hash: Session Persistence Hashes client IP to deterministically route to same server. Enables stateful sessions without external storage. Use when: You need session stickiness but can't use cookies or tokens. Limitation: Uneven distribution if traffic comes from few IP ranges (corporate NATs, VPNs). [6] URL Hash: Content-Based Routing Hashes request URL to route to same server. Critical for cache efficiency. Use when: Building CDNs, caching layers, or content-specific processing pipelines. Why it works: Same content always hits same cache, maximizing hit rates.