Last modified: January 24, 2026

This article is written in: 🇺🇸

Load Balancing in Distributed Systems

Load balancing is central to designing robust distributed systems. It ensures that incoming requests or workloads are equitably distributed across multiple servers or nodes, thereby preventing any single server from becoming a bottleneck. This technique also boosts system resilience, providing higher availability and scalability.

ASCII DIAGRAM: High-Level Load Balancing

         +---------+
         |  Client |
         +----+----+
              |
      (HTTP/TCP Requests)
              v
        +-----+------+
        | Load       |
        | Balancer   |
        +-----+------+
              |
      (Distributes requests)
              v
  +-----------+-----------+
  |    Server 1 (S1)     |
  +-----------+-----------+
  |    Server 2 (S2)     |
  +-----------+-----------+
  |    Server 3 (S3)     |
  +-----------+-----------+

Significance of Load Balancing

Implementing a load balancer in a distributed system offers multiple advantages:

How Load Balancers Work

Load balancers apply algorithms to decide where each incoming request goes. They typically include:

Health Checks

Traffic Distribution Techniques

Below are common methods for distributing requests:

  1. Least Connection
  2. Routes new requests to the server with the fewest active connections.
  3. Helpful if requests have varying durations, preventing busy servers from becoming overloaded.

  4. Least Response Time

  5. Considers both the current number of active connections and the average latency.
  6. Aims to pick the server that can respond fastest.

  7. Least Bandwidth

  8. Monitors ongoing traffic in Mbps or Gbps and sends new requests to the server with the lowest bandwidth utilization.

  9. Round Robin

  10. Sequentially distributes requests across servers in a cyclical order (S1 → S2 → S3 → S1 …).
  11. Weighted Round Robin accounts for each server’s capacity, giving a powerful server more requests.

  12. IP Hash

  13. Uses a hash of the client’s IP address to pick a server.
  14. Ensures the same client IP typically routes to the same server (session persistence), common in Layer 4 load balancing.

  15. Consistent Hashing

  16. The hash of the request (e.g., session ID, cache key) maps to a server “ring.”
  17. When a server is added or removed, only a small subset of the keys or requests are remapped, aiding caching consistency.

  18. Layer 7 Load Balancing

  19. Application-level load balancing that inspects HTTP headers, URLs, cookies, etc.
  20. Allows content-aware routing (e.g., static file requests go to a specialized cluster, API requests go elsewhere).
  21. More resource-intensive but offers fine-grained control.
ASCII DIAGRAM: Multiple Load Balancing Methods

           +------------------+
           |   Load Balancer  |
           +--------+---------+
                    |
    +---------------+---------------+
    |                               |
    v                               v
+----------------+            +----------------+
|   Server Pool  |            |   Routing via  |
| (LeastConn, RR)|            |   IP/Consistent|
| etc.           |            |   Hash, etc.   |
+----------------+            +----------------+

Load Balancer Resilience

Ironically, load balancers can become a single point of failure if not designed carefully. Various techniques mitigate this risk:

  1. Load Balancer Clustering
  2. Multiple load balancers run in active-active or active-passive configurations.
  3. A heartbeat mechanism monitors whether a load balancer node has failed.

  4. Active-Passive Pair

  5. If the active LB node fails, the passive node takes over, preventing downtime.
  6. Usually involves sharing a virtual IP or using DNS-based failover.

  7. DNS-based Load Balancing

  8. DNS records (like round-robin DNS) distribute traffic among multiple LB IPs.
  9. Can be combined with health checks at the DNS level.

Best Practices for Load Balancing