Load Balancers in System Design

What is a load balancer?

Load balancing is a technique to distribute incoming traffic or workloads across multiple servers. For this, we use load balancers. A load balancer is a device or software component that acts as an intermediary between clients and a group of servers. Its goal is to evenly distribute workloads to prevent any server from becoming overloaded. So load balancer is one of the critical components in distributed systems to optimize resource usage, maximize performance, and ensure high scalability and reliability.

Why do we need a load balancer?

Suppose we have a single-server setup where several clients are sending requests. When the number of requests increases, two critical issues arise:

Server overloading: There is a limit to the number of requests a single server can handle. If the number of requests exceeds this limit, the server may become overloaded and unable to function properly.
Single point of failure: If the single server goes down for any reason, the entire application will become unavailable to users. This can result in a poor user experience and impact the system's reliability.

We can solve the above problems in two ways:

Vertical scaling: We can increase the capacity of the single server by adding more hardware resources. However, there are limits to how much we can increase the capabilities of a single machine.
Horizontal scaling: To increase the system capacity on a large scale, we can add more servers to the pool. However, this will bring a new challenge: How to distribute requests evenly across these servers? The answer is: We should use load balancers!

The load balancer will not only help us distribute requests across servers but also increase system capacity by adding more servers if the number of requests increases. On the other hand, the load balancer can continuously check the health of each server. If one of the servers goes offline, it will redirect the traffic to another available server. Therefore, load balancers will help us ensure that the service remains available, even in the case of the failure of one or multiple servers.

How load balancing works?

How a system with a load balancer works?

Now there are some critical questions: How user requests are served when there is a load balancer in the system? What are the key steps? Let's understand the process starting from the DNS request.

The user device sends a request to a DNS server.
The DNS server maps the domain name in the request to an IP address.
The DNS server responds to the device with the IP address of the requested domain. Here is one important thing: This IP address is the IP address of the load balancer! Due to this, web servers are not reachable directly by the end user and only the load-balancer layer is visible.
Now user device establishes a connection with the load balancer using the provided IP address. This connection is initiated over the HTTP(S) protocol.
The load balancer receives the incoming request from the user device. At this point, the load balancer acts as the entry point for the traffic. Now load balancer determines which web server should handle the request based on a load-balancing algorithm. This decision depends on factors like server availability, server load, and the logic of the load-balancing algorithm.
The load balancer forwards the incoming request to the selected web server through private IPs. Here private IP address is unreachable for the end device and reachable only between servers in the same network. They are used for communication between servers.
The selected web server processes the request and generates a response.
Now web server sends back the response to the load balancer.
The load balancer receives the response from the web server and forwards it back to the user's device.
Finally, the user device receives the response from the load balancer. They treat it as if it originated directly from the requested server.

These steps may be different based on the load balancer configuration. Most of the time, load balancers also perform additional functions like SSL termination, session persistence, traffic monitoring, caching, etc.

Where do we add a load balancer?

Load balancers can be placed at different points in a system to distribute the workload.

Between clients and frontend web servers: This is often the first point of contact between the client and the system. So load balancer receives incoming requests from clients and distributes them across frontend web servers.
Between frontend web servers and backend application servers: In a system with multiple frontend web servers, a load balancer can be used to distribute incoming requests from the web servers to the backend application servers.
Between backend application servers and cache servers: Load balancers can be used to distribute requests from the application servers to cache servers, which store frequently accessed data in memory to reduce response times.
Between cache servers and database servers: In systems with multiple cache servers, a load balancer can be used to distribute requests from the cache servers to the database servers, which store the actual data. This helps to ensure that the database servers are not overwhelmed with requests.

Types of load balancers

There are two main types of load balancers: software load balancers and hardware load balancers. The main difference between them is the level of customization and scalability they offer.

Software load balancers are implemented as software applications installed on physical servers or virtual machines. They are configured to meet the specific needs and provide greater flexibility in terms of customization. Scaling with software load balancers is also easier because we can add more capacity by adding more servers.

Hardware load balancers are physical devices installed in a network or in data centres. They are generally less flexible and offer fewer options for customization. But they are faster and more reliable than software load balancers because they are dedicated hardware devices designed specifically for load balancing.

Overall, the choice between a software or hardware load balancer depends on the specific requirements of a system. Software load balancers are more suitable for systems that require a high level of customization and scalability, while hardware load balancers are better for systems that require high performance and reliability.

Pros and cons of software load balancers

Pros

Provide more options for customization and configuration.
Scale horizontally to handle more traffic by adding more instances.
Cheaper than hardware load balancers because they can be installed on commodity hardware. The best thing is: They do not require the purchase and maintenance of physical hardware.
They can be deployed in the cloud for easy scaling and cost savings.
Provide a simple interface or API to configure, monitor, and manage multiple load balancer instances from a single control point. This simplifies administration and reduces operational overhead.

Cons

There can be some delay when scaling beyond the initial capacity because the software needs to be configured and set up.
They introduce an additional layer of processing in the networking stack. Due to this, there can be some performance overhead in terms of throughput, latency and concurrent connections. This performance can also be constrained by the processing capabilities of the host machine.
Require regular maintenance and updates to ensure good performance and security.

Examples of software load balancers

HAProxy: A TCP load balancer.
NGINX: An HTTP load balancer with SSL termination support.
Varnish: A reverse proxy-based load balancer.
Balance: Open-source TCP load balancer.
LVS: Linux virtual server offering layer 4 load balancing.

Pros and cons of hardware load balancers

Pros

Offer consistent performance because load-balancing logic runs on specialized hardware. They can handle large concurrent connections, and provide fast throughput and low latency.
Built on well-optimized and tested hardware, with an underlying operating system that is optimized for performance and stability. This makes them less prone to failure compared to software load balancers.
Increase security, as only authorized personnel can physically access the servers.

Cons

Require a higher upfront cost for purchase and maintenance.
Struggle to scale beyond a certain number of requests because they are limited by the hardware.
Require more human resources to configure and manage, compared to software load balancers.

Examples of hardware load balancers

F5 BIG-IP load balancer
CISCO system catalyst
Barracuda load balancer
Coytepoint load balancer
Citrix NetScaler

Load balancing algorithms

Load balancers use various load-balancing algorithms to distribute incoming network traffic across multiple servers. So, it is the responsibility of these algorithms to select a server from a pool of available servers to direct each incoming request.

Here are some popular load-balancing algorithms. If you want to learn more about these algorithms, you can explore this blog: load balancing algorithms.

Round Robin Method: Each request is sequentially distributed to the next server in a circular manner.
Least Connections Method: Directs new requests to the server with the fewest active connections.
Weighted Round Robin Method: Similar to the Round Robin but there is a weight associated with each server (weight represents the server capacity). So the servers with higher weights receive a larger proportion of the load.
Least Response Time Method: Considers the response times of the servers and directs the request to the server with the lowest response time.
IP Hash Method: Calculates a hash value based on the client's IP address to select the server from the pool. This will ensure that requests from the same client will be directed to the same server.

Different algorithms have different properties. So the correct choice of load-balancing algorithm depends on the characteristics of the workload and the goals of the load-balancing strategy. On the other side, load balancers also provide various configuration options to choose the load-balancing algorithm.

Advantages of load balancing

Ensure that the application is always available and can scale as needed. Servers can be added or removed based on the number of requests.
Prevent a single server from becoming overloaded with requests.
Provide encryption, authentication, and other types of additional support.
End users only need to know the address of the load balancer, rather than the addresses of every server in the cluster, providing a layer of abstraction.
Minimize server response time and maximize throughput.
Load balancers can perform health checks and monitor the servers' request-handling capability to ensure proper functioning. They can also be used to roll out software updates without taking the entire service down, by removing one server at a time.

Critical concepts to explore further

What is the difference between Load Balancer and Reverse Proxy?
Different Categories of Load Balancing: 1) Layer 4 (L4) load balancer 2) Layer 7 (L7) load balancer 3) Global server load balancing (GSLB)
Health check feature of the load balancer.
DNS load balancing vs Hardware load balancing
The application load balancer in designing several systems
Cloud load balancing

Thanks to Navtosh for his contribution in creating the first version of this content. If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy system design!