Load Balancing Algorithms

In my previous post, I talked about what is a load balancer and why load balancing is required in our modern tech stack. This post discusses more on different types of load balancing algorithms that could be efficiently used to make your application more fast and accessible to your users, impacting overall user experience.

But before we dive into the topic, lets look at a simple architecture on what are the possible places where load balancer could be integrated.

Architecture

To utilize full scalability and redundancy, load balancers could be applied in three places:

1. Between user and web server

2. Between web server and application server/cache server

3. Between application server and database server

Image :https://www.designgurus.io/course-play/grokking-system-design-fundamentals/doc/introduction-to-load-balancing

What are Load Balancing Algorithms?

Is a method to efficiently distribute requests among available resources, improving overall performance, efficiency and availability. Choosing the right load balancing algorithm will ensure your servers are not overwhelmed.

Lets’ dive deeper into types of load balancing algorithms.

1. Round Robin Algorithm

Distributes incoming requests among servers in a cyclic order. It assigns the requests to first server, then second server, then third server and so on. Once the server limits are reached it starts again distributing requests to the first server.

This algorithm works well for stateless applications where each requests can be handled independently. Also, if the applications servers have equal capacity and performance, RR algorithm will work well in such environments. It will ensure an equal distribution of requests among servers.

However, just make note RR algorithm does not have load awareness since it could not interpret the capacity of each server. Additionally, if you are working for a stateful application, then it may happen that subsequent requests from same client could be redirected to different servers creating no session affinity.

2. Least Connections

Is a dynamic load balancing algorithms which redirects the request to servers having least number of active connections. This means if a server 1 is having 10 active connections, server 2 having 35 active connections and server 3 having 5 active connections. LC algorithm will ensure the subsequent request from client is redirected to server 3 until it reaches 11 active connections.

This algorithm works well for stateful application where maintaining session state is important. Compare to RR algorithm LC algorithm takes into account capacity/active connections of each server while distributing load. This automatically becomes a recommended algorithm in an environment where each server has a different capacity.

However, LC algorithm is complex to be implemented and usually requires monitoring of active connections. Additionally, it requires to maintain the active session state on server which may introduce overhead on the server.

3. Weighted Round Robin

Is an enhanced update to Round Robin Algorithm. Here, the load balancer divides the load as per the weight supplied on each server. Supposedly, server 1 has a weight of 0.6, server 2 has a weight of 0.2 and server 3 has a weight of 0.2. In this case with the WRR algorithm the maximum request will be supplied to server 1 and then accordingly to server 2 and server 3.

WRR algorithm ensures that less powerful servers are not overwhelmed with too many requests, optimizing the overall performance of the application. Additionally, it allows flexibility to adjust changes in server capacities or addition of new servers.

However, the WRR algorithm may not always provide optimal load balancing in perspective to highly variable load patterns. Additionally, in determining the capacity of each server weights can introduce overhead in a dynamic environment where server performance fluctuates.

4. Weighted Least Connections

Is an advanced combination of LC and WRR algorithms. That is, it takes into consideration weight on each server as well as active connections on each server. This means if a server has a weight of 0.6 and active connections is 100 whereas another server has a weight of 0.2 but has 2 active connections, the subsequent requests will be redirected to server 2. This ensures load distribution in real-time which WRR lacks.

The WLC algorithm takes into account real-time load balancing as well as relative capacity on each server. This ensures balances distribution of requests which in turn leads to optimal utilization of resources.

However, to track weight and active connections on each server, an overhead is introduced to the load balancer. It is also a little complex to implement when compared to WRR or LC.

Note: If your web application experiences high variable traffic patterns, it is recommended to implement the WLC algorithm.

5. IP Hash

Is a technique where the load balancer assigns client requests to servers based on the client’s IP address. Here, load balancers use a hashing function to convert the IP address of the client to a hash value. It then determines which server should be used to redirect a request from the client, maintaining session persistence. I will discuss this in detail in my subsequent posts.

6. Least Response Time

Is a dynamic algorithm where the requests are transferred to the server having least average response time. This algorithm ensures that each request must be processed at the fastest targeting optimal use of server resources.

In this algorithm, the load balancer constantly monitors the response time of servers and requests are sent based on this monitoring. Additionally, adjustments are made with real-time monitoring performance data.

The LRT algorithm helps in better utilization of resources, dynamic load balancing and provides optimized performance.

However, similar to other algorithms such as WLC, continuously monitoring response times and rebalancing may introduce an overhead on the load balancer. Additionally, due to network latency on each server, and constant change in response time may account for frequent rebalancing which may have a negative impact on performance.

7. Custom Load

Helps you define your own custom metrics and rules for distribution of traffic across a pool of servers. This is also a highly configurable approach. It helps you tailor your routing mechanism based upon requirements and specific criteria unlike other algorithms which have their own pre-defined conditions as discussed above.

A custom load algorithm allows you flexibility, adaptability and optimized resource utilization making it suitable for complex algorithms.

However, if incorrect metrics and rules are defined it may lead to unnecessary bottlenecks which could lead to suboptimal load balancing and performance issues which could be hard to track.

Conclusion

Although load balancer is essential for your contemporary tech stack, selecting a suitable load balancing technique that fits your environment and application complexity is equally important. Researching the nuances of your system and routinely changing your criteria for selecting an appropriate load balancing method is highly advised.