ServerIron ADX Server Load Balancing Guide
Release 12.0.00
June 15, 2009

Table of Contents Previous Next Print


Server Load Balancing > How SLB Works

How SLB Works
A Brocade ServerIron ADX running SLB software establishes a virtual server that acts as a front-end to physical servers, distributing user service requests among active real servers. SLB packet processing is based on the Network Address Translation (NAT) method. Packets received by the virtual server IP address are translated into the real physical IP address based on the configured distribution metric (for example, “round robin”) and sent to a real server. Packets returned by the real server for the end user are translated by SLB so that the source address is that of the virtual server instead of the real server.
NAT translation is performed for both directions of the traffic flow. Converting virtual services to real services requires IP and TCP checksum modifications.
Port translation is not performed for any virtual port that is bound to a default virtual port.
Slow-Start Mechanism
When the ServerIron ADX begins sending client requests to a real server that has recently gone online, it allows the server to ramp up by using the slow-start mechanism. The slow-start mechanism allows a server (or a port on the server) to handle a limited number of connections at first and then gradually handle an increasing number of connections until the maximum is reached.
The ServerIron ADX uses two kinds of slow-start mechanisms:
The non-configurable server slow-start mechanism applies to a real server that has just gone online
The configurable port slow-start mechanism applies to individual TCP application ports that have just been activated on a real server
See “Slow-Start Mechanism” for more information.
Load-Balancing Predictor
The predictor is the parameter that determines how to balance the client load across servers.
You can fine-tune how traffic is distributed across multiple real servers by selecting one of the following load balancing metrics (predictors):
Least Connections
Sends the request to the real server that currently has the fewest active connections with clients. For sites where a number of servers have similar performance, the least connections option smooths distribution if a server gets bogged down. For sites where the capacity of various servers varies greatly, the least connections option maintains an equal number of connections among all servers. This results in those servers capable of processing and terminating connections faster receiving more connections than slower servers over time.
NOTE: The Least Connections predictor does not depend on the number of connections to individual ports on a real server but instead depends on the total number of active connections to the server.
The Least Connections predictor can be applied globally to apply for the entire ServerIron ADX or locally per-virtual server as described in “Changing the Load-Balancing Predictor Method”.
Round Robin
Directs the service request to the next server, and treats all servers equally regardless of the number of connections. For example, in a configuration of four servers, the first request is sent to server1, the second request is sent to server2, the third is sent to server3, and so on. After all servers in the list have received one request, assignment begins with server1 again. If a server fails, SLB avoids sending connections to that server and selects the next server instead. The Round Robin predictor can be applied globally to apply for the entire ServerIron ADX or locally per-virtual server as described in “Changing the Load-Balancing Predictor Method”.
Weighted Round Robin
Like the Round Robin predictor, the Weighted Round Robin predictor treats all servers equally regardless of the number of connections or response time. It does however use a configured weight value that determines the number of times within a sequence that the each server is selected in relationship to the weighted values of other servers. For example, in a simple configuration with two servers where the first server has a value of 4 and the second server has a value of 2 the sequence of selection would occur as described in the following:
1.
2.
3.
4.
5.
6.
Notice that that over this cycle of server connections, Server1 which has a weight of 4 was accessed four times and Server2 that has a weight of 2 was accessed only twice.
This cycle will repeat as long as this predictor is in use.
The Weighted Round Robin predictor can be applied globally to apply for the entire ServerIron ADX or locally per-virtual server as described in “Changing the Load-Balancing Predictor Method”.
Weighted and Enhanced Weighted
Assigns a performance weight to each server. Weighted and Enhanced load balancing are similar to least connections, except servers with a higher weight value receive a larger percentage of connections at a time. You can assign a weight to each real server, and that weight determines the percentage of the current connections that are given to each server.
NOTE: it is required that you configure a weight for any real server that is bound to a VIP that is expected to load balance based on a weighted or enhanced weighted predictor
For example, in a configuration with five servers of various weights, the percentage of connections is calculated as follows:
The result is that server1 gets 7/24 of the current number of connections, server2 gets 8/24, server3 gets 2/24, and so on. If a new server, server6, is added with a weight of 10, the new server gets 10/34.
If you set the weight so that your fastest server gets 50 percent of the connections, it will get 50 percent of the connections at a given time. Because the server is faster than others, it can complete more than 50 percent of the total connections overall because it services the connections at a higher rate. Thus, the weight is not a fixed ratio but adjusts to server capacity over time.
The difference between weighted and enhanced-weighted, load-balancing is the method of distributing the traffic once it is assigned.
Connection Assignments for Weighted Load-Balancing
in weighted load-balancing, the traffic is distributed by allocating all of the required connections sequentially to the server with the greatest weight first and then to the server with the next greatest weight, and then to the server with the next greatest weight on-down-the-line, untill all servers have gotten their share of connections. The process then repeats.
Table 2.1 shows the distribution pattern for Weighted Load-Balancing in an example configuration with three real servers, A, B, and C. Real server A has a weight of 1, real server B has a weight of 2, and real server C has a weight of 3. The numbers in bold indicate which server receives the new connection. When the weighted predictor is configured, connections are assigned as shown in Table 2.1.
 

1
For the weighted predictor, the server load is calculated as connections divided by server weight = server load. Fractional remainders are rounded down. If there is a tie, the server with the highest weight receives the connection

Connection Assignments for Enhanced Weighted Load-Balancing
in enhanced weighted load-balancing, the traffic is distributed in the same proportions as with weighted load-balancing but the order of distribution is different. With enhanced weighted load-balancing, the real server with the greatest weight is allocated a connection first but then the next connection is allocated to the real server with the next greatest weight, and then to the server with the next greatest weight on-down-the-line, until all servers have gotten their first connection. The process repeats with each real server getting a connection in sequence until each real server has gotten connections equal to its assigned weight.
Table 2.2 shows the distribution pattern for Enhanced Weighted Load-Balancing in an example configuration with three real servers, A, B, and C. Real server A has a weight of 1, real server B has a weight of 2, and real server C has a weight of 3. The numbers in bold indicate which server receives the new connection. When the weighted predictor is configured, connections are assigned as shown in Table 2.2.
When the enhanced weighted predictor is configured, connections are assigned as indicated in the following table.
 

1
For the enhanced weighted predictor, the server load is calculated as connections x [combined weights / server weight] = server load. Fractional remainders are rounded down. If there is a tie, the server with the highest weight receives the connection.

Weighted and Enhanced Weighted predictors can be enabled as described in: “Changing the Load-Balancing Predictor Method”.
Dynamic Weighted Predictor
TrafficWorks provides a dynamic weighted predictor that enables ServerIron to make load balancing decisions using real time server resource usage information, such as CPU utilization and memory consumption. The ServerIron retrieves this information through SNMP protocol from MIBs available on the application servers.
To achieve this capability, a software process in ServerIron, named SNMP manager (also called SNMP client) is used. This process is different from the SNMP agent process (a.k.a. SNMP server process) on the ServerIron. A ServerIron can be configured as both SNMP agent (that allows management of ServerIron through Network Management System), and SNMP manager (that facilitates the new SNMP based predictor method). In addition, all the real servers must run the SNMP agent demon and support MIBs that can be queried by the SNMP manager of the ServerIron.
You can fine-tune how traffic is distributed across these real servers by enabling Dynamic Weighted Predictor on the ServerIron.
The Dynamic Weighted predictors can be applied globally to apply for the entire ServerIron ADX or locally per-virtual server as described in “Changing the Load-Balancing Predictor Method” and “Configuring Dynamic Weighted Predictor”.
Dynamic-Weighted Direct
The SNMP response value from real server is considered as direct performance weight of that server. Direct weighted load balancing is similar to least connections, except that servers with a higher weight value receive a larger percentage of connections. You can assign a weight to each real server and that weight determines the percentage of the current connections that are given to each server.
NOTE: it is required that you configure a weight for any real server that is bound to a VIP that is expected to load balance based on a Dynamic-Weighted predictor
 
For example, in a configuration with five servers of various weights, the percentage of connections is calculated as follows:
The result is that server1 gets 7/24 of the current number of connections, server2 gets 8/24, server3 gets 2/24, and so on. If a new server, server6, is added with a weight of 10, the new server gets 10/34.
If you set the weight so that your fastest server gets 50 percent of the connections, it will get 50 percent of the connections at a given time. Because this server is faster than the others, it can complete more than 50 percent of the total connections overall because it services the connections at a higher rate. Thus, the weight is not a fixed ratio but adjusts to the server capacity over time.
Dynamic-Weighted Reverse
The SNMP response from each server is regarded as reverse performance weight. Dynamic-weighted reverse load balancing is similar to dynamic-weighted direct , except that the servers with a lower weight value receive a larger percentage of connections. You can assign a weight to each real server, and that weight determines the percentage of the current connections that are given to each server.

Server Load Balancing > How SLB Works

Table of Contents Previous Next Print
Copyright © 2009 Brocade Communications Systems, Inc.