Load Balancing and Scalability (LBS)

The development of technology is currently growing rapidly. And along with the development of the era, humans need more information, including humans searching for information via the internet (website). Where the increasing number of internet users means the increasing number of paths used, which causes designers to design equipment that can accommodate so many internet service users. And from this design, a result is obtained to overcome the large number of internet service users, namely load balancing and scalability. Load Balancing is the process of distributing the load to a service that exists on a group of servers or network devices when there is a request from the user.

Communication sessions are opened by the server to allow its users to enjoy the services of the server. If only one server is burdened, of course the server will not be able to serve many users because its processing capacity has limits. These limitations can come from many things, such as its processing capacity, internet bandwidth, and many more.

For that, the most ideal solution is to divide the incoming load to several servers. So, the task of serving users is not only centered on one device. This is called a load balancing system.

For example, when you access the site www.detik.com, then the web server containing news documents will immediately serve you. The server provides what you request by opening communication using the HTTP port 80 service. The main page information will be sent directly to the PC via port 80, so you can see it on the browser page.

When you click on a link on the web page, your request is then processed again by the server. The web server will serve your request again in various ways that have been determined by its administrator, whether directing you into a certain folder, running certain scripts, sending images, playing sound clips, and more. At this time, the detik.com server is burdened by your request. Until the page or service you requested opens, then the process is complete and the server is free from the burden again.

If you are the only one accessing the web page www.detik.com, of course a load balancing system is not needed, because a server is certainly still sufficient to serve your request. But what if www.detik.com is opened by almost most internet users in Indonesia, every second, and every day like the current situation. Maybe a server alone will not be able to serve such a heavy request. Requests will continue to come and the process will also continue to be carried out.

Generally, Internet users do not want to lose a few seconds to be able to immediately access other Internet sites or facilities. If they are connected to the Internet, every second becomes so valuable. Every second of their time becomes important because it might change their lives drastically.

In addition to the high level of dependency, perhaps the cost of getting an Internet connection is also a contributing factor. Of course they do not want to spend money in vain just to wait for minutes for an Internet banking site page to open, for example. In essence, Internet users are very sensitive to waiting time and its smoothness when they are already on the Internet.

The convenience and smoothness of browsing Internet sites are indeed supported by many factors. Large bandwidth, servers that use the latest processing technology with large memory, fast-access data storage media, and large capacity, are some of the factors that represent that. Seeing how crucial the smoothness of the Internet is, of course Internet service providers, web and e-mail service providers, e-commerce companies, and other Internet facility providers, must really pay attention to the quality of the connection and reliability of their servers.

Understanding Load Balancing Systems

As explained above, the load balancing system can actually be created in many ways. Its creation is not tied to an operating system alone, or can only be created by a single device. However, in general, the way to create a load balancing system is divided into three major categories, namely:

DNS round robin
Integrated load balancing
Dedicated load balancing.

These three types have unique and different ways of working from each other, but still lead to the same end result, namely creating a system that better guarantees the survival of the network behind it and makes it more scalable.

Load is a very important thing in a system that is expected to handle large simultaneous loads. Load balancing is a process to move processes from a host with a high workload to a host with a small workload. This aims to make the average time to complete a task low and increase processor utility.

1. DNS Round-Robin

The simplest method to create a load balancing system is to use the DNS Round Robin method. This method is actually a feature of an application called BIND (Berkeley Internet Name Domain). This is an open source application specifically for building DNS servers that seems to have become a kind of standard used everywhere. The DNS round robin system relies heavily on neat and tidy naming input techniques combined with a round robin rotation system.

As you know, DNS is a naming system for computer devices. This naming is made based on the IP address of the device. A device that has an IP address can be named and can be accessed using its name only if you have a DNS server.

The naming system has many benefits, such as simply making it easier to access or to process further. You will certainly find it easier to remember specific names than a series of IP address numbers, right?

From this naming system, a simple and inexpensive load balancing system can be created that utilizes the natural properties of the BIND program, namely the round robin rotation system.

In a DNS record that contains naming information, you can enter several other names (canonical) to be represented by a primary name. Several other names have their own records that also represent the IP addresses of network devices. So after the naming input process is complete, you will get a primary name that represents several other names that represent several network devices such as servers for example.

Here is the key, when someone accesses the main name, the DNS server will be contacted by the seeker. After receiving the request, the DNS server will search for the record of the main name. It turns out that in the record there are several other names related to the main name. In this condition, the DNS server will run a round robin rotation system to rotate information on which other names will be given to the requesters. Here, the load balancing system has actually occurred. The IP addresses of the servers represented by the other names will be given to the requesters in turn according to the round robin algorithm. This makes the load divided in turn to other servers by itself.

For example, let's say you have four servers that you want to use for your company's website. Your main domain name is myserver.mydomain.com. You want to put these four servers into a load balancing system, so that the load distribution is not centralized. By using the DNS round robin system, all you need to do is input the names of your four servers into the DNS server regularly.

Figure 52. DNS Round Robin

Suppose each server is named myserver0.mydomain.com to myserver3.mydomain.com. Input all the IP addresses of your servers and give the A record name (usually to describe a host) when giving this name. After that, create the primary name and input all the names of the servers you have in the CNAME record.

Figure 53. CNAME record

This configuration will make every time a user accesses the primary name created, the DNS server will provide IP information to the user in turn and in order starting from IP myserver0.mydomain.com to myserver3.mydomain.com.

This load balancing system is relatively easy and simple to implement, but there are also some significant weaknesses. A common problem is when there is another DNS server (for example DNS A) on the Internet that is still caching its first search results.

So if the first time DNS server A gets IP information from myserver.mydomain.com is IP 1.1.1.2, then DNS A does not know the other IP addresses of myserver.mydomain.com. This makes users who use this DNS server also unable to know the existing load balancing system, so load balancing does not work.

Another weakness is when a server in this load balancing system cannot work, the DNS system cannot detect it. This causes the server that cannot work to receive many requests from outside, even though it cannot work. New chaos begins immediately.

2. Integrated Load Balancing

Figure 54. Integrated Load Balancing

As the name implies, Integrated load balancing is usually an additional load balancing solution from an application or operating system. Usually applications or operating systems that have this feature are those that have the ability to operate as a server.

Its load balancing system is not a primary function. Therefore, its features, performance, and capabilities are usually quite simple and used for small to medium scale systems. Its facilities are also more general in nature, rarely specific. However, this feature is very useful if used on the right network.

One of these Integrated load balancing can be found in Microsoft Windows 2000 Advance Server which is an additional feature. On this operating system that has great network capabilities, you can configure the load balancing system quite easily. In addition, the features provided for this purpose are also quite complete. The features in load balancing technology on Windows 2000 Advance Server and also Windows 2000 Datacenter.

The servers are as follows:

2.1 Network Load Balancing (NLB)

Network load balancing is a facility that allows Windows 2000 Advance Server machines to perform load balancing on applications running based on IP networks. Applications running on IP such as HTTP/HTTPS, FTP, SMTP, and many more can be easily load balanced using this facility. By using NLB, you can create a group of server clusters equipped with a load balancing system for all TCP, UDP, and GRE (Generic Routing Encapsulation) services. For all these processes, a term called Virtual Server is known which acts as a central point for accessing the servers below it. With this facility, the services and services run by these servers are more guaranteed to run smoothly. It is ideal for front-end services, such as web servers so that problems such as bottlenecks on the server can be reduced.

2.2 Component Load Balancing (CLB)

This load balancing technology provides a load balance system for components that support the running of a software or application. Applications or software that can be load balanced are those whose components use COM+. By performing load balancing on COM+ components on several servers, the running of an application is more guaranteed and more scalable to serve application users.

2.3 Server Cluster

By using this Server Cluster technology, you can create applications and data on several separate servers that can be combined into one in a cluster configuration. All can be interconnected to serve its users, so that data integrity is maintained. Usually this technology is ideal for back-end applications and databases. The integrated load balancing system is not only found in Windows 2000. If you are an open source lover who uses Apache as your web server, the Backhand module is a special module to increase the ability of your server to be clustered.

To create a more scalable load balancing system on Linux, Linux Virtual Server (LVS) is one of the applications that you can use. LVS is already a kind of standard for building a load balancing system in the open source world. The methods and technologies are also varied and no less great than what Windows 2000 has. In addition to its greatness and simplicity, this integrated load balancing system has several shortcomings. Each of these additional features cannot be used to serve servers or devices that are on different platforms. For example, Microsoft's load balancing feature cannot be used by the Apache web server or conversely the Apache module cannot be used by Microsoft IIS. Or for example, the solution from IBM Websphere to create a server farm cannot be used by systems on different platforms.

3. Dedicated Load Balancing

Figure 55. Dedicated Load Balancing

This load balancing method is claimed as a real load balancing system because its work and process are totally intended for the load balancing process against the server or network below it. In general, this method is still divided into three types:

3.1 Load Balancing with Hardware or Switch

This type of load balancing system is created using the help of a chip that is specifically designed for it. Usually, this special chip is often referred to as ASICS, which is usually in the form of a special microprocessor that only processes specific algorithms and calculations. With this ASICS, the performance of load balancing is no longer in doubt because only the calculation and logic of load balancing are optimized in it.

This type of load balancing is generally in the form of a switch. In practice, this type of device often requires special skills to use because its interface is less user friendly. In addition, the level of flexibility of this device is also low because most of its intelligence is embedded in the hardware, making it more difficult to add features and other facilities.

3.2 Load Balancing with Software

Figure 56. Load Balancing with Software

The most prominent advantage of a software load balancing solution is the level of ease of operation that is more user friendly than if you configure a load balancing switch. Another advantage is that if there are additional features or there is a new upgrade version, you don't need to replace the entire load balancing device.

However, because the logical process is in a software, of course to use it you need a platform as a place to work.

A computer device with certain specifications is definitely needed for this.

The performance and greatness of the load balancing process will also be affected by the computer device used, it cannot only rely on the ability of great software. The network card used, the amount of RAM on the device, large and fast storage media, and other knick-knacks can certainly also affect the performance of this software. Because of this issue, the performance of the entire load balancing system is more difficult to predict.

3.3 Load Balancing with a Combination of Hardware and Software

The solution to creating a dedicated load balancing system is to combine the two types of load balancing systems above, namely combining load balancing software with devices specifically designed to serve it. The performance of special hardware that is deliberately optimized to support user-friendly and flexible load balancing software makes this type of load balancing device more popular with today's users. This type of device is often referred to as a black box load balancing.

Optimized hardware and filled with optimized Linux or BSD-based platforms is the configuration that is usually used to run the main load balancing software. From this configuration, there are many benefits that can be obtained by users and manufacturers. Extraordinary flexibility can be obtained starting from using hardware that is always up-to-date to operating systems with the latest patches.

Thus, the lifetime of these devices can be much longer than a dedicated, inflexible switch. This solution is certainly much cheaper than a dedicated hardware solution, or even a software-only solution. An important part of a load balancing strategy is the migration policy, which determines when a migration occurs and which processes are migrated.

How Load Balancing Works

Step 1

A domain name request sent from a remote web browser enters the gateway. The request is checked by a load balancing algorithm that uses the gateway's current load statistics to determine which WAN port to use.

Step 2

The reply is sent to the remote web browser. The gateway will direct the browser session to the WAN port with the least traffic.

Step 3

This remote web browser then connects to the specified IP address that has an available WAN port.

Figure 57. Circuit in Load Balancing

1. On Load Balancer

The web browser makes a request. This request is sent by WAN 1.
Domain name requests are transferred process through DNS authoritative module.
The DNS module then instructs the WAN port monitoring module to provide the IP address of the requested request.
WAN port monitoring module will check traffic load on WAN1 and WAN2.
Load balancing algorithm will be applied to the request. The algorithm will maintain the gateway user preference and set the load share and load balance type.
The load balancing algorithm determines that
WAN 2 has the least traffic.
Then command the DNS module to use WAN 2.
The response from the gateway is then sent back via WAN 1 to the source of the DNS request.
The web browser receives the response from the gateway and forwards it to the domain name that responds to the IP address.
The web browser receives the response from the gateway and forwards it to the domain name that responds to the IP address. The web browser will retrieve the requested information.
The request information is then forwarded via WAN 2.
Requests for information from a web browser can now be accessed via a web or FTP server location behind the gateway.

2. Migration Process

There are two forms of migration process in load balancing:

Remote execution (also called non-preemptive migration). In this strategy a new process (can be automated) is executed on the remote host.
Pre-emptive migration, in this strategy the process will be stopped, moved to another node and continued.

Load Balancing can be done explicitly by the user or implicitly by the system. Implicit migration can be done by utilizing priority information or not. Of course, every process transfer will incur an overhead. So how the granularity of the migration of the process versus the overhead of the migration process must also be considered.

In principle, the load balancing method used must meet several criteria:

Low overhead for measurements, so measurements can be performed as frequently as possible to determine the most up-to-date system conditions.
Has the ability to represent the load and availability of computing resources of the system.
Independent measurement and regulation In implementing a load balancing strategy, several variations can be distinguished, including:
Local or global. In local scheduling, scheduling is done by each local node, including determining the time slice on a single processor. While in global scheduling, scheduling and determining where the process will run, is done by a central coordination point.
Static or dynamic. In the static model it is assumed that all the information used to place the process is available when the program is about to run. In the dynamic model this determination can change when the system is running. Also known as adaptive assignment and dynamic assignment for the dynamic model, and non-adaptive and one-time assignment for the static model.
Optimal or suboptimal. In the optimal model, strategy determination is based on consideration of the optimal value of the entire system.
Approximation vs. heuristic. The first model uses a mathematical approximation model approach such as enumerative, graph theory, mathematical programs, queuing theory. While the second model uses an approach such as a neural network, genetic algorithm. In addition, in using a mathematical model, a deterministic or probabilistic model can be selected.
Distributed or centralized. Meaning which party is responsible for decision making, whether there is a central system that makes migration decisions, or is it spread across a distributed system.
Cooperative or non-cooperative. In the non-cooperative model each processor makes decisions without relying on other processors.

3. Load Balancing Algorithm

In a load balancing system, the load sharing process has its own techniques and algorithms. Complex load balancing devices usually provide a variety of load sharing algorithms. The goal is to adjust the load sharing to the characteristics of the servers behind it.

In general, the load sharing algorithms that are widely used today are:

Round Robin. The Round Robin algorithm is the simplest algorithm and is widely used by load balancing devices. This algorithm divides the load in turns and sequentially from one server to another server to form a round.
Ratio. Ratio is actually a parameter given to each server that will be entered into the load balancing system. From this Ratio parameter, the load will be distributed to the servers that are given the ratio. The server with the largest ratio is given a large load, as well as the server with a small ratio will be given less load.
Fastest. This algorithm distributes the load by prioritizing servers that have the fastest response. The server in the network that has the fastest response is the server that will take the load when a request comes in.
Least Connection. The Least Connection algorithm will distribute the load based on the number of connections being served by a server. The server with the least connection service will be given the next incoming load.

4. Benefits of Load Balancing

When your server or network is accessed by many users, this is where the benefits of load balancing are most felt. Or when a very important application on a server suddenly cannot be accessed because the server is experiencing problems, then with load balancing it can be transferred to another server.

In general, the advantages of implementing load balancing are:

4.1 Ensuring service reliability

System reliability means the level of trust in a system to continue to serve users as well as possible. Guaranteed reliability means a level of trust that is always maintained so that users can use the service and do their work smoothly. This is very important for commercial sites.

4.2 Scalability and availability

One server used to serve thousands of users, of course, cannot possibly produce good service. Even though you have used a server with the most sophisticated technology, it can still be overwhelmed serving its users. In addition, one server means one problem point. If the server suddenly dies, problems will definitely occur with the site or service in it. However, by using a load balancing system, the server that works to support a site or service can be more than one. This means that if it turns out that one server is overwhelmed serving users, you can add one by one to support the smooth running of your site. You don't need the most sophisticated server to solve this problem.

In addition, the problem point is divided. If there is a server problem, then there is still support from others. The site or service you run is not necessarily problematic when a server is having problems.

4.3 Improved scalability

Scalable load-balanced levels allow the system to maintain acceptable performance while increasing availability.

4.4 Higher availability

Load-balanced allows us to take a maintenance server offline without losing existing applications.

Computer Network Scaling

1. Two approaches to Scaling Servers:

a. Multiple Smaller Servers:

Adding servers for scalability
Most commonly done with web servers.

b. Slightly Larger Servers For Additional Internal Resources

Addition of processors, memory, and disk space
Most commonly done with database servers.

Figure 58. Scaling on the Network

2. Where to Use Scalability

On the network
On individual servers
Ensuring the capacity of a network before scaling by adding servers.

3. Scalability Approach

3.1 Service Provider Application

Scale up: replacing a server with a larger server
Scale out: adding extra servers.

3.2 Approach

3.2.1 Farming

Farm, which collects all servers, applications, and data on a special site.
Farms have special services (eg: directory, security, http, mail, database etc.).

3.2.2 Cloning

Figure 8. Cloning

A service that can clone on several replica nodes, where each node has the same software and data.
Cloning offers scalability and availability.
If one is overloaded, a load-balancing system can be used to allocate work between the duplicates.
If one fails, the other will continue the service.

3.2.3 RACS (Reliable Array of Cloned Services)

1. Collect clones from special services.

2. Shared-without RACS:

Each clone is duplicated on local storage
Updates must be applied to all clone storage.

3. Shared-disk RACS (cluster)

All clones are shared on the storage manager
Storage servers can tolerate errors.

3.2.4 Partition

Figure 59. Partition

1. Service development through:

Hardware and software duplication
Sharing data between nodes (by objects), e.g. mail server by mail box.

2. Can be applied to transparents

3. requests for partition services are routed to the relevant data partitions.

4. Does not increase availability

5. Data storage in only one place

6. Partitions are implemented to package two or more nodes increasing access to storage.

4. Scalability Achievement

To achieve scalability, the following discussion compares an existing nonload-balanced solution, which contains a single system (single point of failure) at the application level, to a highly scalable solution to manage throughput and increase availability.

5. Non-Load Balancing Level

Initially, an organization may start with an architecture that may be appropriate for achieving initial expectations. As the load increases, the application level must be adjusted to the increasing load to achieve the expected maintainability.

Figure 60. Application Server Level (source: ibm.com)

In the figure, the application level contains only one server application (Appserver20), which serves requests from clients. If the server becomes overloaded, then the solution to be taken is to reach the available level or become unavailable.

6. Load Balancing Levels

To improve scalability and to achieve maintainability, an organization may use a load balancer to extend the application tier. The following example, as shown in the figure, adds two servers at the application tier to create a load-balanced cluster, which accesses data from the data tier and provides application access to clients on the client side.

Figure 61. Load Balancing Level

The result is a standard load-balanced design. Both the hardware and software running on the machines are assigned a virtual hostname (Appserver20) and an IP address for Appserver1, Appserver2, and Appserver3.

Load-balanced exposes virtual IP addresses and host names on the network and balances the load of incoming requests once through a stable server in a group. If Appserver1 fails, requests are simply routed to Appserver2 or Appserver3. Depending on the technology used to provide this capability, at a certain number, servers can be added to a load-balanced cluster to maximize scalability and just wait for the desired increase.

7. Conclusion

Load Balancing is the process of distributing the load to a service on a group of servers or network devices when there is a request from a user. What is meant is that when a server is being accessed by users, the server is actually being burdened because it has to process the requests of its users. If there are many users, then the processes carried out also become many. The load balancing system can actually be created in many ways. Its creation is not bound by an operating system alone, or can only be created by one device alone. However, in general, the way to create a load balancing system is divided into three major categories, namely DNS round robin, Integrated load balancing and Dedicated load balancing. Some of the advantages of implementing load balancing include ensuring service reliability as well as scalability and availability.

8. QUESTIONS

What is meant by load balancing? Explain!
Explain the three major categories in realizing a load balancing system!
Explain the types of algorithms used in load balancing!
Explain how load balancing works!
State the advantages of using scalability and load balancing.