Load Balancing and Maintaining Continuity in Failover Mode

How to make your own DNS balancer

There are several global services that provide high availability for data centers. However, for Russian users, the choice is small. Having decided to make our own balancer, we solved the following tasks:

  • automatically redirect traffic to a backup data center or cloud;
  • check server health indicators when redirecting traffic.

We made a fork based on an open source project

Polaris GSLB

, and experimentally found that the solution does not suit us. On load tests, checking the availability of servers went beyond the time parameters specified in the settings. The problem is that Polaris is written in Python, which uses threads in the monitoring part. How threads work under heavy loads is another story. As for GSLB, we found a way out: we rewrote everything in Golang and reduced the amount of resources required for the monitoring site by two or three times.

We use as a DNS server PowerDNS, for which a custom backend was written. Balancing at the DNS level does not require any special settings. In addition, it is a fast, reliable and time-tested solution.

We use two balancing algorithms.

  • Weighted Round Robin — an improved version of the Round Robin algorithm, in which the load is distributed evenly, taking into account the same computing power of the servers. The user assigns weights to the servers depending on the processing power. Thus, the load is distributed more flexibly: servers with a large weight process more requests.
  • Failover group – looks at the weight of the server and always returns the server with the highest priority until it becomes unavailable.

How GSLB Works

The service consists of several modules.

Responds to the client’s query “what’s the IP address of example.com?”, for example, 89.22.165.223.

  • Availability Monitoring

Active Monitoring polls the host by IP address at the specified time interval. Checking the availability of nodes is carried out using the HTTP, HTTPS or TCP protocols.

image

When the DNS server receives a query from a client (for example, “what’s the IP address of example.com?”), GSLB already knows the availability status of the data center. The client will not get back the IP address of the downed node. GSLB decides on the availability of the data center depending on the intervals and the number of Liveness probes that the user has specified.

Data centers can operate in active-active mode (traffic is distributed evenly over them) or in active-passive configuration, when one of the data centers is constantly in the main (active) status, and the second backup (standby) is waiting for traffic to arrive.

Case: how network balancing works for a client

The client is a financial company whose website is hosted in two data centers: the first is in an active state, and the second is in a standby state. One of the key requirements for architecture is reliability. Users should always be able to connect to the site.

The client has connected the GSLB balancing rule in #CloudMTS, configured the domain, the target group of two nodes, and selected the Failover Group algorithm.

In normal mode, all traffic goes only to the active data center. In case of an accident and unavailability of the main data center, GSLB transfers all traffic to the backup data center.

How to connect GSLB

GSLB improves service resiliency and ensures the availability of data centers wherever they are. The service is provided not only to customers of the #CloudMTS cloud. These can be your own sites or sites of other cloud providers.

The service has a simple setup: write rules through the web interface, click the “Connect” button and add NS records for the subdomain to redirect DNS requests to GSLB servers.

Next, we recommend conducting disaster-recovery testing: shutting down one data center and checking that the DNS switch worked fine.

You can connect GSLB through the website.


For cloud news visit Telegram channel #CloudMTS

Similar Posts

Leave a Reply