HA cluster for AlwaysON availability groups MS SQL Server 2022 Linux using Pacemaker for storing IB 1C

The fault tolerance of the 1C information base, which should work 24th, where even the heading “How to finish off a lying person” succinctly accurately and with humor conveys the essence of STONITH.

In general, in this particular case, fencing, in general, is not needed. Most often, it is used on clusters whose nodes exclusively use some kind of shared resource. For example, iSCSI. Also be aware of the pitfalls. For example, here we decided to service one of the servers in the cluster, performed manual failover by switching the serviced server to SLAVE for all important resources. We stop the mssql service, start updating packages, the node goes into an emergency state, the fencing timeout is triggered and the node reboots or shuts down right during the update. Therefore, it is worth either increasing the timeout time or disabling the fencing mechanism during routine maintenance with the node. In this particular guide, the STONISH setting is intended to demonstrate the possibilities rather than a practical necessity.

You can view the entire list of available fencing resources like this:

# crm ra list stonith

There is also IPMI and the ability to interact with proprietary web-muzzles of branded servers such as: ilo, imm, drac, and others. It is possible to interact with the API of popular hypervisors and containerization systems such as: pve, xenapi, vbox, azure, vmware, libvirt, docker, etc. You can even finish off the node by issuing a command to the UPS to which it is connected, if, of course, this UPS is on the list of supported ones, but in extreme cases, you can issue the NUT command, and it will “turn off” the UPS of the failed node.

Specifically, in this case, the nodes are XCP-NG virtual machines (Free analogue of Citrix XenServer), which also uses XenAPI to manage virtual machines. In the list of available resources, it is listed as ‘fence_xenapi’.

Let’s get acquainted with the full list of parameters of this resource:

# crm ra info stonith:fence_xenapi

Press ‘q’ to exit parameter view mode.

Create a stonish resource for mssql-test-1:

# crm configure primitive mssql-test-1.stonish stonith:fence_xenapi \
params \
username="root" \
password="очень сложный пароль" \
session_url="http://IP гипервизора" \
plug="8d5140e9-9a20-f2a0-e712-1dee4fbdf067" \
pcmk_host_list=mssql-test-1

Create a stonish resource for mssql-test-2:

# crm configure primitive mssql-test-2.stonish stonith:fence_xenapi \
params \
username="root" \
password="очень сложный пароль" \
session_url="http://IP гипервизора" \
plug="8d5140e9-9a20-f2a0-e712-1dee4fbdf067" \
pcmk_host_list=mssql-test-2

After creating the stonish resources for each node, you need to configure them so that they do not run on the nodes they were created to reboot.

# pcs constraint location mssql-test-1.stonish avoids mssql-test-1=INFINITY
# pcs constraint location mssql-test-2.stonish avoids mssql-test-2=INFINITY

Well, turn on STONISH:

# crm configure property cluster-recheck-interval=2min
# crm configure property start-failure-is-fatal=true
# crm configure property stonith-timeout=60
# crm configure property stonith-action=eboot
# crm configure property concurrent-fencing=true
# crm configure property stonith-enabled=true

To verify that the STONISH resource is working properly:

# stonith_admin -t 20 --reboot mssql-test-2

And if there is a resource on the cluster that is designed to manage the mssql-test-2 node, then the virtual machine will go into reboot.

That, in principle, is all that I wanted to talk about creating a failover cluster for AlwaysON availability groups.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *