How we accelerated the automatic installation of OS on dedicated servers by 25-75%

Hi! My name is Lev, I am a system administrator

in Selectel

. I told you earlier

about adaptation of ARM servers

for our processes. Then we faced many problems, one of which was the integration of Ubuntu 22.04 autoinstallation. Now I will share how we made the new autoinstallation process faster, brought in several features, and also sped up the addition of new OS.

Use the table of contents if you don't want to read the entire text:

Problems with old autoinstallation
How we tried to make it better
What we got

Problems with old autoinstallation


First, I'll tell you how automatic installation was organized before.

After you ordered a server or changed the OS configuration, a certain process was launched.

  1. The backend receives a request to reinstall the OS.
  2. The backend, using OpenStack Ironic, sends a command to reboot the server.
  3. The server boots into the PXE application of the network card, receives an address via DHCP and a command to boot iPXE.
  4. iPXE again requests an address from DHCP. At the same time, it sends information via DHCP options, by which we can uniquely identify the server.
  5. The backend generates a configuration for autoinstallation from a template – an iPXE script with kernel parameters, preseed/kickstart/unattended.xml with partition and system configuration, a postinstall script with network and driver settings (Intel and Realtek network cards, the drivers of which need to be added to images, I'm looking in your direction), and a client postinstall script.
  6. iPXE receives a one-time link to the boot script and the Netinstall version of the distribution (except for Windows – the process in this case is more complicated, you need to use WinPE), transfers the preseed/kickstart file to the installer, installs the OS from our package mirrors, and reboots.
  7. The server boots from disk, the postinstall script is run, the network and drivers are configured, the boot order is changed back to the network card in UEFI.
  8. By calling the webhook, we inform the backend that the OS installation is complete and reboot the server.
  9. The server is ready to work. We notify the backend about this.

This version of autoinstall has been running since about 2018. It has survived a migration to a new backend version, a switch to an updated network scheme, a complete team rotation, and a change in all available operating systems. It has worked quite stably and predictably, although the code has become increasingly difficult to maintain over time.

Even now we use it in some cases that are not yet covered by the new autoinstall system. For example, for installing the OS on ARM servers, as well as for custom-configured servers with some less common RAID controllers. But at some point we realized that it began to impose too many restrictions on us.

I know “Introduction to ARM» we tell you how ARM appeared and what it is used for, and also analyze the configurations of dedicated servers and compare the characteristics of ARM processors.

Disk order

The templates were configured so that the target disks for OS installation were selected automatically in the order NVMe→SATA SSD→HDD. The chain is built logically and in most cases matches the expected behavior. But what if the client wants to install the OS on a SATA SSD, and use NVMe for the DB? The customer or engineer had to do this manually from the image, without automation. This process took a lot of resources with a large number of ordered servers.

Additional problems were caused by hardware RAID controllers in custom configurations. Since the hardware RAID is displayed in the OS as a single block device, sometimes the OS installer saw only one disk in requests for installing the OS with RAID1/RAID10 markings.

The autoinstaller couldn't handle this behavior, the OS wouldn't install, an engineer or administrator had to look in KVM to figure out what exactly happened. And the client was waiting for the server to be delivered all this time. All this affected the user experience, and we had to spend additional resources on fixing errors.

Drivers and package base

There were also problems with Intel and Realtek network cards. They were expressed in the fact that the Netinstall version of the installer did not have a driver for the network card, so it was not possible to install the OS. Usually, such a problem occurs when trying to install a relatively old version of the operating system on relatively new hardware.

To fix this, it was necessary to unpack the installer's initramfs, build the necessary driver from the source codes, put it in the right directory, describe the loading in the pre-script and pack the initramfs back. In some cases, for example when working with Debian, the installer refuses to work with drivers that are not digitally signed. In addition, after installing the OS on the server, the driver will also have to be installed/built again, already in the postinstall script.

Part of the postinstall script template that is responsible for installing the driver in a DEB-based OS for motherboards with the X670 chipset:

{% if version in ["18.04", "20.04", "9", "10", "11"] %}
mb=$(dmidecode --string baseboard-product-name)
mb_array=("X670 AORUS ELITE AX" <...> "MAG X670E TOMAHAWK WIFI (MS-7E12)")
if [[ " ${mb_array[@]} " =~ " ${mb} " ]]; then
       apt install linux-headers-$(uname -r) -y
       wget -O /root/r8125-dkms.deb https://<path_to_file>/r8125-dkms.deb
       apt install /root/r8125-dkms.deb
       rm -f /root/r8125-dkms.deb
       echo 'r8125' > /etc/modules-load.d/r8125.conf
fi
{% endif %}

Another problem is related to RHEL-based OS package mirrors. Previously, we did not update installation files immediately after the release of a new minor version, since one of the standard steps of the postinstall script is to update the entire package base in the installed OS. However, this approach has an interesting problem: at a certain point, the mirrors stop storing the path to the BaseOS of the required version.

The problem most often occurs after one minor version. For example, if the installer is version 8.1, and the current release is 8.2 – everything will go without problems. But if the current release is 8.3+ – everything will break.

Anaconda error.

Syslog output at the time the error occurred.

Ubuntu 22.04

In version 22.04, Ubuntu developers abandoned the use of debian-installer in favor of subiquity, and canceled support for preseed and switched to cloud-init. They also stopped building a Netinstall image of the OS.

However, using cloud-init didn't fit well with the process – we had to completely rewrite all our templates. This is the time to remember about legacy code and its overall bloat. When Ubuntu 22.04 turned one year old and wasn't available to us yet, we decided to speed up the transition to the new process. Small spoiler: adding Ubuntu 24.04 took only a week.

OS installation time and templates

Some operating systems, for example Windows Server, took a very long time to install – sometimes more than an hour. We wanted to speed up the process. The last, although not the key, nuance: the preseed/kickstart file templates eventually turned into a sheet of if-s and

crutches

non-obvious technical solutions. When making changes to the part responsible for dynamic softRAID + LVM marking in partman, the whole team was scared.

How we tried to make it better


One day we had a meeting with the developers and tried to figure out how to make the system better. The main idea is to install the OS from a “base image”. In fact, to prepare the root file system in advance, pack it into an archive, and then deploy it to disks and fine-tune it.

Of course, you can't do this “just like that” – you need an external system that will run the control script, mark disks, generate startup configs, install the bootloader, etc. In addition, I wanted to automate the assembly and testing of the base image. This would save you from unnecessary time spent on updating and adding new OS. I'll tell you about everything in order.

Service OS and control script

We took Arch Linux as a basis again. We added a minimum of packages to the distribution:

We have two management utilities. The first is a service for interacting with equipment through various stages. For example, performing SMART checks, flashing equipment, managing hardware RAID controllers, etc.

The second is a script that manages the service described above. It launches various stages in the right order, interacts with our backend. Let's see how autoinstallation looks in the new system.

  1. The backend receives a request to reinstall the OS.
  2. The backend, using OpenStack Ironic, sends a command to reboot the server. In parallel, a service VLAN is configured on the switch – this is necessary for connectivity with our DHCP server.
  3. The server boots into the PXE application of the network card, receives an address via DHCP and a command to boot iPXE.
  4. iPXE again requests an address from DHCP. At the same time, it sends information via DHCP options, by which we can uniquely identify the server.
  5. The backend generates an iPXE script from the template to load the service OS. Passes server identifiers to the kernel arguments.
  6. After booting, systemd-unit starts. It exports environment variables and runs the control script.
  7. The control script receives information about the OS (user password, SSH keys, user-data) from the backend, as well as information about the equipment: disk type and size, their partitioning scheme. Then it prepares the equipment, partitions the disks, downloads and unpacks the basic OS image, installs the bootloader, and generates the cloud-init configuration from templates.
  8. The control script notifies the backend that the installation has completed successfully and reboots the server.
  9. The backend sends a command to change the VLAN from service to client.
  10. Fork in the road. If the server is running in Legacy mode, it will try to start over the network, will not be able to find a DHCP server and will boot from disk. In the case of UEFI, it will boot directly from disk.
  11. At the moment of the first OS boot, cloud-init is executed, in which the network, hostname, time zone and time synchronization are configured, the password of the system user (root/Administrator) is changed, SSH keys are added and the client user-data is executed, which can be set in the panel. When working with UEFI, we change the boot order back so that the server tries to boot over the network. This is necessary so that the client can log in in Rescue modeand after the server lease ended, we could clean up the disks and conduct intermediate autotests.
  12. The server is ready for use.

Next I will tell you a little more about the order of the control script operation in the general case (for Debian-derivatives). There will be differences for Windows Server and RHEL-derivatives.

1. Clean the server before installation. Disassemble hardware RAID, delete file systems and partition tables, erase service areas of disks. Quick erase is needed when reinstalling the OS on an already active server, within the same lease. After the server is abandoned, all disks go through Secure Erase to ensure data confidentiality.

2. We receive from the backend a list of expected equipment (CPU, RAM, GPU, disk type and volume, desired partition scheme).

3. Get a list of physical block devices on the server.

4. Create new block devices via disk controllers. Relevant if the server uses hardware RAID controllers.

5. Compare the information from point 1 and point 2. This is necessary to understand how we will mark the disks and mount the partitions.

  • We create a partition table based on the data from point 1.
  • We create service partitions: EFI system partition, cidata. We mark them with the necessary labels and flags.
  • We create the remaining sections in accordance with the configuration.
  • We assemble partitions in RAID, if necessary. Previously, we also assembled partitions in LVM, but abandoned the process due to the difficulties with their subsequent disassembly and creation.
  • We create file systems on partitions.

6. Mount the created file systems.

7. Download and unpack the base image.

Initially, we copied the base image to RAM and unpacked it from there to disks. But one day we encountered the fact that some configurations did not have enough RAM to store the base image – at least 8 GB was needed for most Linux-based OS and 32 GB for Windows Server.

Then we ran tests and found that we could unpack the base image on the fly – directly to disk using `wget -qO | tar -xzf -C `. This method is a bit slower, but requires less RAM.

8. Generate cloud-init configuration files and put them in cidata.

In our case, meta-data with the server UUID and hostname, network-config with the Internet port settings, user-data with additional client configuration (I really hope you use it – we had to try hard to make it work properly), which is specified in the request for installing the OS and vendor-data, where we configure the time zone and NTP, set passwords and SSH keys.

9. Perform chroot into the unpacked system and perform additional system configuration. Generate mdadm.conf, ramdisk's, generate fstab, install GRUB, change bootorder.

10. Wait 60 seconds to ensure that the logs are uploaded. Notify the backend of the installation completion and reboot the server.

If the algorithm is executed without deviations, then after 7-15 minutes the OS is ready for work.

Assembling the image

To unpack the base image to disk, it must first be built and tested. To do this, we use Packer with QEMU as a provider and Ansible to fine-tune the OS. The build takes place in two stages.

First stage

Download the minimal-ISO of the distribution, launch the VM using QEMU, specify the path to the simplest cloud-init for installation and configure SSH to connect later via Ansible. If you are already actively using Packer for your tasks, I most likely will not tell you anything new.

For those who are hearing about it for the first time, Hashicorp Packer — is, according to the README on GitHub, a tool for creating identical machine images for multiple platforms from a single initial configuration. What does this mean for us? We can describe the OS image with code, store it in Git, build it on our runners in GitLab, and automatically deploy it to object storage.

The part of the Packer config that is responsible for the first stage of Ubuntu build:

 dynamic "source" {
    for_each = local.images.ubuntu
    labels   = ["qemu.default"]
    content {
      name             = source.key
      vm_name          = "${source.key}-stage1.qcow2"
      output_directory = "${var.outputs_prefix}/${source.key}/stage1"
      format = "qcow2"
      iso_url = "${var.iso_image_url == "" ? source.value.url : var.iso_image_url }"
      iso_checksum = "${var.iso_image_url == "" ? source.value.checksum : "none" }"
      boot_command = [
        "<enter>",
        "<wait2m>",
        "yes<enter>"
      ]
      cd_files = [
        "${var.preseeds_prefix}/cloud-init/*",
      ]
      cd_label     = "cidata"
      communicator = "none"
    }
  }

An example of a cloud-init file used for the first stage of an Ubuntu build:

#cloud-config
autoinstall:
 version: 1
 identity:
   username: ubuntu
   password: <здесь был хэш пароля>
   hostname: ubuntu-server
 locale: en_US.UTF-8
 keyboard:
   layout: us
 storage:
   layout:
     name: direct
 ssh:
   install-server: true
   allow-pw: yes
 late-commands:
   - shutdown now
 user-data:
   runcmd:
     - sed -ie 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/g' /etc/ssh/sshd_config
     - systemctl restart sshd
     - sudo apt-get update && sudo apt-get install --only-upgrade ansible

After the first stage of the build is complete, we upload the resulting stage1.qcow2 image to the object storage. Now, if changes are needed in the second stage, we do not have to reinstall the OS – we take the finished image and make new settings. This saves time and resources of our runners.

However, the process works until we decide to update the OS version – in this case, the installation ISO changes, and we start a full build from scratch again.

Second phase

Mount stage1.qcow2 in VM and run Ansible. Let's see how our playbook works (example for Ubuntu 20.04).

  1. We adjust mirrors.
  2. We update the package manager cache and all installed packages.
  3. Install linux-image-generic-hwe-20.04 — a kernel with additional hardware support. Necessary for working with the newest hardware.
  4. Install fail2ban, cloud-init, mdadm, grub2, dracut, wget, nftables.
  5. Copy the cloud-init configuration — it lists the modules that need to be launched after the OS starts. This includes almost all supported modules — this is done so that any user-data works correctly.
  6. We copy the UEFI-bootorder control script so that the bootloader will again prioritize booting over the network after the first launch.
  7. Copy /etc/default/grub.
  8. Rebuild initramfs to enable mdadm on first boot.
  9. Copy the basic fail2ban config.
  10. We pack the root file system into an archive and compress it.

For other OS, these steps will be different. For example, for Windows Server, we also add drivers for network cards and RAID controllers to the image. We build Proxmox from Debian – so there you need to install more packages, perform certain manipulations with system users and mirrors.

We also upload the resulting stage2 image to the object storage – it will later be picked up by a script during OS installation.

Automation of assembly and testing

When we were actively developing the new version of the autoinstaller, there was no particular need for automation. We tested everything on approximately the same configurations, and the OS was also the same. However, as soon as we started preparing for the release, we realized that testing the new autoinstaller on 100+ server configurations manually was an impossible task. We decided to go to our colleagues from QA and ask for help.

This is how a set of Python scripts was born — “wrappers” over the client and admin API, which can order all available configurations at once, install the OS on them, check their availability via SSH and fail over servers. It was a great solution that worked in approximately the same form for a long time, even when there were more OS images (there are now 19 of them).

Then, when they started building images for more OS, they decided to automate their assembly and deployment. That's how a simple GitLab pipeline with manual jobs for each OS image appeared. It deployed them to test environments, and after merging with the master branch, updated the images in production. It became more convenient, but testing and debugging still took several days.

As it was.

At some point, we got tired of testing servers manually and updated our pipelines. Now the image is built after a commit into a branch and placed in test environments. As soon as we checked that everything works on one configuration, we roll out the automatic installation of the test branch on all configurations of our servers. We order all available configurations, run Ansible on them and check that everything is installed as expected.

Next, we check the disk layout, repositories, passwords, SSH keys, OS version, list of installed packages, and list of users with interactive sessions. Then the servers are abandoned, the disks are cleared and returned to the pool available to clients.

How it became.

With the new pipeline, we can run all the necessary tests in about 4-6 hours, which is much more pleasing. We plan to finish the beautiful report generation in GitLab Pages.

What we got


Let's sum it up and see how the auto-installation process looks now.

Selecting disks

Now you can select disks of any type, in any order, for automatic installation. They can be organized into raids, and partitions and mount points can be created on them in any order.

During the development of the new system, we had to abandon the automatic addition of disks to LVM. Its disassembly does not always work stably, and it was not possible to achieve the required percentage of successful installations with it.

Old disk partitioning interface.

New disk partitioning interface.

Drivers and package base

We generally solved the driver problem by installing a kernel with extended hardware support. So far, we have not encountered the need to install additional drivers in Linux, but if this is necessary, the task will become much simpler. Now it will just need to be described in Ansible/Cloud-Init.

In Windows Server, drivers are still mounted in the virtual machine during installation, but now this happens locally, without being transmitted over the network. This way, we were able to get rid of the Samba server and reduce the load on service networks. The problem with the outdated base packages in mirrors was also solved: after the build, the image no longer changes and is simply unpacked.

Adding and testing new OS

Since installation and build now take less time overall, we can update the OS to a new minor version in one commit and test in about an hour.

Note: for minor updates, we test not all configurations, but about 10% of the most popular ones. This allows us to analyze the results faster and remove fewer servers from the pool available to clients during the tests.

If the test shows good results, we immediately update the available images. It is a bit more complicated with major versions and new OS families: for them, we have to update and add Ansible playbooks, which sometimes takes quite a lot of time. Next, we conduct tests on all available configurations. However, even so, the process takes much less time than before.

OS installation time

Let's move on to the most interesting part: how much did we manage to speed up the OS installation?

The measurements were taken on the EL10-SSD configuration (Intel Xeon E3-1230v5, 32GB RAM, 2x480GB SATA SSD), the results were obtained from the backend logs. In both cases, we measured the time between the backend receiving the reinstallation command and sending it to reboot after successful installation. The boot time after restart may differ on different platforms.

Autoinstallation time.

The graph shows that we managed to speed up auto-installation by 27-77%, depending on the OS. And now the installation time is a relatively constant value, so we can better plan the processes. For example, in the pipelines for the testing job, a 15-minute launch delay is used – we are sure that the server will be installed during this time. If not, we should check what exactly broke.

Have you ever encountered optimization of OS autoinstallation? Share your experience in the comments, it will be interesting to read!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *