Lessons Learned from the CrowdStrike Global Outage

As you know, on July 19, 2024, there was

serious incident

with software update

CrowdStrike Falcon

to protect computers (

Microsoft report

,

CrowdStrike report

).

Configuration update

caused an out-of-bounds read safety error in the ELAM driver

CSagent.sys

. Since it works at the Windows kernel level, millions of PCs went into BSOD.

The number of victims was officially estimated at 8.5 million:

Microsoft explains that this is the number of reports received. The actual number of victims is much higher. Banks were seriously damaged And airlines. In particular, Delta. Insurers estimated the total damage to the global economy from the bug at $5.4 billion.

What is the main problem with “protective software” and Windows kernel drivers?

Kernel Drivers

Many “protective software” including antiviruses use kernel drivers. There are several reasons for this.

First, kernel drivers provide visibility into processes in the system and the ability to prioritize loading to detect specific threats such as bootkits and rootkits. In addition, Microsoft provides rich functionality such as system callbacks for creating processes and threads, as well as filter drivers that can monitor the creation, deletion, or modification of files. The kernel can call back into drivers to block actions such as file or process creation. Many developers also use drivers to collect various network information in the kernel using NDIS driver class.


Photo from Denver International Airport, source

The second reason is performance. For example, analyzing and collecting network activity data at the kernel level is faster. There are many scenarios where data collection and analysis can be optimized for out-of-kernel operation, and Microsoft continues to work with developers to improve performance and provide best practices to achieve parity when running out-of-kernel.

The third reason for loading into kernel mode is to make the program resistant to hacking. Security developers want to be sure that their software will not be deactivated by malware, even if those attackers have administrator privileges. They also want their drivers to load as early as possible, so that they can monitor system events as soon as possible. For this reason, Windows has a mechanism for starting drivers marked as Early Launch Antimalware (ELAM)very early in the boot process. Specifically, CrowdStrike signed its CSboot driver as ELAM, allowing it to load early in the boot process.

In general, when it comes to kernel drivers, security vendors must find a compromise. Kernel drivers provide the above properties at the cost of robustness. Because kernel drivers run at the most trusted level of Windows, security vendors must carefully balance their needs, such as visibility and tamper resistance, with the risks of running in kernel mode.

Any code running at the kernel level requires careful testing, as it cannot fail and restart like a normal user application. This is a rule for all OS.

A balance between security and reliability can be achieved by minimizing the amount of code in kernel mode:

An example of security software with a balance between security and reliability, source: Microsoft

Windows provides several approaches to protecting user mode from unauthorized interference: Virtualization-based security (VBS) enclaves And protected processes. Developers can use them to protect key processes in their software.

Windows also allows you to use it to track events. ETW events and user mode interfaces such as Antimalware Scan InterfaceThese robust mechanisms can be used to reduce the kernel code footprint of security solutions, providing a balance between security and reliability.

Conclusions

You can improve Windows security using built-in tools, features, and settings. Some are set to maximum security by default, while others are not. Microsoft promises

increase windows 11 default security level

which currently includes various features and settings such as

TPM2.0

,

Virtualization-based security (VBS)

,

Hypervisor-protected Code Integrity (HVCI)

and others. The list of these functions will increase. Microsoft also

switches to Rust

as a more secure language.

Previously, we talked about how to strengthen Windows Defender to the maximum.

But the main conclusion is that antiviruses and other “protective software” introduce new attack vectors into the system, since they operate at the kernel level with elevated privileges. Microsoft believes that developers of protective software do this not quite correctly. As a result, the same antiviruses can cause more harm than good.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *