OpenSSH Backdoors

Picture this: a backdoor is discovered in OpenSSH, maintainers rush to release a fix, security researchers share technical details to analyze the malware. Speculation rages about the attacker’s involvement and motives, and tech media rushes to cover the story. It’s almost an epic incident, a blow to the trust that underpins open source development, a stark reminder of the risks of supply chain attacks. Brilliant and insidious in equal measure.

If you follow security news, you might immediately remember attack on the liblzma/xz‑utils repository earlier this year, the ultimate goal of which was a backdoor into OpenSSH. However, the case of xz-utils is not the one we will discuss below, as few people remember that the xz-utils backdoor was actually the second widely publicized attempt to backdoor OpenSSH. The first occurred over 22 years ago, in 2002. This article tells the story of that backdoor, and what we can learn from the attack that happened over two decades ago.

Background

The 2002 attack was pretty simple. Original announcement: OpenSSH Security Advisory: Trojaned Distribution Files.

At the time, the OpenSSH source code was hosted on ftp.openbsd.org, and somehow it was replaced with a version with a backdoor. It is not exactly clear how this happened, but the attacker managed to replace the .tar.gz files for several versions. At the time, effective server exploits were actively distributed in the hacker community, so this is not particularly surprising. Fortunately, due to the difference in the checksums of the files, the backdoor did not survive for long. For example, when trying to build a version of OpenSSH with a backdoor on FreeBSD, the ports system automatically checked the checksums of the packages, and since it already had information about the checksums of these versions before the backdoor was introduced, it reported a discrepancy. If the attackers had waited for the new version to be released and immediately replaced the .tar.gz files and the checksum files, they could have been much more successful.

By any measure, it was a simple backdoor, perhaps the simplest you could imagine. Step one: the package build was modified so that when configure was run, the files added by the attacker were compiled and executed. Step two: the malware connected to a hardcoded IP address in Australia every hour and received a list of commands to execute on the compromised device.

We still don’t know who exactly was behind this backdoor, but the common wisdom (at least among the OpenBSD developers I’ve spoken to) is that it was just a small-scale prank, something that was fairly common in 2002. It wasn’t the first time something like this had happened, of course. For example, Wu-FTPd, the most popular ftpd of the 90s, had something similar back in 1993. However, it was a good omen for what was to come. The 2002 incident is an interesting historical event, as it shows both similarities and differences compared to the modern attempt to introduce the xz-utils backdoor into OpenSSH. By examining these differences, we can learn useful lessons for the future.

Similarities

The obvious similarity between these two events is that they both targeted OpenSSH. And for good reason. OpenSSH is one of the prime targets of vulnerability research, because if you have a vulnerability in OpenSSH for remote unauthenticated access, you essentially have a “master key” to the Internet. However, there are very few truly serious vulnerabilities left in OpenSSH — it’s been about 20 years since the last major discovery. So if you can’t find a vulnerability, the next best thing is to introduce one.

Introducing a bug that can be exploited (a “bugdoor”), but that developers won't notice during code review, is probably the best option. It's notable that in both 2002 and 2024 we encountered a backdoor, not a bugdoor. This is likely because exploit development is hard, and server-side exploits especially hard. Given how much effort it takes to even be able to modify source code, it's not surprising that attackers would want to take a more reliable approach. The counter-argument is that we probably never see bugdoors because they're rare (and if they are, they're not seen as deliberate tampering).

There are other similarities, too. For example, both the 2002 and 2024 attacks targeted build systems. This also makes sense, since build systems are the perfect combination of complexity and flexibility. In fact, most build systems have virtually no limits on what you can do. This is necessary to ensure compatibility across platforms like Linux, MacOS, and Windows. Add in support for multiple architectures and older versions, and… you get the idea. The core design principle of build systems is “just make it work,” so they become a complex set of directives, rules, variables, and commands. As long as they work correctly, it’s safe to assume that very few people are paying close attention to the contents of their build scripts, including the developers and maintainers themselves. This is the perfect place to insert the first hook for a backdoor hiding in plain sight.

However, what makes OpenSSH an attractive target also makes it a hard target. Everyone uses it, so the chances of someone noticing something is wrong are pretty high. Both attacks were discovered fairly quickly. The 2002 attack was discovered by a developer who noticed that the checksums of the downloaded sources didn’t match, and the 2024 attack was found by a developer after thoroughly investigating a performance issue. The “many eyes” theory of open source security isn’t exactly popular these days, but it does seem that the larger the project, the less vulnerable it is.

The final similarity is that in both cases the attackers remain unknown. This may seem minor, but it is important to note that our usual approaches to attributing a vulnerability do not work well for chain-of-proliferation attacks. The sample size of such events is small, the attackers’ goals are opaque, and there is a high degree of individualization in each case. This works to the attackers’ advantage: the attack either succeeds or fails in a way that makes it impossible to determine who is responsible.

Differences

Despite the similarities, the two attacks are very different in their intent and execution. It’s interesting to see how much the ecosystem has changed in the meantime — everything has to go perfectly for an attack like this to succeed. The xz-utils backdoor was not without its flaws, but many steps were done correctly, and the attackers were much closer to success than the 2002 attack. The main difference is the motivation and intent of the attackers.

It is believed that in 2002, all the attackers wanted was to have fun and wreak havoc, and it is likely that they did not care much about getting caught. In fact, if the goal of the attackers was to brag about their prank later, then getting caught would have been a positive thing. In 2024, however, the attackers seemed to have a specific goal in mind, and clearly intended to use the backdoor to achieve their goals later. In other words, the xz-utils attack was intended as a reconnaissance tool, while the 2002 attack was more of a “performance” than a “persistent threat.”

One of the key differences from a technical perspective is that the xz-utils backdoor targeted build artifacts, not the build system itself. At worst, the 2002 attack could only compromise systems that compiled OpenSSH themselves. If the xz attack had been successful, over time every machine running Linux with systemd and using OpenSSH could have been compromised at any time the attackers chose.

In this case, “at discretion” is the key detail. The xz-utils backdoor gave the attackers variability: they could use the hidden functions of the backdoor selectively. Compared to automatically launching a reverse shell, this significantly reduces the risk of detecting the backdoor and allows it to be used selectively. This was a conscious decision by the attackers, since they could have chosen another option – to compromise all systems on which their code was executed.

It’s also worth noting how indirect the 2024 attack was. Instead of trying to plant a backdoor into OpenSSH itself, the attackers noticed that modern Linux distributions had unexpectedly added a dependency on liblzma to OpenSSH. That was the key to victory: instead of attacking a mature, well-funded project backed by world-renowned security researchers, they targeted an obscure, unfunded library that no one considered critical. Defenders think in lists, attackers think in graphs

In addition, another small innovation caught my attention. Instead of inserting obfuscated scripts, hiding in C files (like in the 2002 attack), or downloading the payload over the network, the xz backdoor had a pre-prepared payload inside a binary test file. This turned out to be an effective approach, as no one noticed the payload in the xz-utils source code repository until the backdoor was discovered during performance analysis. If there had been no performance penalty and the attackers had been less aggressive in their social manipulation, I suspect that both the hook and the payload could have remained undetected for a long time.

The final big difference is the attackers’ methodology. In the 2002 attack, the attackers directly attacked the infrastructure that hosted OpenSSH. The xz backdoor, on the other hand, was the culmination of a long social engineering campaign that saw the attacker become a full-fledged member of the core development team. No matter how you look at it, that’s an impressive feat.

Analysis

In the context of these events, there are many things to consider. Supply chain attacks have certainly evolved, but… not as much as expected? Let's set aside the “malicious insider” approach used by the xz-utils attackers and focus on the “attack on infrastructure” approach. If we look at the 2002 attack in detail, at the most fundamental level, there is nothing stopping this attack from being carried out today. With a little bit of finesse and patience, an attack targeting the source code distribution infrastructure is still entirely possible.

My favorite example is zlib. Like xz‑utils, it is a compression library. You could say it is a capital-C compression library because zlib is everywhere, including in OpenSSH. New versions of their source code are distributed via zlib.net, and the server that hosts zlib.net is run by a small company in Michigan called a2hosting.comwhere managed VPS prices start at $26.95 per month. This hosting company is particularly fond of using CPanel and exim, which are included on zlib.net as well.

This means that the integrity of the distribution chain for almost everything depends on the integrity of a2hosting.com and the absence of remote vulnerabilities in CPanel or exim. And the situation not very encouragingand I haven't even mentioned Pure-FTPD, Apache httpd, or Dovecot (and that's just what's used directly on zlib.net, not considering how a2hosting.com itself could be attacked). Find a vulnerability in any of these projects, or a way to introduce a backdoor, and the chances of introducing a backdoor into the zlib distribution chain are very high.

In recent years, the situation has improved for zlib because they now use a dedicated server. For a long time, zlib.net was hosted on a virtual server (i.e. anyone could buy hosting on the same server). This has become a joke among vulnerability researchers: since finding a bug in zlib is extremely difficult, deliberately adding a bug to the code when the developer announces a new release would probably be the easiest option.

The point of all this is not to criticize zlib. Their support is world-class (I was lucky enough to read their Huffman table code during this analysis), and they work just fine. This is not a problem unique to zlib, xz‑utils, or OpenSSH. Everyone is vulnerable to these risks. The point is that when you look at the overall risks, we are in a really bad place. I am not usually one to exaggerate, but our current vulnerability to chain of custody attacks is quite alarming.

Let's consider the situation. If you compile OpenSSH from source, you will have code (libraries and executables) from about 5 different packages of the distribution in your address space. Not too bad. But in practice, most people use systemd-based Linux systems – in this case, OpenSSH will have code from about 30 different packages (including our friends xz and zlib) in its address space. This is already starting to worry you.

But that's not all. While both the 2002 attack and the xz-utils attack targeted OpenSSH directly, that's not necessary. If you can run a backdoor as root somewhere on the system, you can inject yourself into the sshd process. It's one extra step, but not a very difficult one. On a standard Ubuntu Server 22.04 system, 97 packages run as root after boot (that's 16% of all packages installed by default). On my Linux desktop (where I use remote access via OpenSSH almost every day), a whopping 384 packages run as root.

But is it really possible to protect all of this? Hundreds of projects from teams with different cultures, motivations, funding, experience, and resources? It doesn’t matter if it’s an infrastructure-focused attack like in 2002 or a social engineering attack like in 2024. Either way, I’m not sure we can protect against it with the current approach to operating system design.

To sum it up

Chain of custody attacks are as relevant as they were 20 years ago, and we are lagging behind in protecting against them more than we would like. Frankly, we have largely turned a blind eye to them because of the endless stream of other vulnerabilities that allow attackers to achieve their goals. However, in a world where exploitable vulnerabilities are becoming fewer and fewer, it is unreasonable to think that attackers will not start using chain of custody attacks more and more often. And we are not ready for that yet.

The solution will inevitably involve reducing the attack surface. This will require a deliberate reduction in the amount of code running in remotely accessible processes or at high privilege levels such as root, and an accelerated adoption of sandboxes. Previously, we considered sandboxing only in the context of dealing with parts of the code that handle untrusted data—for example, image parsers, video decoders, JavaScript engines, and so on. In a world where the code itself is untrusted, not the data, the goal should be to refocus on designing systems where all code is constrained to least privilege, and there are technical means to ensure that this happens.

Fortunately, there has been some progress in this direction, at least for Linux. In Ubuntu 24.04, liblzma can no longer be found in the OpenSSH address space, in Android almost every process is restricted by a combination of SELinux and seccomp‑bpf, and the latest Linux kernels now support a promising technology called landlockwhich will allow even unprivileged applications to run in a sandbox. Writing a landlock sandbox for “make” that will prevent the 2002 attack takes about 250 lines of code.

What we learned from the xz-utils backdoor is that there is a greater willingness to invest time, money, and other resources into attacking distribution chains. It feels different now — the stakes have been raised significantly. We have a lot of work ahead of us, and we may have to make radical changes to the way we think about operating system design and application development.

The good news is that there are more and more interested topics to compete with the enthusiasm of attackers on the defense side of distribution chain attacks. This may seem small, but the interest and enthusiasm is a great start, and it is already more than we had 20 years ago.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *