Almost a master key to the Internet: The xz utils backdoor
Why everyone is so upset and what we can learn from the near-disaster
At least it’s not a disaster, but only a near-disaster: a cleverly hidden backdoor in a very widely used open source software collection (xz utils) was found shortly before its rollout.
In case you’ve read “xz backdoor” a hundred times by now, but still don’t know exactly what’s behind it, here are the facts:
What is this backdoor and why is everyone so upset?
- xz utils is a collection of Unix software programs that can compress files. The collection is installed by default on most Linux distributions.
- Among other things, xz utils is used by SSH, a very widely used Linux service for secure, authenticated remote access to servers.
- The backdoor would have made it possible to bypass the authentication required for remote access via SSH. This would have allowed full remote access to the affected servers, including the execution of arbitrary commands or malicious code (remote code execution). This is easily enough for the maximum CVSS score of 10, which is exactly why the vulnerability CVE-2024–3094 has been given that score.
- This would have allowed unauthorized remote access to all Linux servers accessible via SSH on the public Internet — a kind of “master key for the Internet”, as Volker Zota and Malte Kirchner put it in the German heiseshow. This is no exaggeration, because according to Shodan, there are around 20 million SSH access points worldwide, 2 million in Germany alone.
And why only near-disaster?
- Different Linux distributions download updated software in different frequencies, depending on the interest of the user group. Those who absolutely want to have the latest updates as quickly as possible (as “rolling releases”) (usually only developers of Linux operating systems need this) can choose certain developer versions (“unstable” versions) — but the updates are sometimes not yet fully developed. The Kali Linux distribution, which is designed for pentesting, also receives “rolling releases”.
- Stable” versions are used for most applications. In this case, new software is only installed after extensive testing and more attention is paid to ensuring that software updates do not impair system stability.
- The version of xz utils that contained the backdoor was previously only included in some “unstable” Linux distributions. But: It was about to be rolled out to the widely used stable versions (2 weeks to be precise).
- Andres Freund, a German software developer at Microsoft in San Francisco, discovered the backdoor during maintenance work. Failed SSH logins suddenly required much more computing power than before — that seemed strange to him. On March 29, he made his discovery public — after reporting it to the developers of the Linux distributions Debian and Red Hat as part of a “coordinated disclosure”.
How could the backdoor be inserted?
- The xz utils software collection — like so many open source projects on which the entire Internet is built — had been programmed and updated for years by a single developer as a hobby.
- This individual had been suffering from burnout and mental health issues for several years, so he embraced the help of a stranger or group of strangers, “Jia Tan”. “Jia Tan” took on more and more tasks in the project — and eventually programmed in the backdoor.
- The backdoor itself is very cleverly built in. It cannot be found in the source code, but only in the compiled, executable version of the software.
- In addition, it is a so-called NOBUS (nobody but us) backdoor. Not everyone can use it, but you need a secret key for it. This is why many suspect that a secret service is behind the attack.
What do I need to do?
Your Linux servers are very unlikely to be affected unless you have installed one of the developer versions (list here) and updated between March 26 and March 28. You can check whether specific servers are affected with this tool.
So what do we learn from this?
The furor over the xz-utils backdoor is not only great because the vulnerability would have been so far-reaching — but also because something like this could happen again at any time in the software ecosystem that we have: No one has a magic formula to prevent such disasters in the future.
The curse and blessing of software scalability
The cause of all this misery is actually not a problem at all, but an advantage of software — namely its scalability: once written, software can be duplicated at will, which means that anyone can actually use it. This is great because, in contrast to engineering disciplines that are often very material-intensive — let’s take the construction of a power plant as an example — a product can be reproduced at will with almost no additional material input.
Imagine if one person built a proper cooling tower, all other power plants could use it. How convenient!
And this is where the problems begin. Let’s cut them into problem slices:
- Complexity of the software supply chain: No software developer writes every line of code themselves. After all, it would be horrendously inefficient if everyone were to reinvent the wheel. It is much more efficient to simply use existing software that solves the same task. So almost every piece of software is an almost unmanageable pile of pieces of software from different sources, which are all updated by different people — because software is never finished.
Imagine building your power plant like this. What a bustle of engineers at the cooling tower! - Great power and responsibility of individuals: Who actually writes all these pieces of software? Especially in the case of very basic Linux operating system software or Internet protocols, the software used is often open source. This means that, in principle, anyone can contribute to it. And it is often small, unassuming hobby projects by individuals that are used extremely widely. Sure — why should you write a detail like a compression tool yourself when one already exists? The phenomenon is so well known that there is an xkcd comic with cult status.
A great responsibility (and therefore great power) often rests on the shoulders of these lone, often unknown individuals: What they build into their software is used by millions. Of course, millions can also look at it, because the source code is open and can be tested, checked and reviewed by anyone (the most important argument of open source advocates). But in the end, it’s more a matter of hope than a controlled process. Andres Freund, the discoverer of the xz-utils backdoor, says himself that his discovery was mainly by chance.
So imagine you are building your power plant — and inviting anyone who wants to help build it. You simply hope that everyone involved will do a good job — and if not, that the others will notice in time. Forget about the four-eye principle. - Developers cannot oversee the application scenarios of their software: The extensive and free reuse of software libraries also means that no programmer can foresee what their creations will be used for. You can understand what this means by listening to Dr. David Clark, who helped build the architecture for the hut-based Internet at MIT in the 1970s. In a video recorded in 2011, he says: “When people asked if we could imagine that the Internet would connect systems all over the world, we said — sure! There could be tens of thousands of systems on the Internet! […] If you had asked us back then, we would probably have said that it’s a pretty bad idea to attach parts of the network for the operation of critical military operations or critical infrastructure like the power grid to the public internet. You just don’t do that!”
Translated into our power plant example, this would mean that you not only let everyone help build it — you also don’t tell the individual engineers exactly what they are building. The engineers select material for a wall without knowing whether they are building a garden pavilion or a cooling tower — and you yourself don’t care what material your cooling tower is made of as long as it doesn’t collapse.
Who can take responsibility for all this?
The problem is not open source. The problem is not that software is being reused. The problem is not that individuals develop software and make it available free of charge. The problem doesn’t even have to be that this software is then used for, let’s say, critical infrastructure.
The problem starts when nobody can take responsibility for the software used in these critical infrastructures:
- Not the developers of the software, because they are often only prusuing a hobby and cannot possibly foresee all deployment scenarios — they often don’t even know them.
- Not the infrastructure operators, because they can neither control the security of the sheer mass of software commonly used nor hold the developers accountable: If you use software for free, you can’t make demands.
Phew. If nobody can take responsibility for all this — then maybe we really shouldn’t do it that way?
If we built power plants or even just houses the way we build IT systems….then we’d probably all rather sleep in a tent.
If you want to act responsibly, you can probably only use software whose security you can either control yourself or make someone else responsible. And yes, this will definitely be much more expensive than the previous approach of “using free software + hope for the best”.
By the way, this is exactly what the Cyber Resilience Act requires: the distributor of a product is liable for its security — including all software libraries used. Only time will tell what this means for software collections such as xz-utils.
This article was also part of the monthly Security Briefings for Hard Hats” [in German].