I’m hyperbolizing, but still. Open source was supposed to be the best possible software development model. Even those not into the GPL vs BSD war, or more generally, the copyleft vs more permissive licenses, would the open-source model beneficial as a whole.

Off the top of my head:

  • Open-source software allows anyone to inspect it, and to check that it’s not spyware or malware.
  • Open-source software allows anyone to build it, even on operating systems or distributions where native packages aren’t available. (Now there are Flatpaks, and snaps, but not everyone likes them. And AppImages are not optimal either.)
  • Open-source software allows anyone to investigate bugs, and to find and propose fixes.
  • Open-source software allows anyone to develop improvements, add new features, and change a program’s behavior, even if this would require a fork. Developing a derivative version is thus possible.

Now, the reality check:

  • How many people, even if they are software developers, bother to inspect the source code of the thousands of software packages they’re using? The approximate figure is zero.
  • How many people, even if they are software developers, could understand the complexity of today’s software? Consider this:

    Debian 12 … is 1,341,564,204 lines of code. That’s the project’s own estimate. One and a third billion, that is, one and a third thousand million lines of code. For comparison, Google Chrome is about 40 million lines, which is in the same ballpark as the Linux kernel these days.

    Nobody can read the source code of Chrome. Not alone, not as a team. Humans don’t live long enough. Any group that claims to have gone through the code and de-Googlized it is lying: all that’s possible to do is some searches, and try to measure what traffic it emits. A thousand people working for a decade couldn’t read the entire thing. …

    We consider this normal. Everything is like that. It’s just how it is. …

    The world runs on software, produced and consumed on an industrial scale, always getting bigger and more complicated.

    Nobody understands it any more. Nobody can. It’s too big. But it’s the only model of making and selling software we have, so we are trapped in it.

  • The most used feature of open-source software? Building binaries for the OS you’re using!
  • Finally, a small percentage of people (not including those who are actively involved in the development of the respective piece of software) indeed try to propose bug fixes after having examined the source code. I did it myself when possible, so that my bug report wouldn’t just add to the pile of unprocessed reports.

So far, so good. If this software development model offers you opportunities that you don’t care about, it shouldn’t hurt, right?

Enter the bad actors

Let’s be reminded of a few recent incidents:

But it’s time for recent as in really recent.

Red Hat in all caps says STOP USAGE OF ANY FEDORA RAWHIDE INSTANCES

Red Hat on Friday warned that a malicious backdoor found in the widely used data compression software library xz may be present in instances of Fedora Linux 40 and in the Fedora Rawhide developer distribution.

The IT giant said the malicious code, which appears to provide remote backdoor access via OpenSSH and systemd at least, is present in xz 5.6.0 and 5.6.1. The vulnerability has been designated CVE-2024-3094. It is rated 10 out of 10 in CVSS severity.

Users of Fedora Linux 40 may have received 5.6.0, depending upon the timing of their system updates, according to Red Hat. And users of Fedora Rawhide, the current development version of what will become Fedora Linux 41, may have received 5.6.1. Fedora 40 and 41 have not been officially released yet; version 40 is due out next month.

Users of other Linux and OS distributions should check to see which version of the xz suite they have installed. The infected versions, 5.6.0 and 5.6.1, were released on February 24 and March 9, respectively, and may not been incorporated into too many people’s deployments.

CVE-2024-3094: Critical Impact, 10.0 score. No versions of Red Hat Enterprise Linux (RHEL) are affected.

2021: JiaT75 (Jia Tan) creates their GitHub account.

2022: In April 2022, Jia Tan submits a patch via a mailing list. The patch is irrelevant, but the events that follow are. A new persona – Jigar Kumar enters, and begins pressuring for this patch to be merged.

Soon after, Jigar Kumar begins pressuring Lasse Collin to add another maintainer to XZ. In the fallout, we learn a little bit about mental health in open source.

Three days after the emails pressuring Lasse Collin to add another maintainer, JiaT75 makes their first commit to xz: Tests: Created tests for hardware functions. Since this commit, they become a regular contributor to xz (they are currently the second most active). It’s unclear exactly when they became trusted in this repository.

Jigar Kumar is never seen again.

2023: JiaT75 merges their first commit on Jan 7 20231, which gives us good indication into when they fully gain trust.

In March, the primary contact email in Google’s oss-fuzz is updated to be Jia’s, instead of Lasse Collin.

Testing infrastructure that will be used in this exploit is committed. Despite Lasse Collin being attributed as the author for this, Jia Tan committed it, and it was originally written by Hans Jansen in June.

Hans Jansen’s account was seemingly made specifically to create this pull request. There is very little activity before and after. They will later push for the compromised version of XZ to be included in Debian.

In July, a PR was opened in oss-fuzz to disable ifunc for fuzzing builds, due to issues introduced by the changes above. This appears to be deliberate to mask the malicious changes that will be introduced soon.

2024: A pull request for Google’s oss-fuzz is opened that changes the URL for the project from tukaani.org/xz/ to xz.tukaani.org/xz-utils/. tukaani.org is hosted at 5.44.245.25 in Finland, at this hosting company. The xz subdomain, meanwhile, points to GitHub pages. This furthers the amount of control Jia has over the project.

A commit containing the final steps required to execute this backdoor is added to the repository.


A request for the vulnerable version to be included in Debian is opened by Hans.

This request was opened the same week Hans’ Debian account was created. The account created a few similar “update” requests in various low traffic repositories to build credibility, before asking for this one.

A number of other, suspicious, anonymous name+number accounts with little former activity also push for its inclusion, including misoeater91 and krygorin4545. krygorin4545’s PGP key was made 2 days prior to today.

A Fedora contributor states that Jia was pushing for its inclusion in Fedora as it contains “great new features”.

Jia Tan also attempted to get it into Ubuntu days before the beta freeze.

As of 9:00 PM UTC, GitHub has suspended JiaT75’s account. Thanks? They also banned the repository, meaning people can no longer audit the changes made to it without resorting to mirrors. Immensely helpful, GitHub. They also suspended Lasse Collin’s account, which is completely disgraceful.

Nah, this is not disgraceful. Lasse Collin bears a huge responsibility. And maybe not him. Gullibility is not an excuse.

I don’t know who is Jonathan Metzman (OK, “Security engineer working on open source and Chrome security”), but I don’t like what I see here: xz: Disable ifunc to fix Issue 60259. #10667

jonathanmetzman approved these changes Jul 7, 2023

jonathanmetzman commented Mar 29, 2024

In hindsight, this does not “look good to me” 🙂
We’ve disabled the projects for now, but will try to explore how this PR could have prevented discovery of this issue.

Oh, that’s all?! “In hindsight”… we shouldn’t have given access to the nuclear codes to a Chinese malware factory?

Let’s go again to that Debian Bug report logs – #1068024. Joey Hess has a sensible demand:

I count a minimum of 750 commits or contributions to xz by Jia Tan, who backdoored it.

This includes all 700 commits made after they merged a pull request in Jan 7 2023, at which point they appear to have already had direct push access, which would have also let them push commits with forged authors. Probably a number of other commits before that point as well.

Reverting the backdoored version to a previous version is not sufficient to know that Jia Tan has not hidden other backdoors in it. Version 5.4.5 still contains the majority of those commits.

I’d suggest reverting to 5.3.1. Bearing in mind that there were security fixes after that point for ZDI-CAN-16587 that would need to be reapplied.

Unfortunately, this wouldn’t be feasible, answers Aurelien Jarno:

Note that reverted to such an old version will break packages that use new symbols introduced since then. From a quick look, this is at least:
– dpkg
– erofs-utils
– kmod

How on Earth was it possible to allow a person whose identity had not been verified to become a trusted contributor to such a package?

Mandatory xkcd reference:

It’s a reference to the 2016 left-pad incident:

Yes. But no. Indeed, it’s absurd to have the following piece of code in a NPM repository, and to create such a stupid depenency:

But it’s equally absurd to retrieve (or to check for updates versions) of such a thing for every single fucking build of every other package that depends on it! Isn’t anyone using any kind of caches?

The xz package isn’t such a strong dependency, and yet, it kinda is. And it’s been infected by an unknown bad actor, who was undeservedly trusted by some nincompoop!

Are we good now?

No, we aren’t good. We don’t know if this is the only backdoor inserted by the so-called Jia Tan. We also don’t know how many other bad actors are currently trusted contributors to other open-source packages and are waiting for the best time to commit their crap!

I wasn’t aware that the open-source software development was based on giving Bart Simpson the authority over who’s to be trusted.

What if something like that happens to the Linux kernel and is only discovered one year later? Given the number of commits to a fast-evolving and too-complex kernel, does anyone really believe that any of the “top tier” kernel maintainers who play a crucial role in the review and merge process, i.e., Linus Torvalds himself, then Greg Kroah-Hartman, Andrew Morton, Ingo Molnar, Thomas Gleixner, are able to fully understand and vet every single commit? I don’t.

Let’s make the following experiment. Let’s say you are a seasoned software developer, albeit not very familiar with the Linux kernel architecture. You must have heard of the recent CVE-2024-1086. The flaw finder issued a highly detailed technical report of the bug: Flipping Pages: An analysis of a new Linux vulnerability in nf_tables and hardened exploitation techniques. Tell me how much time you need to understand the report.

Some people try to take it jocularly:

My fortune-mod was more skeptical:

Next thing, some idiot will say that we should trust AI with that. Y tu mamá también.

LATE EDIT: Some more details on the xz case

A security newsletter has more: Risky Biz News: Supply chain attack in Linuxland. Here’s an important point:

How did nobody else spot the backdoor: While Jia Tan added some code to the XZ Utils project, the bulk was never added to the project’s GitHub repo. The actual backdoor resided in the XZ Utils tarballs. These are TAR archives that are auto-generated whenever developers released a new version of the XZ Utils library. Third-party software projects don’t (usually) pull the source code and compile the whole XZ Utils project. They just pull the tarball. Jia Tan modified lines in the tarball configs to load the backdoor, which was hidden in binary test data files. Developers usually audit the source code, (erroneously) thinking that the tarballs perfectly reflect the code. In reality, you can tamper with tarballs and other release archive files quite easily—and other threat actors have done so in the past. Below is a graph of the backdoor’s multi-part components by Thomas Roccia.

This newsletter provides pointers to other insights. Kevin Beaumont:

Before everybody high fives each other, this is how the backdoor was found: somebody happened to look at why CPU usage had increased in sshd, and did all the research and notification work themselves. By this point the backdoor had been there for a month unnoticed.

I’ve made the joke before that if GCHQ aren’t introducing backdoors and vulns in open source that I want a tax refund. It wasn’t a joke. And it won’t be just be GCHQ.

Dominic White:

What makes the #xzbackdoor so amazing to me isn’t the tech – is the long game and social aspects to weaken technical defences. Starting two years before, pressuring an unwell dev with sock puppets to take over maintenance, getting oss-fuzz changes merged to weaken detections and adding legitimate new features to encourage distros to package the new version.

Kevin Beaumont, again:

Another two thoughts on XZ –

– sshd itself has no dependency on the XZ utils library. The streams got crossed in a way I don’t think anybody understood (except the threat actor).

had that backdoor been performant with sshd, I don’t think anybody would have spotted it.

The way this played out opens a window of opportunity to go back and look at both issues.

One more of Kevin’s:

Also since there’s a lot going on here, up thread I mentioned a 2015 minor bug in Google’s OSS Fuzzer (security testing tool) – the threat actor deliberately introduced the bugged function into XZ, then used that to get an exception in OSS Fuzzer’s code to stop scanning of XZ.

I’ve just been looking at the actual backdoor for a few hours with greater minds than me, it’s incredibly complex – it basically piggy backs RSA key RCE inside sshd as a Trojan horse. Somebody/bodies spent $$ on this.

Dave Anderson came with a bonus:

The poor original maintainer of xz is on it now, and has already found another “fun” thing: https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd14dedfe63833f8ccbe41b55823b00. The configure check for enabling the Landlock sandboxing facility was subtly broken, so that Landlock support would never get enabled. The original malicious commit landed around the same timeframe as the main backdoor, also at an abnormal time of day compared to the new maintainer’s historical activity pattern.

Life is so beautiful! Have a look at A few relevant quotes.

But the worst thing is that the xz backdoor has been discovered purely by accident, not as part of a review process. This is truly chilling.