So far, I have modified rtld-elf (the FreeBSD runtime linker) to handle simple MachO objects. It can now successfully load and run a trivial MachO executable built on a Mac with an external library dependency (libSystem.B.dylib - the C library). The linker also still handles ELF objects, so I call this monstrosity the "Macho ELF".
★ zoe@haru ~/airyx-freebsd/libexec/dyldᐳ llvm-objdump -d hello
hello: file format Mach-O 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000100000fc0 _start:
100000fc0: 48 89 e5 movq %rsp, %rbp
100000fc3: 48 83 ec 10 subq $16, %rsp
100000fc7: e8 14 00 00 00 callq 20 <_main>
100000fcc: 48 83 c4 10 addq $16, %rsp
100000fd0: 48 89 c7 movq %rax, %rdi
100000fd3: 48 c7 c0 01 00 00 00 movq $1, %rax
100000fda: 0f 05 syscall
100000fdc: 0f 1f 40 00 nopl (%rax)
0000000100000fe0 _main:
100000fe0: 55 pushq %rbp
100000fe1: 48 89 e5 movq %rsp, %rbp
100000fe4: 89 7d fc movl %edi, -4(%rbp)
100000fe7: 48 89 75 f0 movq %rsi, -16(%rbp)
100000feb: b8 7f 00 00 00 movl $127, %eax
100000ff0: 5d popq %rbp
100000ff1: c3 retq
This file corresponds to the trivial C code
int main(int argc, char **argv) {
return 127;
}
together with the equivalent of the libc
crt1.o
which adds _start
and handles the exit code by calling the
_exit
syscall. (I do this because I don't have a libc.dylib
yet.)
Dyld
(our modified rtld-elf
) parses and loads the Mach header then jumps to the object's entry point
(i.e. _start
) which was obtained by parsing the LC_MAIN
command from the Mach header, then exits correctly with the return code (below). This isn't much but is promising. The next step will be to parse dynamic symbols and resolve them in the listed dylibs obtained from the Mach header.
ᐳ ./dyld ~/airyx-freebsd/libexec/dyld/hello
opening main program in direct exec mode
Parsing command-line arguments
argv[1]: '/Users/zoe/airyx-freebsd/libexec/dyld/hello'
MachO object
hdr.magic 0xfeedfacf
hdr.cputype 1000007
hdr.cpusubtype 0x80000003
hdr.filetype 0x2
hdr.ncmds 9
hdr.sizeofcmds 520
hdr.flags 0x85
0. lc.cmd 19 lc.cmdsize 72
segment __PAGEZERO vmaddr 0 size 100000000 fileoff 0 size 0
1. lc.cmd 19 lc.cmdsize 152
segment __TEXT vmaddr 100000000 size 1000 fileoff 0 size 1000
2. lc.cmd 19 lc.cmdsize 72
segment __LINKEDIT vmaddr 100001000 size 1000 fileoff 1000 size c0
3. lc.cmd 22 lc.cmdsize 48
4. lc.cmd 2 lc.cmdsize 24
5. lc.cmd b lc.cmdsize 80
6. lc.cmd e lc.cmdsize 32
7. lc.cmd 2a lc.cmdsize 16
8. lc.cmd 28 lc.cmdsize 24
entry point fc0 stacksize 0
mapbase 801069000 data_vaddr 0 base 0
Overlaying segment 1 @0x801069000 sz 0x1000 5 20012 off 0x0 0
mapbase 801069000 data_vaddr 1000 base 0
Overlaying segment 2 @0x80106a000 sz 0x1000 1 20012 off 0x1000 1000
transferring control to program entry point = 0x801069fc0
ᐳ echo $?
127
Comparing the output of the same C code compiled on macOS ®️ (below) we can see that there is no _start
symbol or preamble corresponding to the ELF CRT. This implies that either the kernel or dyld
takes that responsibility. So, let's add the crt1
code above to our dyld
as a wrapper to the entry point.
★ zoe@kawa ~/Projects/junkᐳ objdump -d hello
hello: file format mach-o 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000100003f90 <_main>:
100003f90: 55 pushq %rbp
100003f91: 48 89 e5 movq %rsp, %rbp
100003f94: c7 45 fc 00 00 00 00 movl $0, -4(%rbp)
100003f9b: 89 7d f8 movl %edi, -8(%rbp)
100003f9e: 48 89 75 f0 movq %rsi, -16(%rbp)
100003fa2: b8 7e 00 00 00 movl $126, %eax
100003fa7: 5d popq %rbp
100003fa8: c3 retq
The magic below in rtld_start.S seems to work for a trivial case. Happily, rdi
contains a pointer to our argc
and argv
so we just have to pass these in the right registers. Next: symbol resolution!
.rtld_goto_main: # This symbol exists just to make debugging easier.
/*
* MachO executables don't have _start or the typical crt preamble,
* so we have to set up the stack and handle exiting
*/
movq (%rsp), %r15 # address of obj_main
movl 0x2b0(%r15), %ecx # offset to 'is macho' flag
cmpl $1, %ecx
jne .jump_elf
movq %rsp, %rbp
pushq %rdi # args ptr
movq %rdi, %rsi
addq $8, %rsi # addr of argv
movl (%rdi), %ecx # argc
xor %rdi, %rdi
movl %ecx, %edi
callq *%rax # call entry point
addq $8, %rsp
movq %rax, %rdi # return code
movq $1, %rax # _sys_exit
syscall
.jump_elf:
jmp *%rax # Enter main program
MachO objects have full paths to their dependencies, so finding them is easy. I expanded our little test program to the following:
(__TEXT,__text) section
_main:
100003f50: 55 pushq %rbp
100003f51: 48 89 e5 movq %rsp, %rbp
100003f54: 48 83 ec 10 subq $16, %rsp
100003f58: c7 45 fc 00 00 00 00 movl $0, -4(%rbp)
100003f5f: 89 7d f8 movl %edi, -8(%rbp)
100003f62: 48 89 75 f0 movq %rsi, -16(%rbp)
100003f66: 48 8d 3d 2d 00 00 00 leaq 45(%rip), %rdi ## literal pool for: "Hello from MachO!"
100003f6d: e8 08 00 00 00 callq 0x100003f7a ## symbol stub for: _puts
100003f72: 31 c0 xorl %eax, %eax
100003f74: 48 83 c4 10 addq $16, %rsp
100003f78: 5d popq %rbp
100003f79: c3 retq
or this in C:
#include <stdio.h>
int main(int argc, char **argv) {
puts("Hello from MachO!");
return 0;
}
Since we are now calling a library function, the program depends on libSystem.
ᐳ llvm-objdump -m --dylibs-used ~/hello
/Users/zoe/hello:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
These dependencies are found in the LC_LOAD_DYLIB commands in the Mach header, and we can extract them during mapping of the object. For now, let's symlink libSystem.B.dylib
to /lib/libc.so.7
and try to resolve _puts
against our ELF C library since dyld
already knows how to link ELF. One minor complication is that ELF objects expect the main program to have two symbols - __progname
and environ
- which MachO objects don't have. We can overcome this by creating a tiny shared object to provide them and injecting it as a link dependency from rtld.
So now dyld
will load and parse our progname.so
and /usr/lib/libSystem.B.dylib
(which is actually libc.so.7
).
relocating "/Users/zoe/hello"
object /Users/zoe/hello has no run-time symbol table
relocating "/usr/lib/libSystem.B.dylib"
relocating "/Users/zoe/airyx-freebsd/libexec/dyld/progname.so"
doing copy relocations
initializing initial thread local storage
initializing key program variables
"__progname": *0x801475730 <-- 0x7fffffffea92
"environ": *0x801475738 <-- 0x7fffffffe7b0
This will crash though, because we aren't actually binding any MachO stubs yet; our function call jumps off into the abyss. Let's fix that.
After some single stepping in GDB, it looks pretty straightforward. Each external symbol is invoked with a call
to an individual address in section __TEXT,__stubs
(one per symbol). Each location simply jumps to the address stored in a table in the __DATA
section. So ... pre-fill all those table entries with a lookup function which fills in the real address, then jump to the real address. This is handled for ELF by _rtld_bind
and for MachO by dyld_stub_binder
in Apple's dyld
. Keeping with our fake dylib, let's start by crafting a modified _rtld_bind
that reads MachO stub names and resolves them in ELF libraries.
A few complexities later, I have a working symbol binder!!
This turned out to be tricky because MachO files use a series of linker opcodes to control symbol binding, and I had to implement an interpreter for those opcodes. (See Apple's source code to dyld.) The basic process is that our dyld
fills in the address of __stub_helper
(in the executable) for all lazy_bind symbols when it initializes. The __stub_helper
is then invoked the first time that symbol is used, and calls into our dyld_macho_bind_stub()
function which looks like this:
uint64_t
dyld_macho_bind_stub(void **cache, uint64_t slot)
{
int lib_ordinal = -1;
int seg_idx = -1;
char *symbol = NULL;
uint64_t address = 0;
uint32_t off = ((const macho_lc_dyld_info_only *)obj_main->dynamic)->lazy_bind_off;
uint32_t sz = ((const macho_lc_dyld_info_only *)obj_main->dynamic)->lazy_bind_size;
const uint8_t *start = obj_main->relocbase + off + slot;
const uint8_t *end = start + sz;
const uint8_t *p = start;
int done = 0;
dbg("lazy_bind linkedit %p off %x sz %x slot %lx start %p end %p",obj_main->linkedit,off,sz,slot,start,end);
while(!done && (p < end)) {
uint8_t imm = *p & BIND_IMMEDIATE_MASK;
uint8_t op = *p & BIND_OPCODE_MASK;
++p;
switch(op) {
case BIND_OPCODE_DONE:
dbg("BIND_OPCODE_DONE");
// these are ignored for lazy binding
break;
case BIND_OPCODE_SET_DYLIB_ORDINAL_IMM:
dbg("BIND_OPCODE_SET_DYLIB_ORDINAL_IMM");
lib_ordinal = imm;
break;
...
(This is only partly implemented - lots of opcodes still to handle!) We decode the linker instructions for binding a real address to the stub. The segment, stub address, symbol name string, and dylib entry number are all extracted from the opcodes. Then we call a modified ELF symbol lookup that searches for the required name in all loaded objects and returns the real address.
rlock_acquire(rtld_bind_lock, &lockstate);
if (sigsetjmp(lockstate.env, 0) != 0)
lock_upgrade(rtld_bind_lock, &lockstate);
/* not local */
symlook_init(&req, p);
req.flags = SYMLOOK_IN_PLT;
req.ventry = NULL;
req.lockstate = &lockstate;
res = symlook_default(&req, obj);
if (res == 0) {
def = req.sym_out;
defobj = req.defobj_out;
}
if (def == NULL)
rtld_die();
if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC)
target = (Elf_Addr)rtld_resolve_ifunc(defobj, def);
else
target = (Elf_Addr)(defobj->relocbase + def->st_value);
dbg("\"%s\" in \"%s\" ==> %p in \"%s\"",
name,
obj->path == NULL ? NULL : basename(obj->path),
(void *)target,
defobj->path == NULL ? NULL : basename(defobj->path));
lock_release(rtld_bind_lock, &lockstate);
return target;
Now that we have the real address, we just stick that into the stub address so the next invocation will go directly to the real routine, and we jump to it. Voila!! A working function call from MachO to ELF! With some less relevant debug output removed, it looks like this:
★ zoe@haru ~/obj.amd64/Users/zoe/airyx-freebsd/amd64.amd64/libexec/dyldᐳ ./dyld ~/hello
...
_rtld_thread_init: done
loading main program
MachO object
hdr.magic 0xfeedfacf
LC_SEGMENT_64 __PAGEZERO vmaddr 0 size 100000000 fileoff 0 size 0
LC_SEGMENT_64 __TEXT vmaddr 100000000 size 4000 fileoff 0 size 4000
section __text addr 100003f50 size 2a offset 3f50 res1 0
section __stubs addr 100003f7a size 6 offset 3f7a res1 0
section __stub_helper addr 100003f80 size 1a offset 3f80 res1 0
section __cstring addr 100003f9a size 12 offset 3f9a res1 0
section __unwind_info addr 100003fac size 48 offset 3fac res1 0
LC_SEGMENT_64 __DATA_CONST vmaddr 100004000 size 4000 fileoff 4000 size 4000
section __got addr 100004000 size 8 offset 4000 res1 1
LC_SEGMENT_64 __DATA vmaddr 100008000 size 4000 fileoff 8000 size 4000
section __la_symbol_ptr addr 100008000 size 8 offset 8000 res1 2
section __data addr 100008008 size 8 offset 8008 res1 0
LC_SEGMENT_64 __LINKEDIT vmaddr 10000c000 size 4000 fileoff c000 size 110
LC_DYLD_INFO_ONLY
LC_SYMTAB sym off c068 count 5 string offset c0c8 size 48
LC_DYSYMTAB indirect offset c0b8 count 3
LC_LOAD_DYLINKER name=/usr/lib/dyld
LC_MAIN entry point 3f50 stacksize 0
LC_LOAD_DYLIB /usr/lib/libSystem.B.dylib cur ver = 051f0000 compat ver = 00010000
...
/Users/zoe/hello: base 0x801069000 sz 10000 vbase 0 tsz 4000 entry 0x80106cf50 reloc 0x801069000
No AT_EXECPATH or direct exec
obj_main path /Users/zoe/hello
macho_fixup_stubs reloc 0x801071000 --> 0x80106cf90
/Users/zoe/hello valid_hash_sysv 0 valid_hash_gnu 0 dynsymcount 3
lm_init("(null)")
loading LD_PRELOAD libraries
loading needed objects
loading "/usr/lib/libSystem.B.dylib"
Ignoring d_tag 1879048185 = 0x6ffffff9
/usr/lib/libSystem.B.dylib valid_hash_sysv 1 valid_hash_gnu 1 dynsymcount 3278
0x801079000 .. 0x801472fff: /usr/lib/libSystem.B.dylib
loading "/Users/zoe/airyx-freebsd/libexec/dyld/progname.so"
Ignoring d_tag 1879048185 = 0x6ffffff9
/Users/zoe/airyx-freebsd/libexec/dyld/progname.so valid_hash_sysv 1 valid_hash_gnu 1 dynsymcount 7
0x801473000 .. 0x801476fff: /Users/zoe/airyx-freebsd/libexec/dyld/progname.so
checking for required versions
initializing initial thread local storage offsets
relocating "/Users/zoe/hello"
object /Users/zoe/hello has no run-time symbol table
relocating "/usr/lib/libSystem.B.dylib"
relocating "/Users/zoe/airyx-freebsd/libexec/dyld/progname.so"
doing copy relocations
initializing initial thread local storage
initializing key program variables
"__progname": *0x801476730 <-- 0x7fffffffea82
"environ": *0x801476738 <-- 0x7fffffffe7a0
"__elf_aux_vector": *0x8014729e0 <-- 0x7fffffffe8b8
...
enforcing main obj relro
transferring control to program entry point = 0x80106cf50
lazy_bind linkedit 0x801075000 off c020 sz 10 slot 0 start 0x801075020 end 0x801075030
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB
BIND_OPCODE_DONE
BIND_OPCODE_SET_DYLIB_ORDINAL_IMM
BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM
BIND_OPCODE_DO_BIND
binding symbol _puts in segment 3 at 801071000
"_puts" in "hello" ==> 0x80120ce90 in "libSystem.B.dylib"
"strlen" in "libSystem.B.dylib" ==> 0x801231750 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124aa00 = 0x801231750
"_fstat" in "libSystem.B.dylib" ==> 0x801235bc0 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124b148 = 0x801235bc0
"__sys_fstat" in "libSystem.B.dylib" ==> 0x801236770 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124c530 = 0x801236770
"malloc" in "libSystem.B.dylib" ==> 0x80119a650 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124aa18 = 0x80119a650
"isatty" in "libSystem.B.dylib" ==> 0x801235890 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124b548 = 0x801235890
"tcgetattr" in "libSystem.B.dylib" ==> 0x801236120 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124b5d0 = 0x801236120
"_ioctl" in "libSystem.B.dylib" ==> 0x8012367d0 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124ab48 = 0x8012367d0
"memchr" in "libSystem.B.dylib" ==> 0x801234d60 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124aa80 = 0x801234d60
"_write" in "libSystem.B.dylib" ==> 0x8012366f0 in "libSystem.B.dylib"
reloc_jmpslot: *0x80124ae80 = 0x8012366f0
Hello from MachO!
I appreciate the notes on the gory details. I am contemplating somethng similar and this is a nice example to wrap my head around.