Skip to content

Instantly share code, notes, and snippets.

@0xdevalias
Last active July 17, 2024 17:10
Show Gist options
  • Save 0xdevalias/256a8018473839695e8684e37da92c25 to your computer and use it in GitHub Desktop.
Save 0xdevalias/256a8018473839695e8684e37da92c25 to your computer and use it in GitHub Desktop.
Some notes, tools, and techniques for reverse engineering macOS binaries

Reverse Engineering on macOS

Some notes, tools, and techniques for reverse engineering macOS binaries.

Table of Contents

Reverse Engineering Tools

Binary Ninja

Ghidra

  • https://ghidra-sre.org/
    • A software reverse engineering (SRE) suite of tools developed by NSA's Research Directorate in support of the Cybersecurity mission

Hex-Rays IDA

  • https://hex-rays.com/
    • https://hex-rays.com/ida-free/
      • This (completely!) free version of IDA offers a privilege opportunity to see IDA in action. This light but powerful tool can quickly analyze the binary code samples and users can save and look closer at the analysis results.

    • https://hex-rays.com/ida-home/
      • IDA Home was introduced thanks to the experience Hex-Rays has been gaining throughout the years to propose hobbyists a solution that combines rapidity, reliability with the levels of quality and responsiveness of support that any professional reverse engineers should expect.

    • https://hex-rays.com/ida-pro/
      • IDA Pro as a disassembler is capable of creating maps of their execution to show the binary instructions that are actually executed by the processor in a symbolic representation (assembly language). Advanced techniques have been implemented into IDA Pro so that it can generate assembly language source code from machine-executable code and make this complex code more human-readable.

        The debugging feature augmented IDA with the dynamic analysis. It supports multiple debugging targets and can handle remote applications. Its cross-platform debugging capability enables instant debugging, easy connection to both local and remote processes and support for 64-bit systems and new connection possibilities.

    • https://www.hex-rays.com/products/ida/debugger/mac/
    • https://hex-rays.com/products/ida/news/8_3/

radare2

Frida / etc

  • https://frida.re/
    • Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.

    • Scriptable Inject your own scripts into black box processes. Hook any function, spy on crypto APIs or trace private application code, no source code needed. Edit, hit save, and instantly see the results. All without compilation steps or program restarts.

      Portable Works on Windows, macOS, GNU/Linux, iOS, watchOS, tvOS, Android, FreeBSD, and QNX. Install the Node.js bindings from npm, grab a Python package from PyPI, or use Frida through its Swift bindings, .NET bindings, Qt/Qml bindings, Go bindings, or C API. We also have a scalable footprint.

      Free Frida is and will always be free software (free as in freedom). We want to empower the next generation of developer tools, and help other free software developers achieve interoperability through reverse engineering.

      Battle-tested We are proud that NowSecure is using Frida to do fast, deep analysis of mobile apps at scale. Frida has a comprehensive test-suite and has gone through years of rigorous testing across a broad range of use-cases.

  • https://github.com/frida
  • https://github.com/Ch0pin/medusa
    • medusa Binary instrumentation framework based on FRIDA

    • MEDUSA is an extensible and modularized framework that automates processes and techniques practiced during the dynamic analysis of Android and iOS Applications.

  • https://github.com/rsenet/FriList
    • Collection of useful FRIDA Mobile Scripts

    • Observer Security Bypass Static Analysis Specific Software Other

Reversing C++ Binaries

Unsorted

C++ vtables

std::string

  • https://shaharmike.com/cpp/std-string/
    • Exploring std::string

    • Every C++ developer knows that std::string represents a sequence of characters in memory. It manages its own memory, and is very intuitive to use. Today we’ll explore std::string as defined by the C++ Standard, and also by looking at 4 major implementations.

    • One particular optimization found its way to pretty much all implementations: small objects optimization (aka small buffer optimization). Simply put, Small Object Optimization means that the std::string object has a small buffer for small strings, which saves dynamic allocations.

    • Recent GCC versions use a union of buffer (16 bytes) and capacity (8 bytes) to store small strings. Since reserve() is mandatory (more on this later), the internal pointer to the beginning of the string either points to this union or to the dynamically allocated string.

    • clang is by-far the smartest and coolest. While std::string has the size of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation. To achieve this libc++ uses a neat trick: the size of the string is not saved as-is but rather in a special way: if the string is short (< 23 bytes) then it stores size() * 2. This way the least significant bit is always 0. The long form always bitwise-ors the LSB with 1, which in theory might have meant unnecessarily larger allocations, but this implementation always rounds allocations to be of form 16*n - 1 (where n is an integer). By the way, the allocated string is actually of form 16*n, the last character being '\0'

  • https://tastycode.dev/memory-layout-of-std-string/
    • Memory Layout of std::string

    • Discover how std::string is represented in the most popular C++ Standard Libraries, such as MSVC STL, GCC libstdc++, and LLVM libc++.

    • In this post of Tasty C++ series we’ll look inside of std::string, so that you can more effectively work with C++ strings and take advantage and avoid pitfalls of the C++ Standard Library you are using.

    • In C++ Standard Library, std::string is one of the three contiguous containers (together with std::array and std::vector). This means that a sequence of characters is stored in a contiguous area of the memory and an individual character can be efficiently accessed by its index at O(1) time. The C++ Standard imposes more requirements on the complexity of string operations, which we will briefly focus on later in this post.

    • If we are talking about the C++ Standard, it’s important to remember that it doesn’t impose exact implementation of std::string, nor does it specify the exact size of std::string. In practice, as we’ll see, the most popular implementations of the C++ Standard Library allocate 24 or 32 bytes for the same std::string object (excluding the data buffer). On top of that, the memory layout of string objects is also different, which is a result of a tradeoff between optimal memory and CPU utilization, as we’ll also see below.

    • For people just starting to work with strings in C++, std::string is usually associated with three data fields:

      • Buffer – the buffer where string characters are stored, allocated on the heap.
      • Size – the current number of characters in the string.
      • Capacity – the max number of character the buffer can fit, a size of the buffer.

      Talking C++ language, this picture could be expressed as the following class:

      class TastyString {
        char *    m_buffer;     //  string characters
        size_t    m_size;       //  number of characters
        size_t    m_capacity;   //  m_buffer size
      }
      

      This representation takes 24 bytes and is very close to the production code.

  • https://stackoverflow.com/questions/5058676/stdstring-implementation-in-gcc-and-its-memory-overhead-for-short-strings
    • std::string implementation in GCC and its memory overhead for short strings

    • At least with GCC 4.4.5, which is what I have handy on this machine, std::string is a typdef for std::basic_string<char>, and basic_string is defined in /usr/include/c++/4.4.5/bits/basic_string.h. There's a lot of indirection in that file, but what it comes down to is that nonempty std::strings store a pointer to one of these:

      struct _Rep_base
      {
        size_type       _M_length;
        size_type       _M_capacity;
        _Atomic_word        _M_refcount;
      };
      

      Followed in-memory by the actual string data. So std::string is going to have at least three words of overhead for each string, plus any overhead for having a higher capacity than length (probably not, depending on how you construct your strings -- you can check by asking the capacity() method).

      There's also going to be overhead from your memory allocator for doing lots of small allocations; I don't know what GCC uses for C++, but assuming it's similar to the dlmalloc allocator it uses for C, that could be at least two words per allocation, plus some space to align the size to a multiple of at least 8 bytes.

std::vector

Universal (Fat) Binaries

  • https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary
    • Building a Universal macOS Binary

    • Create macOS apps and other executables that run natively on both Apple silicon and Intel-based Mac computers.

    • https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Update-the-Architecture-List-of-Custom-Makefiles
      • To create a universal binary for your project, merge the resulting executable files into a single executable binary using the lipo tool.

      • lipo -create -output universal_app x86_app arm_app

    • https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Determine-Whether-Your-Binary-Is-Universal
      • Determine Whether Your Binary Is Universal To users, a universal binary looks no different than a binary built for a single architecture. When you build a universal binary, Xcode compiles your source files twice—once for each architecture. After linking the binaries for each architecture, Xcode then merges the architecture-specific binaries into a single executable file using the lipo tool. If you build the source files yourself, you must call lipo as part of your build scripts to merge your architecture-specific binaries into a single universal binary.

        To see the architectures present in a built executable file, run the lipo or file command-line tools. When running either tool, specify the path to the actual executable file, not to any intermediate directories such as the app bundle. For example, the executable file of a macOS app is in the Contents/MacOS/ directory of its bundle. When running the lipo tool, include the -archs parameter to see the architectures.

      • % lipo -archs /System/Applications/Mail.app/Contents/MacOS/Mail
        x86_64 arm64
      • To obtain more information about each architecture, pass the -detailed_info argument to lipo.

    • https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Specify-the-Launch-Behavior-of-Your-App
      • Specify the Launch Behavior of Your App For universal binaries, the system prefers to execute the slice that is native to the current platform. On an Intel-based Mac computer, the system always executes the x86_64 slice of the binary. On Apple silicon, the system prefers to execute the arm64 slice when one is present. Users can force the system to run the app under Rosetta translation by enabling the appropriate option from the app’s Get Info window in the Finder.

        If you never want users to run your app under Rosetta translation, add the LSRequiresNativeExecution key to your app’s Info.plist file. When that key is present and set to YES, the system prevents your app from running under translation. In addition, the system removes the Rosetta translation option from your app’s Get Info window. Don’t include this key until you verify that your app runs correctly on both Apple silicon and Intel-based Mac computers.

        If you want to prioritize one architecture, without preventing users from running your app under translation, add the LSArchitecturePriority key to your app’s Info.plist file. The value of this key is an ordered array of strings, which define the priority order for selecting an architecture.

  • https://ss64.com/osx/lipo.html
    • lipo Create or operate on a universal file: convert a universal binary to a single architecture file, or vice versa.

    • lipo produces one output file, and never alters the input file.

    • lipo can: list the architecture types in a universal file; create a single universal file from one or more input files; thin out a single universal file to one specified architecture type; and extract, replace, and/or remove architectures types from the input file to create a single new universal output file.

  • https://github.com/konoui/lipo
    • LIPO This lipo is designed to be compatible with macOS lipo, which is a utility for creating Universal Binary as known as Fat Binary.

Reverse Engineering Audio VST Plugins

Compiler Optimisations

Fast Division / Modulus

  • https://binary.ninja/2023/09/15/3.5-expanded-universe.html#moddiv-deoptimization
    • Mod/Div Deoptimization

    • One of the many things compilers do that can make reverse engineering harder is use a variety of algorithmic optimizations, in particular for modulus and division calculations. Instead of implementing them with the native CPU instructions, they will use shifts and multiplications with magic constants that when operating on a fixed integer size has the same effect as a native division instruction.

      There are several ways to try to recover the original division which is far more intuitive and easer to reason about.

  • https://lemire.me/blog/2020/02/26/fast-divisionless-computation-of-binomial-coefficients/
    • Fast divisionless computation of binomial coefficients

    • We would prefer to avoid divisions entirely. If we assume that k is small, then we can just use the fact that we can always replace a division by a known value with a shift and a multiplication. All that is needed is that we precompute the shift and the multiplier. If there are few possible values of k, we can precompute it with little effort.

    • I provide a full portable implementation complete with some tests. Though I use C, it should work as-is in many other programming languages. It should only take tens of CPU cycles to run. It is going to be much faster than implementations relying on divisions.

    • Another trick that you can put to good use is that the binomial coefficient is symmetric: you can replace k by n–k and get the same value. Thus if you can handle small values of k, you can also handle values of k that are close to n. That is, the above function will also work for n is smaller than 100 and k larger than 90, if you just replace k by n–k.

    • Is that the fastest approach? Not at all. Because n is smaller than 100 and k smaller than 10, we can precompute (memoize) all possible values. You only need an array of 1000 values. It should fit in 8kB without any attempt at compression. And I am sure you can make it fit in 4kB with a little bit of compression effort. Still, there are instances where relying on a precomputed table of several kilobytes and keeping them in cache is inconvenient. In such cases, the divisionless function would be a good choice.

    • Alternatively, if you are happy with approximations, you will find floating-point implementations.

    • https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2020/02/26/binom.c
    • https://github.com/dmikushin/binom/blob/master/include/binom.h
    • https://github.com/bmkessler/fastdiv
    • https://github.com/jmtilli/fastdiv/blob/master/fastdiv.c

Unsorted

  • https://github.com/mroi/apple-internals
    • Apple Internals This repository provides tools and information to help understand and analyze the internals of Apple’s operating system platforms.

    • https://mroi.github.io/apple-internals/
      • Collected knowledge about the internals of Apple’s platforms.

        Sorted by keyword, abbreviation, or codename.

  • https://opensource.apple.com/source/objc4/
  • https://github.com/smx-smx/ezinject
    • Modular binary injection framework, successor of libhooker

    • ezinject is a lightweight and flexible binary injection framework. it can be thought as a lightweight and less featured version of frida.

      It's main and primary goal is to load a user module (.dll, .so, .dylib) inside a target process. These modules can augment ezinject by providing additional features, such as hooks, scripting languages, RPC servers, and so on. They can also be written in multiple languages such as C, C++, Rust, etc... as long as the ABI is respected.

      NOTE: ezinject core is purposedly small, and only implements the "kernel-mode" (debugger) features it needs to run the "user-mode" program, aka the user module.

      It requires no dependencies other than the OS C library (capstone is optionally used only by user modules)

      Porting ezinejct is simple: No assembly code is required other than a few inline assembly statements, and an abstraction layer separates multiple OSes implementations.

  • https://github.com/evelyneee/ellekit
    • ElleKit yet another tweak injector / tweak hooking library for darwin systems

    • What this is

      • A C function hooker that patches memory pages directly
      • An Objective-C function hooker
      • An arm64 assembler
      • A JIT inline assembly implementation for Swift
      • A Substrate and libhooker API reimplementation
  • http://diaphora.re/
    • Diaphora A Free and Open Source Program Diffing Tool

    • Diaphora (διαφορά, Greek for 'difference') version 3.0 is the most advanced program diffing tool (working as an IDA plugin) available as of today (2023). It was released first during SyScan 2015 and has been actively maintained since this year: it has been ported to every single minor version of IDA since 6.8 to 8.3.

      Diaphora supports versions of IDA >= 7.4 because the code only runs in Python 3.X (Python 3.11 was the last version being tested).

    • https://github.com/joxeankoret/diaphora
      • Diaphora, the most advanced Free and Open Source program diffing tool.

      • Diaphora has many of the most common program diffing (bindiffing) features you might expect, like:

        • Diffing assembler.
        • Diffing control flow graphs.
        • Porting symbol names and comments.
        • Adding manual matches.
        • Similarity ratio calculation.
        • Batch automation.
        • Call graph matching calculation.
        • Dozens of heuristics based on graph theory, assembler, bytes, functions' features, etc...

        However, Diaphora has also many features that are unique, not available in any other public tool. The following is a non extensive list of unique features:

        • Ability to port structs, enums, unions and typedefs.
        • Potentially fixed vulnerabilities detection for patch diffing sessions.
        • Support for compilation units (finding and diffing compilation units).
        • Microcode support.
        • Parallel diffing.
        • Pseudo-code based heuristics.
        • Pseudo-code patches generation.
        • Diffing pseudo-codes (with syntax highlighting!).
        • Scripting support (for both the exporting and diffing processes).

See Also

My StackOverflow/etc answers

  • https://stackoverflow.com/questions/46802472/recursively-find-hexadecimal-bytes-in-binary-files/77706906#77706906
    • Recursively searching through binary files for hex strings (with potential wildcards) using radare2's rafind2
    • Crossposted: https://twitter.com/_devalias/status/1738458619958751630
    • SEARCH_DIRECTORY="./path/to/bins"
      GREP_PATTERN='\x5B\x27\x21\x3D\xE9'
      
      # Remove all instances of '\x' from PATTERN for rafind2
      # Eg. Becomes 5B27213DE9
      PATTERN="${GREP_PATTERN//\\x/}"
      
      grep -rl "$GREP_PATTERN" "$SEARCH_DIRECTORY" | while read -r file; do
        echo "$file:"
        rafind2 -x "$PATTERN" "$file"
      done
    • SEARCH_DIRECTORY="./path/to/bins"
      PATTERN='5B27213DE9'
      
      # Using find
      find "$SEARCH_DIRECTORY" -type f -exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && echo "$2:" && echo "$output"' sh "$PATTERN" {} \;
      
      # Using fd
      fd --type f --exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && (echo "$2:"; echo "$output")' sh "$PATTERN" {} "$SEARCH_DIRECTORY"
    • time ./test-grep-and-rafind2
      # ..snip..
      ./test-grep-and-rafind2  7.33s user 0.19s system 99% cpu 7.578 total
      
      ⇒ time ./test-find-and-rafind2
      # ..snip..
      ./test-find-and-rafind2  3.24s user 0.72s system 98% cpu 4.041 total
      
      ⇒ time ./test-fd-and-rafind2
      # ..snip..
      ./test-fd-and-rafind2  3.85s user 1.04s system 488% cpu 1.002 total

My Other Related Deepdive Gist's and Projects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment