programming086/reverse-engineering-macos.md

## reverse-engineering-macos.md

      
    Raw
  

              reverse-engineering-macos.md
            
          
    Reverse Engineering on macOS

Some notes, tools, and techniques for reverse engineering macOS binaries.
Table of Contents


Reverse Engineering Tools

Binary Ninja
Ghidra
Hex-Rays IDA
radare2
Frida / etc


Reversing C++ Binaries
Unsorted

C++ vtables
std::string
std::vector


Universal (Fat) Binaries
Reverse Engineering Audio VST Plugins
Compiler Optimisations

Fast Division / Modulus


Unsorted
See Also

My StackOverflow/etc answers
My Other Related Deepdive Gist's and Projects


Reverse Engineering Tools

Binary Ninja


https://binary.ninja/


Binary Ninja is an interactive decompiler, disassembler, debugger, and binary analysis platform built by reverse engineers, for reverse engineers. Developed with a focus on delivering a high-quality API for automation and a clean and usable GUI, Binary Ninja is in active use by malware analysts, vulnerability researchers, and software developers worldwide. Decompile software built for many common architectures on Windows, macOS, and Linux for a single price, or try out our limited (but free!) Cloud version.


https://binary.ninja/free/


There are two ways to try Binary Ninja for free! Binary Ninja Cloud supports all architectures, but requires you to upload your binaries. Binary Ninja Free is a downloadable app that runs locally, but has architecture restrictions. Neither free option supports our powerful API / Plugin ecosystem.


https://cloud.binary.ninja/


Binary Ninja Cloud is our free, online reverse engineering tool.


https://sidekick.binary.ninja/


Sidekick Makes Reverse Engineering Easy
Don't open that binary alone! Take Sidekick, your AI-powered assistant, with you. Sidekick can help answer your questions about the binary, recover structures, name things, describe and comment code, find points of interest, and much more.


https://binary.ninja/blog/

https://binary.ninja/2024/02/28/4.0-dorsai.html


4.0: Dorsai


https://binary.ninja/2023/09/15/3.5-expanded-universe.html


3.5: Expanded Universe


https://binary.ninja/2023/05/03/3.4-finally-freed.html


3.4: Finally Freed


https://binary.ninja/2023/01/18/3.3-the-bytes-must-flow.html


3.3: The Bytes Must Flow


https://binary.ninja/2022/10/28/3.2-released.html


3.2 Release


https://binary.ninja/2022/05/31/3.1-the-performance-release.html


3.1 The Performance Release


https://binary.ninja/2022/01/27/3.0-the-next-chapter.html


3.0 The Next Chapter


https://docs.binary.ninja/guide/


User Guide


https://docs.binary.ninja/guide/types/


There's so many things to learn about working with Types in Binary Ninja that we've organized it into several sections!


Basic Type Editing: Brief overview of the basics


https://docs.binary.ninja/guide/types/basictypes.html


Basic Type Editing
The biggest culprit of bad decompilation is often missing type information. Therefore, some of the most important actions you can take while reverse engineering is renaming symbols/variables, applying types, and creating new types to apply.


Working with Types: Interacting with types in disassembly and decompilation


https://docs.binary.ninja/guide/types/type.html


Working with Types, Structures, and Symbols in Decompilation
There are two main ways to interact with types in decompilation or disassembly. The first is to use the types view, and the second is to take advantage of the smart structures workflow or otherwise annotate types directly in a disassembly or IL view.


Importing/Exporting Types: How to import or export types from header files, archives, or other BNDBs


https://docs.binary.ninja/guide/types/typeimportexport.html


Importing Type Information
Type information can be imported from a variety of sources. If you have header files, you can import a header. If your types exist in an existing BNDB, you can use import from a bndb. With the introduction of type archives we recommend migrating away from importing via BNDB to type archives as they allow types to remain synced between different databases.


https://docs.binary.ninja/guide/types/typeimportexport.html#import-bndb-file


Import BNDB File
The Import BNDB File feature imports types from a previous BNDB into your currently open file. In addition, it will apply types for matching symbols in functions and variables. Import BNDB will not port symbols from a BNDB with symbols to one without -- the names must already match. Matching functions and porting symbols is beyond the scope of this feature.


https://docs.binary.ninja/guide/types/typeimportexport.html#import-header-file


Import Header File
If you already have a collection of headers containing types you want to use, you can import them directly. You can specify the compiler flags that would be used if a compiler were compiling a source file that uses this header.


After specifying the file(s) and flag(s), pressing Preview will give a list of all the types and functions defined in the file(s). You may check or uncheck the box next to any of the types/functions to control whether they will be imported to your analysis.


https://docs.binary.ninja/guide/types/typeimportexport.html#finding-system-headers


Finding System Headers
Since you need to specify the include paths for system headers, you will need to deduce them for the target platform of your analysis. Here are a few tricks that may help


Systems with GCC/Clang (macOS, Linux, etc)
On these systems, you can run a command to print the default search path for compilation:
gcc -Wp,-v -E -
clang -Wp,-v -E -

For the directories printed by this command, you should include them with -isystem<path> in the order specified.


⇒ gcc -Wp,-v -E -
clang -cc1 version 15.0.0 (clang-1500.3.9.4) default target x86_64-apple-darwin23.4.0
ignoring nonexistent directory "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include"
ignoring nonexistent directory "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/Library/Frameworks"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/include
 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include
 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks (framework directory)
End of search list.


-isystem/usr/local/include
-isystem/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/include
-isystem/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include
-isystem/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include


Additionally, several types of containers for type information are documented here:


Debug Info: Debug Info can provide additional type information (examples include DWARF and PDB files)


https://docs.binary.ninja/guide/types/debuginfo.html


Debug Info
Debug Info is a mechanism for importing types, function signatures, and data variables from either the original binary (eg. an ELF compiled with DWARF) or a supplemental file (eg. a PDB).
Currently debug info plugins are limited to types, function signatures, and data variables, but in the future will include line number information, comments, local variables, and possibly more.


Type Libraries: Type Libraries contain types from commonly-used dynamic libraries


https://docs.binary.ninja/guide/types/typelibraries.html


Type Libraries
Type Libraries are collections of type information (structs, enums, function types, etc.), corresponding to specific dynamic libraries that are imported into your analysis. You can browse and import them in the Types View.


Most of your usage of Type Libraries will be performed automatically by Binary Ninja when you analyze a binary. They are automatically imported based on the libraries that your binary uses. Any library functions or global variables your binary references will have their type signature imported, and any structures those functions and variables reference are imported as well.


Platform Types: Types that automatically apply to a platform


https://docs.binary.ninja/guide/types/platformtypes.html


Platform Types
Binary Ninja pulls type information from a variety of sources. The highest-level source are the platform types loaded for the given platform (which includes operating system and architecture). There are two sources of platform types. The first are shipped with the product in a binary path. The second location is in your user folder and is intended for you to put custom platform types.


Platform types are used to define types that should be available to all programs available on that particular platform. They are only for global common types.


Type Archives: How you can use type archives to share types between analysis databases


https://docs.binary.ninja/guide/types/typearchives.html


Type Archives
Type Archives are files you can use to share types between analysis databases. You can create them and manage their contents in Types View.


Signature Libraries: Signature libraries are used to match names of functions with signatures for code that is statically compiled


https://docs.binary.ninja/dev/annotation.html#signature-library


Signature Library
While many signatures are built-in and require no interaction to automatically match functions, you may wish to add or modify your own. First, install the SigKit plugin from the plugin manager.


Once the signature matcher runs, it will print a brief report to the console detailing how many functions it matched and will rename matched functions.


To generate a signature library for the currently-open binary, use Tools > Signature Library > Generate Signature Library. This will generate signatures for all functions in the binary that have a name attached to them. Note that functions with automatically-chosen names such as sub_401000 will be skipped. Once it's generated, you'll be prompted where to save the resulting signature library.


Platform Types: Platform types are base types that apply to all binaries on a particular platform


https://docs.binary.ninja/guide/types/platformtypes.html


Additionally, make sure to see the applying annotations section of the developer guide for information about using the API with types and covering the creation of many of the items described below.


https://docs.binary.ninja/guide/cpp.html


Working with C++ Types and Virtual Function Tables


https://docs.binary.ninja/guide/cpp.html#virtual-function-tables


Virtual Function Tables
Virtual functions are implemented by compilers using a virtual function table structure that is pointed to by instances of the class. The layout of these structures is compiler dependent, so when reverse engineering a C++ binary these structures must be created to match the in-memory layout used by the binary.
One of the most common tasks when reversing a C++ binary is discovering which functions a virtual function call can resolve to. Binary Ninja provides a "propagate data variable references" option in the Create Structure dialog to help with this. When this is enabled, pointers found in the data section that are part of any instance of the structure will be considered as cross references of the structure field itself. This allows you to click on the name of a virtual function and see which functions it can potentially call in the cross references view.


https://blog.trailofbits.com/2017/02/13/devirtualizing-c-with-binary-ninja/


Devirtualizing C++ with Binary Ninja


https://github.com/trailofbits/binjascripts/tree/master/vtable-navigator
https://gist.github.com/joshwatson/661d763f35f545bb6ee0f37731a79ec7#file-vtable-navigator-py


Binary Ninja IL Example: Navigating to a Virtual Function Based on an Indirect Call


https://docs.binary.ninja/guide/objectivec.html


Objective-C (Beta)
Recent version of Binary Ninja ship with an additional plugin for assisting with Objective-C analysis. It provides both a workflow and a plugin command for enhancing Objective-C binary analysis.


https://github.com/Vector35/workflow_objc


Binary Ninja plugin & workflow to help analyze Objective-C code


https://docs.binary.ninja/guide/debugger.html


Debugger
Binary Ninja Debugger is a plugin that can debug executables on Windows, Linux, and macOS, and more!
The debugger plugin is shipped with Binary Ninja. It is open-source under an Apache License 2.0. Bug reports and pull requests are welcome!


https://github.com/Vector35/debugger


Binary Ninja debugger


https://docs.binary.ninja/dev/bnil-overview.html


Binary Ninja Intermediate Language: Overview


https://docs.binary.ninja/dev/bnil-overview.html#reading-il


Reading IL
All of the various ILs (with the exception of the SSA forms) are intended to be easily human-readable and look much like pseudo-code. There is some shorthand notation that is used throughout the ILs, though, explained below


https://docs.binary.ninja/dev/uidf.html


User Informed Data Flow
Binary Ninja now implements User-Informed DataFlow (UIDF) to improve the static reverse engineering experience of our users. This feature allows users to set the value of a variable and have the internal dataflow engine propagate it through the control-flow graph of the function. Besides constant values, Binary Ninja supports various PossibleValueSet states as containers to help inform complex variable values.


https://docs.binary.ninja/dev/workflows.html


Binary Ninja Workflows Documentation


Binary Ninja Workflows is an analysis orchestration framework which simplifies the definition and execution of a computational binary analysis pipeline. The extensible pipeline accelerates program analysis and reverse engineering of binary blobs at various levels of abstraction. Workflows supports hybridized execution models, where the ordering of activities in the pipeline can be well-known and procedural, or dynamic and reactive. Currently, the core Binary Ninja analysis is made available as a procedural model and is the aggregate of both module and function-level analyses.


https://github.com/Vector35/binaryninja-api/tree/dev/examples/workflows
I saw a note somewhere that suggested this feature would allow implementing deoptimisers / similar (eg. fast modulo / division, etc) that could simplify the view of the decompiled output


https://github.com/Vector35/binaryninja-api


Public API, examples, documentation and issues for Binary Ninja


https://github.com/Vector35/binaryninja-api/wiki/Ninjas-In-Training


Ninjas In Training
List of resources for folks beginning their journey into reverse engineering. If appropriate, resources are labelled as B for Beginner, I for Intermediate, and A for Advanced. Feel free to join the slack and hop in the #ninjas-in-training channel for specific questions.


Some interesting/relevant issues I have run into:

Vector35/binaryninja-api#5262


[docs] Improve 'Finding System Headers' examples for C++


Vector35/binaryninja-api#5257


Support parsing C++ templates (at least concrete specializations)


Vector35/binaryninja-api#3305


Templatized types (in some form) in BN's type system


Vector35/binaryninja-api#4551


Demangled type names reference missing types


Vector35/binaryninja-api#2735


macOS Type Libraries


Vector35/binaryninja-api#2736


iOS Type Libraries


Vector35/binaryninja-api#3017


Show vtable function calls as normal function call


Vector35/binaryninja-api#3017 (comment)


With the addition of __data_var_refs (see: C++ Types user docs) this is effectively solved. The next step is to automate the parsing and symbolizing of both MSVC (#3930) and Itanium RTTI (#3857).


Vector35/binaryninja-api#3930


MSVC RTTI analysis


Vector35/binaryninja-api#3857


GCC/Clang RTTI analysis


Vector35/binaryninja-api#4959


Incorrect typeinfo_name_for definitions in mach-o binaries


https://github.com/Vector35/official-plugins


Official Binary Ninja Plugins


https://github.com/Vector35/community-plugins


Binary Ninja Community Plugins


https://github.com/Vector35/community-themes


Binary Ninja Community Themes


https://www.youtube.com/@Vector35

https://www.youtube.com/playlist?list=PLCVV6Y9LmwOgqqT5obf0OmN9fp5495bLr


Binary Ninja Basics


https://www.youtube.com/playlist?list=PLCVV6Y9LmwOhma0VEJDO07uYxFqtEmkrT


Training


Ghidra


https://ghidra-sre.org/


A software reverse engineering (SRE) suite of tools developed by NSA's Research Directorate in support of the Cybersecurity mission


Hex-Rays IDA


https://hex-rays.com/

https://hex-rays.com/ida-free/


This (completely!) free version of IDA offers a privilege opportunity to see IDA in action. This light but powerful tool can quickly analyze the binary code samples and users can save and look closer at the analysis results.


https://hex-rays.com/ida-home/


IDA Home was introduced thanks to the experience Hex-Rays has been gaining throughout the years to propose hobbyists a solution that combines rapidity, reliability with the levels of quality and responsiveness of support that any professional reverse engineers should expect.


https://hex-rays.com/ida-pro/


IDA Pro as a disassembler is capable of creating maps of their execution to show the binary instructions that are actually executed by the processor in a symbolic representation (assembly language). Advanced techniques have been implemented into IDA Pro so that it can generate assembly language source code from machine-executable code and make this complex code more human-readable.
The debugging feature augmented IDA with the dynamic analysis. It supports multiple debugging targets and can handle remote applications. Its cross-platform debugging capability enables instant debugging, easy connection to both local and remote processes and support for 64-bit systems and new connection possibilities.


https://www.hex-rays.com/products/ida/debugger/mac/


This writeup is now deprecated. Please see this resource instead.


https://www.hex-rays.com/wp-content/static/tutorials/mac_debugger_primer2/mac_debugger_primer2.pdf


Debugging Mac OSX Applications with IDA Pro


https://hex-rays.com/products/ida/news/8_3/


radare2


https://www.radare.org/n/

https://www.radare.org/doc/r2papi/


r2papi
Typescript APIs for radare2


The r2papi module implements a set of idiomatic and high-level APIs that are based on top of the minimalistic r2pipe API.


https://www.radare.org/doc/r2papi/interfaces/r2pipe.R2Pipe.html


Interface R2Pipe
Generic interface to interact with radare2, abstracts the access to the associated instance of the tool, which could be native via rlang or remote via pipes or tcp/http.


https://www.radare.org/doc/r2papi/classes/shell.R2Shell.html


Class R2Shell
Class that interacts with the r2ai plugin (requires rlang-python and r2i r2pm packages to be installed). Provides a way to script the interactions with different language models using javascript from inside radare2.


https://www.radare.org/doc/r2papi/classes/ai.R2AI.html


Class that interacts with the r2ai plugin (requires rlang-python and r2i r2pm packages to be installed). Provides a way to script the interactions with different language models using javascript from inside radare2.


https://github.com/radareorg/radare2


Radare2: Libre Reversing Framework for Unix Geeks


UNIX-like reverse engineering framework and command-line toolset


r2 is a complete rewrite of radare. It provides a set of libraries, tools and plugins to ease reverse engineering tasks. Distributed mostly under LGPLv3, each plugin can have different licenses (see r2 -L, rasm2 -L, ...).
The radare project started as a simple command-line hexadecimal editor focused on forensics. Today, r2 is a featureful low-level command-line tool with support for scripting with the embedded Javascript interpreter or via r2pipe.
r2 can edit files on local hard drives, view kernel memory, and debug programs locally or via a remote gdb server. r2's wide architecture support allows you to analyze, emulate, debug, modify, and disassemble any binary.


https://github.com/radareorg/radare2#popular-plugins


Using the r2pm tool you can browse and install many plugins and tools that use radare2.

esilsolve: The symbolic execution plugin, based on esil and z3
iaito: The official Qt graphical interface
keystone Assembler instructions using the Keystone library
r2ai Run a Language Model in localhost with Llama inside r2!
r2dec: A decompiler based on r2 written in JS, accessed with the pdd command
r2diaphora: Diaphora's diffing engine working on top of radare2
r2frida: The frida io plugin. Start r2 with r2 frida://0 to use it
r2ghidra: The native ghidra decompiler plugin, accessed with the pdg command
r2papi High level api on top of r2pipe
r2pipe Script radare2 from any programming language
r2poke Integration with GNU/Poke for extended binary parsing capabilities
r2yara Run Yara from r2 or use r2 primitives from Yara
radius2: A fast symbolic execution engine based on boolector and esil


https://github.com/radareorg/radare2/blob/master/doc/macos.md


https://github.com/radareorg/iaito


Official QT frontend of radare2


https://github.com/radareorg/radare2-r2pipe


Access radare2 via pipe from any programming language!


The r2pipe APIs are based on a single r2 primitive found behind r_core_cmd_str() which is a function that accepts a string parameter describing the r2 command to run and returns a string with the result.
The decision behind this design comes from a series of benchmarks with different libffi implementations and resulted that using the native API is more complex and slower than just using raw command strings and parsing the output.
As long as the output can be tricky to parse, it's recommended to use the JSON output and deserializing them into native language objects which results much more handy than handling and maintaining internal data structures and pointers.


https://github.com/radareorg/radare2-book


Radare2 official book


https://book.rada.re/


https://r2wiki.readthedocs.io/en/latest/


Radare2 wiki
This is an ongoing work in progress and reflects various material obtained while stuying how to use radare2.


https://r2wiki.readthedocs.io/en/latest/tools/radare2/


radare2 – Advanced command-line hexadecimal editor, disassembler and debugger


https://r2wiki.readthedocs.io/en/latest/tools/rabin2/


rabin2 – Binary program info extractor


This program allows you to get information about ELF/PE/MZ and CLASS files in a simple way.


https://r2wiki.readthedocs.io/en/latest/tools/rafind2/


rafind2 – advanced command-line byte pattern search in files


rafind2 is a program to find byte patterns in files


https://r2wiki.readthedocs.io/en/latest/tools/radiff2/


radiff2 – unified binary diffing utility


radiff2 implements many binary diffing algorithms for data and code.


After installing xdot, you can graph the difference between two binaries. Syntax is, radiff2 -g function_name binary1 binary | xdot -
Yellow indicates some offsets doesnt match, grey is perfect match and red shows a strong difference


https://r2wiki.readthedocs.io/en/latest/tools/rarun2/


rarun2 — radare2 utility to run programs in exotic environments


This program is used as a launcher for running programs with different environment, arguments, permissions, directories and overridden default file descriptors.


You can preload r2 inside a process. This is similar to r2frida but native implementation
Example: rarun2 r2preload=yes program=/bin/cat followed by the kill command that rarun2 generates


https://r2wiki.readthedocs.io/en/latest/tools/rasm2/


rasm2 – radare2 assembler and disassembler tool


This tool uses r_asm to assemble and disassemble files or hexpair strings. It supports a large list of architectures which can be listed using the -L flag.


https://r2wiki.readthedocs.io/en/latest/home/valid-arch-cpu/


Valid architecture and cpu
rasm2 -L (list of valid architectures and bits)


https://r2wiki.readthedocs.io/en/latest/tools/rahash2/

rahash2 – block based hashing utility


Rahash2 allows you to calculate, check and show the hash values of each block of a target file. The block size is 32768 bytes by default. It's allowed to hash from stdin using '-' as a target file. You can compare against a known hash and get the result in the exit status.
You can hash big files by hashing each block and later determine what part of it has been modified. Useful for filesystem analysis.
This command can be used to calculate hashes of a certain part of a file or a command line passed string.


WARNING - Do not try to use rahash2 on a big file as it attempts to load the entire file in memory first.


https://r2wiki.readthedocs.io/en/latest/tools/rax2/


rax2 – radare base converter


This command allows you to convert values between positive and negative integer, float, octal, binary and hexadecimal values.


https://r2wiki.readthedocs.io/en/latest/tools/r2pm/


r2pm (radare2 package manager)


Allows to install, update, uninstall and discover plugins and tools that can be used with radare2.


https://r2wiki.readthedocs.io/en/latest/analysis/macho/


To debug a MachO file using r2, run r2 with sudo. Otherwise you need to sign r2


https://r2wiki.readthedocs.io/en/latest/analysis/ios/
https://r2wiki.readthedocs.io/en/latest/analysis/android/
https://r2wiki.readthedocs.io/en/latest/analysis/go/


https://github.com/monosource/radare2-explorations


A book on learning radare2


https://monosource.gitbooks.io/radare2-explorations/content/introduction.html


The goal of this book is to accommodate the reader with radare2, which is quickly becoming a bread & butter tool for any reverse engineer, malware analyst or biweekly CTF player. It is not meant to replace the Radare2 Book, but rather to complement it.


https://github.com/securisec/r2wiki/


Radare2 Wiki
The goal of this wiki is to make a searchable collection of documents which can be used to find various use cases and help regarding using r2.


https://radare2.securisec.com/ (broken?)


https://github.com/radareorg/r2ghidra


r2ghidra
This is an integration of the Ghidra decompiler for radare2. It is solely based on the decompiler part of Ghidra, which is written entirely in C++, so Ghidra itself is not required at all and the plugin can be built self-contained.


https://github.com/radareorg/r2ai


local language model for radare2


https://github.com/radareorg/radare2-skel


Radare2 Script Skeletons
This repository contains directories that can be used as an skeleton or template to start writing your projects that use radare2.
Radare2 can be extended in many ways:

Scripts
Plugins
Programs

And it is possible to use almost any languages to do so. Those templates include the .vscode and build files too. so you can quickly start doing real work!


https://github.com/radareorg/radare2-testbins

Frida / etc


https://frida.re/


Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.


Scriptable
Inject your own scripts into black box processes. Hook any function, spy on crypto APIs or trace private application code, no source code needed. Edit, hit save, and instantly see the results. All without compilation steps or program restarts.
Portable
Works on Windows, macOS, GNU/Linux, iOS, watchOS, tvOS, Android, FreeBSD, and QNX. Install the Node.js bindings from npm, grab a Python package from PyPI, or use Frida through its Swift bindings, .NET bindings, Qt/Qml bindings, Go bindings, or C API. We also have a scalable footprint.
Free
Frida is and will always be free software (free as in freedom). We want to empower the next generation of developer tools, and help other free software developers achieve interoperability through reverse engineering.
Battle-tested
We are proud that NowSecure is using Frida to do fast, deep analysis of mobile apps at scale. Frida has a comprehensive test-suite and has gone through years of rigorous testing across a broad range of use-cases.


https://github.com/frida


Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.


https://github.com/frida/frida


Clone this repo to build Frida


https://github.com/frida/frida-core


Frida core library intended for static linking into bindings


https://github.com/frida/frida-gum


Cross-platform instrumentation and introspection library written in C


This library is consumed by frida-core through its JavaScript bindings, GumJS.


https://github.com/frida/frida-node


Frida Node.js bindings


https://github.com/frida/frida-python


Frida Python bindings


https://github.com/frida/frida-swift


Frida Swift bindings


https://github.com/frida/frida-rust


Frida Rust bindings


https://github.com/Ch0pin/medusa


medusa
Binary instrumentation framework based on FRIDA


MEDUSA is an extensible and modularized framework that automates processes and techniques practiced during the dynamic analysis of Android and iOS Applications.


https://github.com/rsenet/FriList


Collection of useful FRIDA Mobile Scripts


Observer
Security Bypass
Static Analysis
Specific Software
Other


Reversing C++ Binaries

Unsorted


http://www.max-sperling.bplaced.net/?cat=80


Category: Data structures


http://www.max-sperling.bplaced.net/?cat=28


Category: Debugging & Memory


C++ vtables


See also:

https://docs.binary.ninja/guide/cpp.html#virtual-function-tables


Virtual Function Tables
Virtual functions are implemented by compilers using a virtual function table structure that is pointed to by instances of the class. The layout of these structures is compiler dependent, so when reverse engineering a C++ binary these structures must be created to match the in-memory layout used by the binary.


https://clang.llvm.org/doxygen/VTableBuilder_8cpp_source.html
https://shaharmike.com/cpp/

https://shaharmike.com/cpp/vtable-part1/


C++ vtables - Part 1 - Basics


In this mini post-series we’ll explore how clang implements vtables & RTTI. In this part we’ll start with some basic classes and later on cover multiple inheritance and virtual inheritance.


NonVirtualClass has a size of 1 because in C++ classes can’t have zero size. However, this is not important right now.


VirtualClass’s size is 8 on a 64 bit machine. Why? Because there’s a hidden pointer inside it pointing to a vtable. vtables are static translation tables, created for each virtual-class.


Here’s what we learned from the above:


Even though the classes have no data members, there’s a hidden pointer to a vtable;
vtable for p1 and p2 is the same. vtables are static data per-type;
d1 and d2 inherit a vtable-pointer from Parent which points to Derived’s vtable;
All vtables point to an offset of 16 (0x10) bytes into the vtable. We’ll also discuss this later.


Note: we’re looking at demangled symbols. If you really want to know, _ZTV is a prefix for vtable, _ZTS is a prefix for type-string (name) and _ZTI is for type-info.


Here's Parent's vtable layout:


Address
Value
Meaning


0x400ba8
0x0
top_offset (more on this later)


0x400bb0
0x400b78
Pointer to typeinfo for Parent (also part of the above memory dump)


0x400bb8
0x400aa0
Pointer to Parent::Foo()^1^. Parent's _vptr points here.


0x400bc0
0x400a90
Pointer to Parent::FooNotOverridden()^2^


Here's Derived's vtable layout:


Address
Value
Meaning


0x400b40
0x0
top_offset (more on this later)


0x400b48
0x400b90
Pointer to typeinfo for Derived (also part of the above memory dump)


0x400b50
0x400a80
Pointer to Derived::Foo()^3^. Derived's _vptr points here.


0x400b58
0x400a90
Pointer to Parent::FooNotOverridden() (same as Parent's)


Remember that the vtable pointer in Derived pointed to a +16 bytes offset into the vtable? The 3rd pointer is the address of the first method pointer. Want the 3rd method? No problem - add 2 * sizeof(void*) to vtable-pointer. Want the typeinfo record? jump to the pointer before.


https://shaharmike.com/cpp/vtable-part2/


C++ vtables - Part 2 - Multiple Inheritance


The world of single-parent inheritance hierarchies is simpler for the compiler.
As we saw in Part 1, each child class extends its parent vtable by appending entries for each new virtual method.
In this post we will cover multiple inheritance, which complicates things even when only inheriting from pure-interfaces.


TODO: add summary of relevant bits here


https://shaharmike.com/cpp/vtable-part3/


C++ vtables - Part 3 - Virtual Inheritance


TODO: add summary of relevant bits here


https://shaharmike.com/cpp/vtable-part4/


C++ vtables - Part 4 - Compiler-Generated Code


TODO: add summary of relevant bits here


https://web.mit.edu/tibbetts/Public/inside-c/www/


The Secret Life of C++: What Your Compiler Doesn't Want You To Know


C++ is filled with strange and wonderful features, with a few more added in C++11. We will explore in detail how these features are implemented under the covers, in terms of the assembly code generated. Features to be explored include construction and destruction, copying, references, virtual methods, method dispatch, object layout, exceptions, templates, anonymous functions, captures, and more.


Hour One: The Not So Secret Life of C, and crash course in x86_64 assembly


https://web.mit.edu/tibbetts/Public/inside-c/www/c-primer.html


Hour Two


References


https://web.mit.edu/tibbetts/Public/inside-c/www/references.html


Symbol Mangling


https://web.mit.edu/tibbetts/Public/inside-c/www/mangling.html


Objects, Methods, Inheritance, Copying, References, Methods


https://web.mit.edu/tibbetts/Public/inside-c/www/objects.html


Runtime Time Type Information and Casting


https://web.mit.edu/tibbetts/Public/inside-c/www/rtti.html


Hour Three


Virtual Inheritance Review


https://web.mit.edu/tibbetts/Public/inside-c/www/virtual-inheritance.html


Initializing Global Objects


https://web.mit.edu/tibbetts/Public/inside-c/www/initializing-globals.html


Exceptions


https://web.mit.edu/tibbetts/Public/inside-c/www/exceptions.html


Hour Four


Syntactic Sugar


https://web.mit.edu/tibbetts/Public/inside-c/www/sugar.html


Templates


https://web.mit.edu/tibbetts/Public/inside-c/www/templates.html


Anonymous Functions, Captures


https://web.mit.edu/tibbetts/Public/inside-c/www/closures.html


https://hardwear.io/usa-2023/presentation/analyzing-decompiled-c++vtables-and-objects-in-GCC-binaries.pdf


Static Analysis of C++ Virtual Tables (from GCC)
James Rowley, Marcus Engineering, LLC
Hardwear.io USA 2023


https://ww2.ii.uj.edu.pl/~kapela/pn/listLectures.php


Structures and C++


https://ww2.ii.uj.edu.pl/~kapela/pn/print-one-lecture.php?lectureNumber=8


Virtual function arrays
The implementation of virtual function tables depends on the base classes of a given class. It is also often compiler specific
A good demo can be found at:


https://ww2.ii.uj.edu.pl/~kapela/pn/cpp_vtable.html


VTable Notes on Multiple Inheritance in GCC C++ Compiler v4.0.1


https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/cxxclasses.html


Chapter 12. C++ Class Representations


std::string


https://shaharmike.com/cpp/std-string/


Exploring std::string


Every C++ developer knows that std::string represents a sequence of characters in memory. It manages its own memory, and is very intuitive to use. Today we’ll explore std::string as defined by the C++ Standard, and also by looking at 4 major implementations.


One particular optimization found its way to pretty much all implementations: small objects optimization (aka small buffer optimization). Simply put, Small Object Optimization means that the std::string object has a small buffer for small strings, which saves dynamic allocations.


Recent GCC versions use a union of buffer (16 bytes) and capacity (8 bytes) to store small strings. Since reserve() is mandatory (more on this later), the internal pointer to the beginning of the string either points to this union or to the dynamically allocated string.


clang is by-far the smartest and coolest. While std::string has the size of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation. To achieve this libc++ uses a neat trick: the size of the string is not saved as-is but rather in a special way: if the string is short (< 23 bytes) then it stores size() * 2. This way the least significant bit is always 0. The long form always bitwise-ors the LSB with 1, which in theory might have meant unnecessarily larger allocations, but this implementation always rounds allocations to be of form 16*n - 1 (where n is an integer). By the way, the allocated string is actually of form 16*n, the last character being '\0'


https://tastycode.dev/memory-layout-of-std-string/


Memory Layout of std::string


Discover how std::string is represented in the most popular C++ Standard Libraries, such as MSVC STL, GCC libstdc++, and LLVM libc++.


In this post of Tasty C++ series we’ll look inside of std::string, so that you can more effectively work with C++ strings and take advantage and avoid pitfalls of the C++ Standard Library you are using.


In C++ Standard Library, std::string is one of the three contiguous containers (together with std::array and std::vector). This means that a sequence of characters is stored in a contiguous area of the memory and an individual character can be efficiently accessed by its index at O(1) time. The C++ Standard imposes more requirements on the complexity of string operations, which we will briefly focus on later in this post.


If we are talking about the C++ Standard, it’s important to remember that it doesn’t impose exact implementation of std::string, nor does it specify the exact size of std::string. In practice, as we’ll see, the most popular implementations of the C++ Standard Library allocate 24 or 32 bytes for the same std::string object (excluding the data buffer). On top of that, the memory layout of string objects is also different, which is a result of a tradeoff between optimal memory and CPU utilization, as we’ll also see below.


For people just starting to work with strings in C++, std::string is usually associated with three data fields:

Buffer – the buffer where string characters are stored, allocated on the heap.
Size – the current number of characters in the string.
Capacity – the max number of character the buffer can fit, a size of the buffer.

Talking C++ language, this picture could be expressed as the following class:
class TastyString {
  char *    m_buffer;     //  string characters
  size_t    m_size;       //  number of characters
  size_t    m_capacity;   //  m_buffer size
}

This representation takes 24 bytes and is very close to the production code.


https://stackoverflow.com/questions/5058676/stdstring-implementation-in-gcc-and-its-memory-overhead-for-short-strings


std::string implementation in GCC and its memory overhead for short strings


At least with GCC 4.4.5, which is what I have handy on this machine, std::string is a typdef for std::basic_string<char>, and basic_string is defined in /usr/include/c++/4.4.5/bits/basic_string.h. There's a lot of indirection in that file, but what it comes down to is that nonempty std::strings store a pointer to one of these:
struct _Rep_base
{
  size_type       _M_length;
  size_type       _M_capacity;
  _Atomic_word        _M_refcount;
};

Followed in-memory by the actual string data. So std::string is going to have at least three words of overhead for each string, plus any overhead for having a higher capacity than length (probably not, depending on how you construct your strings -- you can check by asking the capacity() method).
There's also going to be overhead from your memory allocator for doing lots of small allocations; I don't know what GCC uses for C++, but assuming it's similar to the dlmalloc allocator it uses for C, that could be at least two words per allocation, plus some space to align the size to a multiple of at least 8 bytes.


std::vector


http://www.max-sperling.bplaced.net/?p=4983


Layout of std::vector (libstdc++)


Universal (Fat) Binaries


https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary


Building a Universal macOS Binary


Create macOS apps and other executables that run natively on both Apple silicon and Intel-based Mac computers.


https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Update-the-Architecture-List-of-Custom-Makefiles


To create a universal binary for your project, merge the resulting executable files into a single executable binary using the lipo tool.


lipo -create -output universal_app x86_app arm_app


https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Determine-Whether-Your-Binary-Is-Universal


Determine Whether Your Binary Is Universal
To users, a universal binary looks no different than a binary built for a single architecture. When you build a universal binary, Xcode compiles your source files twice—once for each architecture. After linking the binaries for each architecture, Xcode then merges the architecture-specific binaries into a single executable file using the lipo tool. If you build the source files yourself, you must call lipo as part of your build scripts to merge your architecture-specific binaries into a single universal binary.
To see the architectures present in a built executable file, run the lipo or file command-line tools. When running either tool, specify the path to the actual executable file, not to any intermediate directories such as the app bundle. For example, the executable file of a macOS app is in the Contents/MacOS/ directory of its bundle. When running the lipo tool, include the -archs parameter to see the architectures.


% lipo -archs /System/Applications/Mail.app/Contents/MacOS/Mail
x86_64 arm64


To obtain more information about each architecture, pass the -detailed_info argument to lipo.


https://developer.apple.com/documentation/apple-silicon/building-a-universal-macos-binary#Specify-the-Launch-Behavior-of-Your-App


Specify the Launch Behavior of Your App
For universal binaries, the system prefers to execute the slice that is native to the current platform. On an Intel-based Mac computer, the system always executes the x86_64 slice of the binary. On Apple silicon, the system prefers to execute the arm64 slice when one is present. Users can force the system to run the app under Rosetta translation by enabling the appropriate option from the app’s Get Info window in the Finder.
If you never want users to run your app under Rosetta translation, add the LSRequiresNativeExecution key to your app’s Info.plist file. When that key is present and set to YES, the system prevents your app from running under translation. In addition, the system removes the Rosetta translation option from your app’s Get Info window. Don’t include this key until you verify that your app runs correctly on both Apple silicon and Intel-based Mac computers.
If you want to prioritize one architecture, without preventing users from running your app under translation, add the LSArchitecturePriority key to your app’s Info.plist file. The value of this key is an ordered array of strings, which define the priority order for selecting an architecture.


https://ss64.com/osx/lipo.html


lipo
Create or operate on a universal file: convert a universal binary to a single architecture file, or vice versa.


lipo produces one output file, and never alters the input file.


lipo can: list the architecture types in a universal file; create a single universal file from one or more input files; thin out a single universal file to one specified architecture type; and extract, replace, and/or remove architectures types from the input file to create a single new universal output file.


https://github.com/konoui/lipo


LIPO
This lipo is designed to be compatible with macOS lipo, which is a utility for creating Universal Binary as known as Fat Binary.


Reverse Engineering Audio VST Plugins


See Also

Reverse engineering Serum patch format (0xdevalias' gist)


https://reverseengineering.stackexchange.com/questions/15572/x64dbg-how-to-debug-a-dll-called-from-an-application


If you do not have a source, the you can patch the dll at the entry point with 0xcc for break point or 0xEB 0xFE (jmp 0x0) for endless loop. In the case of break point opcode, you will trigger debugger on execution.


https://www.reddit.com/r/HowToHack/comments/rpwcgj/reverse_engineer_audio_vst_plugins_with_x64dbg/


I'm curious on how the process looks like when reverse engineering Audio VST Plugins? They're *.dll-files and therefore not as straightforward as *.exe-file applications which means that you can't attach to them directly via x64dbg for example, but you have to attach the debugger to the host or DAW I guess.


I’ve reverse engineered a vst that was no longer able to be purchased and such was unusable. I loaded it in FL Studio in 32 bit mode and used x32dbg. Also used IDA to map out functions and such. Good luck


Just write a DLL loader and call the function you're interested in debugging.


I'd explicitly load the library with LoadLibrary, get the function pointer with GetProcAddress and then call the function to maximize flexibility and control.
Depending on what the debugger you use can actually do and whether it can work around ASLR, you can put the breakpoint at the main program (where it's not randomized) at the function call rather than in the dll which may be anywhere in the address space.


http://www.reteam.org/board/showthread.php?t=4020


Debugging VST Audio Plug In´s


For educational purposes I am trying to reverse engineer VST audio plug ins. Basically these are .dll files hosted by an audio application.
How can I debug them? I tried OllyDBG to load the .dll which works fine. When I do "Call DLL Export" and choose "VSTPluginMain" which is the entry point for every VST plugin, i get an access violation.
I also tried to load the vst host (Ableton Live) and trying to start the VST, but this source is not loaded in ollyDBG. I am a bit confused. Can you point me in the correct direction?
I need a technique to debug these plugin .dll´s.


Get the plugin SDK from Steinberg and study it. You'll notice that you need to pass address of your own audioMasterCallback function to VSTPluginMain().
Either make your own VST host, or find one which you can debug.


I have found the minihost in the Steinberg SDK. I used it to load a vst and now I can debug my host and inside ollyDbg I can now debug the vst.


https://steinbergmedia.github.io/vst3_dev_portal/


VST 3 Developer Portal


https://steinbergmedia.github.io/vst3_doc/
https://steinbergmedia.github.io/vst3_dev_portal/pages/Versions/Index.html


Change History


https://github.com/steinbergmedia/vst3sdk


VST 3 Plug-In SDK


https://github.com/steinbergmedia/vst3sdk#adding-vst2-version


Adding VST2 version
The VST 2 SDK is not part anymore of the VST 3 SDK, you have to use an older version of the SDK and copy the vst2sdk folder into the VST_SDK folder. In order to build a VST2 version of the plug-in and a VST3 at the same time, you need to copy the VST2 folder into the VST3 folder


steinbergmedia/vst3sdk#51


Make the VST2 SDK available


Would it possible to make the VST2 SDK available on GitHub as a separate repo? This would make certail users' lives a lot easier.


VST2 SDK is not anymore supported. https://www.steinberg.net/en/newsandevents/news/newsdetail/article/vst-2-coming-to-an-end-4727.html


there is already a reverse engineered VST2 header since 2006 called VeSTige. It allows you to build a to-spec VST2 plugin.
It doesn't have any of the SDK features of course, but that's not really the point here (the SDK is probably the worst part about all of VST*, that and being completely closed to any revisions).
You can find examples of projects using vestige all over the place https://github.com/x42/lv2vst/blob/master/include/vestige.h?rgh-link-date=2021-02-24T18%3A31%3A13Z


steinbergmedia/vst3sdk#73


A plugin from Github that I am working with (the GLSL Plugin) says that "The VST2 SDK can be obtained from the vstsdk3610_11_06_2018_build_37 (or older) VST3 SDK..." However, there seems to be no VST_2 in steinbergmedia / vst3dk's code, files, etc. Are they missing, or are they not supposed to be in there?


VST2 SDK is not anymore available.


https://github.com/steinbergmedia/vst3_public_sdk/commits/master/source

Looks like we can get a number of the older builds straight from the source; they've just removed the git tags to make them slightly harder to find
Though from a quick skim the oldest seems to be vstsdk367_03_03_2017_build_352, which I believe already had the VST2SDK removed


https://github.com/logicomacorp/WaveSabre/tree/master/Vst3.x


You can download the last version of the VST3 SDK that includes the full VST2 SDK here.
Unpack the contents of the "VST3 SDK" subfolder from that archive into this
folder. This means that folders like "plugininterface" and "public.sdk" should
be located next to this file.


We can see other downloads here:

https://web.archive.org/web/*/http://www.steinberg.net/sdk_downloads/*

vst_sdk2_3.zip
vst_sdk2_4_rev2.zip
vstsdk360_22_11_2013_build_100.zip
vstsdk365_28_08_2015_build_66.zip
vstsdk365_12_11_2015_build_67.zip
vstsdk366_21_06_2016_build_58.zip
vstsdk366_27_06_2016_build_61.zip


https://github.com/search?type=code&q=%22Category++++%3A+VST+2.x+Interfaces%22+%22Filename++++%3A+pluginterfaces%2Fvst2.x%22

"Category    : VST 2.x Interfaces" "Filename    : pluginterfaces/vst2.x"


Looks like someone already did that and committed it here!

https://github.com/skarab/INVOT_TEIH_MARTX/tree/main/music/WaveSabre/Vst3.x

https://github.com/skarab/INVOT_TEIH_MARTX/blob/main/music/WaveSabre/Vst3.x/public.sdk/source/vst3stdsdk.cpp

VSTSDK3 Version: 3.6.6


https://github.com/skarab/INVOT_TEIH_MARTX/tree/main/music/WaveSabre/Vst3.x/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/phil-monroe/midiplug/tree/master/SDKs/vst3

https://github.com/phil-monroe/midiplug/blob/master/SDKs/vst3/public.sdk/source/vst3stdsdk.cpp

VSTSDK3 Version: 3.6.0


https://github.com/phil-monroe/midiplug/tree/master/SDKs/vst3/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/juce-framework/JUCE/blob/master/BREAKING_CHANGES.md#version-540


The VST2 SDK can be obtained from the vstsdk3610_11_06_2018_build_37 (or older) VST3 SDK or JUCE version 5.3.2


https://github.com/juce-framework/JUCE/commit/a3566b8c1edac29997490143982cd2b2eb089ffa


Removed the embedded VST2 SDK


https://github.com/juce-framework/JUCE/commits/master/modules/juce_audio_processors/format_types/VST3_SDK

History for VST3_SDK


https://github.com/juce-framework/JUCE/commits/master/modules/juce_audio_processors/format_types/VST3_SDK/pluginterfaces/vst2.x

History for VST3_SDK/pluginterfaces/vst2.x


https://github.com/gyohng/NewJucePluginTemplate/blob/main/JUCE/modules/juce_core/system/juce_StandardHeader.h#L26

JUCE Version: 7.0.5
https://github.com/gyohng/NewJucePluginTemplate/tree/main/JUCE/modules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Version: 3.6.13 (08.04.2019)


https://github.com/gyohng/NewJucePluginTemplate/tree/main/JUCE/modules/juce_audio_processors/format_types/VST3_SDK/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/bgillock/JUCE/blob/master/modules/juce_core/system/juce_StandardHeader.h#L26

JUCE Version: 7.0.2
https://github.com/bgillock/JUCE/tree/master/modules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Version: 3.6.13 (08.04.2019)
Seems to have some good change history with other versions as well though:

https://github.com/bgillock/JUCE/commits/master/modules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Versions: 3.6.9, 3.6.12, 3.6.13
VSTSDK2 Versions: 2.4

https://github.com/bgillock/JUCE/commit/a3566b8c1edac29997490143982cd2b2eb089ffa#diff-17a7663f3628938364d97d98abbaa7a94c9f96175a26134f6985f8548862f475


Workaround
----------
1. The VST2 SDK can be obtained from the vstsdk3610_11_06_2018_build_37 (or
older) VST3 SDK or JUCE version 5.3.2. You should put the VST2 SDK in your
header search paths.


https://github.com/bgillock/DanteJUCEDemo/blob/master/modules/juce_core/system/juce_StandardHeader.h#L26

JUCE Version: 7.0.2
https://github.com/bgillock/DanteJUCEDemo/tree/master/modules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Version: 3.6.13 (08.04.2019)
https://github.com/bgillock/DanteJUCEDemo/tree/master/modules/juce_audio_processors/format_types/VST3_SDK/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/johndpope/PianoRoller/blob/master/JuceLibraryCode/modules/juce_core/system/juce_StandardHeader.h#L26

JUCE Version: 5.4.1
https://github.com/johndpope/PianoRoller/tree/master/JuceLibraryCode/modules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Version: 3.6.9
https://github.com/johndpope/PianoRoller/tree/master/JuceLibraryCode/modules/juce_audio_processors/format_types/VST3_SDK/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/pac-dev/protoplug/blob/master/Frameworks/JuceModules/juce_core/system/juce_StandardHeader.h#L26

JUCE Version: 5.3.2
https://github.com/pac-dev/protoplug/tree/master/Frameworks/JuceModules/juce_audio_processors/format_types/VST3_SDK

VSTSDK3 Version: 3.6.9
https://github.com/pac-dev/protoplug/tree/master/Frameworks/JuceModules/juce_audio_processors/format_types/VST3_SDK/pluginterfaces/vst2.x

VSTSDK2 Version: 2.4


https://github.com/steinbergmedia/vst3projectgenerator


VST 3 Project Generator


https://github.com/steinbergmedia/vst3_example_plugin_hello_world


Hello World VST 3
Hello World VST 3 example plug-in


This is a simple Hello World VST 3 FX plug-in to demonstrate how to use the VST 3 SDK as an external project.


https://github.com/steinbergmedia/vstgui


VSTGUI
A user interface toolkit mainly for audio plug-ins


VSTGUI is a user interface toolkit mainly for audio plug-ins (VST, AAX, AudioUnit, etc...) and it is designed for working well with VST 3 plug-ins and its wrappers like AU, AAX, ...


https://github.com/steinbergmedia/vst3_public_sdk


VST 3 Implementation Helper Classes And Examples


https://github.com/steinbergmedia/vst3_c_api


VST3 C API


The C API header of the VST3 API


This repository contains the VST3 C API.
It is automatically generated out of the C++ VST3 API (See https://github.com/steinbergmedia/vst3_c_api_generator)


https://github.com/steinbergmedia/vst3_pluginterfaces


VST 3 SDK Interfaces
Here are located all VST 3 interfaces definitions (including VST Component/Controller, UI, Test).


https://github.com/pongasoft/jamba


A lightweight VST2/3 framework


pongasoft/jamba#16


Remove support for VST2


https://github.com/pongasoft/jamba/commit/30776f4a0034123f26575a53317aca59440ba4b3


https://github.com/pongasoft/vst24-hello-world


This project contains a "Hello World" style application for building a VST 2.4 plugin


https://github.com/pongasoft/vst24-hello-world/commit/9355bbb2adebb940c288855d06784caa1350a96f


removed vst2 files for licensing reasons


https://github.com/pongasoft/vst24-hello-world/blob/master/A_Beginners_Guide_To_VST2.4.md


A Beginner's Guide to VST 2.4


Why VST 2.4?
It is fair to wonder why (as of this writing in early 2018), I chose to create a project about VST 2.4. The reason is actually quite simple. When steinberg released 3.0, they decided to make it not backward compatible. 2.4 is the last version released of the 2.x branch. Due to the immense popularity of the format, and the fact that (at least at the time), the 3.x branch was not bringing enough to justify porting all the plugins to the new version, the format has continued to strive. Even despite steinberg officially ending its support at the end of 2013, the format has continued to strive.


https://github.com/pongasoft/vst24-hello-world/blob/master/A_Beginners_Guide_To_VST2.4.md#vst-sdk


For licensing reasons, you need to download the VST SDK from steinberg (3.6.8 as of 2018/01/01)
You can no longer download the VST 2.4 SDK and instead you have to download the VST 3 SDK, but it contains the 2.4 version. Also, although the VST3 SDK is open source (under a dual licensing including GPL3), version 2.4 is explicitly excluded so you need to get it yourself.


Anatomy of the SDK
The entire VST 2.4 SDK is comprised of 2 files: aeffect.h and aeffectx.h located under VST2_SDK/pluginterfaces/vst2.x in the downloaded archive. There is no dependency on anything else so in order to build and compile a basic VST 2.4 plugin, you only need to include aeffectx.h since it includes the other file.
aeffectx.h contains extensions added after version 1.0.


Plugin Lifecycle
Based entirely on convention, the host will locate the plugin executable (packaging and location depends on platform). The host will then look for a function with the following signature (by convention):
AEffect *VSTPluginMain(audioMasterCallback vstHostCallback)
or if you prefer with more readable/meaningful types:
typedef audioMasterCallback VSTHostCallback;
typedef AEffect VSTPlugin;
VSTPlugin *VSTPluginMain(VSTHostCallback vstHostCallback);
This function, written by the plugin developer acts as a factory of plugins:

it receives the host callback which is used by the plugin to communicate with the host (for example to get the sample rate (opCode audioMasterGetSampleRate))
it returns an instance of the AEffect (aka VSTPlugin) structure defined by the API

This structure contains information that the host requires, like the number of inputs and outputs as well as 5 function pointers which defines the callbacks that the host will use to interact with the plugin.


https://github.com/pongasoft/vst3-again-sample


Example of a self contained VST3/VST2 plugin that is not part of the SDK


This project is exactly the again example that ships part of the VST3 SDK but self contained and depending on the SDK (vs being part of the SDK). As a result it can be used as a starting point to build other plugins.


Note 2020/01
This project was created in 2018 using the VST3 SDK 3.6.9 which includes VST2. More recent versions of the SDK have removed VST2 support. Although this project is still valid as long as you use 3.6.9, you should check Jamba which offers a very easy way to bootstrap a blank self contained plugin which depends on the SDK. Jamba also offers a lot of additional features.


https://github.com/R-Tur/VST_SDK_2.4


CMake project for VST SDK 2.4


https://bitbucket.org/r-tur/vst_sdk_2.4


https://github.com/ekenberg/vstminihost


VST audio plugin GUI minihost for Linux


This is the minihost example from Steinberg VSTSDK-2.4, with added support for building on Linux


https://www.image-line.com/fl-studio-learning/fl-studio-online-manual/html/plugins/Minihost%20Modular.htm


Minihost Modular is a modular environment for hosting/interconnecting VST/AU plugins based on a custom modular engine especially developed for this purpose. As a standalone, Minihost Modular can be used as an advanced VST or AU host with modular routing with some sequencing recording/playback capabilities. As a VST or AU plugin, Minihost Modular can be used to extend the capabilities of your existing DAW software with its powerful modular, recallable, environment. Minihost Modular bares some similarities to FL Studio's Patcher but has an extended capability as a self contained host.


https://github.com/blurkk/symbiosis-au-vst-2


Symbiosis is a developer tool for adapting Mac OS X VST plug-ins to the Audio Unit (AU) standard.


https://github.com/juce-framework


JUCE is an open-source cross-platform C++ application framework for desktop and mobile applications, including VST, VST3, AU, AUv3, LV2 and AAX audio plug-ins.


https://github.com/juce-framework/JUCE/tree/6.0.4/modules


Compiler Optimisations

Fast Division / Modulus


https://binary.ninja/2023/09/15/3.5-expanded-universe.html#moddiv-deoptimization


Mod/Div Deoptimization


One of the many things compilers do that can make reverse engineering harder is use a variety of algorithmic optimizations, in particular for modulus and division calculations. Instead of implementing them with the native CPU instructions, they will use shifts and multiplications with magic constants that when operating on a fixed integer size has the same effect as a native division instruction.
There are several ways to try to recover the original division which is far more intuitive and easer to reason about.


https://lemire.me/blog/2020/02/26/fast-divisionless-computation-of-binomial-coefficients/


Fast divisionless computation of binomial coefficients


We would prefer to avoid divisions entirely. If we assume that k is small, then we can just use the fact that we can always replace a division by a known value with a shift and a multiplication. All that is needed is that we precompute the shift and the multiplier. If there are few possible values of k, we can precompute it with little effort.


I provide a full portable implementation complete with some tests. Though I use C, it should work as-is in many other programming languages. It should only take tens of CPU cycles to run. It is going to be much faster than implementations relying on divisions.


Another trick that you can put to good use is that the binomial coefficient is symmetric: you can replace k by n–k and get the same value. Thus if you can handle small values of k, you can also handle values of k that are close to n. That is, the above function will also work for n is smaller than 100 and k larger than 90, if you just replace k by n–k.


Is that the fastest approach? Not at all. Because n is smaller than 100 and k smaller than 10, we can precompute (memoize) all possible values. You only need an array of 1000 values. It should fit in 8kB without any attempt at compression. And I am sure you can make it fit in 4kB with a little bit of compression effort. Still, there are instances where relying on a precomputed table of several kilobytes and keeping them in cache is inconvenient. In such cases, the divisionless function would be a good choice.


Alternatively, if you are happy with approximations, you will find floating-point implementations.


https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2020/02/26/binom.c
https://github.com/dmikushin/binom/blob/master/include/binom.h
https://github.com/bmkessler/fastdiv
https://github.com/jmtilli/fastdiv/blob/master/fastdiv.c


Unsorted


https://github.com/mroi/apple-internals


Apple Internals
This repository provides tools and information to help understand and analyze the internals of Apple’s operating system platforms.


https://mroi.github.io/apple-internals/


Collected knowledge about the internals of Apple’s platforms.
Sorted by keyword, abbreviation, or codename.


https://opensource.apple.com/source/objc4/


Source Browser - objc4


objc4-274


https://opensource.apple.com/source/objc4/objc4-274/
https://opensource.apple.com/source/objc4/objc4-274/objc-exports.auto.html


objc-exports


# Functions and variables explicitly exported from ObjC.
# GrP 2002-2-4
# Note that some commonly used functions are *not* listed in the 
# ObjC headers (e.g. objc_flush_caches())


https://opensource.apple.com/source/objc4/objc4-274/libobjc.order.auto.html


libobjc.order


objc4-818.2


https://opensource.apple.com/source/objc4/objc4-818.2/libobjc.order.auto.html


libobjc.order


https://opensource.apple.com/source/objc4/objc4-818.2/unexported_symbols.auto.html


unexported_symbols


https://github.com/smx-smx/ezinject


Modular binary injection framework, successor of libhooker


ezinject is a lightweight and flexible binary injection framework. it can be thought as a lightweight and less featured version of frida.
It's main and primary goal is to load a user module (.dll, .so, .dylib) inside a target process. These modules can augment ezinject by providing additional features, such as hooks, scripting languages, RPC servers, and so on. They can also be written in multiple languages such as C, C++, Rust, etc... as long as the ABI is respected.
NOTE: ezinject core is purposedly small, and only implements the "kernel-mode" (debugger) features it needs to run the "user-mode" program, aka the user module.
It requires no dependencies other than the OS C library (capstone is optionally used only by user modules)
Porting ezinejct is simple: No assembly code is required other than a few inline assembly statements, and an abstraction layer separates multiple OSes implementations.


https://github.com/evelyneee/ellekit


ElleKit
yet another tweak injector / tweak hooking library for darwin systems


What this is

A C function hooker that patches memory pages directly
An Objective-C function hooker
An arm64 assembler
A JIT inline assembly implementation for Swift
A Substrate and libhooker API reimplementation


http://diaphora.re/


Diaphora
A Free and Open Source Program Diffing Tool


Diaphora (διαφορά, Greek for 'difference') version 3.0 is the most advanced program diffing tool (working as an IDA plugin) available as of today (2023). It was released first during SyScan 2015 and has been actively maintained since this year: it has been ported to every single minor version of IDA since 6.8 to 8.3.
Diaphora supports versions of IDA >= 7.4 because the code only runs in Python 3.X (Python 3.11 was the last version being tested).


https://github.com/joxeankoret/diaphora


Diaphora, the most advanced Free and Open Source program diffing tool.


Diaphora has many of the most common program diffing (bindiffing) features you might expect, like:

Diffing assembler.
Diffing control flow graphs.
Porting symbol names and comments.
Adding manual matches.
Similarity ratio calculation.
Batch automation.
Call graph matching calculation.
Dozens of heuristics based on graph theory, assembler, bytes, functions' features, etc...

However, Diaphora has also many features that are unique, not available in any other public tool. The following is a non extensive list of unique features:

Ability to port structs, enums, unions and typedefs.
Potentially fixed vulnerabilities detection for patch diffing sessions.
Support for compilation units (finding and diffing compilation units).
Microcode support.
Parallel diffing.
Pseudo-code based heuristics.
Pseudo-code patches generation.
Diffing pseudo-codes (with syntax highlighting!).
Scripting support (for both the exporting and diffing processes).


See Also

My StackOverflow/etc answers


https://stackoverflow.com/questions/46802472/recursively-find-hexadecimal-bytes-in-binary-files/77706906#77706906

Recursively searching through binary files for hex strings (with potential wildcards) using radare2's rafind2
Crossposted: https://twitter.com/_devalias/status/1738458619958751630

SEARCH_DIRECTORY="./path/to/bins"
GREP_PATTERN='\x5B\x27\x21\x3D\xE9'

# Remove all instances of '\x' from PATTERN for rafind2
# Eg. Becomes 5B27213DE9
PATTERN="${GREP_PATTERN//\\x/}"

grep -rl "$GREP_PATTERN" "$SEARCH_DIRECTORY" | while read -r file; do
  echo "$file:"
  rafind2 -x "$PATTERN" "$file"
done


SEARCH_DIRECTORY="./path/to/bins"
PATTERN='5B27213DE9'

# Using find
find "$SEARCH_DIRECTORY" -type f -exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && echo "$2:" && echo "$output"' sh "$PATTERN" {} \;

# Using fd
fd --type f --exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && (echo "$2:"; echo "$output")' sh "$PATTERN" {} "$SEARCH_DIRECTORY"


⇒ time ./test-grep-and-rafind2
# ..snip..
./test-grep-and-rafind2  7.33s user 0.19s system 99% cpu 7.578 total

⇒ time ./test-find-and-rafind2
# ..snip..
./test-find-and-rafind2  3.24s user 0.72s system 98% cpu 4.041 total

⇒ time ./test-fd-and-rafind2
# ..snip..
./test-fd-and-rafind2  3.85s user 1.04s system 488% cpu 1.002 total


My Other Related Deepdive Gist's and Projects


Reverse Engineering Golang (0xdevalias' gist)
GitHub Copilot CLI API (0xdevalias' gist)

Decompilation Outputs


https://github.com/0xdevalias/chatgpt-source-watch : Analyzing the evolution of ChatGPT's codebase through time with curated archives and scripts.

Reverse engineering ChatGPT's frontend web app + deep dive explorations of the code (0xdevalias' gist)


Deobfuscating / Unminifying Obfuscated Web App Code (0xdevalias' gist)
Reverse Engineering Webpack Apps (0xdevalias' gist)
Reverse Engineered Webpack Tailwind-Styled-Component (0xdevalias' gist)
React Server Components, Next.js v13+, and Webpack: Notes on Streaming Wire Format (__next_f, etc) (0xdevalias' gist))
Fingerprinting Minified JavaScript Libraries / AST Fingerprinting / Source Code Similarity / Etc (0xdevalias' gist)
Bypassing Cloudflare, Akamai, etc (0xdevalias' gist)
Debugging Electron Apps (and related memory issues) (0xdevalias gist)
devalias' Beeper CSS Hacks (0xdevalias' gist)
Address	Value	Meaning
0x400ba8	0x0	`top_offset` (more on this later)
0x400bb0	0x400b78	Pointer to `typeinfo for Parent` (also part of the above memory dump)
0x400bb8	0x400aa0	Pointer to `Parent::Foo()`^1^. `Parent`'s _vptr points here.
0x400bc0	0x400a90	Pointer to `Parent::FooNotOverridden()`^2^
Address	Value	Meaning
0x400b40	0x0	`top_offset` (more on this later)
0x400b48	0x400b90	Pointer to `typeinfo for Derived` (also part of the above memory dump)
0x400b50	0x400a80	Pointer to `Derived::Foo()`^3^. `Derived`'s _vptr points here.
0x400b58	0x400a90	Pointer to `Parent::FooNotOverridden()` (same as `Parent`'s)