Skip to content

Instantly share code, notes, and snippets.

@obfusk
Last active December 15, 2023 15:08
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save obfusk/f0460dcdb21396ba3d2cc9db56bb40ae to your computer and use it in GitHub Desktop.
Save obfusk/f0460dcdb21396ba3d2cc9db56bb40ae to your computer and use it in GitHub Desktop.
HOWTO: diff & fix APKs for Reproducible Builds

HOWTO: diff & fix APKs for Reproducible Builds

NB: assumes signed APK from upstream named upstream-release.apk, and unsigned APK from F-Droid CI named fdroiddata-ci.apk.

NB: also assumes a working directory in which it is okay to create temporary files & directories.

Links

Prerequisites

Tools

$ apt install aapt apksigner bat dexdump dos2unix unzip

Install reproducible-apk-tools; you can either use the scripts directly from the git repo (e.g. zipinfo.py) or install the repro-apk Python package and use them as subcommands (e.g. repro-apk zipinfo).

Install apksigcopier.

Download fix-dexdump.sh.

bash functions

bdiff() {
  # use bat for coloured output
  diff -Naur "$@" | bat -p -l diff
}
diff2c() {
  # diff running 2 different commands on the same file
  cmd_a="$1" cmd_b="$2" file="$3"
  diff -Naur <( $cmd_a "$file" ) <( $cmd_b "$file" ) | bat -p -l diff
}
diff2f() {
  # diff running the same command on 2 different files
  cmd="$1" file_a="$2" file_b="$3"
  diff -Naur <( $cmd "$file_a" ) <( $cmd "$file_b" ) | bat -p -l diff
}

Preliminary investigation

apksigcopier

# if this prints OK and does not show an error, we're good :)
$ apksigcopier compare upstream-release.apk --unsigned fdroiddata-ci.apk && echo OK

zipinfo.py

NB: it is expected for only upstream's signed APK to have v1 (JAR) signature files: META-INF/MANIFEST.MF, META-INF/*.SF, and META-INF/*.RSA (or .DSA/.EC).

$ diff2f 'zipinfo.py -e' upstream-release.apk fdroiddata-ci.apk
[...]

NB: if upstream's APK has a file named META-INF/BNDLTOOL.RSA, it was almost certainly built from an AAB by bundletool, which will not work with reproducible builds; ensure upstream directly builds an APK, not an app bundle.

NB: we're using zipinfo.py -e here because, unlike the original zipinfo, it also shows the CRC32, thus allowing us to see when a file differs in contents only.

diff-zip-meta.py

If the ZIP contents are equal, you can diff the ZIP metadata using diff-zip-meta.py.

Unpacking the APKs

NB: creates directories x and y.

$ unzip -q -d x upstream-release.apk
$ unzip -q -d y fdroiddata-ci.apk

Make sure the same commit and build method/options are used

NB: first of all: please ensure the same commit is used for both builds.

Second: try to avoid differences in build method/options that may affect reproducibility.

Usually, it's fine for upstream to build using Android Studio (instead of invoking gradle directly as during F-Droid or CI builds), but this can cause differences. But use of bundletool and AABs (instead of APKs) will almost certainly not work with reproducible builds.

NB: if you're certain you're building from the same commit but still seeing differences in AndroidManifest.xml, res/*.xml, or resources.arsc, it's likely something is different about the build method, options, and/or configuration, which should be addressed before trying to fix other issues.

ZIP ordering differences

Solution (upstream): either build using the CLI (not Android Studio) or use Android Gradle plugin 7.1.X or later.

Link: Bug: Android Studio builds have non-deterministic ZIP ordering.

Google Issue Tracker: non-deterministic order of ZIP entries in APK makes builds not reproducible.

Differences in specific files

Differing AndroidManifest.xml files (address first)

App Manifest compiled to Android binary XML.

NB: if these files are not the same, something is definitely wrong; are the APKs really built from the same commit and using the same build method/options?

$ diff2f 'dump-axml.py' x/AndroidManifest.xml y/AndroidManifest.xml
[...]

Differing .xml files in res/ (address first)

App resources compiled to Android binary XML.

NB: if these files are not the same, something is definitely wrong; are the APKs really built from the same commit and using the same build method/options?

$ diff2f 'dump-axml.py' x/res/foo.xml y/res/foo.xml
[...]

Differing resources.arsc (address first)

Android package resource table.

NB: if these files are not the same (and you're using Android Gradle plugin 3.4.X or later), something is definitely wrong; are the APKs really built from the same commit and using the same build method/options?

$ diff2f 'dump-arsc.py' x/resources.arsc y/resources.arsc
[...]

Solution for ordering differences (upstream): use Android Gradle plugin 3.4.X or later.

Link: Reproducible APK tools.

Google Issue Tracker: resources.arsc built with non-determism, prevents reproducible APK builds.

Differing .dex files (can be hard to fix)

Java/Kotlin classes compiled to Android bytecode.

NB: these differences can be hard to fix, depending on what caused them, so please don't spend a lot of time trying to make .dex files equal when there are e.g. differences in .xml files or resources.arsc, as the build will never be reproducible if those are not fixed.

# repeat as needed for classes2.dex etc.
$ dexdump -a -d -f -h x/classes.dex > x/classes.dex.dump
$ dexdump -a -d -f -h y/classes.dex > y/classes.dex.dump
# make the diff a lot smaller :)
$ fix-dexdump.sh x/*.dex.dump y/*.dex.dump
# repeat as needed for classes2.dex etc.
$ bdiff x/classes.dex.dump y/classes.dex.dump
[...]

Ensure that the same JDK (usually that means OpenJDK 11) is used for both builds.

Sometimes this works (but we're not sure why):

--- a/app/build.gradle
+++ b/app/build.gradle
     compileOptions {
-        sourceCompatibility JavaVersion.VERSION_1_8
-        targetCompatibility JavaVersion.VERSION_1_8
+        sourceCompatibility JavaVersion.VERSION_11
+        targetCompatibility JavaVersion.VERSION_11
     }
     kotlinOptions {
-        jvmTarget = '1.8'
+        jvmTarget = '11'
     }

Links:

Google Issue Tracker:

Differing .so files (can be hard to fix)

Compiled native code.

Easily affected by differences in build environment; using a build environment that resembles the F-Droid buildserver/CI as closely as possible -- e.g. using the same Debian version, etc. -- should reduce differences.

NB: these can be some of the hardest differences to fix, so please don't spend a lot of time trying to e.g. make build paths equal when there are differences in .xml files or resources.arsc, as the build will never be reproducible if those are not fixed.

Links:

Differing assets/dexopt/baseline.prof (caused by .dex)

Compiled baseline profile.

NB: these should only differ when the .dex files do (as a result of the .prof file containing a checksum of the corresponding .dex files); any differences in these files should disappear when the .dex files are made equal.

$ diff2f 'dump-baseline.py' x/assets/dexopt/baseline.prof y/assets/dexopt/baseline.prof
[...]

Differing assets/dexopt/baseline.profm (easy to fix)

Compiled baseline profile metadata.

NB: these may also differ as a result of .dex file differences, so please make sure those are equal first.

$ diff2f 'dump-baseline.py' x/assets/dexopt/baseline.profm y/assets/dexopt/baseline.profm
[...]

Solution (upstream): sort baseline.profm in build.gradle using com.android.tools.profgen.

Link: Bug: baseline.profm not deterministic.

Google Issue Tracker: Non-stable assets/dexopt/baseline.profm when rerun with --rerun-tasks.

Differences caused by LF vs CRLF (easy to fix)

NB: this most commonly affects META-INF/services/* files, but can affect other files as well, e.g. .css/.html/.js/.txt.

These can have line ending differences if e.g. upstream's APK was built on Windows. If the diff of a pair of these files looks like this, it's an LF vs CRLF issue:

$ bdiff x/META-INF/services/c6.l y/META-INF/services/c6.l
-y5.a
+y5.a

In which case using unix2dos on the file from the APK built on Linux should fix it:

# diff should be empty now
$ bdiff x/META-INF/services/c6.l <( unix2dos < y/META-INF/services/c6.l )

Solution (fdroiddata):

srclibs:
  - reproducible-apk-tools@v0.2.3
postbuild:
  - $$reproducible-apk-tools$$/inplace-fix.py --zipalign fix-newlines $$OUT$$
    'META-INF/services/*'

Google Issue Tracker: newline differences between building on Windows vs Linux make builds not reproducible.

Differing .json file from the AboutLibraries Gradle plugin (easy to fix)

Link: Embedded timestamps: AboutLibraries Gradle plugin.

Differing .png files (easy to fix)

PNG optimisation/generation is often not reproducible.

Links:

If all else fails, try diffoscope

https://try.diffoscope.org/

$ apt install diffoscope
$ diffoscope --text diff.txt --text-color always upstream-release.apk fdroiddata-ci.apk
$ less -R diff.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment