Skip to content

Instantly share code, notes, and snippets.

Created June 12, 2015 19:07
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save anonymous/7bbae81482b7708b22f8 to your computer and use it in GitHub Desktop.
Save anonymous/7bbae81482b7708b22f8 to your computer and use it in GitHub Desktop.
Better Dependency Hygiene With Private Dependencies On JVM

Better Dependency Hygiene With Private Dependencies On JVM

A common pain when working with large projects is the diamond dependency. Consider a commonly used library such as ASM. One wants to build a big application reusing many powerful libraries, but unfortunately many of my desired dependencies themselves depend on different and incompatible versions of ASM. While compiling my code, since ASM does not appear in any APIs I touch, everything compiles fine, but at runtime the JVM only includes one version of classes of a given name leading to runtime binary errors.

OSGI Bundles are related to solving this problem, but it appears that is a heavy solution that has proven to be too cumbersome to actually use. Here we propose a lighter weight approach that benefits each incremental project that adopts this method.

Private dependencies are implemented by a build tool plug-in. In the build where one declares dependencies, one can label a jar dependency to be a private dependency. A private dependency means that the classes in that jar may only appear in the method signatures of private or package private methods. If they appear in public or protected method signatures, the build fails. If the preceding condition is met, then the build tool mangles the name of the classes to ensure that the sha1 hash of the private dependency jar is the prefix of the package name, and all the classes in that jar are merged into the current jar when publishing, removing any duplicate classes found. This hashing approach has the benefit that everything that was in the same package previously is also in the same package after the mangling.

Examples:

  1. A has a private dependency on B and C v2.0. B Also has a normal, public dependency or C v1.0.

When A is being compiled, the dependency on C is mangled so it does not collide with the jar for C v1.0 that comes from transitively resolving the dependencies of B. Similarly if A has the normal public dependency but B has the private dependency.

  1. A has a private dependency on B and C v2.0 and B has a private dependency on C v2.0.

In this case, A and B depend on the same private version. When A resolves B it will get a jar that includes C with the hash of the jar for v2.0 prepended to the package. A will compute the same hash by its dependency on C v2.0 and when publishing the jar for A that jar will include once mangled copy of B and one mangled copy of C (since both A and B will mangle C in the same way).

Costs and Benefits

The downside of this approach is that we may have several versions of core dependencies in jars. This problem could become more acute with scala that generally produces much larger binaries than java. This bloat is mitigated to the degree that there is a canonical version since the hash mangling will be the same across the dependencies.

A major benefit is that we don't need total buy-in on the part of the ecosystem to improve. Each incremental dependency that is marked private cannot cause runtime binary exceptions for downstream consumers. This may encourage library authors to better encapsulate their dependencies improving the modularity of libraries and avoiding rexporting dependencies on public APIs. With this tool, we can more freely reuse code without the fear of making unsatisfiable builds for downstream consumers. Such fear is often used as justification to avoid reuse which wastes engineering effort.

Implementation

The implementation of this could be done via an SBT plugin. The plugin would have to be able to verify that the public and protected APIs of the current project do not include private dependencies. This could be done after normal compilation on the class files, in such a way that would work with Java or Scala code. If the public/protected APIs do not include the private dependency, then walk all the class files name mangling the private dependency, otherwise fail the build.

@johnynek
Copy link

Unfortunately I posted this not while logged in. :/ So I'm forking it, but I originally wrote this crap.

@johnynek
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment