Skip to content

Instantly share code, notes, and snippets.

@trishume
Last active January 4, 2017 23:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save trishume/e43af5551b6f235a5c3ee62d591c05fb to your computer and use it in GitHub Desktop.
Save trishume/e43af5551b6f235a5c3ee62d591c05fb to your computer and use it in GitHub Desktop.
PackageJsonForCompilers/Esy and Nix

This is a response to a tweet by @jordwalke asking me about Nix and PackageJsonForCompilers. Twitter is too short form to answer well, so I wrote up some thoughts here. I might turn this into a blog post at some point.

Nix

Nix is a package manager, but also a meta-build system and deployment system. It works based off a global /nix/store/ folder where everything is stored keyed by a hash of its description and the description of all its dependencies. This way you can have multiple versions of any package, and not only that different copies of the same version compiled against different versions of its dependencies or with different compile flags. All dependencies must be fully specified and the way the building works enforces this completely. This way completed binaries can be served from a "cache" with no issue. This gives the best of source-based package managers and binary ones.

The way it handles build vs. runtime dependencies is automatic. The "derivation" for a package is itself a thing that gets built into the Nix store which has dependencies which are all the build-time dependencies of the package. This is then used to build the result (with a temporary artifacts directory that is deleted after, but can be kept). The result is then scanned for the md5 hashes in the paths of the build dependencies, anything that shows up is a runtime dependency. These derivations are created from "Nix expressions" which is just a nice small language for writing configs and defining packages.

Does it have all of PJC's advantages? Yes.

Some things takes from the PJC readme:

Cross compilation

Yes Nix can do this, and if the underlying builder doesn't support it, it can transparently farm builds out to a machine of the right type while still running commands on your machine.

Symlinks

Nix can use these since the file paths are deterministic, always /nix/store/hash-blah/

Parallelize builds. Recover builds.

Nix knows all dependencies and auto-parallelizes. There's a preference to make it save build artifact folders.

Export your entire project to a Makefile or shell script that can run on any network isolated machine, even if that machine does not have node/npm/opam/crate installed etc.

Nix can't do this but it can export the transitive closure of dependencies of anything, either source or result to another machine. That other machine must have Nix but Nix runs on many OSs and is light. Once you have the closure you don't need the internet to build. Or you can just ship the closure of the build product instead of the source.

pjc is not a build system for any particular compiler. It is a "meta build system", that helps all your individual pacakges' build systems work in harmony together, and then exposes an approachable human interface to that process.

Same for Nix. It already has setups for building and using the package ecosystems of C++, Node, Ruby, Rust, OCaml, Emacs, Python and some more languages.

pjc is not a package manager. It only requires that package sources be located entirely on disk before it begins working. It follows the npm directory structure convention, but this isn't central to pjc. What is central is that pjc tollerate multiple versions of packages existing simultaneously.

Nix is similar. It can fetch packages from the internet, but you have to specify the exact hash of the download result. There are tools like npm2nix that turn package.json files into Nix specifications of the correct versions and packages to download and build. Nix requires that you specify all versions exactly, but these tools can use solvers to generate Nix files that specify versions that should work together.

Package managers like yarn could make use of pjc packages' package.json fields in order to perform a more optimal installation (deduping more optimally).

Maybe package managers could do this with the same information from Nix too?

pjc build traverses the dependency graph and automatically runs each package.json's build scripts with a perfectly constructed environment. Each package build will see the right environment variables. Among others, it will see a PATH augmented with binaries built by its dependencies, along with any other variables its depenencies want it to see.

Nix makes sure to isolate things so that they can only use dependencies that are specified properly, so you never mess up. It passes information through environment variables to build scripts.

pjc also automatically prepares an ocamlfind directory structure lib/bin/doc for each depdendency to install itself into. pjc tells you where this is located by setting another special environment variable. (It also generates an ocamlfind.conf).

Nix doesn't do this, but I think you could write a Nix expression that could.

Builds out of source, installs out of source.

Nix can build out of source if you tell it, or use a "binary cache" if you specify one. This is great and symlinking still works.

Allows cleaning up of build artifacts trivially, merely by deleting a single _build directory.

Nix does this by default, but you can override that.

pjc can be invoked with arbitrary shell commands

The way development works with Nix is that in a repo you have a default.nix file which describes the dependencies and how to build the project. In that folder you can run nix shell, this transparently install the dependencies if you don't have them, and puts them in the environment. It also defines some commands that you can use to compile the project the same way Nix would if you asked it to build default.nix. You can now run arbitrary shell commands, do dev work, and build it by running the build tool directly or through Nix.

Things Nix has that PJC doesn't.

  • A global cache: I saw this mentioned on Twitter but I don't see how it would work properly based on the spec. How do you plan on separating builds of the same package against different versions of the dependencies and with different compile flags?
  • Binary downloads: It's all well and good to always build from source for small projects, but it's really nice when you scale up to be able to install things super quickly, while still being able to customize the build and source if you want.
  • Enforcement: Nix tries really hard to make it difficult to screw up. Your things will be deterministic, nearly guaranteed.
  • It exists right now: Nix already exists and has thousands of things packaged for it and works with many language ecosystems.
  • An OS: NixOS extends Nix to OS configuration and allows you to really easily deploy software packaged with Nix in a deterministic way. Way better than Chef/Puppet.
  • CI and remote builds: Nix can build things transparently on remote machines. It also has a CI server (Hydra) which can distribute builds to a cluster, do tests and act as a binary cache. Imagine each unique version/configuration of your code being built only once ever across the entire company on any machine.

Commentary

I think PJC has a lot of great ideas, but Nix already does a lot of the things that it wants to do, today.

One way to harness this is to implement PJC on Nix, I think the first implementation should do so to leverage the existing ecosystem to the fullest extent and make things easier.

Or, you can go all in and reimagine PJC as a utility to work with Nix, discarding the parts of the spec that are easy to do directly with Nix, and focusing on being an easy workflow with npm2nix and nix for building packages in development. Nix has a really great story for production builds, but incremental development workflow is where it could use some scripts to make everything tie together nicely.

@jordwalke
Copy link

A global cache: I saw this mentioned on Twitter but I don't see how it would work properly based on the spec. How do you plan on separating builds of the same package against different versions of the dependencies and with different compile flags?

Right now esy does have a global cache and takes into account the hash of resolved dependencies in order to compute the hash of the build package.

Binary downloads: It's all well and good to always build from source for small projects, but it's really nice when you scale up to be able to install things super quickly, while still being able to customize the build and source if you want.

Our thinking was that the way we cache allows instant downloads simply as a special case of the global cache. If you can just write a tool that pre-floods the local cache with the right artifacts ahead of time, then "building" is instant. It's all about getting the cache keys right.

Enforcement: Nix tries really hard to make it difficult to screw up. Your things will be deterministic, nearly guaranteed.

esy currently scrubs environment variables and isolates package builds pretty well, but I'm sure there's a ton of edge cases we're missing that nix already handles.

@trishume
Copy link
Author

trishume commented Jan 4, 2017

Also, is there a "local project" workflow which does not pollute the global environment (aside from caches)?

The Nix store basically is a cache. Environments are actually also items that get built into the Nix store, they are basically just a directory full of symlinks. There is a separate command line tool for managing these "profiles" including building them, updating them and allowing you to roll back to previous versions. On NixOS this is used for managing the entire system, but on other systems this can just manage the "default environment". There's a tool called "nix-env" that allows you to easily manage this environment. nix-env -i package is like npm install -g.

When you tell Nix to build an expression file or directory for development it will place a symlink in your current directory called ./output to the resulting file or folder of the build.

Comment above

Okay so it sounds like esy is basically you trying to rewrite Nix. Nix does basically the same things, except it's been worked on for 10 years now and has a big community already. It has patches and fixes for all sorts of nondeterminism, like unix file creation time attributes and current home directory. It even has an option to do builds in a chroot, or even a VM (which it can create for you).

Side note: one of the cool things about the global cache is that it can quickly create containers and VMs that just mount your entire /nix/store/ from the host as read only and use things from there so that the VM image can be <1mb and created instantly. The NixOS hydra uses this for running tests like creating a game server and having other VMs join it and play as an integration test.

@jordwalke
Copy link

Thanks for the replies! I think the truth is that the majority of the work we've done in esy is actually just integration to make ocamlfind/opam, and package.json work well together, as well as exporting to Makefile. Sounds like each of those could be tools built on top of nix, using nix to power the workflow at the core. We're not at all opposed to doing that, but we would definitely need some nix expertise.

One cool thing about esy is that it is built in a way that puts export first - meaning, when you build a project in the standard development workflow, it generates a Makefile - the same Makefile that you would generate when exporting the project to a CI host - and then runs that makefile locally. I wonder if there's a benefit to making the "export" feature first class, and not bolted on. I also wonder if there's downsides to it.

Either way, it would be cool if you could check out the example project, and esy implementation of pjc, and let us know if you think there's a faster way for us to achieve the desired workflow - given that we have very limited nix experience. If you're interested in helping us port everything over to nix (or know someone who is), we're definitely interested. We hang out in the #packageManagement channel of http://discord.gg/reasonml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment