Skip to content

Instantly share code, notes, and snippets.

@rehno-lindeque
Last active June 3, 2018 13:15
Show Gist options
  • Save rehno-lindeque/e8ec5a03287e2432ed8e10e6a922dc00 to your computer and use it in GitHub Desktop.
Save rehno-lindeque/e8ec5a03287e2432ed8e10e6a922dc00 to your computer and use it in GitHub Desktop.
Experimenting with deferred builds in nix
let
makeDeferredBuildService = package:
let
drvPath = builtins.unsafeDiscardOutputDependency package.drvPath;
outPath = builtins.unsafeDiscardStringContext package.outPath;
drvName = package.name;
gcrootDir = "/nix/var/nix/gcroots/deferred-builds";
deferredBuildScript =
with pkgs;
writeScript "${drvName}-defered-build" (''
#!${stdenv.shell}
''
+
# Realise the build output. Since this could take hours or even days, it is not done during the course of a
# normal nixos rebuild. Note that if a previous output for this service already exists it will end up being
# used in the mean time.
''
output=$(${nix}/bin/nix-store --realise ${drvPath})
''
+
# Create a garbage collector root for the newly built output so that it will not be gc'd.
# If a gcroot for a previous output exists, it will be replaced so that the old output is now orphaned.
#
# TODO: possibly use system.activationScripts to clean up these gcroots for services that have been completely
# removed.
#
''
mkdir -p ${gcrootDir}
previous_model=$(readlink ${gcrootDir}/${drvName})
ln -sf $output ${gcrootDir}/${drvName}
''
);
in {
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart =
# Take an exclusive lock in order to build only one model at a time.
# Otherwise, it seems likely we would start thrashing due to high memory requirements of training.
#
# An alternative might be to set --max-jobs 1 and/or --option build-max-jobs 1.
#
"${pkgs.utillinux}/bin/flock /var/lock/deferred-builds.lock -c ${deferredBuildScript}";
Type = "simple";
Restart = "on-failure";
RestartSec="5min";
};
};
in
{
systemd.services.example-1 =
makeDeferredBuildService
(pkgs.runCommand "example-1" {} "echo 'Long running build (1)' ; sleep 10 ; echo 'done (1)' ; exit 1");
systemd.services.example-2 =
makeDeferredBuildService
(pkgs.runCommand "example-2" {} "echo 'Long running build (2)' ; sleep 10 ; echo 'done (2)' ; exit 1");
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment