Skip to content

Instantly share code, notes, and snippets.

@isaacs
Created December 9, 2019 06:20
Show Gist options
  • Save isaacs/584fd14e68bffbcfd8f0c86b05d74ff7 to your computer and use it in GitHub Desktop.
Save isaacs/584fd14e68bffbcfd8f0c86b05d74ff7 to your computer and use it in GitHub Desktop.

tree reification

For the tree:

a
+-- b
|   +-- c
|   +-- d
|   |   +-- e
|   +-- f
+-- x
    +-- y
        +-- z

To turn it into:

a
+-- b'
|   +-- c'
|   +-- d
|   |   +-- e'
|   +-- f
+-- x
    +-- y'
        +-- z

That is, update b, c, e, and y, but not d, x, or z.

Considerations

  • Windows will fail with EPERM when trying to move or delete a folder with current (or even recent) file operations happening inside it.
  • Nodes with bundleDependencies also bundle the meta-deps of their bundled deps. But, we don't know what those are from the manifest, and it's possible that there's overlap between bundled metadeps and regular dependencies, meaning that there can be extraneous nodes in the ideal tree once all the tarballs are unpacked.
  • This is exceedingly time-sensitive.
  • Users generally expect that a failed install should not make their app unusable.

Fast and Dirty Approach

This is the fastest and most efficient way to proceed, but does not admit rollbacks very easily in any way that is EPERM friendly on Windows.

step 1: clear away excess

Scan actualTree inventory. For each node:

  • If not in idealTree, rimraf path, remove from actualTree (so we don't check its children).
  • If in ideal tree and identical resolved/integrity, leave it alone.
  • If in ideal tree and different, and no unbundled children, rimraf path, remove from actualTree
  • If in ideal tree and different and has unbundled children,
    • rimraf package contents
    • rimraf bundled children and remove them from tree

step 2: lay the path

mkdir every path in ideal tree that does not exist in actualTree

step 3: extract bundlers

  1. Group all nodes with bundles by depth.
  2. for depth = 0, for all nodes at that depth with bundleDependencies,
    1. If no nodes at that level with bundles still in the tree, done.
    2. extract all nodes with bundles at that depth level in parallel
    3. call loadActual with a new Arborist on each one, and move each depth=1 child node into the ideal tree.
  3. if any bundle deps unpacked, prune tree
  4. pacote.extract all remaining nodes in parallel
  5. rm any leaf node folders not extracted into

step 4: extract all others

Extract all remaining packages in parallel.

Their folders already exist, so it's fine to just dump them all into place.

Staged Rollback-able Process

This algorithm avoids ever renaming a directory, or removing a directory with recent writes (except in the case of failure rollbacks), so as to minimize the chances of hitting Windows file-locking EPERM issues.

It is very safe, and somewhat disk-inefficient.

step 1: move shallowest nodes to be replaced out of the way

a
+-- b (.b-original)
|   +-- c
|   +-- d
|   |   +-- e
|   +-- f
+-- x
    +-- y (.y-original)
        +-- z

Fail: rename each .${name}-original to ${name}

step 2: unpack sparse tree of nodes changing

a
+-- b (.b-original)
|   +-- c
|   +-- d
|   |   +-- e
|   +-- f
+-- b'
|   +-- c'
|   +-- d (empty)
|       +-- e'
+-- x
    +-- y (.y-original)
    |   +-- z
    +-- y'

To maximize parallelization while minimizing unnecessary fetches for bundled deps and meta-deps:

  1. mkdirp all the leaf nodes in the tree in parallel
  2. Group all nodes with bundles by depth.
  3. for depth = 0, for all nodes at that depth with bundleDependencies,
    1. If no nodes at that level with bundles still in the tree, done.
    2. extract all nodes with bundles at that depth level in parallel
    3. call loadActual with a new Arborist on each one, and move each depth=1 child node into the ideal tree.
  4. if any bundle deps unpacked, prune tree
  5. pacote.extract all remaining nodes in parallel
  6. rm any leaf node folders not extracted into

Fail: remove sparse tree, fail step 1

step 3: move unchanging nodes into sparse tree

a
+-- b (.b-original)
|   +-- c
|   +-- d (empty)
|       +-- e
+-- b'
|   +-- c'
|   +-- d
|   |   +-- e'
|   +-- f
+-- x
    +-- y (.y-original)
    +-- y'
        +-- z

This actually means that we move each unchanging node's contents (other than node_mdules) into the new location. (Maybe we ought to only ever move files, not directories?)

Fail: move unchanging nodes back to staged tree, fail step 2

Windows Consideration! Extremely easy for a failure in this step to lead to EPERM in the rollback, if we try to rimraf the sparse tree before we're fully moved out of it.

step 4: rimraf staged original (now sparse) nodes

a
+-- b'
|   +-- c'
|   +-- d
|   |   +-- e'
|   +-- f
+-- x
    +-- y'
        +-- z

Fail: report failure as a warning and instruct user to rm -rf the garbage directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment