Skip to content

Instantly share code, notes, and snippets.

@ehaynes99
Last active August 31, 2022 14:34
Show Gist options
  • Save ehaynes99/fd903aef21353c3797e7cf57a1d4fbbb to your computer and use it in GitHub Desktop.
Save ehaynes99/fd903aef21353c3797e7cf57a1d4fbbb to your computer and use it in GitHub Desktop.

Workspaces Module Resolution

Originally introduced by yarn, starting with version 7 npm also provides support for this.

What is it?

Workspaces are first-class support in the package manager for monorepo structures in npm packages. A monorepo is simply a collection of related packages, and having them share a single git repo can help eliminate a lot of redundancy in the project setup, as well as make it easier to work with interdependencies during development.

What does a workspaces project look like?

They're relatively simple. They consist of:

  • a parent directory whose package.json contains a list (or wildcard pattern) defining subprojects
  • one or more subprojects containing a package.json file similar to a regular npm package.
// <root>/package.json
{
  "workspaces": [
    "package-a", // `./package-a` is a subproject
    "package-b", // `./package-b` is a subproject
    "libs/*"     // each subdirectory of `./libs` is a subproject
  ]
}

Why use them?

For starters, they allow for a single definition of the various project-level configurations like linting, gitignore, test configuration, etc. Additionally, all devDependencies can be in the root package.json, so not duplicated amongst all of the subprojects. However, the most important reason is the way they deal node_modules and the resulting impact on developing with subproject interdependencies.

TLDR;

With workspaces, your project will hoist node_modules (including symlinks to subprojects) and install them at the top level like this:

.
├── node_modules
│   ├── package-b -> ../packages/package-b/
│   └── some-library
└── packages
    ├── package-a
    └── package-b

Traditional npm link based alternatives created a structure like this:

.
├── package-a
│   └── node_modules
│       ├── package-b -> ../../package-b/
│       └── some-library
└── package-b
    └── node_modules
        └── some-library

Long-winded description of the main problem with alternatives

Monorepos existed before the workspaces feature was added to package managers. The main problem with developing packages this way was handing interdependencies in the local environment.

How interdependencies work in production

First, let's look at what we actually want as an end result when we publish interdependent packages. Say we have 2 pacakges: package-a and package-b. package-a depends on package-b. Furthermore, package-b depends on some other third-party library some-library. The file structure of these as separate packages in your dev environment would look something like:

.
├── package-a
│   └── node_modules
│       └── package-b
└── package-b
    └── node_modules
        └── some-library

If we publish both of our packages, some application that wants to use package-a would add it as a dependency:

"dependencies": {
  "package-a": "^1.0.0"
}

When we npm install, its node_modules would look like this:

some-application
└── node_modules
    ├── package-a
    ├── package-b
    └── some-library

npm flattens all modules into one directory. The project depended directly on package-a, so that gets installed. package-a depends on package-b, so that gets installed too. Finally, package-b depends on some-library, so we install that as well, and the result is that all of those are peers in node_modules. This is the desired effect when the library is released.


How we want the development environment to work

Since these are closely related, we want to be able to work on both at the same time. Every time something changes in package-b, we need to pick up these changes in package-a to be able to test it. Using separate projects, we would need to publish package-b to an npm repository, then update the dependency in package-a. This can get really unwieldy, so ideally we have a way to test the changes before publishing anything. This is where workspaces come in.

To describe the advantage of workspaces, let's first examine the problems of the alternatives. There are numerous ways of acheiving this behavior... npm link, yalc, npm local paths, etc. These all function in roughly the same way. In package-a, instead of downloading each new version from the npm repository, we instead symlink the dependency within node_modules:

.
├── package-a
│   └── node_modules
│       └── package-b -> ../../package-b/
└── package-b
    └── node_modules
        └── some-library

Now, when we change something in package-b, we can rebuild it, and that change will immediate be available in package-a. Everything works, and they all lived happily ever after.

Just kidding.

This works... sometimes. The main problem with this strategy is transitive dependency conflicts. As all of the above methods use a similar solution, they all suffer from the same problem.


Let's say we really want to live on the edge and have package-a depend on some-library as well. If we publish our libraries like this, in production, the end result is exactly the same:

some-application
└── node_modules
    ├── package-a
    ├── package-b
    └── some-library

npm detects the duplicate dependency, and it's installed just like before*. However, in our DEVELOPMENT environment, we now have this:

.
├── package-a
│   └── node_modules
│       ├── package-b -> ../../package-b/
│       └── some-library
└── package-b
    └── node_modules
        └── some-library

Now, we have 2 distinct copies of some-library on disk. Let's assume that they're the exact same version, and are thus completely identical copies. Disk is cheap these days, so what's the problem?

Well, in this situation, code that imports some-library that's part of package-a will load the version at package-a/node_modules/some-library, and code that's part of package-b will load the version at package-b/node_modules/some-library. Sometimes, this is no problem. The imported code is the same, and the behavior is the same. However, the actual VALUES exported from some-library are distinct.

To illustrate, say that some-library consists of just one file:

// some-library/index.js

export const cache = {}

package-b references the cache value, e.g.

// package-b/index.js

import { cache } from '@ehaynes/some-library'

export const getCached = (key) => {
  const value = cache[key]
  return `${key} -> ${value}`
}

And finally, package-a refers to getCached:

import { cache } from '@ehaynes/some-library'
import { getCached } from '@ehaynes/package-b'

cache['foo'] = 'oof'

console.log('cached value:', getCached('foo'))

So what happens when we run package-a/index.js?

$ node index.js 
cached value: foo -> undefined

What happened? Step by step:

  • package-a imported cache from package-a/node_modules/some-library
  • package-a set the cache value for 'foo' to 'oof'
  • package-b imported a DIFFERENT value of cache from package-b/node_modules/some-library
  • package-b read the cache value for 'foo' from that separate object, which resulted in undefined.

While this is a contrived example, there are many cases where a library might export a constant, and these cause cryptic issues that are hard to track down. Some examples:

  • anything used as a Map key
  • Symbols
  • instanceof checks
  • demonic Java features shoehorned into Typescript err, I mean, @Decorators and all of the reflect-metadata mess on which they rely
  • Typescript class type definitions
  • Static members in classes
  • React Context objects
  • The entire React 16.x render tree

The worst part about this is that it sort of works. You don't get warnings or errors or failures to start. The code runs, but but doesn't work correctly.

How does the workspaces structure resolve this?

The answer lies in how node resolves dependencies. When node attempts to import code, it does a hierarchical traversal. In other words, from the file doing the import, it looks for a node_modules directory as a peer of that file. If one exists and contains the name of the dependency, it loads the module from there. If not, it steps up a directory and repeats until it reaches the filesystem root. Thus, when package-a/index.js imports some-library, it searches:

  • <root>/packages/package-a/node_modules/some-library - doesn't exist
  • <root>/packages/node_modules/some-library - doesn't exist
  • <root>/node_modules/some-library - Found it!

Similarly, when package-b imports some-library, the same process happens, and the same (and only) copy on disk is resolved. Modules are only loaded once, so all subsequent imports get the same in-memory reference.

Similarly, when package-a imports package-b:

  • <root>/packages/package-a/node_modules/some-library - doesn't exist
  • <root>/packages/node_modules/some-library - doesn't exist
  • <root>/node_modules/package-b -> <root>/packages/package-b - This is a symlink to the package-b subproject, but that's fine. It's loaded just as if it were installed here from an npm repository.
@jcardali
Copy link

I assume this is a solved case, but what happens if package-a and package-b have different versions of dependency-c? I guess that would live in the respective packages node_modules, as opposed to top-level one?

@ehaynes99
Copy link
Author

ehaynes99 commented Feb 18, 2022

If there is a conflict between the versions, it will:

  • scan all of the packages to find the most common version
  • install the most common version in <root>/node_modules
  • for the packages using other versions, it will be installed in their respective node_modules

So e.g. if package-a and package-c have the same version*, but package-b uses an incompatible version, it will end up with a structure like:

.
├── node_modules
│   └── some-library  <-- 2.0.0
└── packages
    ├── package-a
    ├── package-b
    │   └── node_modules
    │       └── some-library   <-- 1.0.0
    └── package-c

If they are equally common, it looks like it hoists the newer version

* technically not "same version", but rather "compatible versions". E.g. if one has ^2.0.1 and ^2.0.3, ^2.0.3 (or newer) satisfies both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment