Why we dropped Lerna from PouchDB
We dropped Lerna from our monorepo architecture in PouchDB 6.0.0. I got a question about this from @reconbot, so I thought I'd explain our reasoning.
First off, I don't want this post to be read as "Lerna sucks, don't use Lerna." We started out using Lerna, but eventually outgrew it because we wrote our own custom thing. Lerna is still a great idea if you're getting started with monorepos (monorepi?).
Second off, it's good to understand why you might want a monorepo in the first place:
- Contributions are easier (no more "this issue is in the wrong repo, please post it here")
- Cross-package changes are easier (just make one git commit)
- Testing is easier (test the whole thing, don't make separate test suites or copy tests)
Third off, it's worth understanding why you might want separate packages instead of sub-modules
require('lodash.uniq') instead of
require('lodash/uniq')). In the case of PouchDB, 99% of the reason we wanted
separate packages was because the main
pouchdb module has a dependency on
leveldown, which is a native module that
takes foreeeeeever to
npm install and may actually fail on certain architectures. Ditto
sqlite3, which is used by
Combine that with the fact that we have lots of different adapters that you can use to
mix-and-match your own custom build, and the fact that plugins frequently
want to re-use little bits of functionality from PouchDB core (such as the
ajax() module), and publishing separate packages suddenly makes a lot of sense.
OK, so here's the deal with Lerna. Basically Lerna has three different steps:
lerna bootstrap, which links all of your sub-packages together so you can easily test them without a lot of
npm linking. Lerna does this by creating separate
node_modulesfor each sub-package, then inserting pseudo-packages that simply
require()the parent package. (E.g.
packages/a/node_modules/b/index.jswill simply contain
module.exports = require('../../b');.) This is a neat trick that avoids a lot of
npm linking (which in my experience can be very faily w.r.t. circular dependencies and have surprising side effects due to symbolic links).
lerna run, which you would normally use to run your build inside each sub-package, e.g.
lerna run buildwill run
npm run buildinside each sub-package.
lerna publish, which publishes all packages to npm and does some other magic to update git tags, etc.
Let's dissect each one of those steps. For
lerna boostrap, we were actually using this in PouchDB, and this was the main benefit we were getting out of Lerna.
lerna run, we were originally using it to run Rollup in each sub-package, but quickly realized that with ~30 packages, running 30 Node processes for each one (i.e. doing
npm run build 30 times) was too slow. It made more sense to just write one big
build.js script that built each sub-package inside of a single Node process. I can't remember the speedup, but it was something like 60 seconds vs 5 seconds (those numbers are completely made up).
lerna publish, we actually don't use Lerna's "independent" mode (
which is what Babel uses correction: Babel uses "locked" mode, see comment below). Independent mode would mean that every sub-package would have its own semver and would get updated accordingly when its dependencies got updated, but we figured this would be way too complicated for PouchDB users, and it was simpler to just lock everything to a single version. Therefore we didn't really need
lerna publish – we could just run
npm publish in a loop, and that was good enough (along with a script to update the version number in every
package.json, which is equally easy to write).
So that leaves us with
lerna bootstrap. After talking with Stephan Boennemann, though, and reading his Alle proposal, I realized we could avoid it entirely by simply renaming the
packages/ folder to
packages/node_modules. Because of how the
require() algorithm works, any reference to e.g.
require('pouchdb-ajax') from within
packages/node_modules/pouchdb will resolve to
require() just walks up the file tree until it finds a
node_modules folder with a sub-folder that matches the package name. This cuts out the
lerna boostrap step, which shaved about 30 seconds off of our
npm install time (which is huge when we have dozens of Travis builds).
Using the "Alle" model also allowed us to move all of the sub-package's dependencies up to the top-level
package.json, which worked around a current Greenkeeper limitation, which is that it doesn't work with monorepos if your dependencies are declared anywhere but the top-level
package.json in the repo root. But by having our dependency versions declared at the top level (and a script to update the sub-package's
package.jsons right before publishing), we can continue using Greenkeeper like normal. (Again, this works because of how
require() works; it just keeps walking up until it finds the right
node_modules.) As an added bonus, we don't have to try to coordinate the versions of dependencies used by sub-packages (which is a real problem we ran into).
So that's it! We switched to the "Alle" model because it worked better for us. On the other hand, I wouldn't discourage anybody from using Lerna, because it provides a lot of good out-of-the-box tools for working with monorepos, and if you have a more complex setup than ours (e.g. you're using independent versioning), then it can save you a lot of boilerplate. And even if you find you can speed up your builds by removing Lerna and going custom, there's no reason not to start with a Lerna-style build system.