conartist6/post.md

## post.md

      
    Raw
  

              post.md
            
          
    I'm working on a brand new build system, and I want to take a moment to address the obvious question, which is: OMGWTFBBQ why another one!?
The answer is in the trees.
The new tool, Macromé (pronounced "macro-may", imported as macrome) is powerful becase it performs in–place builds within existing directory tree structures, so that many types of data about the same underlying unit stay colocated.
For example a React component expressed as a tree-structured directory (a.k.a pod, or unit) and built by macrome might look like this:
[projectRoot]
├─ src/
│ └─ my-component/       # my-component is a unit
│    ├─ __tests__/
│    │  ├─ unit.spec.js
│    │  └─ integration.spec.js
│    ├─ index.ts
│    ├─ index.js         # generated from index.ts
│    ├─ index.js.map     # generated from index.ts
│    ├─ index.d.ts       # generated from index.ts
│    ├─ index.module.scss
│    ├─ index.css           # generated from index.module.scss
│    └─ index.module.json   # generated from index.module.scss
└─ macrome.config.js

Why is this a pain?

Your IDE may end up with many open tabs labelled index.*. That's ugly. Some IDEs will make it look better than others. It may also be more difficult to use your IDE to search only in non-generated files. But IDEs can always adapt, and relatively freely so, based on the needs of their users.
Why is this good?

There are several reasons. Let's look at them!
Better discoverability

Having a single directory with every kind of data relating to a particular code unit facilites discovery of all relevant parts of a system for first-time readers. Implementing engineers and code reviewers can both use the file structure to eliminate part of a class of bugs which have as their root cause failure to discover relevant data that should have been updated.
Many projects now embrace this theory of organization and its benefits, but they stop short of applying to their generated files -- their build assets -- which stand to benefit just as much. In fact build assets benefit more, because build output is frequently hidden in memory (see: ts-node), squashed in a huge file (see: webpack), mangled (see: uglify), or sourcemapped out of sight. This is a crying shame because the built output is your product, not the sources. Engineers should be seeing built output, and they should understand it. Having that insight can enable engineers to approach more code problems with build-based solutions, which ulimately faclitate the management of more complex and feature-complete codebases.
A quick case study: if you've ever tried to ship an HTML component as an npm package, you've probably ran up against some difficulties. The tutorials I've seen skate right over shipping stylesheets, which is a shame because that's the hardest part. I won't spend a lot of time discussing this particular problem, in part because there is no one correct solution -- but for those interested I've written up a gist that shows how CSS modules can be used to achieve a desirable built result. If you looked at the example, perhaps you learned something about how CSS modules work? Or perhaps seeing the structure of the data made you question how certain cases (like nested SCSS selectors) are handled? Good! Now you get it. That's discoverability.
Easier builds

Having a single directory permits files to reference each other using relative paths. Transforming such relative paths correctly during a build is much easier. In my-component above, for example, a build step applied to index.ts might find the line import styles from './index.module.scss'. The build system here might convert this into: import styles from './index.module.json'. Code to make such a transformation is simple, and has no concerns like configuration of a directory in which to expect other build output.
Better testing

By doing as much of your build output as possible in place your unit tests can become more valuable. You can combine the benefits of testing a unit in isolation -- rapid test evluation and feedback -- with the benefits of testing built output. Testing built output as the test code and environment will more closely resemble the application code and environment, making the test a better predictor of future bugs. When a test fails and emits a stack trace you can see real filenames and correct line numbers.
Source control

In addition to the reasons above, there's one other great benefit of built output in place. You have the option of checking the built output into source control. There are benefits and drawbacks to doing this. Drawbacks include additonal overhead for your source control system to manage, and some distortion of naively-computed statistics like lines of code changed. There's also the possibility of checking in incorrect or stale built output, though macrome attempts to solve that problem for you (with macrome check). BUT there are tremendous benefits as well.
When altering the build itself, for example, you'll get a whole diff showing changes to the built output. It can be very helpful to scan this diff. Are the changes are what you expected? This yet is another layer of luxury for developers attempting to make build changes.
A second benefit to checking in the built assets is that doing so will likely allow the resultant git commit to be used an ad-hoc npm package. This is tremendously good news for contributors to your repository. Users who fork your codebase can push changes to their fork, at which point testing their changes (or sharing the changes with others for experimentation) is trivial. For contributor octocat using such a package can be as simple as going to dependencies in package.json and changing: "great-pkg": "1.2.3" to "great-pkg": "octocat/great-pgk#ab3d8f0". This saves a good deal of work otherwise involved like changing the package name (e.g. by namespacing it) in order to be able to publish a release, which can then be annoying to prevent that change from being part of the PR.
Macrome

Ok, perhaps I've convinced you now that it is wise to structure your code by its functionality. What about this calls for a whole new build system? For the record the existing tool that macrome is most closely related to is eslint, and there is much that macrome has intentionally borrowed or stolen from it.


Cleanup: existing build systems usually only know how to clean up their built output by deleting the build target directory. Or perhaps they assume that every file of a certain type is temporary. If all your sources are typescript it's safe to rm *.js. Of course, maybe it isn't. This is likely to lead to one of two outcomes: accidentally deleting a file you had work in, or failing to delete a file which is no longer being built. I've had both. Macrome marks all files it generates with a leading annotative comment that it recognizes -- that way it can always clean up after itself.


Watcher longevity: If you're developing on a project you can just run macrome watch and forget about it, even as you move freely through your repo's history. Again this is tremendously useful when creating build systems that are designed to change. It's easy enough for you to remember to restart your build watcher after every change you make to the build configuration, but how about others working in your codebase? Do you need to remind everyone that they'll need to restart their build? Will they remember to take the right steps every time they do a rebase or checkout that involves a build change? Macrome ensures that you needn't worry, any changes to macrome.config.js or its build steps (generators) are being watched so that built output will update as necessary.


Indexing: Macrome also makes it easy to create indexes or lists. For example if you had a routes/ directory macrome could help you generate a routes/index.js which would likely be implemented to emit a line for each route file found. Other tools offer some of this functionality, but many are also implemented to walk the directory tree at runtime which will be slower, opaque to static analysis, usually limited to a single directory, and could quietly produce unexpected results.


Code modding: A special case of an in-place build is one where the outputs overwrite the inputs. Macrome allows this, and is well suited to it since each generator is able to configure a specific set of paths for which it is relevant. This is ideal since it creates a self-contained modification which is already set up to be run repeatedly. This is how code mods should be developed. In an ideal world instead of running a mod once and then manually tweaking the results (fixing edge cases, resolving merge conflicts, etc) you would continue tweaking and improving the mod until you were ready to apply it permanently, at which time you or even CI would do so. The mod can then stay in the codebase so that other developers can apply it to their changes if necessary, preventing further bugs that can emerge during manual conflict resolution. For fast initial development your mod you can transform file.js => file.modded.js and thus leverage the ability of macrome watch to continuously rebuild all modded files as the mod itself is developed.