Skip to content

Instantly share code, notes, and snippets.

@rektide
Last active August 29, 2015 14:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rektide/bb6cec805fe73ae3875c to your computer and use it in GitHub Desktop.
Save rektide/bb6cec805fe73ae3875c to your computer and use it in GitHub Desktop.
bottom-up-vs-top-down-engineering

My feedback from the perspective of a framework developer is quite different. I found the tone and attitude towards express.js to be concerning and somewhat offensive. Here was a developer blaming the framework he chose for poor architecture when they never bothered to actually learn how the framework works in the first place. Everything that followed originated from this basic lack of understanding.

Express is designed with from a classic framework-developer perspective, exposing primarily a 'consumption-only' that allows developers to assemble together complex routing structures.

Express uses a very simple router logic which is at the core of how express works, so let’s examine that first (my knowledge of express is somewhat dated but I think the principal is still the same). Express keeps a hash of the various HTTP methods (GET, POST, etc.) and for each method, an array of middlewares. Each middleware is just a function with signature function(req, res, next) and an regular expression used to decide when to execute the middleware based on the incoming request path.

Express doesn't give any feedback on this router logic. It's API docs do not advertise any means to look inside the middleware to see the state of composition. Express doesn't advertise any means of telling a request handler any of the state of affairs. So: Express can be simple, might be simple, but it's not apparently simple.

The router logic is pretty simple. When a request comes in, the routing table it looked up using the HTTP method which returns an array or middlewares. The array is then iterated over by matching the request path to the regular expression assigned to the middleware. When you add generic middleware to express (that is, not path-specific), those are added to the same array but without any path requirements. When a middleware is invoked, it can end the workflow by not calling the next() callback (or call it with an error). If next() is called, the next middleware match is executed. This is how all your app.use() middlewares are called before and after the route handler.

Express's logic is simple, but it's not exposed. It's internalized, compartmentalized, enclosed and composed within Express. Express won't share information about it's state with you. This makes Express very hard to learn from, and goes to give somewhat of a nod to Netflix's confusion: if Express provided a means for the runtime to see what was occuring inside of Express, there would be a better case to be made against Netflix not knowing.

While this is not a design pattern I would choose, it is simple and elegant. It keeps the entire framework codebase minimal and consistent with its architecture. It is really all about middleware, some with filters (e.g. path handlers).

The post criticizes express for storing the routing table in an array vs a hash or tree. As it turns out, the complexity tradeoff between iterating over an array (which tends to be short given most application routing requirements) and walking a tree makes arrays a much better choice. I’ll also point out that Resitfy, the alternative framework Netflix listed as their new choice, does the same thing (though instead of recursive calls, it uses a for loop). The only router I know that uses a fancy tree is director and that design significantly handicapped it’s feature set and usefulness.

Matching routes to requests is tricky because developers like to use everything for defining their routes. This includes simple strings, regular expressions, wildcards, and path parameters. You can’t store these in a hash because you cannot look up a regular expression match based on a string. You have to iterate somewhere. A hash will only work if you limit routes to static string values (and if that’s the case, why use a router at all).

Yup. And more-so, if you want to have something other than a linear routing, just use router module that'll do that.

The criticism about allowing routes to repeat in the express array shows that even after doing all this work, the Netflix team still doesn’t understand how express works. I am only pointing this out because they made a public statement putting express down without acknowledging the reasons. The middleware architecture requires repeating the same path multiple times in the array because that’s how the matching works. It also allows powerful chaining of small actions on a route without having to collect them all in a single function (e.g. pre and post route processors).

To give a more concrete example, a common pattern I find is a chain of: static filesystem handler, asset compiler handler, static filesystem handler. Starting from a cold state, and app will miss on the first static serve. The compiler will find the asset and install it, and the duplicated static handler will serve the now installed asset..

Less-compiler for example is used in this manner frequently.

The express design, and for that matter all the other framework I am familiar with except my own allows adding conflicting routes. This is not a bug but an outcome of their extremely flexible route matching support allowing regular expressions and wildcards. You cannot compare two regular expressions to decide if and how much they overlap, and in which order they should be compared. This is a limitation coming directly from the feature set. It is a simple tradeoff.

In hapi, we limit the types of routes you can add so that we can enforce strict limits on conflicts. We also worked hard to ensure that the order in which routes are added doesn’t matter. Routes are matched based on a deterministic algorithm that sorts the table on every addition. This was a very important feature for us working in a large team where people might not be aware of the routes added by others, or where the order in which plugins are loaded can cause unexpected production issues. These are all decisions we made based on months of hands-on experience building applications on top of express and director.

The Netflix post does take responsibility for failing to understand how the framework they chose works. But that admission does not excuse criticizing the framework as inept. The express architecture has worked well for many people. It has known limitations which is why there are so many other frameworks to choose from. One isn’t superior to the other without the context of your use case and requirements. There is no “best”.

Much of the failure here as I see it is the process of frameworkization itself: Express, in attempting to compose a solution created a resistance to gaining insight in operation-ality of their construct. Express was framed to be a computational module one inserts into HTTP that "just runs" based off a number of inputs provided to it: it was not intended to be operated, to have external agents watching or following it's sequencing of operations.

And this is really the critical fault. Composition creates systems which attempt to provide things "for consumers," which are resistant to bottom-up engineering and re-engineering. But this aspect of composition- what we see of as good, top down engineering, is the enemy of operation-ality: we need baser matter available, that we can watch. Frameworks are not just for people "using" them: the framing needs to go to extents to also allow people to see and understand the behaviors in a directly, experiencable fashion. Express, by talking about it's concepts but not providing programmatically observable means, created a gap where problems such as Netflix's will always naturally emerge.

It's not enough to write consumer-facing libraries and frameworks. It's not enough to solve problems- such as Express's chosen problem of assembling HTTP handlers- in a way that stably, reliably creates good top-down engineering praxis. Netflix certainly found themselves in a gap where they had incidentally fallen out side of that praxis, and some of the fault lies on them. But the findability, the discoverability of that situation was greatly hampered by Express being designed with such top-down, consumer-facing API-centrism. Had Express exposed more of it's runtime, had Express more bottom-up places where it's workings could be observed, there would be both a technical, programmatic means to create the kinds of safe-guards that you boast of in HAPI, and it's more likely we'd have created a culture of observability and operationizability where these bottom up hooks were use in common practice to monitor and check Express's running.

The initiative to write software that 'knows it's purpose' and which is opinionated is very strong. A huge amount of developer mindshare has been around the productization of software libraries and frameworks. But ultimately I think Engelbart and many others have good basis where we can find other motivators: software isn't just running machines. Software is a model, and we need to remain intimiately connected not just with how we plug in to the models we consume, but we need awareness and operationizeability of the models we build on top of. Express.js was engineered in opposition to that augmentative construction: it was designed as a solution in the middle, one we we were told to rely on, one we sold ourselves as solving our problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment