Skip to content

Instantly share code, notes, and snippets.

@wesleytodd
Last active September 16, 2017 04:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save wesleytodd/2569a4dcb2ca5fc85457e77be2624ad9 to your computer and use it in GitHub Desktop.
Save wesleytodd/2569a4dcb2ca5fc85457e77be2624ad9 to your computer and use it in GitHub Desktop.

Architecting Node Apps At A Startup

Startups move fast. Part of moving fast is being willing to make mistakes. At StreamMe we have moved fast, and made plenty of mistakes. We have also worked hard at finding solutions to move past those mistakes. Throughout this process we have found that the most important thing is not AVOIDING the mistakes, but eaisly setting up a system to rectify them when you have time.

One of the best things you can do to avoid the kind of mistake that takes months of work to dig yourself out of is setting a good foundation. The right kind of foundation will allow you to explore without encumbering your entire stack or application.

This talk I will go through some of the key mistakes we made at StreamMe, and what we have learned. Hopefully getting everyone to a place where they feel like they could start a new project on Node that will last for more than a month at a startup.

Build your apps without dependencies

As your application grows there will be infrastructure pieces and configuration that changes, often based on the server environment (dev, pre-production, production). Build your integrations to these in such a way that they are easily de-couppled. There are a few ways we mitigate these issues:

  • Automate: Jenkins, Salt, Vagrant
  • Use third part services: Google Cloud Storage, Mandrill
  • Proxy: our development proxy

Runnable

To help differentiate between the production requirements and local development environments we use a pattern and module called runnable. It basically relies on having two entry points to the application, one for development that uses development configs and opens up the http, and a second that loads prod configs and registers with our service discovery after starting the application. To understand what it does I will show you how we started using this pattern without the helper package:

// index.js
var express = require('express');

var main = module.exports = function (config, done) {

	// Kick off the application
	var app = express();
	app.get('/', function (req, res) {
		res.end('Hello Meetup! Our static asset host is ' + config.staticHost);
	});

	// Start server
	var server = app.listen(config.port, function (err) {
		done(err, server);
	});

	return server;
};

if (require.main === module) {
	main({
		port: 4000,
		staticHost: '//static1-pds.stream.me'
	}, function (err, server) {
		if (err) {
			return console.error(err);
		}

		console.log('Starting server on port ' + server.address().port);
	});
}
// bin/app
#! /usr/bin/env node
var consul = require('@streamme/consul');

// Load your config from the file system or a config provider
var config = {
	staticHost: '//static1.stream.me'
};

require('../')(config, function(err, server) {
	// Register with consul,
	// which returns the shutdown method
	var shutdownConsul = consul({
		name: 'app',
		port: server.address().port
	}, function() {
		console.log('server online');
	});

	function shutdown() {
		shutdownConsul();
		process.nextTick(function() {
			server.close(function() {
				process.nextTick(function() {
					process.exit(0);
				});
			});
		});
	}

	// On shutdown signals, clean things up
	process.on('SIGTERM', shutdown);
	process.on('SIGINT', shutdown);
	process.on('uncaughtException', function (e) {
		console.error(e);
		shutdown();
	});
});

You can run either of these files with node and you will get a running service. The key difference is that the bin script loads dynamic configs and then registers with Consul (service discovery) after. With the runnable module the index.js becomes even simpler:

module.exports = runnable(function (config, done) {
	// Same stuff here as before
}, [{
	port: 4000,
	staticHost: '//static1-pds.stream.me'
}, function (err, server) {
	if (err) {
		return console.error(err);
	}

	console.log('Starting server on port ' + server.address().port);
}]);

Also, try to make sure that they do not have any environment or external dependencies during development. That means all development configuration should be bundled into the source code repository. That also means your persistence layers and infrastructure should be either optional, or easily run separately for development.

Packages

One of the most important concepts in the node ecosystem is packages. It is at the heart of what we all love about working with node. Functionality can be broken down into small chunks and published to NPM. When you need to do something specific you can just look for the best package on NPM and one npm i later you are moving on with your app.

Now that you are setting up your application as SOA apps you still want to be able to share some code between all those shiny new apps you have, because DRY is still a thing. So like every good Node developer, we decided to make packages out of our shared code. We created a directory called modules and threw a bunch of directories in it. To use the packages we took three approaches:

  1. Just requireing the path: require('../../../modules/log')
  2. Symalinking to the node_modules: require('log') // returns code at /modules/log via /node_modules/log
  3. File based package installs: require('log') // returns code at /modules/log via /node_modules/log installed with npm

All three of those have serious problems, some of which they all shared, some of which they uniquely had. The first option means that you always have to figure out how many ../ to add, and this seems like a silly problem, but man did it frustrate me. It also had the issue that you couldn't just grep for the usages of the package with grep -rn 'require('log', and because we had packages required from packages it meant we could never be fully sure that we knew what was using the package when we went to change it. Lastly it means that you cannot support different versions of the same package in different apps.

Number two solves the silly frustration of the ../'s, but it means that you either need to share the unique namespace from NPM or prefix your internal packages. We took the approach of namespacing, so all of our modules started with v-, dont ask at the choice, it is a running joke because it really makes no sense. So this means that the modules that started as require('../../modules/log') became require('v-log'), alot eaiser to type, but way more confusing on how to find the code. It also shares the problem with #1 that all apps require the same version of the code.

Number three was arguably the best solution to the problem because it used NPM to do the installs, which avoided the custom symalinking solution, and made it so all you had to do to know where the module came from was look at the package.json. The issue was that when you were developing a feature you STILL had to symalink the code, or risk forgetting to apply your edits to the actual source file if you made them in the node_modules version.

Setup a local NPM

The apps in the monolith-soa app shared a single package.json and all accessed a shared modules directory. In that modules directory was everything from models to tooling, caching libs to database queries and front-end helpers. When a developer needed something in an application, the would just require('') and use it.

There will come a point where you have some code shared between your applications. Probably on day 2. In the past this, there was a hard decision to make, should you setup and maintain your own internal NPM. I say in the past because now it is easy to setup and run because of NPM Inc's NPM On-Site.

This also solves a large security issue that was recently publicized with the left-pad package issue. Because you are running a local copy you reduce your risk of falling victim to malicious packages. A local NPM in combination with npm shrinkwrap means that you can rely on reproducable builds and secure code.

Mature packages

  • in the app
  • moved to a packaged directory
  • published to internal npm

Unused?

The biggest issue for all of these options actaully comes down not to issues I mentioned above though. The real issue came when we wanted to update Node to 4.x across our stack. Turns out that from 0.10.x to 4.x there were a ton of incompatiable modules, and the modules were integrated into all of these

Architecture Styles

New features will happen, and development of them should not be hindered legacy code and practices. There are three major metholodigies for structuring your web application:

  • Monolith Applications
  • Service Oriented Architecture
  • Microservices

Monolith applications are when your entire application is run as a single application on your server. This usually means separate features all share code, persistance and dependencies.

Service Oriented Architecture, SOA, is the practice of breaking up your application into smaller applications based on features and responsibilites. There are 4 main tenents of an SOA application:

  1. Boundaries are explicit
  2. Services are autonomous
  3. Services share schema and contract, not class
  4. Service compatibility is based on policy

Microservices are generally considered a subset of SOA, and as the name suggests, is when you break your services down to even smaller units than traditional SOA. There are even people who go so far as writing single functions as services.

Our first try at SOA

The first application that we as a company wrote was a two monolith PHP applications to start. We relativly quickly realized that it was difficult to build out new and un-related features. We also wanted to introduce some new technologies, Go and Node. Obviously you cannot write a Node application inside of a PHP project. So those were our first services in an SOA approach.

Then, something that any startup might face happened, we had to pivot. We were lucky, we had the finances to spend time re-writing the application for our new site, and the development team with 2 years of expirence working together. So we setout to write the SOA of our dreams.

This is where we made the next mistake. Instead of following all 4 of the tenents of SOA, we only really succeded at #4 and parts of #2. We had multiple services that ran as separate executables in production, they talked to eachother via http and REST, but they shared code, persistence layers and often had unclear boundries.

Benifits

  • New technologies can eaisly be added to your stack
  • Changes can be rolled out slowly and safetly without having to test EVERY feature all at once
  • You can update versions of dependencies independently on an app basis
  • Service outages and errors do not take down your whole application

The service going down thing is a pretty big deal. An example that we expirenced recently was when someone decided to DDOS us via spamming chat messages from over 4000 account spread out across hundreds of host IP's. The DDOS mostly just hit 2-3 of our services related to chat. Clearly the chat started having issues when it was under such a high load, but the rest of the site just kept on chugging and expirenced no down time.

Git

This is a silly one, but one that can seriously make or break your workflow. Should you setup your entire application in one repo, or create multiple, one for each of your services? This is one that I dont believe to have an absolutly right answer, but there are definate pros and cons to both approaches.

Multiple repos

Pros:

  • Separation, it is actually difficult for a developer to cross the application boundry
  • Cleaner git history, Pull Requests and lower likelyhood of merge conflicts
  • Only need to have the code you work on

Cons:

  • If you are working on something that touches multiple apps, it is easy to get errors because a version miss-match
  • No easy single way to track corresponding features across the projects
  • Way more complexity

Single Repo

Pros:

  • Super easy to track related changes across the whole application
  • A branch actually contains everything you need to run a given feature
  • If Google can do it so can you

Cons:

  • Big merges are a major pain
  • Promotes the idea that you can rely on other parts of the repo being there, and that you can use them
  • It is pretty easy to break the build for your entire company
  • The repo gets REALLY big

Not sure we made a mistake on this one

I bring this one up because I think we made the right decision for our team, but we have done both. We went the route of a single repo when we pivoted, because we had multiple repos in our previous app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment