sam-github/.gitignore Secret

## .gitignore

      
    Raw
  

              .gitignore
            
          
## DEPLOY.md

      
    Raw
  

              DEPLOY.md
            
          
    Build/Deploy

TL;DR


slc build: We need a tool that will build an app and it's deps into a tarball or
git branch. High priority.
slc publish: We need a tool that will publish an app tarball or git branch
to a deploy server (use git protocol first, other protocols later).
Medium priority (you can do this with a few git commands if you use git).
slc deploy, prepare, export: We need three small tools, one for each stage of
what it takes to make an app run on a server, they can be run individually,
or all together for a complete "heroku-like" deploy solution: (High priority)

A network receiver of an app publish (using git protocol first, others later)
A CLI to prep the received app for running (npm rebuild/install)
A CLI to export the app's config to the system process manager (upstart,
systemd, etc.)


slc run: Needs signal handling and log aggregation, and perhaps a plugin
system for features like a /health endpoint.
node foreman: Contains useful code which should be refactored into modules.

References:

a. The npm Debacle Was Partly Your Fault
b. Heroku Buildpack README
c. REQUIREMENTS Depoloying Node.js applications
d. 10 steps to nodejs nirvana in production
e. strongloop/node-foreman#43 (comment)
f. http://addyosmani.com/blog/checking-in-front-end-dependencies/
g. http://www.futurealoof.com/posts/nodemodules-in-git.html
h. https://github.com/mikefrey/node-pac

Terms:

URL: description of a node package's source or destination. Could be . or
some other local fs path, a tarball, a checked out git repo, a git URL, an
HTTP reference to a tarball, etc. Which specific URLs we support first for
source and destination depend on use-case priorities.
node package: package.json and the direct source, deps present only as
specs in the package.json
node archive: node package with dependencies built in preparation for deploy
preparer: a tool that takes a node archive, and prepares it to be runnable
(mostly npm rebuild and post-install hooks)
exporter: a tool that generates configuration for a runnable application, and
registers it with a process manager (such as upstart)
deployer: a tool that receives a node archive, and prepares and exports it

Building a node application archive


Input: node package URL
Output: node archive URL
Tool: slc build (and/or grunt rule and/or yeoman generator?)
REQ: 1-4,12
ESTIMATE: 5 days

Build steps are described in (a). Note error in (a) claiming Heroku does this
wrong, in fact, they intelligently and correctly deal with archived
applications, see (b).
(a) advocates comitting archived deps to development branches. This is wrong,
the archive may be committed to a git "deployment only" branch or packaged as
tarball, but it should not be comitted to a development branch.
What this tool should do is mostly discussed in the second section of (c), but
missing is a description of:

automation of npm interactions: npm install, strip of binaries, shrinkwrap,
what npm scripts should be run (if any), etc. TBD
automation of custom build steps: front-end build output (minified src,
compiled sass, etc.) needs to be archived along with the npm/package.json
dependencies. This is often done with grunt and/or bower, or npm scripts.
We should drive this in the build, or ensure that an 'slc build'
run after the front-end build will archive its output, or perhaps implement
the 'slc build' as a grunt plugin. TBD

Imaging a node application archive


Input: node archive URL
Output: virtual image

I'm not sure if this is a tool we should prioritize, but its a tool I think we
could write. We could take a node archive, and bind it into a docker image, or
a vmware image, or ... Basically, generate output that can be fairly directly
imported into a virtualizer.
Publishing a node application archive


Input: node archive or node package (+) URL
Output: deployer URL to publish to
Tool: slc publish, or ... Ops Dashboard (++)
REQ: 4,8
ESTIMATE: 2 days

PaaS deployers usually use git to accept applications. I'm not sure if there
are any other generic protocols to a remote deployer. Custom infrastructure in
a company might use scp or sftp, wrapped in some kind of scripts or tools to
trigger running the app after its published. We can publish tarballs to
an HTTP URL, assuming there are people out there that use HTTP to publish
to a deployer, I'm not sure there are.
Common cases to prioritize are just these:

git (local/remote) to git remote: ex. github to openshift
git (local/remote) to tarball (local): ex. git remote to custom tooling
tarball (local/remote) to git remote: artifactory to git PaaS

This may seem trivial, but its unnecessarily painful to take an artifactory HTTP
URL to a tarball, and publish it to an openshift git repo.  Even pulling a
remote git branch branch (archived) from github and pushing to openshift would
require a tedious series of git commands.
These processes are worth automating, IMO.
(+) Note that heroku cleverly supports both archived apps, and unarchived
app packages. Archives are superior for production and reproduceability,
but raw packages are useful for PoC, dev, quick tests, etc. Its easy to
support both in the deployer.
(++) I don't suggest we prioritize this, but its worth noting that if we built
this into a dashboard, you could theoretically provide the slc publish input
and output URLs as part of an application's configuration, and run the slc
publish in the Ops backend on user-request. If we did lots more work, we could
even provide Dashboard tools to provision destinations (spin up EC2 instances,
etc.). I don't think we should do this... Ryan points out that there are lots
of companies out there trying to provide high-level dashboards to PaaSes, we
could partner with one or more of them, helping them make their dashboards work
well with node appplication archvies, and possibly helping them implement
heroku-like deployers for node.
Deploying a node application archive


Input: (pushed) git, ?
Output: running application
Tool: slc deploy, slc prepare, slc export
REQ: 5-7,10
ESTIMATE:

receive: 2 days to repurpose substack/cicada
prepare: 1 day
export: 2-4 days, depending on how much configuration we go for, and system
support (should be mostly pulling code out of node-foreman)


Involves 3 steps:

receive: probably with git protocol
prepare: see (b), basically checkout/untar, then npm rebuild/npm install
export: configuring the process manager to run the app

For PaaS, they already run deployers, and its unlikely we need to do anything
there, certainly not for Heroku. If we find OpenShift does require over-riding
of how it prepares apps to be inject, we may want a stand-alone CLI implementing
the prepare, particularly the sequence and choices described in (b).
For existing corporate dev-ops infrastructure, they probably have receive and
run tooling already setup, but as for PaaS providers, if they don't have node
specific experience, they may benefit from the same stand-alone "prepare" CLI.
For companies without existing infrastructure, they have a few options:


docker: use dokku: "docker-based heroku
in 100 lines of bash", implements a git receive, prepare, and run inside
docker. Ryan speaks well of the code, but we haven't tried it, and says the
reason it is so Heroku-like is it reuses a lot of their opensource build-pack
code.


vanilla linux: we should build a tool that combines the git receive
implementation of substack/cicada, the
prepare CLI (to be written), and the process manager configuration compiler
from NodeFly/node-foreman to
achieve a light-weight "heroku equivalent" for linux machine that we can
recommend to our customers who are looking for a deployment solution.


As a matter of good architecture, the perparation would be implemented as
a node module, with an API and a CLI.
Running a node application in deployment

System process managers are sysV-init, upstart, systemd, launchd, windows
service.
Process management starts with the system, and requires system configuration.
The system configuration can be hand-written, but is mostly tedious
boiler-plate.  Unfortunately, it is not entirely boiler-plate... there are
choices to be made in regards to logging, daemonization, runners (node version,
slc run, foreman?, etc.), pid-file handling/generation, ulimits (nfiles, memory,
core, etc.), etc.
There are various tools that will export a configuration. They don't generally
have equivalent support across the managers, or equivalent or useful
configurability.
Possibilities are:

ddollar/foreman,
commonly used, partically abandoned, often cloned. The original exports to
initd or upstart
NodeFly/node-foreman, we own
copyright on, a foreman clone in node that compiles a Procfile and some
configuration into upstart or systemd configurations.
unitech/pm2, has a few
varieties of startup scripts it can install

I suggest we not leave this up to third-party utilities, and have a more
definitive solution. In particular, we need to ensure our solution integrates
well with a git-protocol deployer (see previous), and is sufficiently
customizable it allows the use of strong-supervisor, supports configuration of
log aggregation, etc.
Since node-foreman currently contains some code to do this, we should enhance
and refactor it to support our use-cases.
Ryan suggests node-foreman (npm package name is actually just foreman) should be
re-written by extracting/re-writing the most useful features as modules and then
re-composing them into a couple different tools:

strong-foreman (Procfile + .env + log tagger + nodemon)
slc export (Procfile + .env + service exporter)
... and the log tagger should be added to strong-supervisor

** ESTIMATE: ... node-foreman refactor **
** TODO: agent reporting system limit "metrics" **
Note on logging

In-app logging has many options for a logger, we should recommend one of:

winston
bunyan
console
visionmedia/debug
... any other candidates?

Logging (depending on logger) can be configured to multiple destinations:

stdout/err
syslog
file
fifo (allowing smart file redirection)

and some allow run-time reconfiguration of log levels.
Configuring the logger to write to stdout/err is most flexible, it allows the
logging destination to be configured at deploy time, but introduces a protocol
question: who adds the timestamps, log levels, pids, cluster worker IDs? And
is the output allowed to be multi-line? Stack traces on error usually are.
We should extend supervisor to redirect the output of its workers (including
error stack dumps) to a configurable destination, and make sure that it is
flexible enough to accomodate reasonable variations in log output format.
** ESTIMATE: backlogged, supervisor logging **
** ESTIMATE: 1 day, exporter implementing configurable logging strategy **
Note on supervisors and process managers

system process managers often contain features to do daemonization, logging,
restart on failure, start/stop, deliver signals or run scripts on request,
process listing, log viewing, cpu/memory usage, etc.
A number of node supervisors have veered into system process manager territory
in terms of features.
Why node process managers are not as useful as they appear

One thing I don't like about pm2/etc. is that they seem to be reimplementing
system features. If they are doing it, as node-foreman does, to build a dev
time tool, then that makes some sense. But if they are targeting the tool for
deployment time... it doesn't make so much sense.
If we did decide to implement process-management features, it would make sense
only as a manager that attached to the Ops Dashboard, and that we used to get a
toe-hold on machines, so that we could then control, deploy, and configure them,
pushing apps dynamically, etc. A kind of super-clustering, super-supervising,
super-deployer.
In the absence of those features, I don't see process managers as useful node
deployment tools.
Why node supervisors are useful

Deployment-time decisions should be made at deploy time, not in development.
This is best done with some kind of supervisor, that can optionally run an
application with monitoring (like strong-agent), with clustering (like
strong-cluster-control), with log aggregation (strong-supervisor implementation
in progress), pidfile generation, etc.
Monitoring is not a system feature.
Clustering is a feature that could be done by system, but... node has ability to
use multiple cores with cluster, forcing that into system is not very dynamic,
and also forces use of a reverse proxy/load balancer. Clustering and restart is
a feature that belongs in a node supervisor.
Logging could be done by the system, but upstart doesn't, and the log
aggregation works better when its cluster-worker/node aware.
FWIW, I regret adding daemonization and logging to strong-supervisor.
We need a blog that more clearly articulates why strong-supervisor is a
reasonable choice of tool, despite that it is apparently feature poor when
compared to forever, pms, etc.
We also need to add a few features that it really is missing:

signal handling: TBD, requirements to be driven by systemd/upstart script
health: TBD
log aggregation: TBD (but in progress)

Exposing a /health HTTP route is necessary for integration with EC2, Heroku,
and many in-house load-balancer setups. It can't be done well in a worker,
because workers don't know state of a cluster. Its a reasonable thing to add to
strong-supervisor, though there are limits to how "application specific" the
health check can be. At least basic cluster status (number of workers, restarts,
whether its shutting down or starting up, etc.) can be published at a known
route (optionally, of course), and possibly some kind of application health can
be determined from whether the workers are listening, or even whether they
respond to a message sent to them on the cluster bus, or are sending a heartbeat
status on the cluster bus.
I'd like to see some kind of "plugin" system for strong-supervisor... so each
supervisor can call an arbitrary piece of code before forking the workers. Then
we could provide plugins for health, strong-mq, strong-cluster-store, signal
handling, etc. I don't want to build too many optional and opinionated features
into it!
** ESTIMATE: signal handling **
** ESTIMATE: health  **
** ESTIMATE: strong-supervisor plugin system **
Note on restart-in-place

strong-supervisor and other supervisors allow code to be updated underneath an
app, and for the workers to be gradually restarted. I've mixed feelings about
this. For an app composed purely of resources loaded into memory, this could
work, maybe its even a good idea. If your app has resources on disk that are
loaded at run-time such as a web app and/or its assets, it would be awful if the
new version of pieces of the web app started to be served before the running
node app had restarted!
What's our stance on this?
restart-in-place may mean that configuration of the process manager gets more
complex, and might make zero-downtime harder. Perhaps its a choice made by the
app.  Perhaps on-disk resources are loaded once on app startup when NODE_ENV
is production, but not after?
My feeling on zero-downtime is that if you actually care, you should have
multiple machines behind a load-balancer. Otherwise, any kind of operational
activity (scheduled or unfortunate) other than a simple app update is still
going to cause downtime.
Note on graceful close

REQ: 9
Its common for people to suggest that node apps should graceful close by:

Not accepting new connections
Allowing current connections to "finish gracefully"

(1) is easy: see
server.close().
2... not so easy, though spanishdict blog post below seems to have the best
suggestion: express middleware. Its hard to believe this isn't out there
already.  I've been asked several times (Dream11, etc.) by people how to do
this, and have had to wave my hands. I'd like to be more of a subject-matter
expert on this!
In particular, node http connections are usually in keep-alive, so they will
stay open as long as a client is using them. A client doing an API polling loop
could keep the connection open until the app is exited... some of the approaches
discussed below need wrapping up into a module, I highly expect, and we need to
blog about them.
Random resources on this issue:

http://blog.argteam.com/coding/hardening-node-js-for-production-part-3-zero-downtime-deployments-with-nginx/
https://groups.google.com/forum/#!topic/nodejs/Mn1jzHCGUYE
http://stackoverflow.com/questions/14626636/how-do-i-shutdown-a-node-js-https-server-immediately
http://stackoverflow.com/questions/15229205/efficient-http-shutdown-with-keepalives
http://engineering.spanishdict.com/blog/2013/12/20/the-4-keys-to-100-uptime-with-nodejs

[1:07:59 PM] Sam Roberts: ben, how to gracefully close an http server? The best I can see is to keep an array of all open connections (manually, ugh), then when I want to close, call .setTimeout(0) on all the connections.
[1:08:25 PM] Sam Roberts: I take that back, probably need .setTimeout(1), because 0 disables a timeout
[1:09:20 PM] Ben Noordhuis: how graceful is graceful? you can call res.end() on all responses, then server.close()
[1:10:04 PM] Ben Noordhuis: but that will wait until all pending data has been flushed to the client. if there are stragglers (or malicious slolaris clients) that can take a while
[1:10:38 PM] Sam Roberts: graceful, as in, avoid client seeing the connection reset, if possible.
[1:11:41 PM] Sam Roberts: I don't see any way to set keep-alive to false, which would be the nicest way for clients that are actively using that http connection. and for clients that are not using them actively, they run the risk of a RST if we close our side. unavoidable.
[1:11:41 PM] Ben Noordhuis: right. res.end() or setTimeout(1) would do that
[1:12:20 PM] Ben Noordhuis: yes. just server.close() will make the server stop accepting new connections, then you can wait for the existing ones to die off naturally
[1:13:16 PM] Ben Noordhuis: you probably want to put a mechanism in place that responds with 503 to new requests on existing connections
[1:14:16 PM] Sam Roberts: this is a cluster scenario, so I want them to keep doing http with the server, but to make a new connection first.
[1:14:41 PM] Ben Noordhuis: is the scenario a rolling upgrade?
[1:16:03 PM] Sam Roberts: rolling restart, single restart, or a decrease in number of concurrent workers (but server hasn't errored, its still good, we just want it to go away, and not stay FOREVER because there is an active http client keeping a connection open)
[1:16:42 PM] Sam Roberts: would also happen with a loadbalancer in front of a number of node instances, where we want to take one instance down
[1:18:23 PM] Ben Noordhuis: right. then you want a mix of the above: server.close(), a reasonable timeout, etc.
[1:20:13 PM] Sam Roberts: http://blog.argteam.com/coding/hardening-node-js-for-production-part-3-zero-downtime-deployments-with-nginx/, these folks are using middleware to start sending 502. hm, not so different from your 503 suggestion.
[1:21:03 PM] Ben Noordhuis: 503 > 502 in more than one way
[1:23:25 PM] Ben Noordhuis: i think if you unset both Content-Length and Transfer-Encoding in the final response, node will auto-close the socket (because there is no way for the client to otherwise know when the end of the response is)

Note on when and how configuration is bound to an application

Configuration should not be present in a node application archive. The more
configuration is bound in, the less choice there will be over deployment
environment. Even deploying the same app to staging and then production could
become impossible.
The tool flow is:

slc build: creates an archive
slc publish: publishes an archive
slc deploy: prepares an archive, and exports to process manager

At step 2 or 3, configuration can be bound. 2 would involve re-writing the
archive to contain configuration. 3 is the almost-last moment, and has the
advantage that the runner could know what the configuration should be on the
machine it is running on. But, that may require the deployer to know things
about the application, which introduces the problem of the deployer
configuration needing to be changed as the application changes, which introduces
the question of where it gets its configuration... StrongOps has run into this,
its nasty.
We should support configuration being bound by the deployer (probably in form of
a .env file, possibly with the a URL to a .env file), and while that can include
application specific configuration, we should recommend a better way.
Applications should use tools like
nconf-redis to pull their
configuration. In this model, the only application specific configuration that
the exporter needs to bind to application is the URL of the configuration
server, and some identifying token. In particular, note how rolling restart
from strongops (or ssh/CLI, through supervisor or through process manager)
allows natural restart and pickup of new configuration on change.
Other than that, all configuration should be application non-specific... where
logs go, where pid file goes, which supervisor or node version to use, etc.
Use Cases

Jenkins, to Artifactory, to staging, then production

Build steps/jobs would be:

ci-build: checkout, slc build, upload to artifactory
ci-test: npm test, code metrics, etc..
ci-promote-to-staging: slc publish $ART_URL git@staging
ci-promote-to-production: slc publish ... git@production

This requires routine Jenkins job configuration to trigger build steps on
appropriate criteria, such as git commits, passing of automated tests, manual
promotion by QA or OPS, etc.
It also requires a staging and production environment to exist and be
provisioned with a deployer ready to accept a publish (we will provide
deployers for linux with redhat/systemd and ubuntu/upstart, git deployers
for heroku, openshift, docker, etc. are already standard in PaaS world).
StrongOps

Discussed at length by Ryan and Sam, we think we could deploy strongops
if we moved it to nconf-redis for config, had output log aggregation, and
something that we could git push the code to.
Interesting wrinkle is when you have multiple apps in a single repo, that don't
all run on the same machine. As we do. I can't recall what we decided about
that... though I think its a bit unusual.
Related projects (unreviewed)


http://www.projectatomic.io/
https://linuxcontainers.org/
http://www-01.ibm.com/software/ca/en/tivoli/
http://www.jboss.org/developer/features.html
http://tomcat.apache.org/tomcat-8.0-doc/deployer-howto.html