Instantly share code, notes, and snippets.

Embed
What would you like to do?
12 factor app configuration vs leaking environment variables
App configuration in environment variables: for and against
For (some of these as per the 12 factor principles)
1) they are are easy to change between deploys without changing any code
2) unlike config files, there is little chance of them being checked
into the code repo accidentally
3) unlike custom config files, or other config mechanisms such as Java
System Properties, they are a language- and OS-agnostic standard."
http://12factor.net/config
4) because the key and value both have to be plain text, it discourages
adding more complicated things as config settings when they really ought
not to need to be. Look at any mongoid.yml for example. Multi-level
config hashes are a code smell (my opinion)
Against:
1) Environment variables are 'exported by default', making it easy to
do silly things like sending database passwords to Airbrake. Sure we could
introduce code to filter them out, but it's another thing we need to
remember to update every time we add one - not robust in the face of
code changes. Better not to put them there in the first place
2) It provides the "illusion of security": env vars are really no more
secure than files, in that if you can read someone's files you can also
(quite easily in Linux) read the environment variables of their running
processes. This is not to say that files are better, just that they
don't pretend to be.
3) in some respect it's just deferring the problem: in order to start
your production instance those config variables still need to be read
from some source so they can be added to the environment, and 98% of the
time that source will be a local file.
4) if you restart an app by sending it a signal (e.g. SIGHUP) from an
unrelated shell that causes it to re-exec itself, it will still have the
environment of the original process. So for example, you can't update
config in environment variables and do a Unicorn "zero downtime" restart.
This can cause confusion
5) There is no single place in which to look to find out what settings are
accepted/required: even successfully starting the app doesn't mean that some code
path somewhere won't dereference an unset env var sometime later. We don't pass
parameters into modules using arbitrarily-named and undeclared globals, so
why is it OK to pass params into the main program that way
My argument:
is that what we're really asking for is a configuration source that
a) lives outside the project. This requirement could be met by
environment variables or a file in /etc or even a request to a web
server - see e.g. as the AWS instance metadata
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html
but in any case it should be difficult to accidentally merge production
config values into the project version control.
b) can be easily and reliably read using any of a variety of languages
(including shell scripts and the like) without complicated parsing code
or library dependencies
c) has limits on its expressitivity, so that people aren't trying to add
code that wants hashes and dates and lists and stuff like that as config
values
d) ideally, makes it hard to accidentally send the configuration values
to our collaborators and external services
Is that a fair representation of the arguments for/against, or am I
missing something?
@telent

This comment has been minimized.

Owner

telent commented Mar 24, 2014

@simplybusiness/developers if I tag this gist does the team get notified? All comments welcome

@madmax25

This comment has been minimized.

madmax25 commented Oct 21, 2014

I agree with the arguments for and against. I found the idea originally on 12factor and it surprised me that it was suggested. I considered it closely; my main concern was on my local development machine, it seemed like it would be a mess to keep up with different sets of configuration for different apps being deployed locally.

Not to mention, it would require change from using files to using environment (which I probably would have wanted to manage by sourcing files) which brought me back to, what is the point and necessity of using the environment, what problem did it solve?

The most compelling argument by 12factor is if your source is immediately viewable, you won't compromise any credentials. I agree with this philosophy, but don't agree with the use of the environment to solve the problem.

@rdj

This comment has been minimized.

rdj commented Nov 7, 2014

Given how easily environment variables are disclosed and how likely they are to just be specified in a file someplace anyway, I was very surprised to see this recommendation among 12 factor's other very common sense approaches.

@ifeltsweet

This comment has been minimized.

ifeltsweet commented May 14, 2015

Absolutely agreed with this. I have pretty much created the same list while debating config files vs environment variables. Most of the time like you said environment variables are still declared somewhere in a file (/etc/profile, ~/.bashrc), so why don't we just create a config file and read from it within our processes.

Some advantages of this are:

  • You don't have to put your sensitive config into process.env, $_ENV, etc, to prevent accidental exposure of such information via 3rd party code or debugging.
  • You have tight control over which processes will read this file

And placing that config file will be no different that placing one for environment variables if you use something like Chef and encrypted data bags. You keep app source code free of sensitive data and only your chef server has access to encrypted data bags.

@axelmagn

This comment has been minimized.

axelmagn commented Jul 28, 2015

A while ago I wrote this library as a proof of concept to solve this problem. It's a bit of an incomplete solution (I'd rather have it be a library that sits on top of a generic hash), but I like the compromise of storing most of the config in a markup file, and having an informal syntax for specifying variables which should be read from the environment instead (as well as specifying the environment key and any default). That way your local dev environment can be the sane default, while production servers need to be explicitly configured and managed.

@rranelli

This comment has been minimized.

rranelli commented Jul 30, 2015

(this is probably old, but I will comment anyway.)

One of the main reasons I use environment variables is the orthogonality between each configuration value. I am able to start a process with a slightly different environment at will (and in a reliable/reproducible way). I can't figure how to do the same if I am reading configuration files.

I find this possibility invaluable in the kind of debugging I do often.

@daniel-barlow

This comment has been minimized.

daniel-barlow commented Feb 12, 2016

(this is probably old ....)

It is, but I happened to check back :-)

One of the main reasons I use environment variables is the orthogonality between each configuration value. I am able to start a process with a slightly different environment at will (and in a reliable/reproducible way). I can't figure how to do the same if I am reading configuration files.

It's a fair question. If you take Unix conventions as your model, your process would accept command-line arguments and use them to override defaults which it read from the config file. Made-up syntax here ...

$ ruby myapp.rb --config-file=/etc/myapp.cfg
$ ruby myapp.rb --config-file=./myapp-test.cfg
$ ruby myapp.rb --config-file=./myapp-test.cfg --with-db=pg://user:pass@some.other.host:5432/dbname

Obviously there are also precedents for using the environment for overrides instead of command-line args (e.g. make does this extensively) which would make that approach more suitable sometimes. Maybe it's a bit hard to get access to the command-line in a Rails app, for example. Also you're unlikely to run into some of the issues with env vars if you're using them to pass parameters to a short-lived debugging session rather than to a production server.

If you've weighed up the issues and the environment genuinely works best for your use case, it's not my intention to stop you. I'm just hoping that people will think about this stuff instead of cargo-culting it :-)

@sam-github

This comment has been minimized.

sam-github commented Feb 16, 2016

I agree with your distillation into the general purpose at the bottom, but I don't think its really an "env vs file" black and white choice. Env can lead to file. Its not unreasonable to have an env var that points at /etc/some-file.ini, or to point to a config server: redis://some.host:port/name-of-this-config. In the latter, config is treated like a backing service, but the app doesn't pre-know its config names, they are certainly not checked in, and they are easy to maintain across multiple instances of the app that should share config. And even if you go "pure env"... well, that env will be in a file... perhaps just a shell script, possible in a .env file (env $(cat .env) /some/app).

@iolloyd

This comment has been minimized.

iolloyd commented Feb 20, 2016

The main point of the separation of state declaration in the environment is to have that outside of the application itself. Whether one adds these values directly into the environment using /etc/profile or similar, or reads in the values from an external source is not really the primary discussion, which is that values that change depending on the environment should only be available in that environment.

@omadawn

This comment has been minimized.

omadawn commented Feb 23, 2016

On the "they are likely to just be specified in a file someplace anyway" argument in the context of 12 factor the intent is that they aren't stored anywhere that the server using them can fetch them from other than an environment variable. Though one of my biggest complaints about 12 factor is that it doesn't address where they are actually stored.

The idea is that the deployment tool is able to reach out and fetch both your application build and the information the app needs to connect to it's persistent store (database location/port, credentials, etc.) it is then able to push not only the built app but the creds to the host running the app. If that system is compromized by some exploit in say ntp or some other process there is no way for the attacker to fetch these credentials and then use them to connect back to the application store (other than potential exploits allowing them to read the applications environment variables)

Remember, with this architecture if you are sshing into the server and restarting a process you are doing it wrong. Hosts do not have real identities and are not maintained in this architecture. If there is a problem with an application running on node x492323498 then you simply delete it and spin up another instance using the same tool which deployed the app the first time.

Both of these ideas I wholly concur with. Though I'm still not so sure environment variables are the way to go I'm having trouble coming up with compelling arguments that show files as being MORE secure.

@pajtai

This comment has been minimized.

pajtai commented Mar 14, 2016

Even the arguments that production configs should be stored outside of VC is has some againsts.
Let's say the VC is a private repo. Would you rather have your production configs stored in your private repo with known reader access or have it be in some free for all state outside of the known VC. People are going to store the configs, because you have to safe guard them somewhere in order to restart prod. Also, you're going to need staging and local configs.
Seems like storing everything in the VC or a submodule allows you to control reader access the same way you control reader access to VC. If it's just "outside" of VC, everyone's going to do it differently, and that can at times be a greater security risk.

@johnculviner

This comment has been minimized.

johnculviner commented May 4, 2016

Coming from a config file for each environment (and maybe a base one with the stuff in common) I'm finding environment variables to be cumbersome for anything more than hello world with MY_DB_CONNECTION type stuff. Almost tempted to put JSON in a config variable but that also sounds disgusting.

@pfeiferbit

This comment has been minimized.

pfeiferbit commented Jul 14, 2016

I'm surprised that Docker or its substitutes (LXC, Rocket) and complementaries (Vagrant, Chef, Puppet etc.) were not mentioned in this discussion.

For us developers the differences between reading a config file or environment vars is negligible.
But when you think about dockerizing your app it is far easier if the container by itself is already runnable and does not require some hidden dependency in addition to the Dockerfile, such as a shared directory containing a config file.

The Dockerfile (or Vagrantfile, or Apache VHost etc.) would then become the configuration file for a specific deployment. The env vars set by those are typically isolated to the application process. While still investigatable there is usually no accidental leak that could affect some other process.

@andreasevers

This comment has been minimized.

andreasevers commented Sep 28, 2016

Perhaps take a look at Spring Cloud Config server.
https://cloud.spring.io/spring-cloud-config/

@abiacco

This comment has been minimized.

abiacco commented Nov 21, 2016

As an admin and someone who traditionally has supported use of config files, i looked into using env variables briefly when we had a jruby app. I never found the overwhelming evidence in favor of.
We're starting up a new infrastructure now and are seriously considering externally stored credentials using credstash or hashicorp vault. Haven't found any showstoppers yet, but we're still early in the process.

@helpermethod

This comment has been minimized.

helpermethod commented May 30, 2017

👍 Spot on analysis. Still people are obsessed with env variables because the are used to them an 12 factor app cargo cult.

@ghost

This comment has been minimized.

ghost commented Nov 7, 2017

Ok adding here in support of envars:

What I've proposed to our dev team which was well received:

  • Still use a config file, but that config file solely defines WHAT environment variables the application is expecting. The config file is the same, regardless of the environment the code is being deployed to. This config file is committed to source control and is used as a self-documenting way of defining the expected config.

  • DO NOT export envars into the actual shell environment. Instead, only pre-load them at runtime and spawn a child process which is the app itself. Therefore the envars are locked into the application process. The key here is using envdir from daemontools which can run on any platform without any special config (not platform specific). The source of envdir values are pointing to a node specific locked down directory holding the values needed. This directory is the same on every node across our company /etc/mycorpname/envdir/

  • This can be further locked down by gpg encrypting the envdir source by using gpgenv as well. However we have not explored this yet.

The above methodology actually alleviates some of your arguments against (#1 and #5 are directly slashed). However there is always the chicken-egg scenario. In order to unlock your secrets, you need a secret. That originating secret in my scenario is the envdir directory itself... which is carefully locked down and lives outside of the source. It is unified across all nodes though due to the same envdir path being used, regardless of the node, environment, or application.

@pizzapanther

This comment has been minimized.

pizzapanther commented Dec 7, 2017

One of the organizational problems I've seen with ENVs are when there are too many of them. When someone starts up a new environment for staging or production, they just copy all the ENVs from another environment and don't bother going through each one to see if it is valid or safe to copy. So you end up with a testing environment that can send real money. 🤑 What fun! Been there!

This can happen with configuration in code too but at least it gets checked in and there is usually a better review process.

So I usually go for a hybrid approach. Have most configurations committed and tracked in code in different files, one for development, production, etc. This makes it really clear which configurations are more important. Any configurations that can't be committed to code, use ENVs or a secure key store if you can handle the extra process complexity.

This approach can still have holes but at least the surface area is reduced.

@teamextension

This comment has been minimized.

teamextension commented Feb 17, 2018

We created Config, a SaaS for managing configuration files. We use an environment variable just to tell use what environment we are in, and use that information to pull the correct configuration file. This is all done during deployment time. The environment can come from the system itself, or taken from an Ansible variable. You can achieve a similar effect with Git, but Config specializes in configuration files, and seamlessly handles commonality and differences between environments.

@lwm

This comment has been minimized.

lwm commented Jul 28, 2018

lol, a SaaS for managing configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment