Skip to content

Instantly share code, notes, and snippets.

@bear
Last active May 24, 2016 22:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bear/5a1b24db815f94b0bfdcc015e18c2490 to your computer and use it in GitHub Desktop.
Save bear/5a1b24db815f94b0bfdcc015e18c2490 to your computer and use it in GitHub Desktop.
python configuration blog post

Python Application Configuration

(or how I grew to love environment variables)

“Every program and every privileged user of the system should operate using the least amount of privilege necessary to complete the job.” Jerome Saltzer, “Protection and the control of information sharing in multics” 1974 Communications of the ACM

Application configuration is possibly more of a divisive topic for any given group of developers than asking what editor to use or to define what full-stack means -- there are dozens of ways to set the running state of your application at startup. To be clear, this discussion does not cover at all the sub-category of configuration management of secrets -- while secrets are often passed using the same methods outlined below, they should be discussed separately as they necessitate more restrictions.

For Python you have a fixed set of options - now some would quibble about my use of the word “fixed”, but if you examine the other methods they all devolve into a small number of ways to pass the configuration to the application:

  • From the command line
  • By way of a configuration (or settings) file
  • Via environment variables using either a source’d settings file or from the process manager

All of these have their place within the configuration management toolbox and are useful, they just also carry some pro’s and con’s which I’ll discuss.

From the command line

The realm of the command line is where you will find that most operations and sys-admin folk inhabit and it’s here that the use of arguments and parameters are omnipresent and will never go away. The use of parameters now lets your application be self-documenting by way of --help. They also allow for allow flexibility across your different environments by way of templates.

Some of the reasons to not use command line parameters are shell escaping, parameter value size and complexity. You will more than likely need to pass in some item that has values that cannot be properly entered on the command line without some form of escaping. You will also run across size issues as some environments limit the maximum size allowed for a single command.

The most significant negative reason is that they are now visible to anyone on the server using the basic “list all processes” command ps ax. This will show the command line, including any passwords or resource endpoints you may need but don’t want to share, to be completely visible.

These restrictions almost always lead to storing them as scripts and, as above, this is both a postive and negative reason. The plus side is that you can now use a template or some other external method to inject the values into your script, i.e. cloud-init or consul-template. As with command line params, using a script also means that anyone with read access to the script can inspect the file and discover its contents, which leads us to our next possible solution.

Configuration or settings file

Ok, so you worked through the issues with using command line parameters and you may have even moved them to a startup script, but you didn't like the visibility issue. That leads you to using a settings file which is then referenced during your application's startup. Everything is now perfect right? Let's walk through that option and find out.

With the configuration or settings file, you may not have any size or escaping issue to worry about as your application is now tasked with reading the file properly, but you still have the visibility issue. Any user or process could potentially dump the contents of the settings file and, if it has (or can obtain) write permissions, can now alter your settings file. Ensuring that your application is the only process that can read the settings file is crucial -- really your application should not even have write permissions as the settings file should be inserted into its environment. This type of service environment can be found with any server that allows for cloud-init -- you can then have the file pushed to the server during startup and stored in a read-only filesystem. This approach mirrors what I list later as my favourite.

Environment variables

So, that leaves us with environment variables!

I can already hear everyone firing up their twitters...

But Wait! Won’t they appear in the process list also!

But Wait! I don’t want to have 50 environment variables, what are we a CGI app?!?

The first one is a good point and I’m just going to ignore the second as a courtesy to everyone reading who may have survived CGI web applications :)

So let’s clear up the first concern by running a test on the two ways you can set environment variables and have them available to your application.

First, you can set an environment variable directly

    FOO=bar python -c “import os, time; print(os.environ.get(’FOO’)); time.sleep(10)”

And secondly as part of a bootstrap script

    #!/bin/bash
    export FOO=bar
    python -c “import os, time; print(os.environ.get(’FOO’)); time.sleep(10)”

Running either of the above will let you look at the command line while the script is in the time.sleep() call using the ps ax command:

    Bear    4321    0.0    0.0    2392959    676    /Users/bear/bin/python -c import os, time; print(os.environ.get(’FOO’)); time.sleep(10)

Or by running cat /proc/4321/cmdline

    /Users/bear/bin/python -c import os, time; print(os.environ.get(’FOO’)); time.sleep(10)

You would see the same thing for environment variables set via a calling script or by your process manager du-jour (shout out to team runit!).

Now realistically you can never prevent the root user (or anyone who can attain root privileges) from doing the above, but if that is the case you have more pressing matters to worry about! What you can do is ensure that your process manager allows for you to drop privileges for any service being run and that you specify a minimum set of permissions on the settings file allowed for command line or environment variable, like this runit example:

    #!/bin/sh
    PATH='/home/myservice'
    exec 2>&1
    exec setuidgid myservice /home/myservice/service $MYSERVICE_CONFIG_FILE

Now the above is a contrived example sure, but we are using the setuidgid command to restrict the process to that environment, setting an explicit PATH enviroment and then also telling the service to start with and explicit configuration file. The value of $CONFIG will have been set via the environment var options of runit which I'm hand-waving about right now.

Summary

Like all things, in a production environment you will end up with a mix of the above for your application. I know in some of the production systems i’ve managed it was very common to use a configuration file to store the developer defaults and then the code would check for the presence of the environment variable ENV, and if found, retrieve any key values from the environment or from tools like consul or etcd using a combination of host or role as a unique key for the environment needed.

One thing to be very aware of when you are dealing with Python application frameworks like Django or Flask is that they often will have debug modes that will gladly output everything about the environment they are being run within! This is one of the reasons you see the examples here using services being run in user-space -- we take advantage of the operating systems (hopefully) robust user isolation tools to reduce the amount of information leaked.

Thankfully there are already a few implementations of the above practices -- see Flask Environments and Django Environ.

I would make sure to read about the Flask debugger and the Django debugger -- both of these add enhancements to the underlying Werkzeug debugger.

The ultimate goal is to make sure that things that are secret or sensitive are passed into your application with as little visibility as possible and that anywhere those values "rest" the access to them is constrained.

The final thing to realize is that like code, environments need to be tested -- make sure to add sanity checks that are run post-deploy that check for the sort of secrets you don’t want leaked so you at least will get alerted when something glitches. Which, as we all know, will happen.

@pdurbin
Copy link

pdurbin commented May 24, 2016

should be "its contents"

@pdurbin
Copy link

pdurbin commented May 24, 2016

you have a "not not"

@pdurbin
Copy link

pdurbin commented May 24, 2016

should be "its environment"

@bear
Copy link
Author

bear commented May 24, 2016

thanks! (fixed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment