public
Last active

Virtualenv's `bin/activate` is Doing It Wrong

  • Download Gist
gistfile1.md
Markdown

Virtualenv's bin/activate is Doing It Wrong

I'm a Python programmer and frequently work with the excellent virtualenv tool by Ian Bicking.

Virtualenv is a great tool on the whole but there is one glaring problem: the activate script that virtualenv provides as a convenience to enable its functionality requires you to source it with your shell to invoke it. The activate script sets some environment variables in your current environment and defines for you a deactivate shell function which will (attempt to) help you to undo those changes later.

This pattern is abhorrently wrong and un-unix-y. activate should instead do what ssh-agent does, and launch a sub-shell or sub-command with a modified environment.

Problems

The approach of modifying the user's current environment suffers from a number of problems:

  • It breaks if you don't use a supported shell.
  • A separate activate script must be maintained for each supported shell syntax.
  • What do you do if you use no shell at all? (I.E. run programs in a virtualenv from a GUI.)
  • If the deactivate script fails to un-set an environment variable, it may contaminate other environments.
  • If you want to edit deactivate or any other function sourced into your environment, you have to kill your shell and re-source the script to see the changes take effect.
  • If you change the current directory from one to another virtual environment and forget to carefully deactivate and activate as you do so, you may end up using libraries from or making changes in the wrong one!

Virtualenv's activate suffers from a number of other warts as well:

  • You can't simply run the script; you have to learn and employ your shell's "source this script" builtin. Many non-experts frequently stumble over this distinction. Doing away with the recommendation to source a shell script should make virtualenv easier to use.

    # This file must be used with "source bin/activate" *from bash*
    # you cannot run it directly
    
  • In an attempt to preserve the user's old environment, it declares _OLD_VIRTUAL_PATH, _OLD_VIRTUAL_PYTHONHOME, and _OLD_VIRTUAL_PS1, and must define how to restore them upon deactivation. If you happen to want to modify activate to override more variables specific to your environment, you have to do the same.

  • Its default means to display whether or not a virtual environment is currently active (modifying the user's PS1 variable) is fragile. On Debian and Ubuntu boxes it becomes confusing if one enters a subshell, or uses a tool like screen or tmux.

  • It is not executable, and not meant to be used as an executable, yet it lives in a a directory named bin.

Doing It Right

Entering and exiting a virtual environment should be like using ssh to connect to another machine. When you're done, a simple exit should restore you to your original, unmodified environment.

An example of a program that does this the Right Way is ssh-agent. In order to communicate the port that it uses to other programs, it must set some variables into the environment. It provides an option to do what virtualenv does, but the better way is to simply ask ssh-agent to launch your command for you, with a modified environment. ssh-agent $SHELL will launch a sub-shell for you with its environment already modified appropriately for ssh-agent. Most Debian and Ubuntu machines even launch X11 this way; see /etc/X11/Xsession.d/90x11-common_ssh-agent.

Another advantage to the subshell approach is that it is far simpler than the hoops virtualenv jumps through to activate and deactivate an environment. There's no need to set _OLD_ variables since the former environment is restored automatically. There's no need for a deactivate function.

Finally, employing a prompt context variable instead of messing with PS1 would allow the user to define how that information is presented.

A better activate: "inve"

To differentiate, I'm calling this approach "inve" as in "inside this virtual environment, ..." I'll happily take name suggestions.

Launching a subcommand with a modified environment

How do we make an executable like ssh-agent that launches a subcommand with a modified environment? Easy. Call this my_launcher:

#!/bin/sh
export MY_VAR=xyz
exec "$@"

Calling "my_launcher firefox" will launch firefox with MY_VAR set to 'xyz' in its environment. The environment where "my_launcher" is called from will not be disturbed.

Simplifying activate

Let's now examine bin/activate to see what we can throw away if we assume that the system takes care of restoring the environment for us when we exit. We don't need the deactivate shell function at all. We don't need any _OLD_ variables. We don't mess with the prompt. What's left?

export VIRTUAL_ENV="/home/mike/var/virtualenvs/myvirtualenv"
export PATH="$VIRTUAL_ENV/bin:$PATH"
unset PYTHON_HOME

That's it. Three lines, down from 76. Down from 187 if you count all variants for other shells.

Wrap this with the launcher technique above, call it inve, and ./bin/inve $SHELL spawns a new subshell in the active virtualenv. What if you want a no-argument invocation to default to spawning an activated shell? This is the entire script:

#!/bin/sh
export VIRTUAL_ENV="/home/mike/var/virtualenvs/myvirtualenv"
export PATH="$VIRTUAL_ENV/bin:$PATH"
unset PYTHON_HOME
exec "${@:-$SHELL}"

Now bin/inve does what bin/activate should. By the way: this works for all shells. bash, zsh, csh, fish, ksh, and anything else, with one script.

More hacks

Re-enabling current environment modification

Some users source bin/activate from within their own shell scripts, which I don't find quite as offensive.

ssh-agent also supports this style of use. It too has to deal with the syntax differences between shells to do so. It's not hard to enable this; here's one proposal.

#!/bin/sh

# As above, do what's needed to activate
export VIRTUAL_ENV="/home/mike/var/virtualenvs/myvirtualenv"
export PATH="$VIRTUAL_ENV/bin:$PATH"
unset PYTHON_HOME

# If the first argument is -s or -c, do what ssh-agent does
if [ "$1" = "-s" ]; then cat <<- DONE
    export VIRTUAL_ENV="$VIRTUAL_ENV";
    export PATH="$PATH";
    unset PYTHON_HOME;
DONE
elif [ "$1" = "-c" ]; then cat <<- DONE
    setenv VIRTUAL_ENV "$VIRTUAL_ENV";
    setenv PATH "$PATH";
    unset PYTHON_HOME;
DONE

# Otherwise, launch a shell or subcommand
else
    exec "${@:-$SHELL}"
fi

Now inve supports the same -s and -c options that ssh-agent does. Where one might previously have written a script like this:

#!/bin/sh
source ./activate
... (commands) ...

One would now write instead:

#!/bin/sh
eval `./inve -s`
... (commands) ...

Or, for csh:

#!/bin/csh
eval `./inve -c`
... (commands) ...

Unfortunately, I don't know if this "eval the output of a command" technique works for all possible shells.

A system-level inve

I find it convenient to employ a "system-level" inve script that lives in my system $PATH, that I can run from anywhere within any virtual environment, and without specifying the full path to 'ENV/bin/inve'. This goes against the intention that "virtualenvs are self-sufficient once created" so I'm not advocating this technique be used instead of ENV/bin/inve.

#!/bin/sh

# inve
#
# usage: inve [COMMAND [ARGS]]
#
# For use with Ian Bicking's virtualenv tool. Attempts to find the root of
# a virtual environment. Then, executes COMMAND with ARGS in the context of
# the activated environment. If no COMMAND is given, activate defaults to a
# subshell.

# First, locate the root of the current virtualenv
while [ "$PWD" != "/" ]; do
    # Stop here if this the root of a virtualenv
    if [ \
        -x bin/python \
        -a -e lib/python*/site.py \
        -a -e include/python*/Python.h ]
    then
        break
    fi
    cd ..
done
if [ "$PWD" = "/" ]; then
    echo "Could not activate: no virtual environment found." >&2
    exit 1
fi

# Activate
export VIRTUAL_ENV="$PWD"
export PATH="$VIRTUAL_ENV/bin:$PATH"
unset PYTHON_HOME
exec "${@:-$SHELL}"

Until an inve-like script gets created in virtualenv bin/ directories, this system-level script will allow you to immediately use the subshell technique with all existing virtualenvs. If ever the inve script does land in virtualenv's bin/, this system level script could be simply a helper that searches for and invokes ENV/bin/inve:

# Locate the root of the current virtualenv
... (same as above) ...
# Activate
exec bin/inve "$@"

Don't mess with my prompt

But what about the prompt? Build a PS1 that does the right thing everywhere without needing to be modified to suit a particular purpose. I tend to have a function that collects all the context info this way, in my .bashrc:

function ps1_context {
    # For any of these bits of context that exist, display them and append
    # a space.
    virtualenv=`basename "$VIRTUAL_ENV"`
    for v in "$debian_chroot" "$virtualenv" "$PS1_CONTEXT"; do
        echo -n "${v:+$v }"
    done
}

export PS1="$(ps1_context)"'\u@\h:\w\$ '

This lets the user control their PS1 and it works everywhere, no matter how many subshells or screen sessions you're nested into. This is the only piece that has to be customized per-shell.

Conclusion

While using activate is intended only a convenience and is not necessary to work within a virtual environment, most of programmers I know treat it as a black box and never do without it. I suspect that, in part, the complexity of the script is what prevents more programmers from avoiding it.

Perhaps the worst part about a popular, useful tool like virtualenv using this antipattern is that many other programmers are adopting it as normative and using it for their own work. virtualenvwrapper and dustinlacewell/capn are two examples. Stop doing this, everyone!

I've taken this rant to the virtualenv maintainers and now I'm working on a patch that might eventually get accepted upstream! :-D Just need a bit more time to improve it.

https://github.com/pypa/virtualenv/issues/247
https://groups.google.com/d/topic/python-virtualenv/XzI8GStvKDw/discussion

I think virtualenv is goodish, but you're write that it's not Unix enough.

A particular gripe is that . bin/activate seems to break when called from a shell script running with

set -eu

I'd like to be able to use -eu as it makes bash scripting a lot less error-prone, but virtualenv's active doesn't want to know

I don't really feel like activate is a real problem. Most of the time, you just end up spawning a bash before entering a virtualenv, and that's it. If you're lazy:
alias inve='bash --init-file '

Btw, you often end up automating the use of virtualenvs (e.g. using the excellent fabric library) and dealing with a single source is way easier than using a subshell and pipe once-more-escaped commands to it.

I agree. But the real problem I have with virtualenv is how it works on a non-interactive shell. What if I write a Python program using a virtualenv, and then I want to execute it from init.d, supervisor, upstart, cron, etc.? Those can't activate.

The trick I have been using is to hardcode the full path to the virtualenv python binary. For example in a cron I will

*/5 * * * * /home/username/.virtualenv/project/bin/python /home/username/src/project/program.py

In a perfect world it would look something more like this

*/5 * * * * virtualenv_activate projectname -e /home/username/src/project/program.py

activate also modifies $LD_LIBRARY_PATH. inve needs to set that too.

EDIT:

Standard activate does not modify $LD_LIBRARY_PATH. I just remembered that's something I patched in. Where I work, we use the virtual Python environment to be the general container for all the stuff our application needs. We include gevent in our virtual environment, so we also put its dependencies (like libevent, needed by our version of gevent) in the virtual environment under the $VIRTUAL_ENV/lib. It's worked out well for us. It lets us deploy new virtual environment snapshots with libraries upgraded as needed. The virtual environment is the atomic unit of deployment, complete with all libraries and their dependencies. I wonder why this isn't common practice.

Not to be a pain, but to speak up for the great unwashed, please don't forget about Windows in the drive to be more Unix-y. It's possible some of the problems come from Windows compatibility as activate (well, activate.bat) is an executable on Windows.

This is pretty nice, I always put my virtual env in _venv so I wrote an alias that would do something like:
alias venv='source _venv/bin/activate'
but I like this much better.

One thing I do think would be nice is modifying the shell prompt. For example, how virtual env prepends to the users prompt after activating. Its nice having that visual reminder so that you know which shells are actively using which virtual environment..

ssh-agent(1) doesn't start a sub-shell. It generates environment variables that you source into your current environment. dircolors(1) does the same thing. I would argue that the pattern of modifying your environment in place is very consistent with how UNIX programmers have solved this problem. Sub-shells are simply an alternative.

You make a lot of great points about the simplicity of solving this problem with sub-shells, and I completely agree that it should be made available as an option -- but nothing about modifying your environment in place is "abhorrently wrong and un-unix-y".

An incremental name suggestion for 'inve' - 'intove' is a little more evocative.

Thanks for a great perspective, code and attention to UX issues that benefit everyone when fixed. :)

Thanks! I've been super annoyed having to switch from fishfish to bash to use virtualenv. :3

Wow, almost a year after I wrote this thing it has been receiving a lot of attention on Hacker News (link) and Reddit (link). I'm very grateful for all the feedback and intend to incorporate as much as I can.

The trick I have been using is to hardcode the full path to the virtualenv python binary. For example in a cron I will

*/5 * * * * /home/username/.virtualenv/project/bin/python /home/username/src/project/program.py

You are not forced to do this.
If your program was properly installed inside the virtualenv (as it should be), then its shebang should have been replaced by the full path of the virtualenv's python interpreter. This was made to answer this problem.

Standard activate does not modify $LD_LIBRARY_PATH. I just remembered that's something I patched in. Where I work, we use the virtual Python environment to be the general container for all the stuff our application needs. We include gevent in our virtual environment, so we also put its dependencies (like libevent, needed by our version of gevent) in the virtual environment under the $VIRTUAL_ENV/lib.

Yeah, we do the exact same thing.

I wonder why this isn't common practice.

Because this isn't really working all that well:

  • (if you use pip to provision your venvs:)pip only handle runtime dependencies. It only ensure that after install, every dependency is present, so the installation order of you dependencies is not guaranteed to match your dependency graph => you can not rely on pip to build source packages. You are limited to distribute binaries (and since pip was made to handle source packages, it does not provide anything to support multiple platforms for prebuilt binaries, thus forcing you to handle this at the bootstrap layer (a.k.a. multiple repositories for different platforms, and distributing a pip.conf) or the package maintainer level (the package contains all platforms, and setup.py picks at installation time which one to install)
  • Relying on LD_LIBRARY_PATH is kind of hackish since it directly hooks up library resolution in ld.so, thus impacting your dynamic loader (program execution) AND ld (the linker). This means that any attempt to build something in your virtualenv could not be trusted.

Doesn't seem to work for me on the fish shell. Creates a subshell just fine but doesn't actually "activate" the virtualenv.

I absolutely agree with this idea, it makes a lot more sense conceptually. If you're developing for a server, you should basically be creating a local version of that server, and a local version of ssh is more analogous. I use virtualenvwrapper, which simpler to use, but it just abstracts away that complication rather than fixing it.

@honza You should check that, because it does work on fish for me without further hacks.

Do you known Kenneth Reitz's autoenv?
One simply "cd " and BANG!, virtualenv activated.
Then, "cd .." and BANG!, virtualenv deactivated.

The picture says for itself... Please give it a try:

https://github.com/kennethreitz/autoenv

To be fair, I only give auto-activating a virtualenv as an example of using capn which is a generic directory-based hooking mechanism. It doesn't really "use" the anti-pattern described here. @caruccio, that might interest you based on your comment, though autoenv looks nice.

That said, I agree with what has been said here. I wonder, like @tclancy how much of this has to do with Windows.

@e000 in fish (or fishfish) you can do . virtualenv/bin/activate.fish

This is a very elegant approach. I used it to create a simple activate/workon script for environments created with the conda tool, which is part of the Anaconda scientific python distribution. check it out https://gist.github.com/mangecoeur/5161488

I can only get this to work with bash not zsh

After an ubuntu upgrade, my virtualenvs broke, I got tired of that setup, and I saw how this could be useful for making a virtualenvwrapper that doesn't even have to touch the shell from which it is invoked...

So, I made a rewrite of virtualenvwrapper in pure python... and I unimaginatively called it "invewrapper" :)

https://github.com/berdario/invewrapper

It works on bash, zsh, fish... and it mostly works on windows too! (there's a bug right now that affects sitepackages_dir and related commands)

Please let me have some feedback :)

A year after reading this, I'm still chewing on it. I think I completely agree, and I don't know why it hasn't gained wider adoption.

@datagrok, do you attend PyCon? If so, you should really do at least a lightning talk on this. It's subtly and elegantly revolutionary IMO.

@lyndsysimon have you tried invewrapper? it could be a way for it to gain wider adoption :D

I've done a bit more work on this--you can see it in my feature/subshell-activate branch of virtualenv. It's not completely working but I think I have a good proof-of-concept. I've implemented a means to do the "activation" in a subshell (the way I like it) without breaking workflow at all for users who prefer to source bin/activate or one of its flavors. (I reimplement the latter in terms of the former.) I'll be pushing more commits whenever I have time.

Also, I just submitted a PyCon 2014 talk proposal based on this work.

Thanks all for your feedback, recommendations, alternative proposals, and encouragement.

@datagrok - I hope to catch your talk; PyCon's time and place is actually convenient for once. :)

@apreche take a look here:

http://rosettacode.org/wiki/Multiline_shebang

I start a lot of my scripts this way....

#!/bin/sh
if "true" : '''\'
then

# set up your execution environment
source /some/stuff

exec python "$0" "$@"

exit 127
fi
'''

print "now we're in Python"

@tsal ah, my proposal was declined. I'll try to boil it down into a lightning talk and submit that when I'm there though. Oh, and will try to finish up my proposed changes to virtualenv, that too. But @berdario's invewrapper looks pretty awesome, he might have gone and eliminated the need for any of my changes! We'll see :)

@datagrok wrote:

Unfortunately, I don't know if this "eval the output of a command" technique works for all possible shells.

Yes it can work. But with many quirks depending on each shell: for example, you must avoid any use of \n and consecutive spaces if you want the code to work even if the user forgot to use quotes around your command (eval "$(xxx)" vs eval $(xxx)).
I'm using this technique for the launch of my prompt engine named angel-PS1 which is portable (on various shells: bash, zsh, dash, ksh, fish, tcsh).

Any thoughts on why this doesn't work with zsh, or what the fix will be to make it more portable?

Mark: Virtualenv the right way.

@Apreche it looks like pew in ve_name your_command might be the answer you're looking for (confirmed for shell scripts, not sure about Python scripts or modules just yet) - see https://stackoverflow.com/questions/22018185/how-can-i-use-pew-in-a-bash-python-fabric-sh-script

Very important note: the provided inve will fail if you do not have the python-dev package installed, as there will not be include/python*/Python.h

@datagrok You should make a note of this or modify the script to function without python-dev.

I've added a pull request in https://github.com/pypa/virtualenv/issues/581 to introduce login/logout shell logic. It is a first step that could be expanded later if merged.

There's a problem using this with oh-my-zsh. The $PATH variable gets overwritten by the stock ~/.zshrc when you launch the new shell, appending something like this to your ~/.zshrc will fix the problem.

if [ -z "$VIRTUAL_ENV" ]; then
else
    export PATH="$VIRTUAL_ENV/bin:$PATH"
fi

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.