tacaswell/overlays.md

## overlays.md

      
    Raw
  

              overlays.md
            
          
    Hot-fixing and extending conda environments

Introduction

We deploy root-owned conda environments which are the
basis of the data collection and analysis environments.  On one hand because
these are owned by root they are write-protected and ensure that users can not
accidentally break the environment, on the other hand because they are
write-protected they can not be upgraded or extended.  While we want to run
with a stable, standard, well understood software environment, we do need this
ability for both development and for time-critical hot-fixes.
There are a number of possible solutions to this:

use sudo to edit the environment in place (via conda, pip, or "by hand")
create / clone the conda environment into user space
use $PYTHONPATH and pip install --prefix to create "overlay" directories

Historically, we have primarily gone with option 1 and 2, however they have
significant down sides.  Modifying the environment requires doing operations
with elevated privileges and it is very hard to track what has been done after
the fact.
This document lays out using "overlays" as a technique to both locally replace
already installed packages and to add new packages for development.
General theory of operation

When you do python import foo Python goes through a
process to find and load the
requested module.  An early step uses import
path to search disk
locations for the requested modules.  This path can be accessed via sys.path
Installation tools typically place files in directories that Python searches by
default, conventionally site-packages.  In addition to being directly
controllable from inside of a Python process, the entries in sys.path can be
controlled via the
PYTHONPATH
envvar.  When searching for an import Python stops looking when it finds the
module allowing you to effectively shadow modules by putting their locations
earlier in the path.
Taken together we can now do two things:

install place modules someplace we can write to as an un-privileged user
use PYTHONPATH to tell Python to find our modules there

Location, location, location

From Python's point of view these extra files can be anywhere, however as a matter of
policy we are going to use the location
/some/path/overlays/{env_name}/

as the prefix which means we will have to add the path
/some/path/overlays/{env_name}/lib/{python_version}/site-packages

to the PYTHONPATH.
Similarly, if the package contains anything that will be run from the shell, then
/some/path/overlays/{env_name}/bin

needs to be added to PATH by any mechanism.
Install a new package for development

To install a new packages into our overlay directory using pip we use the
--prefix flag
for pip:
$ conda activate {env_name}
$ pip install --prefix=/some/path/overlays/{env_name} ...
Any dependencies that are already installed in the host environment will be
picked up (conda provides the meta-data that pip needs to agree a package is
installed) and any missing dependencies will be installed along side your
requested package.  All standard pip command line flags and arguments should
work as expected.
To access the packages you need to arrange for the site-packages directory in
the overlay to be added to the PYTHONPATH / sys.path.
Upgrade an existing package

If we want to upgrade an existing package using this technique the above will
fail because as part of the installation process pip will (rather sensibly)
attempt to uninstall any existing versions of the package.  Because our host
environment required elevated privileges to modify this will fail.  To upgrade
a package we need to additionally add the
-I flag to
ignore any information about the already installed packages which prevents the
permissions error.  However, because this also means that pip is no longer
aware of the already installed dependencies!  To avoid re-installing all of the
dependencies along with the target package we use the
--no-deps
flag to tell pip to not try to install them. Thus :
$ conda activate {env_name}
$ pip install \
    -I --no-deps \
    --prefix=/some/path/overlays/{env_name} \
    ...