Skip to content

Instantly share code, notes, and snippets.

@acroz
Forked from imrehg/00_README.md
Last active August 12, 2020 19:22
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save acroz/e872f6266b795a22f351896792ba2c3d to your computer and use it in GitHub Desktop.
Save acroz/e872f6266b795a22f351896792ba2c3d to your computer and use it in GitHub Desktop.
Custom Python environment

Custom Python versions on Faculty Platform

The following environments provide a foundation to use a different Python version on the Faculty Platform servers than what's shipped in the default Python2 and Python3 Conda environments (Python 2.7.16 and 3.6.10 respectively, at the time of writing). These environments are intended to be a quick fix while the Platform implements support for newer Python versions.

There are 3 environments included:

  • Python 3.7 - Minimal: this is a barebones environment that is faster to install and can serve as a basis when not all of the default dependencies are required
  • Python 3.7 - Default packages: an environment that installs the default packages found on the platform, plus PyTorch
  • Python 3.7 - GPU default packages: this environment is the same as the previous one, with the addition of installing the GPU-enabled versions of the Tensorflow and PyTorch libraries

Installation

We recommend adding these environments to the Faculty Knowledge Centre to facilitate easy use across multiple projects. To do this:

  1. Create a new empty project that we will set up the environments in.
  2. In this project, create three environments, each for one of the scripts below. See below for more detailed instructions.
  3. Publish these three environments to the Knowledge Centre as three separate environments. They will then be available in the Knowledge Centre, from where users can copy them to projects where they would like to use them.

1. Python 3.7 - Minimal

  • Create a new environment with the name "Python 3.7 - Minimal" and the description contained in 01_minimal_description.md below. Make sure to copy the descriptions in "Raw" mode - the Knowledge Centre will render the Markdown formatting.
  • Paste the content of 01_minimal_script.sh in the "Script" section. The environment should save automatically.
  • The environment can now be published to the knowledge centre with the "Publish" button in the upper right corner of the screen.

2. Python 3.7 - Default packages

  • Create a new environment with the name "Python 3.7 - Default packages" and the description contained in 02_default_packages_description.md below. Make sure to copy the descriptions in "Raw" mode - the Knowledge Centre will render the Markdown formatting.
  • Paste the content of 02_default_packages_script.sh in the "Script" section. The environment should save automatically.
  • The environment can now be published to the knowledge centre with the "Publish" button in the upper right corner of the screen.

3. Python 3.7 - GPU default packages

  • Create a new environment with the name "Python 3.7 - GPU default packages" and the description contained in 03_gpu_default_packages_description.md below. Make sure to copy the descriptions in "Raw" mode - the Knowledge Centre will render the Markdown formatting.
  • Paste the content of 03_gpu_default_packages_script.sh in the "Script" section. The environment should save automatically.
  • The environment can now be published to the knowledge centre with the "Publish" button in the upper right corner of the screen.

This environment installs an additional conda environment to a server which runs Python 3.7. None of the default packages normally provided by the platform (e.g. numpy, scikit-learn) are installed. This makes it faster to install, but if you want thbe full set of default dependencies, use the "Python 3.7 - Default packages" or "Python 3.7 - GPU default packages" environments.

To use this environment, add it to your project and customise it as necessary according to the "Customisations" section below - any dependencies you need can be specified by editing the environment script inside your project.

Using this environment

When this environment is applied to a server, terminals will default to using the newly created conda environment.

When using Jupyter, you have to manually change the kernel of any notebooks you are using. To do this, use the "Kernel > Change kernel" menu option in Jupyter notebooks.

In jobs, you will have to manually activate the environment in your job script. The recommended way to do this is to write a wrapper script like the one below which first runs conda activate Custom-3.7 then runs the actual Python (or other) process. Then, configure your job to run this script with bash scriptname.sh.

#!/bin/bash
conda activate Custom-3.7
python myscript.py

Customisations

PYTHON_VERSION sets the version of Python to be installed. Not all Python libraries support all Python versions, and thus not all Python version / script combinations have been tested. Please do your own testing whether any given library functions properly with the Python version you've chosen!

CONDA_ENV is the name of the new Conda environment (not to confuse with Faculty environment), where the new Python version is set up. Faculty uses Conda to create isolated Python environments in the platform servers, which can differ in their Python version, their installed packages, etc. The Conda environment name is used to refer to these. By default the Platform currently ships with two environments, Python2 and Python3, while this script will create a third environment to use, called Custom-3.7 in its current form. You can change this value if you prefer to something more natural for your use case.

If you want to specify extra packages to be installed in the given environment, the scripts has marked sections where those modifications (specifying package list for conda install and pip install) can be made.

Note that with these scripts, you cannot use the Faculty environments' "Python" section to define your Python packages (with Pip and Conda), as they apply to the Faculty-provided Python environments only. You need to use the scripts' variables as described above to set your Python package dependencies!

# Configuration
# The Python version to install (such as 3, or 3.8, or 3.8.1, for example)
PYTHON_VERSION=3.7
# The name of the new kernel in Jupyter
# Kernel names can only contain ASCII letters and numbers,
# and these separators: - . _ (hyphen, period, and underscore).
CONDA_ENV="Custom-${PYTHON_VERSION}"
# End of configuration
# Stop on any error
set -o errexit
/opt/anaconda/bin/conda create -n "${CONDA_ENV}" -y python=${PYTHON_VERSION}
# Install pacakages from conda
/opt/anaconda/bin/conda install -n "${CONDA_ENV}" -y \
nb_conda_kernels==2.1.1
# Add extra Conda packages here, don't forget the \ at the end of the line for newlines
/opt/anaconda/envs/${CONDA_ENV}/bin/pip install --upgrade pip
# If installing any packages with pip, uncomment the following line and
# replace PACKAGEA, ... with the package names
# /opt/anaconda/envs/${CONDA_ENV}/bin/pip install PACKAGEA PACKAGEB
# Export relevant environment variables
# https://docs.faculty.ai/how_to/environment_variables.html#setting-environment-variables-through-faculty-environments
SHARED_ENV_SCRIPT="/etc/faculty_environment.d/conda.sh"
echo "export CONDA_ENV=${CONDA_ENV}" >> "${SHARED_ENV_SCRIPT}"
# Activate Virtualenv in interactive shells automatically
sudo tee /etc/profile.d/zz-path.sh <<'END'
if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ] && \
[ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
conda activate "${CONDA_ENV}"
fi
END
# Restart Jupyter to pick up the new conda environment
if [ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
sudo sv restart jupyter
fi
conda activate ${CONDA_ENV}
echo "Python version:"
python -V

This environment installs an additional conda environment to a server which runs Python 3.7 with all the packages provided by the platform in the default Python2 and Python3 environments. If you want a minimal environment that's faster to install, use the "Python 3.7 - Minimal" environment, or if you need dependencies with GPU support, use "Python 3.7 - GPU default packages".

To use this environment, add it to your project and customise it as necessary according to the "Customisations" section below - any dependencies you need can be specified by editing the environment script inside your project.

Using this environment

When this environment is applied to a server, terminals will default to using the newly created conda environment.

When using Jupyter, you have to manually change the kernel of any notebooks you are using. To do this, use the "Kernel > Change kernel" menu option in Jupyter notebooks.

In jobs, you will have to manually activate the environment in your job script. The recommended way to do this is to write a wrapper script like the one below which first runs conda activate Custom-3.7 then runs the actual Python (or other) process. Then, configure your job to run this script with bash scriptname.sh.

#!/bin/bash
conda activate Custom-3.7
python myscript.py

Customisations

PYTHON_VERSION sets the version of Python to be installed. Not all Python libraries support all Python versions, and thus not all Python version / script combinations have been tested. Please do your own testing whether any given library functions properly with the Python version you've chosen!

CONDA_ENV is the name of the new Conda environment (not to confuse with Faculty environment), where the new Python version is set up. Faculty uses Conda to create isolated Python environments in the platform servers, which can differ in their Python version, their installed packages, etc. The Conda environment name is used to refer to these. By default the Platform currently ships with two environments, Python2 and Python3, while this script will create a third environment to use, called Custom-3.7 in its current form. You can change this value if you prefer to something more natural for your use case.

If you want to specify extra packages to be installed in the given environment, the scripts has marked sections where those modifications (specifying package list for conda install and pip install) can be made.

Note that with these scripts, you cannot use the Faculty environments' "Python" section to define your Python packages (with Pip and Conda), as they apply to the Faculty-provided Python environments only. You need to use the scripts' variables as described above to set your Python package dependencies!

# Configuration
# The Python version to install (such as 3, or 3.8, or 3.8.1, for example)
PYTHON_VERSION=3.7
# The name of the new kernel in Jupyter
# Kernel names can only contain ASCII letters and numbers,
# and these separators: - . _ (hyphen, period, and underscore).
CONDA_ENV="Custom-${PYTHON_VERSION}"
# End of configuration
# Stop on any error
set -o errexit
/opt/anaconda/bin/conda create -n "${CONDA_ENV}" -y python=${PYTHON_VERSION}
# Install pacakages that are installed by default in the Faculty user images
/opt/anaconda/bin/conda install -n "${CONDA_ENV}" -y \
anaconda \
psycopg2 \
plotly \
flask \
'gunicorn==19.9.0' \
gevent \
greenlet \
nb_conda_kernels==2.1.1
# Add extra Conda packages here, don't forget the \ at the end of the line for newlines
# Install PyTorch from the pytorch conda channel
conda install -n "${CONDA_ENV}" -y -c pytorch pytorch
# pip-installed dependencies
/opt/anaconda/envs/${CONDA_ENV}/bin/pip install --upgrade pip
/opt/anaconda/envs/${CONDA_ENV}/bin/pip install \
ipykernel \
boto3 \
google-cloud-storage \
google-cloud-bigquery \
pandas-gbq \
sqlalchemy \
psycopg2 \
mysqlclient \
lens \
tensorflow~=1.14.0 \
graphviz \
nbdime \
sherlockml \
faculty \
mlflow==1.7.2 \
mlflow-faculty==0.5.0 \
faculty-models
# Add extra pip packages here, don't forget the \ at the end of the line for newlines
# Export relevant environment variables
# https://docs.faculty.ai/how_to/environment_variables.html#setting-environment-variables-through-faculty-environments
SHARED_ENV_SCRIPT="/etc/faculty_environment.d/conda.sh"
echo "export CONDA_ENV=${CONDA_ENV}" >> "${SHARED_ENV_SCRIPT}"
# Activate Virtualenv in interactive shells automatically
sudo tee /etc/profile.d/zz-path.sh <<'END'
if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ] && \
[ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
conda activate "${CONDA_ENV}"
fi
END
# Restart Jupyter to pick up the new conda environment
if [ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
sudo sv restart jupyter
fi
conda activate ${CONDA_ENV}
echo "Python version:"
python -V

This environment installs an additional conda environment to a server which runs Python 3.7 with all the packages provided by the platform in the default Python2 and Python3 environments. If you want a minimal environment that's faster to install, use the "Python 3.7 - Minimal" environment, or if you want the same environment without GPU support, use "Python 3.7 - Default packages".

To use this environment, add it to your project and customise it as necessary according to the "Customisations" section below - any dependencies you need can be specified by editing the environment script inside your project.

Using this environment

When this environment is applied to a server, terminals will default to using the newly created conda environment.

When using Jupyter, you have to manually change the kernel of any notebooks you are using. To do this, use the "Kernel > Change kernel" menu option in Jupyter notebooks.

In jobs, you will have to manually activate the environment in your job script. The recommended way to do this is to write a wrapper script like the one below which first runs conda activate Custom-3.7 then runs the actual Python (or other) process. Then, configure your job to run this script with bash scriptname.sh.

#!/bin/bash
conda activate Custom-3.7
python myscript.py

Customisations

PYTHON_VERSION sets the version of Python to be installed. Not all Python libraries support all Python versions, and thus not all Python version / script combinations have been tested. Please do your own testing whether any given library functions properly with the Python version you've chosen!

CONDA_ENV is the name of the new Conda environment (not to confuse with Faculty environment), where the new Python version is set up. Faculty uses Conda to create isolated Python environments in the platform servers, which can differ in their Python version, their installed packages, etc. The Conda environment name is used to refer to these. By default the Platform currently ships with two environments, Python2 and Python3, while this script will create a third environment to use, called Custom-3.7 in its current form. You can change this value if you prefer to something more natural for your use case.

If you want to specify extra packages to be installed in the given environment, the scripts has marked sections where those modifications (specifying package list for conda install and pip install) can be made.

Note that with these scripts, you cannot use the Faculty environments' "Python" section to define your Python packages (with Pip and Conda), as they apply to the Faculty-provided Python environments only. You need to use the scripts' variables as described above to set your Python package dependencies!

# Configuration
# The Python version to install (such as 3, or 3.8, or 3.8.1, for example)
PYTHON_VERSION=3.7
# The name of the new kernel in Jupyter
# Kernel names can only contain ASCII letters and numbers,
# and these separators: - . _ (hyphen, period, and underscore).
CONDA_ENV="Custom-${PYTHON_VERSION}"
# End of configuration
# Stop on any error
set -o errexit
/opt/anaconda/bin/conda create -n "${CONDA_ENV}" -y python=${PYTHON_VERSION}
# Install pacakages that are installed by default in the Faculty user images
/opt/anaconda/bin/conda install -n "${CONDA_ENV}" -y \
anaconda \
psycopg2 \
plotly \
flask \
'gunicorn==19.9.0' \
gevent \
greenlet \
nb_conda_kernels==2.1.1 \
cudatoolkit
# Add extra Conda packages here, don't forget the \ at the end of the line for newlines
# Install PyTorch from the pytorch conda channel
conda install -n "${CONDA_ENV}" -y -c pytorch pytorch
# pip-installed dependencies
/opt/anaconda/envs/${CONDA_ENV}/bin/pip install --upgrade pip
/opt/anaconda/envs/${CONDA_ENV}/bin/pip install \
ipykernel \
boto3 \
google-cloud-storage \
google-cloud-bigquery \
pandas-gbq \
sqlalchemy \
psycopg2 \
mysqlclient \
lens \
tensorflow-gpu~=1.14.0 \
graphviz \
nbdime \
sherlockml \
faculty \
mlflow==1.7.2 \
mlflow-faculty==0.5.0 \
faculty-models
# Add extra pip packages here, don't forget the \ at the end of the line for newlines
# Export relevant environment variables
# https://docs.faculty.ai/how_to/environment_variables.html#setting-environment-variables-through-faculty-environments
SHARED_ENV_SCRIPT="/etc/faculty_environment.d/conda.sh"
echo "export CONDA_ENV=${CONDA_ENV}" >> "${SHARED_ENV_SCRIPT}"
# Activate Virtualenv in interactive shells automatically
sudo tee /etc/profile.d/zz-path.sh <<'END'
if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ] && \
[ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
conda activate "${CONDA_ENV}"
fi
END
# Restart Jupyter to pick up the new conda environment
if [ "$FACULTY_IS_INTERACTIVE" = "1" ]; then
sudo sv restart jupyter
fi
conda activate ${CONDA_ENV}
echo "Python version:"
python -V
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment