Skip to content

Instantly share code, notes, and snippets.

@mgbckr
Last active December 23, 2022 09:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mgbckr/bd7da9205b8d84eb7eaeeddfaf1012bd to your computer and use it in GitHub Desktop.
Save mgbckr/bd7da9205b8d84eb7eaeeddfaf1012bd to your computer and use it in GitHub Desktop.
Dockerfile with JupyterLab and custom environment based on micromamba and mamba

Dockerfile with JupyterLab and custom environment based on micromamba and mamba

⚠️ ATTENTION: The container is intended to run in ROOTLESS mode!

The Dockerfile provides a project container that contains Jupyter and provides a conda environment specified by environment.yml. The container uses mamba to install the environment. Within the container, you can install new packages with mamba.

Set up

  • create a custom environment.yml

  • build the container

    docker build -t PROJECT_NAME:VERSION .
  • run

    • Jupyter:

      # Jupyter will be available on port `27101`
      docker run --name CONTAINER_NAME -v "${PWD}":/workspace -p 27101:8888 -it PROJECT_NAME:VERSION start.sh jupyter
    • zsh

      docker run --name CONTAINER_NAME -v "${PWD}":/workspace -it PROJECT_NAME:VERSION

Notes

  • run with --gpus all to support GPUs; if you are running in rootless mode (which you should!), make sure no-cgroups = true in /etc/nvidia-container-runtime/config.toml (source). Also, for an environment.yml you need to specify the right pytorch version, e.g., via pytorch=*=cuda11.7", otherwise the CPU version is installed by mamba (see bug report).
  • --shm-size=128gb increases the shared memory size which might be necessary when fitting and training larger models

TODO

  • I am acutally not sure whether mamba/micromamba is a good idea at this point (but conda is just too low :(). I already encountered a couple of issues now.

  • replace mamba with micromamba as soon as the following bug is fixed (we are not using conda because it is tremendeously, unbelievably, horrendously slow when installing pytorch!!!): mamba-org/mamba#2167 (comment)

  • maybe support user mode like the Jupyter Docker Stack containers (or replace this with Jupyter Stack containers?)

  • secure Jupyterlab!

  • container musings:

    • podman might be able to actually map local users as we wanted to -.- ... maybe it can also do this with root though
    • what about singularity? how do podman and singularity compare?
    • easy way to make an image user specific linked by Singularity Jupyter Image
    • Docker
      • I also tried running with a command like this in rootless mode (with the correct user and group ids, i.e., the same as the calling user) but it fucked up my file permission to some weird number:

        docker run -it --rm \
            -p 8889:8888 \
            --user root \
            -e NB_USER="mgbckr" -e NB_UID=1001 -e NB_GID=1001 \
            -e CHOWN_HOME=yes -e CHOWN_HOME_OPTS='-R' \
            -w "/home/mgbckr" \
            -v "${PWD}:/home/mgbckr/work" jupyter/base-notebook
      • running Docker with ---user also didn't get me anywhere ... I was still seeing things as root and the scripts didn't work anymore

# ATTENTION: This container is intended to run in ROOTLESS mode:
# https://docs.docker.com/engine/security/rootless/
# TODO:
# * check this for better approach to run in user space rather than root if rootless mode is not an option:
# https://github.com/jupyter/docker-stacks/blob/main/base-notebook/Dockerfile
# * secure Jupyter Lab
ARG BASE_IMAGE=ubuntu:22.04
FROM ${BASE_IMAGE}
# install packages
RUN apt-get -y update --allow-releaseinfo-change\
&& apt-get -y autoremove \
&& apt-get clean \
&& apt-get install -y \
fonts-liberation \
locales \
wget curl \
unzip \
# - bzip2 is necessary to extract the micromamba executable.
bzip2 \
# - pandoc is used to convert notebooks to html files
# it's not present in aarch64 ubuntu image, so we install it here
pandoc \
git \
vim \
zsh \
&& rm -fr /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& echo "en_US.UTF-8 UTF-8" > /etc/locale.gen \
&& locale-gen
# TODO: not really sure why I need this
ENV LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
LANGUAGE=en_US.UTF-8
# setup oh-my-zsh
RUN sh -c "$(wget https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh -O -)"
# install micromamba
ENV MAMBA_ROOT_PREFIX=/opt/mamba
# NOTE: because of some weird bug,
# we need to use a mamba installation
# rather than micromamba directly
# source: https://github.com/mamba-org/mamba/issues/2167#issuecomment-1355533652
#RUN curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba \
# && /bin/micromamba shell init -s zsh -p ${MAMBA_ROOT_PREFIX} \
# && echo "alias mamba=micromamba" >> ~/.zshrc
# WORKAROUND:
RUN curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba \
&& /bin/micromamba create --yes -n mamba mamba -c conda-forge
# switch from micromamba to the now installed mamba distribution
ENV MAMBA_ROOT_PREFIX=${MAMBA_ROOT_PREFIX}/envs/mamba
ENV MAMBA_EXEC=${MAMBA_ROOT_PREFIX}/bin/mamba
# init mamba
RUN ${MAMBA_EXEC} init zsh \
&& ${MAMBA_EXEC} config --set auto_activate_base false
# setup jupyter environment
COPY environment-jupyter.yml .
RUN ${MAMBA_EXEC} env create -f environment-jupyter.yml -n jupyter
# setup custom environment
ARG ENV_NAME="project"
ARG ENV_YAML="environment.yml"
COPY ${ENV_YAML} .
# TODO: migrate to `micromamba create --yes` as soon as the above mentioned bug is fixxed:
# source: https://github.com/mamba-org/mamba/issues/2167#issuecomment-1355533652
RUN ${MAMBA_EXEC} env create -f environment.yml -n ${ENV_NAME} \
&& ${MAMBA_EXEC} install -p "${MAMBA_ROOT_PREFIX}/envs/${ENV_NAME}" --yes ipykernel -c conda-forge \
&& ${MAMBA_ROOT_PREFIX}/envs/${ENV_NAME}/bin/python -m ipykernel install --name ${ENV_NAME} \
&& echo "mamba activate ${ENV_NAME}" >> ~/.zshrc
# set zsh prompt
#RUN echo "export PROMPT='[${ENV_NAME}] $PROMPT'" > >> ~/.zshrc
# set workspace directory
WORKDIR /workspace
# expose jupyter port (not really needed; just for documentation)
EXPOSE 8888
# get start script
COPY start.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/*
CMD ["/bin/bash", "-c", "/usr/local/bin/start.sh"]
name: jupyter
channels:
- conda-forge
dependencies:
- python=3.10
- jupyterlab
name: project
channels:
- conda-forge
- pytorch
- nvidia
dependencies:
- python=3.9
- pandas
- openpyxl
- seaborn
- scikit-learn
- networkx
- kaggle
- pillow
- pydicom
# necessary due to bug in mamba: https://github.com/mamba-org/mamba/issues/1617#issuecomment-1099962432
- pytorch=*=*cuda11.6*
- torchvision
- pytorch-cuda=11.6
# - pip:
# - git+https://github.com/blaze/dask.git#egg=dask[complete]
#!/bin/bash
if [ "$#" -eq "0" ]; then
echo "Starting zsh"
zsh
elif [ "$1" == "shell" ]; then
echo "Starting zsh"
zsh
elif [ "$1" == "zsh" ]; then
echo "Starting zsh"
zsh
elif [ "$1" == "jupyter" ]; then
echo "Starting jupyter lab"
/opt/mamba/envs/jupyter/bin/jupyter-lab --allow-root --NotebookApp.token='' --ip="*"
else
exit 0
fi
@turian
Copy link

turian commented Dec 22, 2022

@mgbckr what is your environment-jupyter.yml

@mgbckr
Copy link
Author

mgbckr commented Dec 22, 2022

@turian I attached it, have a look. Nothing fancy really :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment