Skip to content

Instantly share code, notes, and snippets.

@daniel-k-feinberg
Last active February 8, 2024 20:58
Show Gist options
  • Save daniel-k-feinberg/30686d4ef45df5f137c7cb2264ae369f to your computer and use it in GitHub Desktop.
Save daniel-k-feinberg/30686d4ef45df5f137c7cb2264ae369f to your computer and use it in GitHub Desktop.
Dev Container for Reproducible Data Analysis

Containerized & Reproducible Data Analysis Environment

About

This gist contains Docker and Dev Container configs for R, Python, Rstudio Server, JupyterLab, Quarto, and renv.

Acknowledgements

Many thanks to the following people and organizations for sharing their code and tutorials openly. Please see below for a non-exhaustive list:

Of course, I must also acknowledge the many open source contributers who helped build the software necessary for this workflow. Without their generosity, this would not be possible. Many thanks!

{
"name": "Data Analysis Env",
"build": {
"dockerfile": "Dockerfile",
// Update VARIANT to pick a specific R version: 4, 4.1, 4.2
// More info: https://github.com/rocker-org/devcontainer-images/pkgs/container/devcontainer%2Fgeospatial
"args": { "VARIANT": "4.2" }
},
// Install Dev Container Features. More info: https://containers.dev/features
"features": {
"ghcr.io/rocker-org/devcontainer-features/quarto-cli:1": {
"installTinyTex": true
},
// Install JupyterLab and IRkernel.
// More info: https://github.com/rocker-org/devcontainer-templates/tree/main/src/r-ver
"ghcr.io/rocker-org/devcontainer-features/r-rig:1": {
"version": "none",
"installJupyterlab": true,
"vscodeRSupport": "full",
"installDevTools": true,
"installRMarkdown": true,
"installRadian": true,
"installVscDebugger": true
},
// Install R dependencies
"ghcr.io/rocker-org/devcontainer-features/r-packages:1": {
"packages": "downlit"
},
// Install Apt packages.
"ghcr.io/rocker-org/devcontainer-features/apt-packages:1": {
"packages": "curl, nano, neofetch, tmux, libxt6"
}
},
"customizations": {
"vscode": {
"extensions": [
// infra tools
"donjayamanne.githistory",
"GitHub.codespaces",
"ms-azuretools.vscode-docker",
"GitHub.copilot",
"GitHub.copilot-chat",
"peakchen90.open-html-in-browser",
"tomoki1207.pdf",
"mechatroner.rainbow-csv",
// language support
"quarto.quarto",
"ms-toolsai.jupyter",
"ms-toolsai.jupyter-renderers",
"ms-python.python",
"ms-python.vscode-pylance",
"REditorSupport.r",
"RDebugger.r-debugger",
"Ikuyadeu.r-pack",
"DavidAnson.vscode-markdownlint",
// theming
"GitHub.github-vscode-theme",
"cosmicsarthak.cosmicsarthak-neon-theme",
"hassanoof.theme",
"PKief.material-icon-theme",
// vscode behavior
"vsls-contrib.codetour",
"dqisme.sync-scroll"
]
}
},
// Forward Jupyter and RStudio ports
"forwardPorts": [8787, 8888],
"portsAttributes": {
"8787": {
"label": "Rstudio",
"requireLocalPort": true,
"onAutoForward": "ignore"
},
"8888": {
"label": "Jupyter",
"requireLocalPort": true,
"onAutoForward": "ignore"
}
},
// Use 'postAttachCommand' to run commands after the container is started.
"postAttachCommand": "R -q -e 'renv::repair() ; renv::restore()'"
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root
// "remoteUser": "root"
}
# Pre-built Dev Container Image for R. More info: https://github.com/rocker-org/devcontainer-images/pkgs/container/devcontainer%2Fgeospatial
# Available R version: 4, 4.1, 4.2
ARG VARIANT="4.2"
FROM ghcr.io/rocker-org/devcontainer/geospatial:${VARIANT}
RUN install2.r --error --skipinstalled -n -1 \
trackdown \
gptstudio \
&& rm -rf /tmp/downloaded_packages \
&& R -q -e 'remotes::install_github("https://github.com/dcomtois/summarytools/tree/0-8-9")'
# Install Python packages
COPY requirements.txt /tmp/pip-tmp/
RUN python3 -m pip --disable-pip-version-check --no-cache-dir install -r /tmp/pip-tmp/requirements.txt
pybryt
pylint
datascience
otter-grader
numpy
pandas
scipy
folium>=0.9.1
matplotlib
ipywidgets>=7.0.0
bqplot
nbinteract>=0.0.12
okpy
scikit-learn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment