https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community
conda install -c conda-forge --solver=libmamba ...
From the official documentation:
- Mamba is a fast, robust, and cross-platform package manager.
- It runs on Windows, OS X and Linux (ARM64 and PPC64LE included) and is fully compatible with conda packages and supports most of conda’s commands.
mamba
: a Python-based CLI conceived as a drop-in replacement for conda, offering higher speed and more reliable environment solutions
- Code review of https://github.com/MrPowers/mack (to learn Python, the tooling, Delta Lake and concepts like SCD2)
- Powerful and flexible package manager
- a free minimal installer for conda
- a small, bootstrap version of Anaconda with only
conda
, Python, the packages they depend on, and a small number of other useful packages (e.g.,pip
,zlib
) - Use
conda install
to install additional conda packages from the Anaconda repository - Miniconda Docker images
From Installation:
The fastest way to obtain conda is to install Miniconda, a mini version of Anaconda that includes only conda and its dependencies.
$ conda
usage: conda [-h] [-V] command ...
conda is a tool for managing and deploying applications, environments and packages.
...
From docker-miniconda:
Anaconda is the leading open data science platform powered by Python. The open source version of Anaconda is a high performance distribution and includes over 100 of the most popular Python packages for data science. Additionally, it provides access to over 720 Python and R packages that can easily be installed using the conda dependency and environment manager, which is included in Anaconda.
conda update
to upgrade
$ conda update conda 1 ↵
Collecting package metadata (current_repodata.json): done
Solving environment: done
# All requested packages already installed.
- docker-miniconda and Usage
- Databricks AutoML
$ cd /Users/jacek/dev/sandbox/dask
It's common to name the environments after the project.
$ conda create --help 2 ↵
usage: conda create [-h] [--clone ENV] (-n ENVIRONMENT | -p PATH) [-c CHANNEL] [--use-local] [--override-channels]
[--repodata-fn REPODATA_FNS] [--strict-channel-priority] [--no-channel-priority] [--no-deps | --only-deps] [--no-pin]
[--copy] [-C] [-k] [--offline] [-d] [--json] [-q] [-v] [-y] [--download-only] [--show-channel-urls] [--file FILE]
[--no-default-packages] [--solver {classic} | --experimental-solver {classic}] [--dev]
[package_spec ...]
Create a new conda environment from a list of specified packages. To use the newly-created environment, use 'conda activate envname'. This command requires either the -n NAME or -p PREFIXoption.
...
Target Environment Specification:
-n ENVIRONMENT, --name ENVIRONMENT
Name of environment.
...
$ conda create -n dask-sandbox dask
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /usr/local/Caskroom/miniconda/base/envs/dask-sandbox
added / updated specs:
- dask
The following NEW packages will be INSTALLED:
...
Proceed ([y]/n)? y
Downloading and Extracting Packages
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate dask-sandbox
#
# To deactivate an active environment, use
#
# $ conda deactivate
$ conda activate dask-sandbox
FIXME How to conda activate dask-sandbox
anytime we cd
to the directory (like pyenv local dask-sandbox
would do)?
Let's use a modern Python interface.
$ conda install --help
usage: conda install [-h] [--revision REVISION] [-n ENVIRONMENT | -p PATH] [-c CHANNEL] [--use-local] [--override-channels]
[--repodata-fn REPODATA_FNS] [--strict-channel-priority] [--no-channel-priority] [--no-deps | --only-deps]
[--no-pin] [--copy] [-C] [-k] [--offline] [-d] [--json] [-q] [-v] [-y] [--download-only] [--show-channel-urls]
[--file FILE] [--solver {classic} | --experimental-solver {classic}] [--force-reinstall]
[--freeze-installed | --update-deps | -S | --update-all | --update-specs] [-m] [--clobber] [--dev]
[package_spec ...]
Installs a list of packages into a specified conda environment.
...
Install the package 'scipy' into the currently-active environment::
conda install scipy
Install a list of packages into an environment, myenv::
conda install -n myenv scipy curl wheel
Install a specific version of 'python' into an environment, myenv::
conda install -p path/to/myenv python=3.7.13
$ conda install -n dask-sandbox jupyter notebook
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /usr/local/Caskroom/miniconda/base/envs/dask-sandbox
added / updated specs:
- jupyter
- notebook
...
Proceed ([y]/n)? y
Downloading and Extracting Packages
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
$ jupyter notebook
That opens http://localhost:8888/tree.
Followed 10 Minutes to Dask and got my very first Dask app up and running! Yay!
From Itai's #1DatabrickAWeek - week 29:
- You can run Python wheel tasks on the Databricks platform
- A Python wheel is a way to package a project's components into a single file that can be installed on a target system (similar to JAR files in the JVM world)
- With Databricks Jobs, you can now run Python wheel tasks on clusters (similar to running an Apache Spark JAR or a notebook), providing the package name, entry point, and parameters.
- You can define these tasks through the UI (Jobs) or through the #REST API (Jobs API 2.1).
- Deploy Production Pipelines Even Easier With Python Wheel Tasks
- Databricks CLI eXtensions (dbx)
- Great Expectations (GX)
- poetry
- https://github.com/davidhalter/jedi
- https://tox.wiki
- flake8
- Black
- pyenv lets you easily switch between multiple versions of Python.
- pyenv-virtualenv - a pyenv plugin to manage virtual environments created by
virtualenv
or Anaconda
pip install git+https://github.com/ibis-project/ibis.git#egg=ibis-framework[pandas,dask,postgres]
python3 -bb -m pytest tests/fugue