We are lacking the ability to specify optional dependencies in the conda ecosystem. The PyPI ecosystem has had this for a while now and we would like to, on the one hand, lower the friction for people converting packages from PyPI to conda, and on the other hand we would like to have this functionality.
There are two types of optional dependencies (in PyPI, but also in general):
- Optional dependency groups based on a specifier. These are additional dependencies that are supposed to be pulled in when additional parameters are given. For example, the
ibis
package on PyPI has multiple optional dependency groups such asmysql
,postgres
, orpandas
. When the user (or another package) tries to installibis[mysql]
the extra dependencies are automatically pulled in. - Conditional dependencies: these are environment markers in PyPI / pip land and are added based on certain conditions such as on "Windows", "macOS" or "Linux", based on the GLIBC version, or the Python version.
For both dependency types there are workarounds in the conda ecosystem.
For optional dependency groups, multiple packages can be created and separated by "dash". This can be done relatively easily with multiple outputs from a recipe.
For example, ibis and ibis-mysql could be modeled as follows:
ibis 1.0.1 h12345:
- python: >=3.4
- rich: >=1.0
ibis-mysql 1.0.1 h432123:
# exact dependency on "parent" package so that versions are tied together
- ibis: ==1.0.1 h12345
- mysql: >=5.0
- mysql-adapater: 1.2.3
Now, any package could depend on ibis-mysql
or even multiple ibis-mysql ibis-pandas to have the same effect as
ibis[mysql, pandas]` - however, this is not elegant because we create empty packages only to ship some metadata. It is also wasteful in terms of "repodata.json" size as it balloons up without added benefit.
However, internally, the solver should continue modelling optional dependency groups like this.
Conditional dependencies can be modelled in the conda world by adding virtual packages in the run dependencies. In the conda-forge community we have started to observe two noarch
packages per version that require either __unix
or __win
. This is used to model conditional optional dependencies for Windows or Unix. A given package would do the following:
mypkg 1.0 v1:
__win: >=0
numpy: >=2
pywin32: 1.2
mypkg 1.0 v2:
__unix: >=0
numpy: >=2
The solver will now only pick the one package that matches the current platform and thus add in the optional dependency. Again, this is wasteful in terms of repodata.json
and package disk space as we could express the same with metadata only.
Now we would like to express these two modes for conda in repodata.json:
"mypkg-1.0-v1.conda": {
"dependencies": [
"numpy >=2"
],
"optional_dependencies": {
"mysql": [ "mysql >=5.0", "mysql-adapter 1.2.3" ],
}
"conditional_dependencies": {
"__win": [ "pywin32" ],
"python<=3.10": [ "future_annotations" ]
}
}
The new optional_dependencies
and conditional_dependencies
dictionaries can be used to inject additional dependencies for a given package.
When asked for mypkg[mysql]
the solver will return mypkg
+ the extra dependencies, just like previously with mypkg-mysql
.
The conditional dependencies are more interesting as they should result in solver "branches", a positive and negative entry for each combination.
For example, the solver should have 4 combinations of mypkg
internally:
[__win, python<=3.10] -> [pywin32, future_annotations]
[!__win, python<=3.10] -> [future_annotations]
[__win, !python<=3.10] -> [pywin32]
[!__win, !python<=3.10] -> []
Most of these would be discarded very early on (e.g. __win
is never true when resolving for a Unix system). This schema would allow for proper resolution though, especially for Python versions.
"optional_depends": [
{
"if": ["__win"],
"requires": "winapi"
},
{
"feature": "feature_a",
"requires": "qt-main"
},
{
"feature": "feature_b",
"requires": "sphinx >=7",
"if": ["python 3.11.*"]
}
{
"feature": "foobar",
"requires": "python >=3.8"
}
{
"feature": "foobar",
"requires": "sphinx >=7",
"if": ["python 3.11.*"]
}
]
}