Skip to content

Instantly share code, notes, and snippets.

@0xbe7a
Last active January 30, 2024 21:45
Show Gist options
  • Save 0xbe7a/bbf8a323409be466fe1ad77aa6dd5428 to your computer and use it in GitHub Desktop.
Save 0xbe7a/bbf8a323409be466fe1ad77aa6dd5428 to your computer and use it in GitHub Desktop.

Proposal: Introduction of Feature Sets

Objective

The aim is to introduce a feature set mechanism in the pixi package manager. This mechanism will enable clear, conflict-free management of dependencies tailored to specific environments, while also maintaining the integrity of fixed lockfiles.

Motivating Example: Test Dependencies and Multiple Python Versions

Consider a scenario where a project needs to be tested across multiple Python versions, each requiring a different set of dependencies. In this case, defining separate feature sets for each Python version (like py39, py310, etc.) allows for easy switching between environments without conflicts. Similarly, for development purposes, a test feature set can include dependencies necessary for testing and linting, which are not required in the production environment.

Design Considerations

  1. Non-Combinatorial: To ensure the dependency resolution process remains manageable, the solution should avoid a combinatorial explosion of dependency sets.
  2. Single Feature Activation: The design should allow only one feature set to be active at any given time, simplifying the resolution process and preventing conflicts.
  3. Fixed Lockfiles: It's crucial to preserve fixed lockfiles for consistency and predictability. Solutions must ensure reliability not just for authors but also for end-users, particularly at the time of lockfile creation.

Proposed Solution

Feature Set Definitions

Introduce feature sets in the pixi.toml configuration file, with each set comprising dependencies specific to a given environment or use case. For instance, a test feature set may include dependencies like pytest and pre-commit, essential for development but not for production.

pixi.toml Example:

[features]
test = ["pytest", "pre-commit"]
py39 = ["py39"]
py310 = ["py310"]

[dependencies]
requests = "*"
pytest = { version = ">= 1.2", optional = true }
pre-commit = { version = ">= 2", optional = true }
py39 = { package = "python", version = "3.9", optional = true}
py310 = { package = "python", version = "3.10", optional = true}

Lockfile Structure

Within the pixi.lock file, a package may now include an additional feature field, specifying the feature set to which it belongs. This structure ensures clarity and prevents unnecessary duplication of dependencies across different environments.

Feature Set Activation

Users can manually activate the desired feature set via command line or configuration. This approach guarantees a conflict-free environment by allowing only one feature set to be active at a time.

Command Configuration

Commands defined in pixi.toml can specify the supported feature sets. For example, a testing command can be linked to the test feature set, ensuring it runs with the correct dependencies.

Command Configuration Examples:

[tasks.test]
cmd = "pytest"
feature_set = ["test"]
[tasks.test]
cmd = "pytest"
feature_set = ["py39", "py310"]

Benefits

  • Simplicity: Clear and straightforward dependency management is achieved by making each feature set mutually exclusive.
  • Consistency: The solution upholds the principle of fixed lockfiles, ensuring stable and predictable dependency management across different project stages.

Drawbacks

In the proposed feature set mechanism, users can activate only one feature set at a time. This design decision simplifies the dependency resolution process and prevents conflicts. However, it can be limiting in certain scenarios:

Scenario with Orthogonal Feature Sets

  • Orthogonal feature sets are sets that could theoretically be combined because they don't interfere with each other. For example, one set might pertain to linting tools, while another might specify a Python version.
  • The limitation arises when a user wants to activate multiple such orthogonal sets simultaneously. For instance, they might want to run linting tools (lint feature set) under a specific Python version (py39 feature set).
  • In the current design, they would have to define a new feature set that combines both sets of dependencies. This becomes cumbersome if there are many such orthogonal sets, as it requires defining every possible combination.
@pavelzw
Copy link

pavelzw commented Nov 23, 2023

I like the

[feature.test.dependencies]
pre-commit = "*"

syntax. This way we could also easily incorporate activation scripts etc

[feature.test.activation]
scripts = ["env-vars-testing.sh"]

@ruben-arts
Copy link

I like the idea of creating environments from a set of features. Here are a few syntax ideas to define the created environments.
Note that this doesn't work with allowing to choose features from the CLI as we can't lock what isn't defined in the pixi.toml.

Defining multiple environments:

[dependencies]
numpy = "*"

[feature.py39.dependencies]
python = "3.9"

[feature.py310.dependencies]
python = "3.10"

[feature.test.dependencies]
pytest = "*"

[environments]
py39 = ["default", "py39"]
py310 = ["default", "py310"]
test39 = ["py39", "test"]

The cli would look like:

pixi run -e py39 python foo.py
pixi run -e test39 pytest

Making the "default" environment be overwritten by multiple features is not super clear to me yet. Here is an idea:

[environments]
# Renaming it to the "main" environment  and use the nameless configuration as "default"
main = ["default", "py39"]
py310 = ["default", "py310"]
test39 = ["main", "test"]

Defining environments specific configuration

Setting the platforms and system-requirements can really important to support multiple ways of running the project. Without supporting all machines at all times. This can be very important if you are developing a project that is going to be running on a different machine in production. (ML and Robotics)

[feature.cuda.dependencies]
pytorch-cuda = "*"

[environments.cuda]
# cuda can only work on these platforms (hypothetically)
platforms = ["osx-64", "osx-arm64", "linux-64"]
features = ["default", "cuda"]
system-requirements = { cuda = "12.0" }

Long name problem

This is not persee an opinion but just a list of ideas of how we could do the naming of the tables:

# Option 1
[target.linux-64.feature.test.activation]
scripts = ["env_vars.sh"]

# Option 2: More like Cargo.toml
[target.'cfg(feature=test, linux-64)'.activation]
scripts = ["env_vars.sh"]

# Option 3: Have a way to configure/predefine names
[target]
linux-cuda = {platform=["linux-64"], feature="cuda"}
[target.linux-cuda.activation]
scripts = ["env_vars.sh"]

Defining tasks

We should also have a ergonomic way to overwrite and define tasks in combination with environments
Would we define them per environment or per feature? I think it would be hard to integrate the "environments" in the nameless task configuration.

[tasks]
train = "python train.py"

[feature.cuda.tasks]
# Overwrites the nameless "train" task if cuda feature is in the environment
train = "python train.py --gpu"

[feature.test.tasks]
test = "pytest"

@pavelzw
Copy link

pavelzw commented Nov 23, 2023

[target.'cfg(feature=test, linux-64)'.activation]

I personally am not really a fan of the cfg syntax 😅 looks a bit odd

Would we define them per environment or per feature?

I would suggest per feature since otherwise you would need to redefine it for every environment:

[environments]
py39 = ["py39"]
py39test = ["test", "py39"]
py310 = ["py310"]
py310test = ["test", "py310"]
py311 = ["py311"]
py311test = ["test", "py311"]

[feature.py39.dependencies]
python = "3.9.*"

[feature.py310.dependencies]
python = "3.10.*"

[feature.py311.dependencies]
python = "3.11.*"

[feature.test.dependencies]
pre-commit = "*"
pytest = "*"

[feature.test.tasks]
test = "pytest --color"

If this were per environment, we would need to duplicate

[environment.<env-name>.tasks]
test = "pytest --color"

for each environment.

If you really want to have a specific task per environment, you could create a 1-to-1 mapping of features to environments.

@pavelzw
Copy link

pavelzw commented Nov 24, 2023

Another idea that came to my mind which could be useful:

A use-case that occurs quite often for me is that i want my dev dependencies to be a strict superset of my prod dependencies. This way, one can actually make sure that the tests that one executed actually matter in production and don't produce different results just because the prod environment was solved differently.

I'm not sure what the best way to integrate this into pixi.toml is. Does anybody have suggestions?
Maybe something like this?

[environments]
py39 = ["py39"]
py39test = ["test", "py39"]
py310 = ["py310"]
py310test = ["test", "py310"]
py311 = ["py311"]
py311test = ["test", "py311"]
prod = ["py311"]
prod-test = [{environment = "prod"}, "test"]
# or alternatively
# prod-test = ["evironment:prod", "test"]
# or
# prod = [{constraints = "env:prod-test"}, "py311"]
# prod-test = ["py311", "test"]

[feature.py39.dependencies]
python = "3.9.*"

[feature.py310.dependencies]
python = "3.10.*"

[feature.py311.dependencies]
python = "3.11.*"

[feature.test.dependencies]
pre-commit = "*"
pytest = "*"

[feature.test.tasks]
test = "pytest --color"

@majidaldo
Copy link

is this the official discussion thread? i've created "hydraconda" so i have much to contribute.

@pavelzw
Copy link

pavelzw commented Jan 30, 2024

There was also some discussions on prefix-dev/pixi#584

now, there is already quite a lot implemented in the latest pixi build on main (not released yet, check the build artifacts)

@ruben-arts
Copy link

ruben-arts commented Jan 30, 2024

Yeah end of this week we should have the feature released in an MVP state. I think it's better to start a PR on the design docs if you have additional ideas or a discussion on pixi's discussion board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment