Skip to content

Instantly share code, notes, and snippets.

@nicwaller
Created February 24, 2016 16:39
Show Gist options
  • Save nicwaller/e8209383138018ec6bb6 to your computer and use it in GitHub Desktop.
Save nicwaller/e8209383138018ec6bb6 to your computer and use it in GitHub Desktop.
Run Sets. A new way of thinking about Chef runs.

Environment workflow

Problem

The status quo today is that we do not use dependency constraints on our internal cookbooks. This inhibits our ability to make changes because all changes must be backward-compatible. And even when our changes are backward-compatible, sometimes we promote cookbooks in the wrong order and end up with a failure.

Solution

We need to use two kinds of constraint.

  • Equality constraint specified in the environment (so that we can be 100% sure about which cookbooks are being executed in that environment)
  • Dependency constraint specified in cookbooks (to ensure correct execution of each cookbook)

Now this poses a problem: how do we choose equality constraints that satisfy all dependency constraints? Here's what happens if we don't plan for this.

  • Start with two cookbooks, A and B. A depends B ~= 0.5.0
  • Make a new cookbook C. C depends B >= 2.0
  • A and C have conflicting constraints. There is no valid solution.

Solution redux

Berkshelf is a dependency solver for cookbooks. It can help. You might have heard about the environment cookbook pattern. Before we go there, I need to clarify some terminology.

What are environments?

We use the word environment for two distinct ideas.

  1. An Environment is a combination of compute resources, application artifacts, and configuration that together form a deployment of your application. You probably have separate environments for end users, developers, and maybe even QA.
  2. A Chef environment, or environment file, defines a set of cookbook version constraints. Also commonly used to set attributes.

These ideas need to be decoupled.

Imagine you have ten application cookbooks that all depend on a shared nodeapp cookbook. You want to refactor the nodeapp cookbook, but refactoring would introduce breaking changes. If you are tied to the idea that every application must use the same version of the cookbook, then you are unable to make a breaking change. Any breaking change would require all application cookbooks to be promoted simultaneously. This is nearly impossible.

Because the word environment is tied very strongly to the first idea, I will introduce a new term, run set, which refers to set of equality constraints present in a Chef environment file. Defining additional run sets necessarily means the creation of additional Chef environment files. Any given Chef run will always use exactly one run set.

I will use an uppercase E when referring to an Environment that is a collection of compute resources.

There are two ways to approach run sets.

  1. The traditional approach is to use one run set for your Environment. This is a comfortable, conservative way of reasoning. It's easy to comprehend because every Chef node in your Environment is assigned to the same Chef environment. You cannot have multiple run sets with this approach.
  2. A radical new approach is to have multiple run sets available within a single Environment, with a

TODO: are environments the only way to pin the run set? what if we use an environment cookbook for that? then we would not pin anything in the environment. we would have to be extremely careful that every node gets an environment cookbook, otherwise it would end up with no constraints. so you would probably want to assign runset cookbooks at the role level, as an easy way to verify that every node has a specified runset. what about the weird hybrid middle ground where we use actual environment cookbooks to define the run set? this would allow us to use the Chef environment construct as originally intended. That's a lot nicer for attribute sharing! using cookbook metadata comes at the cost of needing to include EVERY cookbook on the chef server as part of the Chef run. If you don't do that, you could end up using unwanted newer versions because of transitive dependencies. I'm not sure how slow this would be. It would result in a bit of attribute pollution for sure. Less pollution if everybody followed my style guide.

TODO: figure out how slow it is to depend on all cookbooks, but not actually include any of them. 60 seconds?

Run Sets with Berkshelf

why environment cookbook? depends on scope of the solving problem. GROUP

Alternative: reduce the satisfiability problem by shrinking the size of the environment. maybe down to the size of an application. this changes the cardinality of a node -- previously a node was guaranteed to be in exactly one environment, but it can have more than one application. But if we shrink the environment down to the size of an application, then multiple environments can run on a single instance (in separate processes). Heterogenous Chef runs on a single instance, each with their own set of cookbooks and constraints. a "GROUP".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment