Create a gist now

Instantly share code, notes, and snippets.

Exploring Solutions for CHEF-3694

Solving CHEF-3694

Problem(s)

Resource Naming Collisions

The main problem here is that something has to happen when the user creates two resources with the same type and name:

log "danger" do
  message "deleting all the backups!"
end

log "danger" do
  message "restarting a service, you might get paged"
end

Currently, Chef will copy the first resource, omitting some elements, but this is pretty confusing for non-power users (especially the question about whether not_if and only_if guards get copied). It also violates an implicit assumption that resources are unique by type and name:

file "/tmp/whatever" do
  content rand.to_s
  notifies :write, "log[danger]"
end

What message does this print? How would you get it to print the other one?

Resource Namespacing

Another problem we may want to address at the same time as CHEF-3694 is that there is a single namespace for resources (the resource collection). The example above (with the two log resources) is silly because they're right next to each other, but as it stands now, they could be in different cookbooks (perhaps community cookbooks for different pieces of software, written by different authors and combined by the user). In the current implementation, you probably won't notice unless you're trying to notify the duplicated resource and your notifications are going to the wrong resource. Depending on how the primary question (what to do with resource collisions) is answered, we may want to do something to reduce the "collision domain" for resources.

Multiple Resource Actions at Different Times

The current ("resource cloning") behavior is the way it is in order to provide a feature like "notifications as a recipe-level construct." From the original ticket:

service "monkeypants" do
  supports :reload => true
  action :enable
end

# configure "monkeypants"

notify :start, resources(:service => "monkeypants"), :immediately

Given that you can now notify resources where the notify-ee is defined after the notify-er resource, the above example would be better written as action :enable, :start on the service resource, which would be declared after the templates, etc. that configure it.

There are more complex cases where this feature is handy. For example:

service "mysql" do
  action :stop
end

execute "mv /var/lib/mysql /mnt/mysql"

service "mysql" do
  action :start
end

This code is in the EC2 recipe for mysql. The mysql service may be configured with particular stop/start/restart commands, etc., so this can only work correctly when the service[mysql] resource knows about those settings.

SRP Side Discussion

If we look at this problem with our OO-design hat on, we'll see that this problem looks like a violation of the single responsibility principle that's baked deeply into Chef. When we declare a resource in a recipe, we're doing two things:

  1. Creating a data structure describing some configurable aspect of the system.
  2. Adding an item to Chef's ordered list of configuration actions to take.

We see this in resource declarations, where we specify an action on the resource itself; we also see this in the resource collection, where we have an ordered list of resources combined with their associated actions. To use the cooking analogy, we've combined the ingredients list with the preparation.

Thought Experiment

Let's imagine we've completely separated the ingredients from the recipe. It could look like this:

# "ingredients.rb"

file "/tmp/foo" do
  content "blah"
  mode "0644"
  # action :nothing (implicit)
end

execute "run-a-command" do
  # action :nothing (implicit)
end

service "apache2" do
  restart_command "sv 1 apache2"
  # action :nothing (implicit)
end

# "recipe.rb"

converge(:create, "file[/tmp/foo]").and_if_changed do
  # These replace notifications
  converge(:execute, "execute[run-a-command]")
  converge(:reloade, "service[apache2]
end

This has some neat properties, but has big downsides:

  • For the 80% (90% ?) use case, it's more verbose than Chef is now, since you only need this level of control over ordering in a small subset of Chef use cases.
  • It breaks all Chef code ever.

Though I'd personally enjoy exploring this design space, these two downsides are too enormous to seriously consider implementing the above. That said, we can implement Chef-as-it-is-now on top of these ideas.

Proposed Solution

Separate resource declaration from adding action to the ordered list:

  • Internally separate resource declaration from append-a-resource-to-the-todo-list. The current resource collection will be split into a "resource set" and "action list."
  • Normal resource declarations create a resource, add it to the "resource set", and insert a [resource, action] pair into the "action list." action :nothing could be a shortcut for not inserting the resource into the action list, or there could be some other means to accomplish this, such as a generic declare_resource method.
  • Add a DSL method for appending a [resource, action] pair into the "action list." This replaces the functionality provided by resource cloning.

This leaves the question of what to do when conflicting resources are declared. I am strongly in favor or this raising an error; however, looking at the CHEF-3694 warnings in our own Chef runs, this could be considered overly strict. Furthermore, for heavy users of community cookbooks, it could be annoying to deal with conflicts between different externally sourced cookbooks. If desired, this could be mitigated by namespacing resources by cookbook. Duplicating a resource within a cookbook would be an error, while duplicating a resource from a different cookbook would work but possibly emit a warning. Resource lookup (for notifications, etc.) would prefer a resource from the current cookbook in the case of multiples. Additional syntax would be added to fully qualify a resource for the purpose of notifying a resource from an outside cookbook when duplicates are present.

Justification

There's obviously a much quick-n-dirtier solution to the bug-as-reported, which is to create a notification resource. Here's why I think we should do a deeper refactor:

  • As a matter of philosophy plus pragmatism, I want to embrace reprogrammability in a way that is stable, supportable, and has a lower barrier to entry than we have now. Reprogrammability in Chef currently involves lots of magic, code duplication, and (ab)use of internal code that can easily be broken by an innocent refactor.
  • Splitting the resource collection's responsibilities into two components opens lots of opportunities to fix issues and limitations in the existing code:
    • The run_action hack is useful for some circumstances, but its limitations can lead to running entire recipes during the compile phase, for example to install build tools, database libraries, and then a database gem. To some degree this problem could be addressed by splitting cookbooks into multiple components, but this is frustrated by the way run_lists are composed (you can't make a role insert some items before and some items after the rest of the run_list). By exposing the action list for user manipulation, it's easy to write code to insert {resource, action} pairs in other locations.
    • Nested Chef runs (recipe_eval in the deploy resource, LWRPs with inline compilation enabled) can access resources in the main resource set, but have their own action list. This means we can make inline compilation the default for LWRPs with no downsides.
  • Design quality. We solve the problems in CHEF-3694 by making the internal and external models of resource creation and action ordering consistent with their intended use and design.
@jtimberman

It is worth noting that for some use cases, resource cloning is desirable. Take the example of Opscode's mysql cookbook, "server_ec2" recipe which moves the mysql data directory to ephemeral storage. The recipe does this:

service "mysql" do
  action :stop
end

execute "mv /var/lib/mysql /mnt/mysql"

service "mysql" do
  action :start
end

The only difference between the instances of service[mysql] here is the action.

@danielsdeleo
Owner

Important note about this use case:

it is very likely that service[mysql] is also defined from the mysql::server recipe.

So, the action :start after the execute[mv ...] could be done with a notification, but the action :stop at the top, probably not.

@tmatilai

Unrelated to the problem, but the message attribute of the log resource seems to be totally undocumented feature. =)

@danielsdeleo
Owner

Re: message attribute to log, that's a new thing in 10.18. I'll make a note to get it documented.

@btm
btm commented Jan 24, 2013

Joshua, the only reason resource cloning is desirable in a situation like that is when you're defining additional attributes on the resource.

mysql/recipes/default.rb:

service "mysql" do
  service_name "super_mysqld"
  action :nothing
end

my_app/recipes/default.rb:

include_recipe "mysql"

service "mysql" do
  action :stop
end

execute "mv /var/lib/mysql /mnt/mysql"

service "mysql" do
  action :start
end

Provided the resource is already defined, this use case (CHEF-26) would be better served IMHO with a top-level notification command

my_app/recipes/default.rb:

include_recipe "mysql"
# service[mysql] defined in above recipe

notify :stop, "service[mysql]", :immediately

execute "mv /var/lib/mysql /mnt/mysql"

notify :start, "service[mysql]", :immediately
@danielsdeleo
Owner

The message attribute is now documented on docs.opscode.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment