The main problem here is that something has to happen when the user creates two resources with the same type and name:
log "danger" do
message "deleting all the backups!"
end
log "danger" do
message "restarting a service, you might get paged"
end
Currently, Chef will copy the first resource, omitting some elements,
but this is pretty confusing for non-power users (especially the
question about whether not_if
and only_if
guards get copied). It
also violates an implicit assumption that resources are unique by type
and name:
file "/tmp/whatever" do
content rand.to_s
notifies :write, "log[danger]"
end
What message does this print? How would you get it to print the other one?
Another problem we may want to address at the same time as CHEF-3694 is that there is a single namespace for resources (the resource collection). The example above (with the two log resources) is silly because they're right next to each other, but as it stands now, they could be in different cookbooks (perhaps community cookbooks for different pieces of software, written by different authors and combined by the user). In the current implementation, you probably won't notice unless you're trying to notify the duplicated resource and your notifications are going to the wrong resource. Depending on how the primary question (what to do with resource collisions) is answered, we may want to do something to reduce the "collision domain" for resources.
The current ("resource cloning") behavior is the way it is in order to provide a feature like "notifications as a recipe-level construct." From the original ticket:
service "monkeypants" do
supports :reload => true
action :enable
end
# configure "monkeypants"
notify :start, resources(:service => "monkeypants"), :immediately
Given that you can now notify resources where the notify-ee is defined
after the notify-er resource, the above example would be better written
as action :enable, :start
on the service resource, which would be
declared after the templates, etc. that configure it.
There are more complex cases where this feature is handy. For example:
service "mysql" do
action :stop
end
execute "mv /var/lib/mysql /mnt/mysql"
service "mysql" do
action :start
end
This code is in the EC2 recipe for mysql. The mysql service may be
configured with particular stop/start/restart commands, etc., so this
can only work correctly when the service[mysql]
resource knows about
those settings.
If we look at this problem with our OO-design hat on, we'll see that this problem looks like a violation of the single responsibility principle that's baked deeply into Chef. When we declare a resource in a recipe, we're doing two things:
- Creating a data structure describing some configurable aspect of the system.
- Adding an item to Chef's ordered list of configuration actions to take.
We see this in resource declarations, where we specify an action on the resource itself; we also see this in the resource collection, where we have an ordered list of resources combined with their associated actions. To use the cooking analogy, we've combined the ingredients list with the preparation.
Let's imagine we've completely separated the ingredients from the recipe. It could look like this:
# "ingredients.rb"
file "/tmp/foo" do
content "blah"
mode "0644"
# action :nothing (implicit)
end
execute "run-a-command" do
# action :nothing (implicit)
end
service "apache2" do
restart_command "sv 1 apache2"
# action :nothing (implicit)
end
# "recipe.rb"
converge(:create, "file[/tmp/foo]").and_if_changed do
# These replace notifications
converge(:execute, "execute[run-a-command]")
converge(:reloade, "service[apache2]
end
This has some neat properties, but has big downsides:
- For the 80% (90% ?) use case, it's more verbose than Chef is now, since you only need this level of control over ordering in a small subset of Chef use cases.
- It breaks all Chef code ever.
Though I'd personally enjoy exploring this design space, these two downsides are too enormous to seriously consider implementing the above. That said, we can implement Chef-as-it-is-now on top of these ideas.
Separate resource declaration from adding action to the ordered list:
- Internally separate resource declaration from append-a-resource-to-the-todo-list. The current resource collection will be split into a "resource set" and "action list."
- Normal resource declarations create a resource, add it to the
"resource set", and insert a [resource, action] pair into the "action
list."
action :nothing
could be a shortcut for not inserting the resource into the action list, or there could be some other means to accomplish this, such as a genericdeclare_resource
method. - Add a DSL method for appending a [resource, action] pair into the "action list." This replaces the functionality provided by resource cloning.
This leaves the question of what to do when conflicting resources are declared. I am strongly in favor or this raising an error; however, looking at the CHEF-3694 warnings in our own Chef runs, this could be considered overly strict. Furthermore, for heavy users of community cookbooks, it could be annoying to deal with conflicts between different externally sourced cookbooks. If desired, this could be mitigated by namespacing resources by cookbook. Duplicating a resource within a cookbook would be an error, while duplicating a resource from a different cookbook would work but possibly emit a warning. Resource lookup (for notifications, etc.) would prefer a resource from the current cookbook in the case of multiples. Additional syntax would be added to fully qualify a resource for the purpose of notifying a resource from an outside cookbook when duplicates are present.
There's obviously a much quick-n-dirtier solution to the bug-as-reported, which is to create a notification resource. Here's why I think we should do a deeper refactor:
- As a matter of philosophy plus pragmatism, I want to embrace reprogrammability in a way that is stable, supportable, and has a lower barrier to entry than we have now. Reprogrammability in Chef currently involves lots of magic, code duplication, and (ab)use of internal code that can easily be broken by an innocent refactor.
- Splitting the resource collection's responsibilities into two
components opens lots of opportunities to fix issues and limitations
in the existing code:
- The
run_action
hack is useful for some circumstances, but its limitations can lead to running entire recipes during the compile phase, for example to install build tools, database libraries, and then a database gem. To some degree this problem could be addressed by splitting cookbooks into multiple components, but this is frustrated by the way run_lists are composed (you can't make a role insert some items before and some items after the rest of the run_list). By exposing the action list for user manipulation, it's easy to write code to insert {resource, action} pairs in other locations. - Nested Chef runs (
recipe_eval
in the deploy resource, LWRPs with inline compilation enabled) can access resources in the main resource set, but have their own action list. This means we can make inline compilation the default for LWRPs with no downsides.
- The
- Design quality. We solve the problems in CHEF-3694 by making the internal and external models of resource creation and action ordering consistent with their intended use and design.
It is worth noting that for some use cases, resource cloning is desirable. Take the example of Opscode's mysql cookbook, "server_ec2" recipe which moves the mysql data directory to ephemeral storage. The recipe does this:
The only difference between the instances of
service[mysql]
here is the action.