Skip to content

Instantly share code, notes, and snippets.

@lamont-granquist
Last active May 8, 2019 21:42
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lamont-granquist/b569d0086d19739ebeb3e8c7ab1d65bc to your computer and use it in GitHub Desktop.
Save lamont-granquist/b569d0086d19739ebeb3e8c7ab1d65bc to your computer and use it in GitHub Desktop.
New Chef APIs

Notifications from Resources "Bubble Up"

Release in Chef 12.9.41 via PR #4741 core chef now has a feature which has been available in Poise for awhile, which is that notifications from within resources will now notify resources in outer run contexts. This means you can write a recipe with a service resource and send a notification to it from a resource that you write.

Notifications will bubble up from arbitrarily nested resources, so users that write resources that wrap resources which wrap your resource will still find the service resource in your default recipe.

At the same time the resources collection #find() and #lookup methods and the more commonly-used DSL method resources("service[ntpd]") has been changed to also match this behavior. Disabling this behavior of notifications has not been found to be necessarily in practice so far, so has not been implemented. There are some internal functions hanging for #find_local() and #lookup_local() on the resource collection itself which are internally use to avoid CHEF-3694-style resource cloning from finding identical resources in outer scopes (which would have been terrible -- if you got worried about us making resource cloning any worse, you can be assured we did not).

For :immediate notifications the semantics are obviously that the notified resource will always run immediately after the notifying resource. For :delayed notifications the semantics are that the notification will fire when the delayed notifications are run in the run_context of the notified resource. So in the case of a service resource in recipe code, the delayed notifications will run at the end of the chef-client run. If a resource was declared inside of a custom resource, and was notified by a sub-resource (most likely a much less common use case) then the delayed notification would run at the end of the compile-converge-notification sequence created by the custom resource. You would see the delayed notification action occur as the last action of the outer resource running.

This most likely deprecates the design pattern of not declaring use_inline_resources on providers in order for the provider code to execute in the outer run_context and be able to notify resources in the outer run context. That pattern has been broken for some time, though, because calling any of those resources from withing a use_inline_resources style resource would create a sub-resource_collection and would break the use of the not-use_inline_resources provider.

A simple example might be in thingd/recipes/default.rb to declare a service resource:

service "thingd" do
  action :start
end

And then have a resource in thingd/resources/thingd_config.rb which drops a config file and signals a restart:

action :create do
  template "/etc/thing.d/#{new_resource.name}" do
    source "thing.userconf.erb"
    variables(options: new_resource.options)
    notifies :restart, "service[thingd]", :immediately
  end
end

notifying_block API

This API was released in 12.10.24 via PR #4741 and was also adapted from Poise. This allows the creation of a new sub-resource-collection in-line in recipe or resource code. This can be used to group resources together. Just like a custom resource the sub-resources inside the block will get their own use_inline_resources-style run_context, their own resource_collection and their own delayed notification phase. The behavior of bubble-up notifications is identical, both for resources inside the block which send notification to other resources in an outer scope, or for resources in the block which receive notifications from sub-resources that are called.

This should be useful to solve the problem of "notifications delayed to the end of the recipe". Since the notifying_block creates its own compile-converge-notification block if you declare a service resource inside the recipe and send delayed notification, they will run at the end of the block, not at the end of the chef-client-run. This covers the case where you need two drop two config files to configure a service, you cannot immediately notify the service when it is only half-configured, but you want it to be configured before execution continues to later recipe code.

Note, however, that notifying_block converges all the resources as it executes and it executes at "compile-time" so it effectively forces all its resources to run at compile time. In order to be useful, it most likely needs to be lazy'd via wrapping it in a ruby_block:

ruby_block "resource group" do
  block do
    notifying_block do
      service "thingd" do
        action :nothing
      end
      template "/etc/thing.d/config.conf" do
        source "config.conf.erb"
        notifies :restart, "service[thingd]", :delayed
      end
      template "/etc/thing.d/config.acls" do
        source "config.acls.erb"
        notifies :restart, "service[htingd]", :delayed
      end
    end
  end
end

There is PR #4982 still outstanding to wrap this use-case with better sugar.

Better Dynamic Resource Construction via declare_resource and build_resource

These were released in chef-client 11.10.0 via PR #1241, so they cannot be considered "new" in any way, but they need to get a bump for better visibility. These are the methods that are interally used by chef to build a resource object (build_resource) and to build a resource object and insert it into the resource_collect (declare_resource).

The build_resource command can be used to clean up code which looks like this:

  @env_dir = Chef::Resource::Directory.new("/etc/thing.d, run_context)
  @env_dir.owner(new_resource.owner)
  @env_dir.group(new_resource.group)
  @env_dir.mode(00755)
  @env_dir

Into code which looks like this:

# note that you should probably still not be doing this and should be using declared resources in a sub-resource collection so that you
# get proper output of the actions taken and proper reporting and notification, and do not have to manually handle setting your own
# resource's updated_by_last_action state manually based the update state of this manually built resource. 
@env_dir = build_resource(:directory, "/etc/thing.d") do
  owner new_resource.owner
  group new_resource.group
  mode 00755
end

It should be obvious that this is preferable stylistically and produces less visual clutter and looks more like a recipe declartion of a directory resource. There is also a crucial difference which is not important in this example, but which is that we are constructing a resource from the symbol :directory now and not from the class Chef::Resource::Directory which reduces tight coupling and winds up taking a different codepath through the dyanmic Chef::ResourceResolver. In the next example this is important:

package new_resource.product_name do
  source local_path || new_resource.package_source
  provider value_for_platform_family(
    'debian'  => Chef::Provider::Package::Dpkg,
    'rhel'    => Chef::Provider::Package::Rpm,
    'windows' => Chef::Provider::Package::Windows,
  )
end

Should be better written as:

package_resource = value_for_platform_family(
  'debian'  => :dpkg_package,
  'rhel'    => :rpm_package,
  'windows' => :windows_package,
)
declare_resource(package_resource, new_resource.product_name) do
  source local_path || new_resource.package_source
end

This is actually somewhat important because of what happens when chef-client constructs the resource. In the first case when chef parses the package statement is constructs a package based on no more information than the distro that it is running on. It cannot yet parse the block and the contents of the block and the provider block are completely opaque to it. On RHEL it will construct a Chef::Resource::YumPackage and on Debian it will construct a Chef::Resource::AptPackage. It will then go on to fill out the values of that resource and will eventually parse the provider line and subsequently the provider will be constructed based on what the user requested. On RHEL this results in a Chef::Resource::YumPackage wrapping a Chef::Provider::Package::Rpm and on Debian it will result in a Chef::Resource::AptPackage wrapping a Chef::Provider::Package::Dpkg. Both of which technically work since all the package resources inherit from Chef::Resource::Package and there is no property declared on the rpm resource which isn't also declared on the yum resource. However, the output of chef-client will appear to be wrong and you will see yum_package[whatever] (while the RPM provider is correctly running) and apt_package[whatever] (while the Dpkg provider is correctly running). This output mismatch is a feature and not a bug since it hints that you're doing something incorrect.

In the latter case since we pass :dkg_package or :rpm_package what we wind up with is a Chef::Resource::RpmPackage around a Chef::Provider::Package::Rpm on RHEL and Chef::Resource::DpkgPackage around a Chef::Provider::Package::Dpkg which produces correct output and is semantically correct and gets validation correct for unique properties of those providers.

The shape of the API should also indicate that we're writing better code. Tight coupling (google that term) through hardcoded classes is much less flexible and it should hopefully be obvious that we're offering the information to chef that it needs earlier so that it can make better and more correct decisions (the symbol argument to declare_resource occurs before construction of the resource, while the provider argument in the block happens after the resource is constructed).

The use of the provider argument inside of the block should be considered a not-best-practice and probably deprecated-ish. The problem is that there is so much code that uses it, and it still does have a use in the service provider (which necessarily has to do late-binding of the provider and has a model of having one Chef::Resource::Service mapping onto many Providers) so its unlikely to be really deprecated at any time soon, but just because Chef doesn't yell a deprecation warning at you doesn't mean it should be used.

Better Resource Collection Manipulation APIs

Also released in 12.10.24 via PR #4834 is a better resource_collection API which largely replaces and deprecates the functionality of chef-rewind. Noah Kantrowitz has also released a blog post about this API and chef-rewind's deprecation.

The goal of the API design was to add a complete set of CRUD operators around manipulating the resource_collection, along with upsert-like (create-or-update) convenience methods. The shape of the API is similar to declare_resource and build_resource and uses the same method signature:

# returns a resource, but does not place it on the resource collection
build_resource(:file, "/tmp/foo") do
  content "bar"
end  
  
# returns a resource, and has added it to the resource collection.
#
# in CRUD terms this is "Create"
declare_resource(:file, "/tmp/foo") do
  content "bar"
end

# edits (and returns) a resource if it finds it on the resource collection.  fails if the resource has not been declared.
#
# in CRUD terms this is "Update"
edit_resource!(:file, "/tmp/foo") do
  content "bar"
end

# edits (and returns) a resource if it finds it on the resource collection.  creates the resource if it has not been declared.
# the block is applied to the resource every time this runs.
#
# in CRUD terms this is "Upsert"
edit_resource(:file, "/tmp/foo") do
  content "bar"
end

# finds the resource if it exists.  fails if the resource has not been declared.  this is read-only so does not take a block.
#
# in CRUD terms this is "Read"
find_resource!(:file, "/tmp/foo")

# finds the resource if it exists, creates it if it does not.
# in contrast to edit_resource() this only ever applies the block once, if the resource does not already exist.
#
# in CRUD terms this is...  i dunno???
find_resource(:file, "/tmp/foo") do
  content "bar"
end

# delete the resource if it exists.  fails if the resource has not been declared.
#
# in CRUD terms this is "Delete"
delete_resource!(:file, "/tmp/foo")

# delete the resource if it exists, does not fail if the resource has already been removed.
#
# i think of this as an idempotent assertion of CRUD's "Delete".
delete_resource(:file, "/tmp/foo")

Note that find_resource! and resources() overlap with identical functionality and just a different shape of the API. Either one may continue to be used, there is no preference for either form.

There is also a new helper with_run_context which has been added to allow manipulating resource collections outside of the current one. This is mostly intended to be used with the sugar of the :root and :parent arguments:

# can be placed into a custom resource or a use-inline-resources style provider and the service resource
# will be created in the outermost 'recipe' run_context -- no matter how deeply the resource is called from other
# resources (which could be called from other resources, etc, etc)
with_run_context :root do
  service "thingd" do
    action :nothing
  end
end

The delayed_action helper

Scheduled to be relased in Chef 12.16.x via PR #5443 and PR #5446 this lets us create a resource and at the same time send a :delayed notification to the resource, which can be useful for constructing accumulators.

It is basically a short hand for this kind of abuse of the log resource:

service "thingd" do
  action :nothing
end
log "restart thingd in my delayed notification block" do
  notifies :start, "service[thingd]", :delayed
end

Which we can now do without the intervention of the log resource and without causing that log resource to always fire:

service "thingd" do
  action :nothing
  delayed_action :start
end

You still have to give the action :nothing in order to suppress the resource from firing as a normal part of the compile phase.

Easy injection into the Recipe DSL

There were various issues involved in injecting methods to be used into the Recipe DSL which made it somewhat unpleasant (see what chef-sugar does and see the discussion in #4674). It was also somewhat confusing as to which methods (like shell_out) were being injected into different contexts. As a result of this the lazy_module_include helper was written (to avoid the need to inject methods into multiple different target modules and have module inclusion into the DSL modules work more like object inheritance -- long story) and then the various DSL contexts were organized into three different contexts, Chef::DSL::Recipe, Chef::DSL::Core and Chef::DSL::Universal. These DSLs were released in 12.11.18 via PR #4942. All of those files contain a fairly self-explanatory comment block about how they are organized and what they affect:

    # Part of a family of DSL mixins.
    #
    # Chef::DSL::Recipe mixes into Recipes and LWRP Providers.
    #   - this does not target core chef resources and providers.
    #   - this is restricted to recipe/resource/provider context where a resource collection exists.
    #   - cookbook authors should typically include modules into here.
    #
    # Chef::DSL::Core mixes into Recipes, LWRP Providers and Core Providers
    #   - this adds cores providers on top of the Recipe DSL.
    #   - this is restricted to recipe/resource/provider context where a resource collection exists.
    #   - core chef authors should typically include modules into here.
    #
    # Chef::DSL::Universal mixes into Recipes, LWRP Resources+Providers, Core Resources+Providers, and Attributes files.
    #   - this adds resources and attributes files.
    #   - do not add helpers which manipulate the resource collection.
    #   - this is for general-purpose stuff that is useful nearly everywhere.
    #   - it also pollutes the namespace of nearly every context, watch out.
    #

As per the comment this makes it easy to inject a module into the Recipe DSL used by cookbook authors:

module ACMERecipeHelpers
  def acme_warn_foo
    Chef::Log.warn "foo"
  end
end
Chef::DSL::Recipe.send(:include, ACMERecipeHelpers)

Keep in mind that as with all methods injected into the global DSL that if you pick a simple name like foo and someone else picks a simple name like foo that if both your cookbooks are included into the run_list that you'll both get into a fight, so its appropriate to consider rubbing a bit of namespacing on the method names that you inject.

Also keep in mind that Chef::DSL::Recipe is the least dangerous DSL to inject methods into. If you use Chef::DSL::Universal you will inject methods into all core chef providers and resources and attribute files -- convenient for a future implementation of chef-sugar, but if you abuse this in your own cookbooks, particularly with simply-named methods, you may get into a fight with a minor release of chef-client in the future and we will be unlikely to consider it breaking (chef-sugar is integrated into our own CI processes now so that we can be more assured that we don't break that gem before we release new versions of the client).

Putting it Together: Writing Delayed Accumulators using edit_resource

We can now put together how to write a delayed accumulator using the new APIs. In order to accumulate state into a single template resource which updates only once at the end of the chef-client run, we can do the following:

action :add do
  with_run_context :root do
    edit_resource(:template, "/tmp/aliases") do |new_resource|
      source 'aliases.erb'
      variables[:aliases] ||= {}
      variables[:aliases][new_resource.address] ||= []
      variables[:aliases][new_resource.address] += new_resource.recipients
      action :nothing
      delayed_action :create
    end
  end
end

There is a fairly delicate issue involved in the edit_resource ... do |new_resource| call which was recently introduced in PR#5441 and scheduled to be released in Chef 12.16.x, which is that edit_resource runs on every invocation, but will create a new resource the first time it is called and that resource will create a closure over its wrapping resource (the first time it was called) so that the 'new_resource' will be bound to the first time your custom resource was called. If you want to pass your resources properties to the resource you are editing/constructing, then you MUST call edit_resource this way, and MUST fully declare your properties as new_resource.property_name. This issue is directly related to ruby language scoping rules. You can also simply introduce a local variable like nr_address = new_resource.address at the top of your action and bypass this entire problem.

Putting it Together: Writing Delayed Accumulators using find_resource

Find resource can also be used to write delayed accumulators:

action :create do
  r = with_run_context :root do
    find_resource(:template, "/tmp/aliases") do
      source 'aliases.erb'
      variables[:aliases] = {}
      action :nothing
      delayed_action :create
    end
  end
  r.variables[:aliases][address] ||= []
  r.variables[:aliases][address] += recipients
end

That is entirely equivalent to the edit_resource form. Since find_resource only ever evaluates the block once (when it does not find the resource and builds it the first time) the issues over the closure over the new_resource go away, but the price is that you have to save the resource to a variable and manipulate it directly. Note that the return value of with_run_context is guaranteed to be the last line in the block it is passed.

Putting it Together: Eager Accumulators and Replacing Definitions with Resources

The differences between definitions and resources are really that definitions are a block of code that can manipulate the current run_context and which runs at compile_time. Since we now have the ability to write a resource which manipulates it's :parent run_context, and we have a pattern from chef_gem for how to force a resource to run at compile_time we can mimic the functionality of definitions. It might seem like there is little advantage to this, but while definitions look superficially like resources, the parameter parsing they use is antequated at best and completely broken at worst. They do no validation at all of their arguments, and since they use method_missing to evaluate their parameters it is impossible to write a definition that takes a parameter that overlaps with a method that is injected into Object or Kernel. This is problematic since, for example, the Timeout class injects Kernel#timeout into Ruby which makes it impossible to write a definition that takes a timeout paramater.

While it might be possible to remove the use of method_missing to populate the params hash of definitions and to use properties in order to obtain validation for definitions in the immortal words of Noah Kantrowitz, "How is this not a resource?".

We can show this by taking a simplified multipackage definition implementation:

define :multipackage do
  package_names = [ params[:name] ].flatten
  t = begin
        resources(:package => "collected packages")
      rescue Chef::Exceptions::ResourceNotFound
        package "collected packages" do
          package_name Array.new
        end
      end
  t.package_name += package_names
end

And converting it to a resource:

provides :multipackage
resource_name :multipackage

property :package_name, [String, Array], coerce: proc { |x| [x].flatten }

default_action :install

action :install do
  r = with_run_context :parent do
    find_resource(:package, "collected packages") do |new_resource|
      package_name []
    end
  end
  r.package_name += package_name
end

def after_created
  run_action(:install)
  action :nothing # don't run twice
end

This is different from the prior "delayed accumulator" examples since there is no delayed_action call and no delayed notification taking place. There is also considerable magic occuring in the after_created block, but this is just the same magic used for years in the chef_gem resource to force it to run at compile_time.

The net result of this is that while this custom resource gets its own run_context and resource_collection that we break out of that with the with_run_context helper manipulate our parent resource_collection.

When using this pattern you get an 'eager' accumulator that installs early. The first time this multipackage resource is encountered in recipe code, the package resource will get built right there. Subsequent multpackage calls in later recipe code will edit that resource at compile time. In this way later recipes in the run_list can inject packages into the resource we just created. At the same time in the compile phase the package resource will have been created "early" in the resource_collection so it will converge "early". Later recipes can rely on the fact that the packages they wanted installed will be installed.

If we had used a delayed accumulator pattern for this resource all the packages would be installed at the very end of the chef-client run, which is likely not what you want.

At the same time if we attempted to use the :root run_context the problem is that we may have code structured like this:

recipes/default.rb:

# these calls, of course, might in different cookbooks in the run_list
multipackage "git"
my_resource "that does some stuff"

resources/my_resource.rb:

action :doit do
  multipackage "nmap"
end

What would happen if we attempted to place a single package resource in the :root run context would be (the compile/converge phase of the multipackage resource itself has been omitted since it doesn't add any clarity):

  1. multipackage "git" runs at compile_time and places package [ "git" ] onto the :root resource_collection
  2. my_resource "that does some stuff" constructs the my_resource resource and places it onto the :root resource_collection
  3. the package [ "git" ] action is executed and the git package is installed
  4. the my_resource action is run a. the compile phase of my_resource action is run and the package resource is found and "nmap" is added to its package_name b. the converge phase of my_resource action is run and nothing happens

The problem here is that in #3 we've already converged the package resource we're building in the root context, but we add a package to it in #4a when a later resource is converging.

TL;DR: Either write a delayed accumulator with the former pattern that runs once very late, or write an eager accumulator with this pattern that may run several times (but later resource can count on the actions they need having taken place).

HEADSUP: use_inline_resources will likely become the only way to write resources

The old way of writing providers that do not declare use_inline_resources will probably go away in Chef 13, and old providers that depend on that behavior will break. We will definitely be making the behavior of use_inline_resources the default in Chef 13 (see Issue #3123).

The problem with introducing a dont_use_inline_resources flag for backwards compatibility is that behavior is fundamentally broken. It breaks notifications which is why use_inline_resources was built in Chef 11 (see COOK-1385 for one example out of dozens in the old bug tracker). Plus if the intent is rely on not constructing a new resource collection, then the resource will be broken when called from non-recipe code in a use_inline_resources style provider.

The pattern of including a service resource in a recipe and then notifying it from a provider is the only reason I'm aware of favoring the old way of doing things, and that now works by default after the bubble-up notification patch. It is likely that many old providers that couldn't work with use_inline_resources now simply work correctly with use_inline_resources with zero or little porting work.

EDIT: in Chef 13 use_inline_resources is now on by default for all resources. by using action :whatever do ... end you will get a sub-resource collection. The way to "defeat" this is to use def action_whatever do ... end but never do that.

Question to Ponder: What use are Definitions any more?

I don't think they have much of a use. If you can't be arsed to write a resource and want something simpler, then you should most likely just inject a method into the DSL (i.e. like the ssh_known_hosts cookbook does now. That bypasses the ugly method_missing params mangling that definitions do. If you want to have something that has a 'shape' of a resource and want to pass properties to your function and validation sounds like a good thing to you, then you probably want to go with a resource and use the tools here to make it work like a definition.

Since its easier for me (and anyone else facing this issue) to rewrite a definition as a resource than to fix definitions to have properties and remove the method_missing construction in definitions, its fairly unlikely that definitions will ever get fixed.

Since definitions are mostly just a (kinda kludgy) way to create a method on the resources DSL with some sugar around the method taking a block with parameters... "there is no there there".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment