juliandunn/chef-client-run-internals.md

## chef-client-run-internals.md

      
    Raw
  

              chef-client-run-internals.md
            
          
    A Quick Tour of A Chef Client Run Internals

Dan DeLeo appeared on the FoodFightShow some time ago to walk through "what a Chef run really does". I expanded on these remarks in my personal investigation.
/usr/bin/chef-client


bin/chef-client creates a new Chef::Application::Client (subclass of Chef::Application which sets up common things like loggers across chef-client, chef-solo, knife, etc.) then jump to:


lib/chef/client.rb


application classes create a new Chef::Client object, which calls initialize().


look at do_run, that's where all the run happens inside a big mutex that's used by all Chef runs to ensure only one Chef run happens at a time.


a run_context is the object holding state for the current Chef run. It's like a "God object" in Ruby land but God objects are considered an antipattern (not sure why), Dan says they have done what they can to delegate functionality to others


setup_run_context is where the cookbook compilation happens (using Chef::RunContext::CookbookCompiler which takes the expanded run list)


the compilation looks like:

any Ruby is evaluated first
then in this order:

Libraries
Attributes, in cookbook order

Within a cookbook, default.rb is loaded first and then all the remaining attributes files in lexical sort order
No relationship between attributes file X.rb and recipe X.rb! Very important.


LWRPs (providers first, then the resources)
Resource Definitions — mostly superseded by LWRPs. In fact, one big problem is duplicate definitions. Later matches win!
Recipes


defined resources are placed into the resource collection, which is an ordered hash


Chef::Runner is then passed the compiled run_context and convergence happens.


notifications:

immediate notifications cause the action on the notifee to be run after the notifier's action is complete
delayed notifications are aggregated to the end of the Chef run where they are fired one-by-one
in Chef 10, if an unhandled exception occurred, then the delayed notifs would get dropped on the floor. Chef 11 is smart enough to not do that anymore. (CHEF-581)
even if delayed notifs fail, we just continue (in Chef 11) and add that to a list of things that tell you things went wrong.


Event Subsystem


Two motivators: why-run needed instrumentation and also the existing output was obtuse - only timestamps + info messages, hard to track down what's going wrong.
Event subsystem allows you to define points of interest inside the Chef code. (e.g. cookbook compiler sticks events on the event subsystem after each load step)
lib/chef/event_dispatch/base.rb is the base class for this.
if you inherit from this class you'll start getting events and you can do what you wish with them -- behold, nyancat formatter!
another example: in ChefSpec where they override almost all output.

What is converge_by for in LWRPs?


converge_by makes why-run work.
it's just a wrapper around "if why_run? do this"
avoid you having to set the updated status yourself (old-style: new_resource.updated_by_last_action(true))
general idea for any provider is, if you need to change something, then converge_by (...)

The Node Object - what is it, actually?


At its core it's a model object - a bunch of data about your node


Rails ActiveRecord-style pattern for interacting with the API


The class that holds the data has methods like save and load, etc.


node.save --> responsibility for talking to the server as being built into the same object as the data.


Properties of the data? It looks like a large hash but it's more complicated than that.


Node has its attributes, they are implemented by the node attribute class, which deals with all the precedence and merging.


When you call a random method on a node that it doesn't know about it goes into a method_missing hook that will look into the attribute.


node[] is an element reference operator that will go directly into the attribute class


node.chef_environment, node.name, etc. -- you're in for a sad surprise when you try to use node['chef_environment'] because those are methods on the node object whereas if you use the element reference operator you're going directly to the attribute class.


What are good ways and bad ways to extend resources and providers?


Using it as a component is great, wholesale functionality
File stuff - we've componentized that already
In some situations you need to subclass because we haven't broken things up into components (yet)
Last resort... monkey patch
Accessing private variables via send? (Ugh, you can work around that with send, but maybe go back a little bit later and see if there's a different way, or whether something should be private)

For training… what else do people want to know


How does the DSL work? (Ruby metaprogramming magic, instance_eval) -- Franklin's "Fundamentals of Ruby" webcast series goes into this.
How run_context's load_recipe creates new Chef::Recipe