Skip to content

Instantly share code, notes, and snippets.

@dbarrosop
Last active December 28, 2016 10:16
Show Gist options
  • Save dbarrosop/9830a04d365d9098c948db32b5752e2c to your computer and use it in GitHub Desktop.
Save dbarrosop/9830a04d365d9098c948db32b5752e2c to your computer and use it in GitHub Desktop.

Philosophy

Although this might look similar to other configuration management systems its philosophy is a bit different. Some of the key aspects of its philosophy are:

  1. Data is key. Data should be easy to gather, process and make it avaiable to potential consumers.
  2. YAML is not a programming language, it's just a mean to define simple data and glue different building blocks.
  3. It's easier to write, troubleshoot and debug simple python than complex YAML.
  4. It's easier to write, troubleshoot and debug simple python than complex Jinja2 templates.

Inventory

The inventory is just a replaceable python script which adheres to some standards. It's supposed to be simple and provide generic information like groups/roles, devices and a mapping between the two. It also provides connectivity information (hostname, username, password, OS, etc). More specific data is gathered later on by modules (see Modules section).

Modules

Modules provide reusable functionality and provide the basic building blocks.

Writing Modules

To write modules a base class will be provided with some basic functionality. The module can then implement any of the following two methods run_scoped and run_global to provide functionality.

Modes of operation

Modules can run in two modes:

  • global - Tasks run in global mode are run once only and data is made available to everybody. Tasks in the pre and post sections are global. A module, to provide global functionality has to implement run_global method.
  • scoped - Tasks run in scoped mode are run per device and data is only made available to other tasks within the same scope or to other global tasks. Tasks defined in the tasks section are scoped. A module, to provide scoped functionality has to implement run_scoped method.

A module can combine run_global and run_scope to easily provide both functionailities. For example:

class MyModule(BaseClassYetToProvide):

    def run_scope(self, host):
        do_something(host)

    def run_global(self):
        for group in self.inventory.groups():
            for host in self.inventory.hosts_in_group(group):
                self.run_scope(hosts)

A module doesn't need to overload both methods. It might make sense for some methods to only run in scoped mode (for example, a module that loads configuration on a device) or in global mode (a module that process data).

Modules

  • facts_yaml - Reads a directory structure with hostfiles and groupfiles and provides data. Provides similar functionality to other configuration magenement systems.
  • napalm_facts - Gather facts from live devices using napalm getters.
  • network_services - Based on a directory structure $service/$vendor/template.j2 defines a set of services that can be mapped to devices and groups.
  • napalm_configure - Provides configuration capabilities based on napalm.
  • ip_fabric - Reads a file defining a topology, correlates the definition with data from the inventory and generates all the necessary data for the deployment.
  • template - Provides generic jinja2 functionality.
  • napalm_validate - Provides the validate functionality of napalm.

Commits

Brigade doesn't have the notion of commit or dry-run. However, modules that perform changes should provide a commit argument to dictate if you want to perform the change or not.

Data

All the modules have access to all the data collected/generated previously. More interestingly, due to the global/scoped nature of tasks. Data availability follows the following rules:

  1. All data provided by the inventory is available to everybody.
  2. Data gathered by pre tasks are available to all subsequent tasks.
  3. Data gathered by scoped tasks defined in the tasks section are available to:
    1. To the subsequent tasks for that device.
    2. To all the tasks in the post section.

Example of data flow:

  1. The inventory could provide some basic information; devices, groups,parameters to connect to de devices, etc... Something simple, generic and very fast to run.
  2. A pre task could read specialized information to deploy an IP fabric or a WAN network or something else and make it consumable to everybody. Some sanity checks could be performed here as well. In addition, further data could be gathered from live devices and make it available through the entirity of the runbook. For example, you could read a prescriptive topology file, compare it with LLDP information and compute a fabric configuration (interfaces, IPs, BGP sessions, etc). The idea is to generate/munge data and make it consumable
  3. During the tasks execution phase, modules defined there could use all the data gathered so far. Individual devices could expand and collect more data in case they need them, things like OS version, interface names, etc... Things that might be useful to pick the right configuration
  4. Finally, a post task could process the result, log the result somewhere, validate the deployment, etc...

Runbooks

Runbooks are yaml files that glue all the building blocks. Runbooks are not code and thus there are not if statements or for loops. Because all modules have access to all the data, there is no need for dynamic variables in the runbook. Modules can still register data but that's just useful so other methods know where to find that data.

The only exceptions is the when clause. This accepts a string and the task will only be executed if the string eval is True.

In addition, modules can also tweak their behavior via CLI options and/or the environment.

@dbarrosop
Copy link
Author

dbarrosop commented Dec 25, 2016

Regarding naming. I know some people wanted to talk about that. I was thinking the following names

  • tactic. A module that provides some functionality
  • planning phase. The pre section.
  • execution phase. The tasks section.
  • analysis phase. The post section.
  • plan. The runbook/playbook, the yaml describing what to do.

And our motto: "I love it when a plan comes together" (@GGabriele is probably too young to get this one ;) )

So a plan consists of a set of tactics divided in the planning, execution and analysis phases. We could also add aliases like cleaning (alias for analysis) or intel (alias for planning) so people can categorize their tasks more accordingly. @jedelman8 mentioned something along these lines but I am not sure it's useful other than "it looks good xD"

Thoughts?

@ogenstad
Copy link

I would vote for not using the Behave way of describing things. While I haven't used it a formal definition seems more clear. It's easier just to look at it and see which parts are parameters and values. From my point of view it also makes it easier to interchange the format json/yaml or through some other systems api.

Regarding modules in another language than Python. Does anyone here want/need that or is it just a cool feature? I would also vote for the wrapper module written in Python if needed.

Perhaps another way to have it stand out a bit would be to allow the controller to run from a Windows box. Having said that I don't know if Napalm (or all drivers) currently runs on Windows.

The name analysis implies verification of what happened. This step could also be to things like log the job or close a change in a system like ServiceNow or something. Something like paperwork might be broader. recon -> execution -> paperwork, but it's also nice if it's fairly clear what the things mean so you don't have to read a manual just to understand. Perhaps analysis is better.

tactic sounds wrong for a module though. Something like operation would be better.

Regarding the motto, they did a remake remember? 😄

@GGabriele
Copy link

GGabriele commented Dec 26, 2016

And our motto: "I love it when a plan comes together" (@GGabriele is probably too young to get this one ;) )
I had to Google that lol

Regarding Behave, it may be something nice to add later on if both your assumptions are right @dbarrosop, but I would still start with YAML/JSON. From a user perspective, netengs had to learn and deal with Python for scripting and NAPALM, then with YAML for Ansible, then with something new called Behave for Brigade? I don't know if they would see this as a simplification at first.

@ktbyers
Copy link

ktbyers commented Dec 27, 2016

I agree with @dbarosso and think that outputting JSON is a mistake. This is one of the reasons Ansible is a pain to debug/troubleshoot. It makes sense for server people where they were shipping the Ansible modules to remote machines. Probably doesn't make sense here.

I am against doing the behave-like language.

Can we get rid of the conditionals?

The only exceptions is the when clause. This accepts a string and the task will only 
be executed if the string eval is True.

The above is the road to adding more programming constructs i.e. if we add conditionals, then shortly we will need loops, exception handlers.

I am inclined to say that we should do one of two things...no programming constructs (although I don't know how we necessarily pull this off) or full programming (i.e. can just embed straight Python in Brigrade playbook and entire section will be eval() as Python code).

@ktbyers
Copy link

ktbyers commented Dec 27, 2016

You can ignore my comment on 'when'. After thinking about this some more, I am starting to like the above.

I do think we probably need to specify/consider how variables (facts/inventory) are going to work.

I also think we might want to work through a few concrete examples and see how what we proposed would actually work (and whether it is meaningfully better than other existing tools).

@dbarrosop
Copy link
Author

Regarding when, I think we need at least that one to signal some tasks to be skipped or not. For example, if you want to slack/mail someone as part of your "plan" but don't want to do it when developing/testing/dry-running. I would like to keep it as simple as possible but I think we need at least a simple way of selecting which tasks to run.

I also think we might want to work through a few concrete examples and see how what we proposed would actually work (and whether it is meaningfully better than other existing tools).

I agree. I think we can start writing an MVP. I am going to start writing a POC. It will just implement the behavior we discussed here. It's not going to be modular and it's not going to be pretty but I hope I can write it relatively quick. In the meantime, we can discuss the "naming" and once we have the POC and if we like the idea we can start deciding how to make it more modules and then how to write the inventory, modules, etc...

I created a new gist for the naming and pasted my proposal, @ogenstad, can you add your comment there as well, please?

https://gist.github.com/dbarrosop/33a2b68b1afa337c1ef9b1588a3fced0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment