As Chef Metal approaches 1.0, we've landed a huge configuration and driver interface improvement intended to enable:
- A standard way to specify credentials and keys that keeps them out of recipes and allows them to be used in multiple places
- External commands (like "metal execute") that can look up information and manage a node independent of the original Metal recipe
- Environmental and directory-specific configuration
- Make the drivers easily usable in test-kitchen and knife
Herein I want to talk about the Driver interface and how it is used by provisioning programs like the machine resource or kitchen, and driver implementors.
Metal's machine
resource has not changed, but the way you specify drivers has changed significantly. No longer are you encouraged to specify the driver name and credentials in the recipe (though it is still possible); instead, it is preferred to configure this information through Chef configuration files and the environment.
In a recipe, you use the machine
and machine_batch
resources to manipulate machines:
machine 'webserver' do
recipe 'apache'
end
You'll notice you don't specify where the machine should be or what OS it should have. This configuration happens in the environment, configuration files or in recipes, described below.
(There are many things you can do with the machine
resource, but we'll cover that in another post.)
chef-metal drivers are generally named chef-metal-. To install, just use gem install chef-metal-docker
(for instance). Fog and Vagrant come pre-installed when you install chef-metal.
To specify where the machine should be (AWS, Vagrant, etc.), you need a driver. There are several drivers out there, including:
- Fog (which connects with AWS EC2, OpenStack, DigitalOcean and SoftLayer)
- IBM VSphere
- Vagrant (VirtualBox and VMWare Fusion)
- LXC
- Docker
- Raw SSH (with a list of already-provisioned servers)
(Note: as of this writing, only Fog and Vagrant are up to date with the new Driver interface, but that will change very quickly.)
The driver you want is specified by URLs. The first part of the URL, the scheme, identifies the Driver class that will be used. The rest of the URL uniquely identifies the account or location the driver will work with. Some examples of driver URLs:
fog:AWS:default
: connect to the AWS default profile (in~/.aws/config
)fog:AWS:123514512344
: connect to the AWS account # 123514512344vagrant
: a vagrant directory located in the default location (<configuration directory>/vms
)vagrant:~/machinetest
: a vagrant directory at~/machinetest
To set the driver that will be used by default, you can place the following in your Chef or Knife config (such as .chef/knife.rb
):
local_mode true
log_level :debug
driver 'vagrant:~/machinetest'
You can also set the CHEF_DRIVER
environment variable:
CHEF_DRIVER=fog:AWS:default chef-client -z my_cluster.rb
Driver options contain the credentials and necessary information to connect to the driver.
To specify driver_options, you can
driver 'fog:AWS:default'
driver_options {
:aws_profile => 'jkeiser_work'
}
If you alternate between many drivers, you can also set options that are "glued" to a specific driver by putting this in your Chef config:
drivers {
'fog:AWS:123445315112' => {
:driver_options => {
aws_profile => 'jkeiser_work'
}
}
}
As you can see, machine_options can be specified as well. We'll talk about those more later.
There will be easier ways to specify this as Chef profiles and configuration evolve in the near future, as well.
Machine options can be specified in Chef configuration or in recipes. In Chef config, it looks like this:
driver 'vagrant:'
# This will apply to all machines that don't override it
machine_options :vagrant_options => {
:bootstrap_options => {
'vm.box' => 'precise64'
}
}
And with the with_machine_options
directive to affect multiple machines:
with_driver 'vagrant:'
with_machine_options :vagrant_options => {
'vm.box' => 'precise64'
}
machine 'webserver' do
recipe 'apache'
end
machine 'database' do
recipe 'mysql'
end
Or directly on the machines:
machine 'webserver' do
driver 'vagrant:'
machine_options :vagrant_options => {
'vm.box' => 'precise64'
}
recipe 'apache'
end
This sort of mixing of physical and logical location is often not advisable, but there are situations where it's expedient or even required, so it's supported.
NOTE: with_machine_options can also take a do block that will apply to all machines inside it.
As before, you can even attach options to specific drivers (defaults for specific drivers and accounts can be useful):
drivers {
'fog:AWS:123445315112' => {
:driver_options => {
aws_profile => 'jkeiser_work'
}
:machine_options => {
:bootstrap_options => {
:region => 'us-east-1'
}
}
},
'vagrant:/Users/jkeiser/vms' => {
:machine_options => {
:vagrant_options => {
'vm.box' => 'precise64'
}
}
}
}
You can set the CHEF_PROFILE
environment variable to identify the profile you want to load.
In Chef config:
profiles {
'default' => {
}
'dev' => {
:driver => 'vagrant:',
:machine_options => {
:vagrant_options => {
'vm.box' => 'precise64'
}
}
},
'test' => {
:driver => 'fog:AWS:test',
:machine_options => {
:bootstrap_options => {
:flavor_id => 'm1.small'
}
}
},
'staging' => {
:driver => 'fog:AWS:staging',
:driver_options => {
:bootstrap_options => {
:flavor_id => 'm1.small'
}
}
}
}
This will get better tooling and more integrated Chef support in the future, but it is a good start. You can set the current profile using the CHEF_PROFILE
environment variable:
CHEF_PROFILE=dev chef-client -z my_cluster.rb
The Driver interface is a set of 4 objects that allow provisioning programs to communicate with drivers. There are several key objects in the Driver interface:
Driver
: Represents a "machine warehouse"--an AWS account, a set of vagrant machines, a PXE machine registry. You cam ask it for new machines, power machines on and off, and get rid of machines you are no longer using.Machine
: Represents a ready, connectable machine. The machine interface lets you run commands, upload and download files, and converge recipes. This returned by Driver methods that create and connect to machines.MachineSpec
: Represents the saved information about a Machine. Drivers use this to save information about how to locate and manipulate individual machines (like the AWS instance ID, PXE MAC address, or Vagrantfile location).ActionHandler
: this is how Metal communicates back to the host provisioning program (like the machine resource, test-kitchen, or knife/metal command line). It primarily uses it to report actions it performs and progress, so that the host can print pretty output informing the user.
When you need to access a new PXE or cloud service, you need to write a new Driver. (For cloud services, often modifying chef-metal-fog will be sufficient rather than creating a whole new driver.)
Every driver instance must be identified uniquely by a URL. This generally describes where the list of servers lives. For cloud providers this will generally be an account or a server. For VMs an containers it will either be a directory or global to the machine.
Example URLs from real drivers:
fog:AWS:1231241212 # account ID (canonical)
fog:AWS:myprofile # profile in ~/.aws/config
fog:AWS # implies default profile
vagrant:/Users/jkeiser/vms # path to vagrant vm (canonical)
vagrant:~/vms # path to vagrant vm (non-canonical)
vagrant # implies <chef config dir>/vms
The bit before the colon--the scheme--is the identifier for your driver gem. Some of these URLs are canonical and some are not. When you create a driver with one of these URLs, the driver_url on the resulting driver must be the canonical URL. For example, ChefMetal.driver_for_url("fog:AWS").driver_url would equal "fog:AWS:12312412312" (or whatever your account is). This is important because the cannical URL will be stored in the URL and may be used by different people on different workstations with different profile names.
To instantiate the driver, you must implement Driver.from_url. This method's job is to canonicalize the URL, and to make an instance of the Driver. For example:
require 'chef_metal/driver'
class MyDriver < ChefMetal::Driver
def self.from_url(url, config)
MyDriver.new(url, config)
end
def initialize(url, config)
super(url, config)
end
def cloud_url
scheme, cloud_url = url.split(':', 2)
cloud_url
end
def the_ultimate_cloud
TheUltimateCloud.connect(cloud_url, driver_config['username'], driver_config['password'])
end
end
As you can see in the previous example, driver_config is where credential information is passed to your driver. It ultimately comes from config[:driver_config] passed to the from_url method. For example, our hypothetical driver could allow the user to specify this in their Chef config:
driver 'mydriver:http://the_ultimate_server.com:8080'
driver_config :username => 'me', :password => 'mypassword'
This is the standard place for users to put credentials. It is a freeform hash, so you should document what keys you expect users to put there to help you connect.
Please feel free to work with any files or environment variables that drivers typically support (like ~/.aws/config
), so that you can share configuration with standard tools for that cloud/VM/whatever.
Allocate machine is the first method called when creating a machine. Its job is to reserve the machine, and to return quickly. It may start the machine spinning up in the background, but it should not block waiting for that to happen.
allocate_machine takes an action_handler, machine_spec, and a machine_options argument. action_handler is where the method should report any changes it makes. machine_spec.location will contain the current known machine information, loaded from persistent storage (like from the node). machine_options contains the desired options for creating the machine. Both machine_spec.location and machine_options are freeform hashes owned by the driver. You should document what options the user can pass in your driver's documentation.
By the time the method is finished, the machine should be reserved and its information stored in machine_spec.location. If it is not feasible to do this quickly, then it is acceptable to defer this to ready_machine.
def allocate_machine(action_handler, machine_spec, machine_options)
if machine_spec.location
if !the_ultimate_cloud.server_exists?(machine_spec.location['server_id'])
# It doesn't really exist
action_handler.perform_action "Machine #{machine_spec.location['server_id']} does not really exist. Recreating ..." do
machine_spec.location = nil
end
end
end
if !machine_spec.location
action_handler.perform_action "Creating server #{machine_spec.name} with options #{machine_options}" do
private_key = get_private_key('bootstrapkey')
server_id = the_ultimate_cloud.create_server(machine_spec.name, machine_options, :bootstrap_ssh_key => private_key)
machine_spec.location = {
'driver_url' => driver_url,
'driver_version' => MyDriver::VERSION,
'server_id' => server_id,
'bootstrap_key' => 'bootstrapkey'
}
end
end
end
In all methods, you should wrap any substantive changes in action_handler.perform_action
. Progress can be reported with action_handler.report_progress
. NOTE: action_handler.perform_action will not actually execute the block if the user passed --why-run
to chef-client. Why Run mode is intended to simulate the actions it would perform, but not actually perform them.
If you notice the user wants the machine to be different than it is now--for example, to have more RAM or disk or processing power--you should either safely move the data over to a new instance, or warn the user that you cannot fulfill their desire.
You'll notice the service is passed a private key for bootstrap. This is the bootstrap key, and in our example, TheUltimateCloud will allow you to ssh to the machine with the root user using that private key after it is bootstrapped. (Several cloud services already work this way.)
The issue one has here is, the user needs to be able to pass you these keys. chef-metal introduces configuration variables :private_keys
and :private_key_paths
to allow the user to tell us about his keys. We then refer to the keys by name (rather than path) in drivers, and look them up from configuration.
Here is what the get_private_key method looks like:
def get_private_key(name)
if config[:private_keys] && config[:private_keys][name]
if config[:private_keys][name].is_a?(String)
IO.read(config[:private_keys][name])
else
config[:private_keys][name].to_pem
end
elsif config[:private_key_paths]
config[:private_key_paths].each do |private_key_path|
Dir.entries(private_key_path).each do |key|
ext = File.extname(key)
if ext == '' || ext == '.pem'
key_name = key[0..-(ext.length+1)]
if key_name == name
return IO.read("#{private_key_path}/#{key}")
end
end
end
end
end
end
ready_machine is the other half of the machine creation story. This method will do what it needs to bring the machine up. When the method finishes, the machine must be warm and connectable. ready_machine returns a Machine object. An example:
def ready_machine(action_handler, machine_spec, machine_options)
server_id = machine_spec.location['server_id']
if the_ultimate_cloud.machine_status(server_id) == 'stopped'
action_handler.perform_action "Powering up machine #{server_id}" do
the_ultimate_cloud.power_on(server_id)
end
end
if the_ultimate_cloud.machine_status(server_id) != 'ready'
action_handler.perform_action "wait for machine #{server_id}" do
the_ultimate_cloud.wait_for_machine_to_have_status(server_id, 'ready')
end
end
# Return the Machine object
machine_for(machine_spec, machine_options)
end
ready_machine takes the same arguments as allocate_machine, and machine_spec.location will contain any information that was placed in allocate_machine.
The Machine object contains a lot of the complexity of connecting to and configuring a machine once it is ready. Happily, most of the work is already done for you here.
require 'chef_metal/transport/ssh_transport'
require 'chef_metal/convergence_strategy/install_cached'
require 'chef_metal/machine/unix_machine'
def machine_for(machine_spec, machine_options)
server_id = machine_spec.location['server_id']
hostname = the_ultimate_cloud.get_hostname()
ssh_options = {
:auth_methods => ['publickey'],
:keys => [ get_key('bootstrapkey') ],
}
transport = ChefMetal::Transport::SSHTransport.new(the_ultimate_cloud.get_hostname(server_id), ssh_options, {}, config)
convergence_strategy = ChefMetal::ConvergenceStrategy::InstallCached.new(machine_options[:convergence_options])
ChefMetal::Machine::UnixMachine.new(machine_spec, transport, convergence_strategy)
end
WindowsMachine and WinRMTransport are also available for Windows machines. You can look at how these are instantiated in the chef-metal-vagrant driver.
The destroy_machine function is fairly straightforward:
def destroy_machine(action_handler, machine_spec, machine_options)
if machine_spec.location
server_id = machine_spec.location['server_id']
action_handler.perform_action "Destroy machine #{server_id}" do
the_ultimate_cloud.destroy_machine(server_id)
machine_spec.location = nil
end
end
end
Same with stop_machine:
def stop_machine(action_handler, machine_spec, machine_options)
if machine_spec.location
server_id = machine_spec.location['server_id']
action_handler.perform_action "Power off machine #{server_id}" do
the_ultimate_cloud.power_off(server_id)
end
end
end
This method should return the Machine object for a machine, without spinning it up. Because of how we coded ready_machine
, we can just do this:
def connect_to_machine(machine_spec, machine_options)
machine_for(machine_spec, machine_options)
end
Drivers are automatically loaded based on their driver URL. The way Metal does this is by extracting the scheme from the URL, and then doing require 'chef_metal/driver_init/schemename'
. So for our driver to load when driver is set to mydriver:http://theultimatecloud.com:80
, we need to create a file named chef_metal/driver_init/mydriver.rb` that looks like this:
require 'chef_metal_mydriver/mydriver'
ChefMetal.register_driver_class("mydriver", ChefMetalMyDriver::MyDriver)
After this require, chef-metal will call ChefMetalMyDriver::MyDriver.from_url('mydriver:http://theultimatecloud.com:80', config)
and will have a driver!
For users to actually use their gem, you need to release the gem on rubygems.org, and people will do gem install chef-metal-mydriver
. Instructions for publishing a gem are at rubygems here.
By default Chef Metal provides parallelism on top of your driver by calling allocate_machine and ready_machine in parallel threads. But many providers have interfaces that let you spin up many machines at once. If you have one of these, you can implement the allocate_machines
method. It takes the action_handler you love and know, plus a specs_and_options hash (keys are machine_spec and values are machine_options), and a parallelizer object you can optionally use to run multiple ruby blocks in parallel.
def allocate_machines(action_handler, specs_and_options, parallelizer)
private_key = get_private_key('bootstrapkey')
servers = []
server_names = []
specs_and_options.each do |machine_spec, machine_options|
if !machine_spec.location
servers << [ machine_spec.name, machine_options, :bootstrap_ssh_key => private_key]
server_names << machine_spec.name
end
end
# Tell the cloud API to spin them all up at once
action_handler.perform_action "Allocating servers #{server_names.join(',')} from the cloud" do
the_ultimate_cloud.create_servers(servers)
end
end
You can also implement ready_machines, destroy_machines and stop_machines.
There are many programs that could benefit from creating and manipulating machines with Metal. For example, the machine
and machine_batch
resources in Chef recipes, test-kitchen
, and knife
all use the Metal Driver interface for provisioning. This is an explanation of how the Driver interface is used.
The fundamental bit of Metal is the configuration, passed in to. This is a hash, with symbol keys for the important top level things:
{
:driver => 'fog:AWS:default',
:driver_options => { <credentials here, if you must> },
:machine_options => { <options here> }
:chef_server_url => 'https://api.opscode.com/organizations/myorg'
:node_name => 'jkeiser', # Client or username to connect to Chef server
:client_key => '/Users/jkeiser/.chef/keys/jkeiser.pem'
}
To get the Chef config, you can use this code:
require 'chef/config'
require 'chef/knife'
require 'chef/config_fetcher'
require 'cheffish'
chef_config = begin
Chef::Config.config_file = Chef::Knife.locate_config_file
config_fetcher = Chef::ConfigFetcher.new(Chef::Config.config_file, Chef::Config.config_file_jail)
if Chef::Config.config_file.nil?
Chef::Log.warn("No config file found or specified on command line, using command line options.")
elsif config_fetcher.config_missing?
Chef::Log.warn("Did not find config file: #{Chef::Config.config_file}, using command line options.")
else
config_content = config_fetcher.read_config
config_file_path = Chef::Config.config_file
begin
Chef::Config.from_string(config_content, config_file_path)
rescue Exception => error
Chef::Log.fatal("Configuration error #{error.class}: #{error.message}")
filtered_trace = error.backtrace.grep(/#{Regexp.escape(config_file_path)}/)
filtered_trace.each {|line| Chef::Log.fatal(" " + line )}
Chef::Application.fatal!("Aborting due to error in '#{config_file_path}'", 2)
end
end
Cheffish.profiled_config # This adds support for Chef profiles
end
This will handle everything including environment variables.
You may also want to do this:
Chef::Config.local_mode true
If you have your own configuration mechanism, you can either merge it with the Chef config using `Cheffish::MergedConfig.new(my_config, chef_config), or just pass it directly and ignore Chef.
If you want to work with local mode (spin up a chef-zero server), you will need to spin it up. You can use this code to do that:
require 'chef/application'
Chef::Application.setup_server_connectivity
The Chef server URL will be in Chef::Config.chef_server_url
.
Use this code to stop it when you are done with it:
Chef::Application.destroy_server_connectivity
ActionHandler
is how Metal communicates back to your application. It will report progress and tell you when it updates things, so that you can print that information to the user (whether it be to the console or to a UI). To create an ActionHandler, you implement these methods:
require 'chef_metal/action_handler'
class MyActionHandler < ChefMetal::ActionHandler
# Loads node (which is a hash witha bunch of attributes including 'name')
def initialize(name, my_storage)
@node = my_storage.load(name) || { 'name' => name }
super(@name)
@my_storage = my_storage
end
# Globally unique identifier for this machine. For Chef, we use
# <chef_server_url>/nodes/#{name}. Does not have to be a URL.
def id
"#{@my_storage.url}/#{name}"
end
def save(action_handler)
# much-vaunted idempotence
if @my_storage.node_is_different(name, @node)
action_handler.perform_action "save #{name} to storage" do
@my_storage.save(@node)
end
end
end
end
MachineSpec
is the way you communicate the persisted state of a machine to metal (including save and load).
MachineSpec has a save() method that saves the machine location data (like its instance ID or Vagrantfile) to persistent storage for later retrieval. For chef-client, this location is a Chef node. For other applications, you may prefer to store this sort of persistent data elsewhere (test-kitchen has its own server state storage). To do that, you will override MachineSpec
and implement the save
method (as well as create a method to instantiate YourMachineSpec by loading it back in).
In many Chef-centric cases,
If you are OK with just storing the nodes in the Chef server, then you can just use the ChefMachineSpec
to take care of saving and loading:
require 'cheffish'
require 'chef_metal/chef_machine_spec'
chef_server = Cheffish.chef_server_for(config)
machine_spec = ChefMetal::ChefMachineSpec.new(machine_name, chef_server)
When you want to work with machines, you need a driver. There are two principal reasons to get a driver. First, for connect, destroy and delete type operations, you may want to work with an existing machine, defined by a machine_spec. Second, to create a desired machine (allocate and ready_machine), you will want to create a driver straight from configuration or from a driver URL.
To get a driver URL from config:
require 'chef_metal'
driver = ChefMetal.driver_for_url(chef_config[:driver], chef_config)
To get a driver URL from a machine spec:
if machine_spec.driver_url
driver = ChefMetal.driver_for_url(machine_spec.driver_url, chef_config)
end
To create a machine, you do this:
machine_options = ChefMetal.config_for_url(driver.driver_url, chef_config)
ChefMetal.allocate_machine(action_handler, machine_spec, machine_options)
ChefMetal.ready_machine(action_handler, machine_spec, machine_options)
driver = ChefMetal.driver_for_url(chef_config[:driver], chef_config)
machine_options = ChefMetal.config_for_url(driver.driver_url, chef_config)
machine_options = Cheffish::MergedConfig.new(machine_options, { :convergence_options => { :chef_server => Cheffish.default_chef_server(chef_config) } })
specs_and_options = {}
machine_specs.each do |machine_spec|
specs_and_options[machine_spec] = machine_options
end
driver.allocate_machines(action_handler, specs_and_options)
driver.ready_machines(action_handler, specs_and_options)
NOTE: if you have specific options for each individual machine, you can use Cheffish::MergedConfig.new({ :machine_options => new_options }, machine_options)
instead of machine_options
inside the loop.
driver.connect_to_machine(action_handler, machine_spec, machine_options)
driver.destroy_machine(action_handler, machine_spec, machine_options)
driver.stop_machine(action_handler, machine_spec, machine_options)
driver.destroy_machines(action_handler, specs_and_options)
driver.stop_machines(action_handler, specs_and_options)