Skip to content

Instantly share code, notes, and snippets.

@rpedde
Created December 3, 2013 22:37
Show Gist options
  • Save rpedde/7778872 to your computer and use it in GitHub Desktop.
Save rpedde/7778872 to your computer and use it in GitHub Desktop.
docs for rpcdaemon. maybe merge into the rpcdaemon repo?

RPCDAEMON

Overview

Currently, OpenStack Havana does not have built-in support for highly availabile virtual routers or DHCP services. In the existing Havana release, virtual routers and DHCP services are scheduled to a single Quantum network node, and are not rescheduled on network node failure.

Since virtual router and DHCP services are normally scheduled approximately evenly, the failure of a single Quantum network node could cause IP addressing and routing failure on a number of networks proportional to the number of Quantum network nodes in use. Because this is generally an unacceptable risk in production environments, most production deployments of OpenStack have traditionally used either the "old-style" nova network driver in HA mode instead of Quantum, or chosen to use Quantum with provider networks so as to externalize these services for higher availability.

This has the unfortunate consequence of reducing the utility of software defined networking, which is frequently one of the most compelling freatures of OpenStack itself. While the Quantum project itself will likely find solutions to the problem, production requirements dictated that a solution to the problem be found sooner. To attempt to solve some of these issues, we have developed a service to monitor topology changes in a running OpenStack cluster, and automatically make changes to the networking configuration to maintain availability of services even in the event of Quantum network node failure.

Theory of Operation

The RPCDaemon is a python-based daemon that subscribes to the nova message bus and watches for events that it should take action on. Three plugins are currently implemented:

  • DHCPAgent: Implements high availability in DHCP services

  • L3Agent: Implements high availability in virtual routers

  • Dump: Simple plugin to dump message traffic. This is typically only used for development or troubleshooting purposes.

DHCPAgent

The operation of the DHCPAgent plugin is simple to describe. At periodic intervals, DHCP services are removed from any Quantum DHCP agent that is no longer reporting itself as available. In addition, DHCP services are provisioned on every Quantum DHCP agent node that doesn't already have them provisioned.

In addition, when a DHCP enabled network is removed, the DHCPAgent plugin ensures that DHCP services are deprovisioned on all Quantum DHCP agent nodes.

The operation effect of these actions is that when creating new DHCP enabled networks, DHCP servers appear on every Quantum network node, rather than on a single Quantum network node. While this slightly increases DHCP traffic from multiple offers to each DHCP discovery request, it does so safely, as the OpenStack DHCP implementation uses DHCP reservations to ensure virtual machines always boot with predictable IP addresses.

Because of this, DHCP requests can continue to be services by other available network nodes, even in the event of catastrophic failure of a single network node.

L3Agent

The L3 agent also runs periodically, but is only interested in virtual routers that are currently assigned to L3 agents that have become inactive. If the L3Agent plugin observes an "down" L3 agent that Quantum believes is hosting a virtual router, then the L3Agent plugin deprovisions the virtual router from that node and reprovisions it on another active Quantum L3 agent node.

This reprovisioning action does not occur immediately, and there will be some minimal network interruption while the virtual router is migrated, however the corrective action happens without intervention, and any network outage is transient. While not perfect, this does allow a higher availability of virtual routing, and may be acceptable for some production workloads.

Configuration

While the configuration options are not currently exposed by the Rackspace Private Cloud cookbooks, the following is a description of the configuration values in the rpcdaemon configuration file (typically located at /etc/rpcdaemon.conf)

The configuration file is a python ConfigParser ini-style file. There is one section for general daemon settings, and then configuration sections for each plugin.

Daemon Options

General daemon options are specified in the Daemon section of the configuration file. Available options include:

  • plugins: Comma separated list of plugins to load. Valid options include L3Agent, DHCPAgent, and Dump.

  • rpchost: Kombu connection url for the OpenStack message server. In the case of rabbitmq, an IP address is sufficient. See the [Kombu Documentation] (http://kombu.readthedocs.org/en/latest/userguide/connections.html) for more information on Kombu connection urls.

  • pidfile: Location of the daemon pid file.

  • logfile: Location of the log file.

  • loglevel: Verbosity of logging. Valid options include DEBUG, INFO, WARNING, ERROR, and CRITICAL.

L3Agent Options

L3Agent options are specified in the L3Agent section of the configuration file. The L3Agent logs to the logfile specified in the Daemon section, but the log level of the L3Agent can be configured independently of the daemon itself. Available configuration options include:

  • conffile: Path to the quantum (or neutron) configuration file.

  • loglevel: Verbosity of logging.

DHCPAgent

DHCPAgent options are specified in the DHCPAgent section of the configuration file. Like the L3Agent, logs will also be sent to the logfile specified in the Daemon section, while the log level is independently configurable. The DHCPAgent takes the same configuration options as the L3Agent, namely:

  • conffile: Path to the quantum (or neutron) configuration file.

  • loglevel: Verbosity of logging.

Dump

Unsurprisingly, the Dump plugin options are specified in the Dump section of the configuration file. In daemon mode, the Dump plugin will log to the logfile specified in the Daemon section, and although the log level is configurable, dumped messages are emitted at DEBUG level, so any other loglevel setting is essentially useless.

The Dump plugin is most useful when running in foreground mode. See the Command Line Options section for more information.

Available options:

  • loglevel: any valid loglevel vebosity, but should be DEBUG as explained previously.

  • queue: queue to dump. Typically quantum or neutron to view network related messages.

Command Line Options

The RPCDaemon currently understands only two command-line options:

  • -d: don't detach (run in foreground). When running in foreground, a pidfile is not dropped, the default log level is set to DEBUG, and the daemon logs to stderr rather than the specified logfile. This is most useful for running the Dump plugin, but can be helpful in development mode as well.

  • -c : path to configuration file. The default configuration file path is /usr/local/etc/rpcdaemon.conf, but init scripts on packaged version of RPCDaemon pass -c /etc/rpcdaemon.conf.

@rpedde
Copy link
Author

rpedde commented Dec 4, 2013

forgot to doc queue_expire in neutronagent plugins to set x-expires headers. default is 60
timeout (default to 20s)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment