Skip to content

Instantly share code, notes, and snippets.

@gnuoy
Created June 16, 2020 09:52
Show Gist options
  • Save gnuoy/10577aecc351a767ba5b3df9e124e7e3 to your computer and use it in GitHub Desktop.
Save gnuoy/10577aecc351a767ba5b3df9e124e7e3 to your computer and use it in GitHub Desktop.

Bug 1882113

This assumes that the masakari charms smoke test was run using the openstack provider.

Check that all the segments are online (you may have to install python-masakariclient into your client venv):

$ openstack segment host list 08b44816-01ce-4470-a047-98114f906a84 +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+ | uuid | name | type | control_attributes | reserved | on_maintenance | failover_segment_id | +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+ | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | juju-39ea71-zaza-93849497e9a4-18.project.serverstack | COMPUTE | SSH | False | False | 08b44816-01ce-4470-a047-98114f906a84 | | 7b696e27-128c-4923-9e11-1ffdb085bf9c | juju-39ea71-zaza-93849497e9a4-17.project.serverstack | COMPUTE | SSH | False | False | 08b44816-01ce-4470-a047-98114f906a84 | | 04352f07-034c-456c-8760-2c6681a9785c | juju-39ea71-zaza-93849497e9a4-16.project.serverstack | COMPUTE | SSH | False | False | 08b44816-01ce-4470-a047-98114f906a84 | +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+

And all nova services are up and enabled:

$ openstack compute service list +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+ | 1 | nova-scheduler | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T08:30:22.000000 | | 2 | nova-conductor | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T08:30:27.000000 | | 3 | nova-compute | juju-39ea71-zaza-93849497e9a4-16.project.serverstack | nova | enabled | up | 2020-06-16T08:30:27.000000 | | 4 | nova-compute | juju-39ea71-zaza-93849497e9a4-17.project.serverstack | nova | enabled | up | 2020-06-16T08:30:29.000000 | | 5 | nova-compute | juju-39ea71-zaza-93849497e9a4-18.project.serverstack | nova | enabled | up | 2020-06-16T08:30:25.000000 | +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+ Fine the hypervisor hosting zaza-test-instance-failover

$ openstack server show zaza-test-instance-failover -f value -c 'OS-EXT-SRV-ATTR:host' juju-39ea71-zaza-93849497e9a4-17.project.serverstack

Disable the other hypervisors: $ openstack compute service set juju-39ea71-zaza-93849497e9a4-18.project.serverstack nova-compute --disable $ openstack compute service set juju-39ea71-zaza-93849497e9a4-16.project.serverstack nova-compute --disable $ openstack compute service list +----+----------------+------------------------------------------------------+----------+----------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+----------------+------------------------------------------------------+----------+----------+-------+----------------------------+ | 1 | nova-scheduler | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T08:49:02.000000 | | 2 | nova-conductor | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T08:49:08.000000 | | 3 | nova-compute | juju-39ea71-zaza-93849497e9a4-16.project.serverstack | nova | disabled | up | 2020-06-16T08:49:08.000000 | | 4 | nova-compute | juju-39ea71-zaza-93849497e9a4-17.project.serverstack | nova | enabled | up | 2020-06-16T08:49:00.000000 | | 5 | nova-compute | juju-39ea71-zaza-93849497e9a4-18.project.serverstack | nova | disabled | up | 2020-06-16T08:49:07.000000 | +----+----------------+------------------------------------------------------+----------+----------+-------+----------------------------+

Simulate failure of hypervisor hosting quest $ python3 -c "import zaza.openstack.configure.masakari; zaza.openstack.configure.masakari.simulate_compute_host_failure('nova-compute/1', 'zaza-93849497e9a4')"

The guest has nowhere to move so will eventually end up in an error state: $ openstack server show zaza-test-instance-failover -f value -c 'OS-EXT-STS:vm_state' error

There is a running notification that is not processed: $ openstack notification list +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+ | notification_uuid | generated_time | status | type | source_host_uuid | payload | +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+ | 983b423c-e4d0-43a4-a826-d042ac338668 | 2020-06-16T08:50:41.000000 | running | COMPUTE_HOST | 7b696e27-128c-4923-9e11-1ffdb085bf9c | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 4629ed09-63ef-4965-8119-3bc19ef7f373 | 2020-06-16T08:33:36.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 77003944-71a5-4486-9a12-f640ced3e311 | 2020-06-16T08:11:30.000000 | finished | COMPUTE_HOST | 7b696e27-128c-4923-9e11-1ffdb085bf9c | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 5888480f-b181-4a95-907a-65164e797ba9 | 2020-06-16T07:52:36.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STARTED', 'cluster_status': 'ONLINE', 'host_status': 'NORMAL'} | | 4af024b8-9965-492f-8ab2-72f98a3f6b43 | 2020-06-16T07:52:25.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STARTED', 'cluster_status': 'ONLINE', 'host_status': 'NORMAL'} | | ec633036-4953-4596-b7f5-e627cc6a979b | 2020-06-16T07:48:32.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+

Simulate recovery and bring all host back from a nova pov:

$ openstack compute service set juju-39ea71-zaza-93849497e9a4-16.project.serverstack nova-compute --enable $ openstack compute service set juju-39ea71-zaza-93849497e9a4-17.project.serverstack nova-compute --enable $ openstack compute service set juju-39ea71-zaza-93849497e9a4-18.project.serverstack nova-compute --enable $ openstack compute service list +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+ | 1 | nova-scheduler | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T09:00:32.000000 | | 2 | nova-conductor | juju-39ea71-zaza-93849497e9a4-15 | internal | enabled | up | 2020-06-16T09:00:38.000000 | | 3 | nova-compute | juju-39ea71-zaza-93849497e9a4-16.project.serverstack | nova | enabled | up | 2020-06-16T09:00:38.000000 | | 4 | nova-compute | juju-39ea71-zaza-93849497e9a4-17.project.serverstack | nova | enabled | up | 2020-06-16T09:00:39.000000 | | 5 | nova-compute | juju-39ea71-zaza-93849497e9a4-18.project.serverstack | nova | enabled | up | 2020-06-16T09:00:37.000000 | +----+----------------+------------------------------------------------------+----------+---------+-------+----------------------------+

Stop and start the server to clear the error state and bring it back online:

$ openstack server stop zaza-test-instance-failover $ openstack server start zaza-test-instance-failover $ openstack server show zaza-test-instance-failover -f value -c 'OS-EXT-SRV-ATTR:host' juju-39ea71-zaza-93849497e9a4-17.project.serverstack $ openstack server show zaza-test-instance-failover -f value -c 'OS-EXT-STS:vm_state' active

The notification is still in a 'running' state: ` $ openstack notification list +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+ | notification_uuid | generated_time | status | type | source_host_uuid | payload | +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+ | 983b423c-e4d0-43a4-a826-d042ac338668 | 2020-06-16T08:50:41.000000 | running | COMPUTE_HOST | 7b696e27-128c-4923-9e11-1ffdb085bf9c | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 4629ed09-63ef-4965-8119-3bc19ef7f373 | 2020-06-16T08:33:36.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 77003944-71a5-4486-9a12-f640ced3e311 | 2020-06-16T08:11:30.000000 | finished | COMPUTE_HOST | 7b696e27-128c-4923-9e11-1ffdb085bf9c | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | | 5888480f-b181-4a95-907a-65164e797ba9 | 2020-06-16T07:52:36.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STARTED', 'cluster_status': 'ONLINE', 'host_status': 'NORMAL'} | | 4af024b8-9965-492f-8ab2-72f98a3f6b43 | 2020-06-16T07:52:25.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STARTED', 'cluster_status': 'ONLINE', 'host_status': 'NORMAL'} | | ec633036-4953-4596-b7f5-e627cc6a979b | 2020-06-16T07:48:32.000000 | finished | COMPUTE_HOST | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | {'event': 'STOPPED', 'cluster_status': 'OFFLINE', 'host_status': 'NORMAL'} | +--------------------------------------+----------------------------+----------+--------------+--------------------------------------+----------------------------------------------------------------------------+

$ openstack segment host list 08b44816-01ce-4470-a047-98114f906a84 +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+ | uuid | name | type | control_attributes | reserved | on_maintenance | failover_segment_id | +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+ | 3b5719ca-4d7e-4546-a1b5-5165f7a8d623 | juju-39ea71-zaza-93849497e9a4-18.project.serverstack | COMPUTE | SSH | False | False | 08b44816-01ce-4470-a047-98114f906a84 | | 7b696e27-128c-4923-9e11-1ffdb085bf9c | juju-39ea71-zaza-93849497e9a4-17.project.serverstack | COMPUTE | SSH | False | True | 08b44816-01ce-4470-a047-98114f906a84 | | 04352f07-034c-456c-8760-2c6681a9785c | juju-39ea71-zaza-93849497e9a4-16.project.serverstack | COMPUTE | SSH | False | False | 08b44816-01ce-4470-a047-98114f906a84 | +--------------------------------------+------------------------------------------------------+---------+--------------------+----------+----------------+--------------------------------------+ `

Try and bring host out of maintenance mode in masakari:

$ openstack segment host update 08b44816-01ce-4470-a047-98114f906a84 7b696e27-128c-4923-9e11-1ffdb085bf9c --on_maintenance False ConflictException: 409: Client Error for url: https://172.20.0.101:15868/v1/e9c8633535844bdda019eef939887381/segments/08b44816-01ce-4470-a047-98114f906a84/hosts/7b696e27-128c-4923-9e11-1ffdb085bf9c, Host 7b696e27-128c-4923-9e11-1ffdb085bf9c can't be updated as it is in-use to process notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment