This document contains the steps to reproduce Neutron bug 1887405 using devstack on Victoria.
I used two Rackspace public cloud VMs. The VMs had a private network (192.168.23.0/24) in addition to their public IPs, but that was probably unnecessary.
Perform these steps on both nodes
useradd -s /bin/bash -d /opt/stack -m stack
echo "stack ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
su - stack
git clone https://opendev.org/openstack/devstack
cd devstack
Create local.conf
in /opt/stack/devstack
. Be sure to update HOST_IP
with the public IP.
[[local|localrc]]
HOST_IP=<PUBLIC_IP>
LOGFILE=/opt/stack/logs/stack.sh.log
ADMIN_PASSWORD=redacted
DATABASE_PASSWORD=redacted
RABBIT_PASSWORD=redacted
SERVICE_PASSWORD=redacted
# Neturon
FIXED_RANGE=10.4.128.0/20
FLOATING_RANGE=192.168.23.128/25
Q_USE_SECGROUP=True
Q_AGENT=linuxbridge
IP_VERSION=4
# This was required to avoid errors when creating instances. It may not be
# necessary in other environments.
[[post-config|$NOVA_CONF]]
[libvirt]
cpu_mode=host-model
Run stack.sh
Create local.conf
in /opt/stack/devstack
. Be sure to update HOST_IP
with the public IP of this node and SERVICE_HOST
with the public IP of node 1 (controller+compute).
local.conf
:
[[local|localrc]]
HOST_IP=<PUBLIC_IP>
LOGFILE=/opt/stack/logs/stack.sh.log
ADMIN_PASSWORD=redacted
DATABASE_PASSWORD=redacted
RABBIT_PASSWORD=redacted
SERVICE_PASSWORD=redacted
DATABASE_TYPE=mysql
SERVICE_HOST=<CONTROLLER+COMPUTE_PUBLIC_IP>
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST
GLANCE_HOSTPORT=$SERVICE_HOST:9292
ENABLED_SERVICES=n-cpu,q-agt,c-vol,placement-client
NOVA_VNC_ENABLED=True
NOVNCPROXY_URL="http://$SERVICE_HOST:6080/vnc_lite.html"
VNCSERVER_LISTEN=$HOST_IP
VNCSERVER_PROXYCLIENT_ADDRESS=$VNCSERVER_LISTEN
# Neturon
FIXED_RANGE=10.4.128.0/20
FLOATING_RANGE=192.168.23.128/25
Q_USE_SECGROUP=True
Q_AGENT=linuxbridge
IP_VERSION=4
Wait until node 1 has finished building and then run stack.sh
.
Wait until node 2 has finished building.
On node 1 (controller+compute)
/opt/stack/devstack/tools/discover_hosts.sh
# Source credentials
source ~/devstack/openrc admin
# Create secgroups
openstack security group create server
openstack security group create client
# Create server security group rules
openstack security group rule create --remote-group client --dst-port 9092 --protocol tcp server
# Create client security group rules
openstack security group rule create --egress --remote-group server --dst-port 9092 --protocol tcp client
# Save the client secgroup's ID
CLIENT_SG_ID=$(openstack security group show client -cid -fvalue)
These patches just make it easier to trigger the race condition. The neutron-server
patch slows down port updates for the "client" VM, and the neutron-linuxbridge-agent
patch makes the window where the condition can occur longer.
These steps should be performed on the node 1 (combined controller+compute node)
cat << EOF > ~/plugin.py.patch
--- /opt/stack/neutron/neutron/plugins/ml2/plugin.py 2020-07-17 18:19:15.821827538 +0000
+++ /opt/stack/neutron/neutron/plugins/ml2/plugin.py-patched 2020-07-17 18:19:05.142035026 +0000
@@ -1400,6 +1400,16 @@
@db_api.retry_if_session_inactive()
def create_port(self, context, port):
self._before_create_port(context, port)
+ client_sg_id = 'CLIENT_SG_ID'
+ sleeptime = 8
+ try:
+ if client_sg_id in port['port']['security_groups']:
+ import time
+ LOG.info("neutron_diag: sleeping {}s before creating port".format(sleeptime))
+ time.sleep(sleeptime)
+ LOG.info("neutron_diag: done sleeping {}s".format(sleeptime))
+ except TypeError:
+ pass
result, mech_context = self._create_port_db(context, port)
return self._after_create_port(context, result, mech_context)
EOF
sed -i "s/CLIENT_SG_ID/${CLIENT_SG_ID}/" ~/plugin.py.patch
patch -b -i ~/plugin.py.patch /opt/stack/neutron/neutron/plugins/ml2/plugin.py
sudo systemctl restart devstack@q-svc.service
cat << EOF > ~/securitygroups_rpc.py.patch
--- /opt/stack/neutron/neutron/agent/securitygroups_rpc.py-orig 2020-07-16 15:33:43.055891510 +0000
+++ /opt/stack/neutron/neutron/agent/securitygroups_rpc.py 2020-07-16 15:34:24.979023281 +0000
@@ -148,6 +148,11 @@
devices.update(devices_info['devices'])
security_groups.update(devices_info['security_groups'])
security_group_member_ips.update(devices_info['sg_member_ips'])
+ import time
+ sleeptime = 10
+ LOG.info("neutron_diag: sleeping {}s after security_group_info_for_devices".format(sleeptime))
+ time.sleep(sleeptime)
+ LOG.info("neutron_diag: done sleeping {}s".format(sleeptime))
else:
devices = self.plugin_rpc.security_group_rules_for_devices(
self.context, list(device_ids))
EOF
patch -b -i ~/securitygroups_rpc.py.patch /opt/stack/neutron/neutron/agent/securitygroups_rpc.py
sudo systemctl restart devstack@q-agt.service
Be sure to change the hypervisor names to match your environment's hostnames.
openstack server create --flavor m1.nano --image cirros-0.5.1-x86_64-disk --network private --security-group client --availability-zone nova:cfarquhar-devstack-master02 client01 > /dev/null & \
openstack server create --flavor m1.nano --image cirros-0.5.1-x86_64-disk --network private --security-group server --availability-zone nova:cfarquhar-devstack-master01 server01 > /dev/null &
ipset list