-
-
Save wido/51cb9880d86f08f73766634d7f6df3f4 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash | |
# | |
# Use BGP+EVPN for VXLAN with CloudStack instead of Multicast | |
# | |
# Place this file on all KVM hypervisors at /usr/share/modifyvxlan.sh | |
# | |
# More information about BGP and EVPN with FRR: https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn | |
# | |
DSTPORT=4789 | |
# We bind our VXLAN tunnel IP(v4) on Loopback device 'lo' | |
DEV="lo" | |
usage() { | |
echo "Usage: $0: -o <op>(add | delete) -v <vxlan id> -p <pif> -b <bridge name> (-6)" | |
} | |
localAddr() { | |
local FAMILY=$1 | |
if [[ -z "$FAMILY" || $FAMILY == "inet" ]]; then | |
ip -4 -o addr show scope global dev ${DEV} | awk 'NR==1 {gsub("/[0-9]+", "") ; print $4}' | |
fi | |
if [[ "$FAMILY" == "inet6" ]]; then | |
ip -6 -o addr show scope global dev ${DEV} | awk 'NR==1 {gsub("/[0-9]+", "") ; print $4}' | |
fi | |
} | |
addVxlan() { | |
local VNI=$1 | |
local PIF=$2 | |
local VXLAN_BR=$3 | |
local FAMILY=$4 | |
local VXLAN_DEV=vxlan${VNI} | |
local ADDR=$(localAddr ${FAMILY}) | |
echo "local addr for VNI ${VNI} is ${ADDR}" | |
if [[ ! -d /sys/class/net/${VXLAN_DEV} ]]; then | |
ip -f ${FAMILY} link add ${VXLAN_DEV} type vxlan id ${VNI} local ${ADDR} dstport ${DSTPORT} nolearning | |
ip link set ${VXLAN_DEV} up | |
sysctl -qw net.ipv6.conf.${VXLAN_DEV}.disable_ipv6=1 | |
fi | |
if [[ ! -d /sys/class/net/$VXLAN_BR ]]; then | |
ip link add name ${VXLAN_BR} type bridge | |
ip link set ${VXLAN_BR} up | |
sysctl -qw net.ipv6.conf.${VXLAN_BR}.disable_ipv6=1 | |
fi | |
bridge link show|grep ${VXLAN_BR}|awk '{print $2}'|grep "^${VXLAN_DEV}\$" > /dev/null | |
if [[ $? -gt 0 ]]; then | |
ip link set ${VXLAN_DEV} master ${VXLAN_BR} | |
fi | |
} | |
deleteVxlan() { | |
local VNI=$1 | |
local PIF=$2 | |
local VXLAN_BR=$3 | |
local FAMILY=$4 | |
local VXLAN_DEV=vxlan${VNI} | |
ip link set ${VXLAN_DEV} nomaster | |
ip link delete ${VXLAN_DEV} | |
ip link set ${VXLAN_BR} down | |
ip link delete ${VXLAN_BR} type bridge | |
} | |
OP= | |
VNI= | |
FAMILY=inet | |
option=$@ | |
while getopts 'o:v:p:b:6' OPTION | |
do | |
case $OPTION in | |
o) oflag=1 | |
OP="$OPTARG" | |
;; | |
v) vflag=1 | |
VNI="$OPTARG" | |
;; | |
p) pflag=1 | |
PIF="$OPTARG" | |
;; | |
b) bflag=1 | |
BRNAME="$OPTARG" | |
;; | |
6) | |
FAMILY=inet6 | |
;; | |
?) usage | |
exit 2 | |
;; | |
esac | |
done | |
if [[ "$oflag$vflag$pflag$bflag" != "1111" ]]; then | |
usage | |
exit 2 | |
fi | |
lsmod|grep ^vxlan >& /dev/null | |
if [[ $? -gt 0 ]]; then | |
modprobe=`modprobe vxlan 2>&1` | |
if [[ $? -gt 0 ]]; then | |
echo "Failed to load vxlan kernel module: $modprobe" | |
exit 1 | |
fi | |
fi | |
# | |
# Add a lockfile to prevent this script from running twice on the same host | |
# this can cause a race condition | |
# | |
LOCKFILE=/var/run/cloud/vxlan.lock | |
( | |
flock -x -w 10 200 || exit 1 | |
if [[ "$OP" == "add" ]]; then | |
addVxlan ${VNI} ${PIF} ${BRNAME} ${FAMILY} | |
if [[ $? -gt 0 ]]; then | |
exit 1 | |
fi | |
elif [[ "$OP" == "delete" ]]; then | |
deleteVxlan ${VNI} ${PIF} ${BRNAME} ${FAMILY} | |
fi | |
) 200>${LOCKFILE} |
@wido I've posted a question on the mailing list. Thanks for checking it out later when you're free :)
I cc'ed you in there also.
Has any attempt been made to integrate this into cloudstack proper? If so, what were the issues?
I can see this obviously needs to do things like disable multicast for EVPN where cloudstack enables it in the provided script, but I'd think that could be fairly easily resolved. Ideas for resolution would be some sort of additional setting, or duplicating the vxlan protocol in cloudstack into a new protocol like "vxlan-evpn" which will simply pass a flag into the modifyvxlan.sh script to alter behavior.
I'd be interested in taking this on, but I'd like to know what kind of prior feedback there may have been.
Has any attempt been made to integrate this into cloudstack proper? If so, what were the issues?
I can see this obviously needs to do things like disable multicast for EVPN where cloudstack enables it in the provided script, but I'd think that could be fairly easily resolved. Ideas for resolution would be some sort of additional setting, or duplicating the vxlan protocol in cloudstack into a new protocol like "vxlan-evpn" which will simply pass a flag into the modifyvxlan.sh script to alter behavior.
I'd be interested in taking this on, but I'd like to know what kind of prior feedback there may have been.
Good suggestion! I have opened a Pull Request to at least add this script to the main repository: apache/cloudstack#9778
I have been using this script for 5y now without any issues, it just works as expected.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.
I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
No they can not ping the gateway but then can ping each other.. am trying to figure out how does those VM attached to the br-vxlan2 break out of that bridge to the internet.
Hi @wido, thank you for the great work you have on the script, it work flawlessly. VMs in Isolated or L2 Guest network type from different hypervisors can reach other without issues. But i have bumped into a challenge and i need guidance. When i create a Guest network of type Shared with public IP address(with VLAN/VNI), i cant get access to the internet or other physical machines in the same public IP range. Before the change from VLAN to VXLAN isolation method it was working fine.
What routers are you using? And they are the gateway for the shared network, correct? You would need to check if they learn the proper EVPN information through BGP. It seems that you might have an issue there.
This is not a script/CloudStack issue it seems, this is probably a local networking issue. (config).
Could be many, many, many things.I use FRR as a router & BGP router reflector and VMs hosted in different KVM server can communicate with each other regardless of their underlying network. Internet access only work with Isolated network type since they are behind a VR but when using shared network with public IP address they can still communicate with each other but they cant break out to the internet.
The VMs in this shared network with a public address, can they reach their gateway? Or from the gateway router, can you reach (ping) these VMs?
No they can not ping the gateway but then can ping each other.. am trying to figure out how does those VM attached to the br-vxlan2 break out of that bridge to the internet.
This is than an issue on your local network with the EVPN config on those gateways. Could be many things:
- BGP to gateway routers not working properly
- Incorrect route targets for the VNIs
- Policy config issue
- etc
- etc
Again, this has nothing to do with this particular script.
Yes, this script should still work with 4.19, no problem at all.
Could you maybe ask this question on the CloudStack mailinglist? I can get back to it there. You can cc me in the e-mail :-)