Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active June 16, 2024 12:30
Show Gist options
  • Save scyto/4c664734535da122f4ab2951b22b2085 to your computer and use it in GitHub Desktop.
Save scyto/4c664734535da122f4ab2951b22b2085 to your computer and use it in GitHub Desktop.

Enable Dual Stack (IPv4 and IPv6) OpenFabric Routing

This will result in an IPv4 and IPv6 routable mesh network that can survive any one node failure or any one cable failure. Alls the steps in this section must be performed on each node

Note for ceph do not dual stack - either use IPv4 or IPv6 addressees for all the monitors, MDS and daemons - despite the docs implying it is ok my findings on quincy are is it is funky....

this gist is part of this series

Create Loopback interfaces

Doing this means we don't have to give each thunderbolt a manual IPv6 or IPv4 addrees and that these addresses stay constant no matter what. Add the following to each node using nano /etc/network/interfaces

This should go uder the auto lo section and for each node the X should be 1, 2 or depending on the node

auto lo:0
iface lo:0 inet static
        address 10.0.0.8X/32
        
auto lo:6
iface lo:6 inet static
        address fc00::8X/128

so on the first node it would look comething like this:

...
auto lo
iface lo inet loopback
 
auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

auto lo:6
iface lo:6 inet static
        address fc00::81/128
...

also add this is as the last line to the interfaces file

# This must be the last line in the file
post-up /usr/bin/systemctl restart frr.service

Save file, repeat on each node.

Enable IPv4 and IPv6 forwarding

  1. use nano /etc/sysctl.conf to open the file
  2. uncomment #net.ipv6.conf.all.forwarding=1 (remove the # symbol)
  3. uncomment #net.ipv4.ip_forward=1 (remove the # symbol)
  4. save the file
  5. reboot?

FRR Setup

Install FRR

Install Free Range Routing (FRR) apt install frr

Enable the fabricd daemon

  1. edit the frr daemons file (nano /etc/frr/daemons) to change fabricd=no to fabricd=yes
  2. save the file
  3. restart the service with systemctl restart frr

Configure OpenFabric (perforn on all nodes)

  1. enter the FRR shell with vtysh
  2. optionally show the current config with show running-config
  3. enter the configure mode with configure
  4. Apply the bellow configuration (it is possible to cut and paste this into the shell instead of typing it manually, you may need to press return to set the last !. Also check there were no errors in repsonse to the paste text.).

Note: the X should be the number of the node you are working on, as an example - 0.0.0.1, 0.0.0.2 or 0.0.0.3.

ip forwarding
ipv6 forwarding
!
interface en05
ip router openfabric 1
ipv6 router openfabric 1
exit
!
interface en06
ip router openfabric 1
ipv6 router openfabric 1
exit
!
interface lo
ip router openfabric 1
ipv6 router openfabric 1
openfabric passive
exit
!
router openfabric 1
net 49.0000.0000.000X.00
exit
!

  1. you may need to pres return after the last ! to get to a new line - if so do this

  2. exit the configure mode with the command end

  3. save the configu with write memory

  4. show the configure applied correctly with show running-config - note the order of the items will be different to how you entered them and thats ok. (If you made a mistake i found the easiest way was to edt /etc/frr/frr.conf - but be careful if you do that.)

  5. use the command exit to leave setup

  6. repeat steps 1 to 9 on the other 3 nodes

  7. once you have configured all 3 nodes issue the command vtysh -c "show openfabric topology" if you did everything right you will see:

Area 1:
IS-IS paths to level-2 routers that speak IP
Vertex               Type         Metric Next-Hop             Interface Parent
pve1                                                                  
10.0.0.81/32         IP internal  0                                     pve1(4)
pve2                 TE-IS        10     pve2                 en06      pve1(4)
pve3                 TE-IS        10     pve3                 en05      pve1(4)
10.0.0.82/32         IP TE        20     pve2                 en06      pve2(4)
10.0.0.83/32         IP TE        20     pve3                 en05      pve3(4)

IS-IS paths to level-2 routers that speak IPv6
Vertex               Type         Metric Next-Hop             Interface Parent
pve1                                                                  
fc00::81/128         IP6 internal 0                                     pve1(4)
pve2                 TE-IS        10     pve2                 en06      pve1(4)
pve3                 TE-IS        10     pve3                 en05      pve1(4)
fc00::82/128         IP6 internal 20     pve2                 en06      pve2(4)
fc00::83/128         IP6 internal 20     pve3                 en05      pve3(4)

IS-IS paths to level-2 routers with hop-by-hop metric
Vertex               Type         Metric Next-Hop             Interface Parent

Now you should be in a place to ping each node from evey node across the thunderbolt mesh using IPv4 or IPv6 as you see fit.

@vdovhanych
Copy link

vdovhanych commented Jun 10, 2024

FWIW

I have dropped the restart of the frr service via my if-up script, I went back to the /etc/network/interfaces post up script, mainly due to one of the interfaces not coming back when I had some script checking interface state during boot, it disturbs something for the thunderbolt interface during initialization.

I put this in my interfaces file, basically 5 second sleep was enough and openfabric network always
comes back up.
post-up sleep 5 && /usr/bin/systemctl restart frr.service kep in mind that even though it is said to put the post up as the last line in the interfaces file, it needs to be before source /etc/network/interfaces.d/* if you have it there. Otherwise, the post-up command won't be executed.

here is the full interfaces file for anyone interested.

iface enp114s0 inet manual

iface wlo1 inet manual

auto vmbr0
iface vmbr0 inet static
	address 10.30.10.8/24
	gateway 10.30.10.1
	bridge-ports enp114s0
	bridge-stp off
	bridge-fd 0

allow-hotplug en05
iface en05 inet manual
       mtu 1500

allow-hotplug en06
iface en06 inet manual
        mtu 1500

auto lo
iface lo inet loopback

auto lo:0
iface lo:0 inet static
        address 10.0.0.8/32

post-up sleep 5 && /usr/bin/systemctl restart frr.service

source /etc/network/interfaces.d/*

@SteveKnowless
Copy link

Following up to my last post I was able to hack around this by adding the following to my crontab:

@reboot sleep 60 && /usr/bin/systemctl restart frr.service

Has worked so far but it's not a good fix.

Thank you for this suggestion. This was the only way I was able to get both IPv4 and IPv6 Routes to come back up after reboot. Anything I tried to restart the frr.service in /etc/network/interfaces would come back with an error processing the line.

@travisw3
Copy link

FWIW

I have dropped the restart of the frr service via my if-up script, I went back to the /etc/network/interfaces post up script, mainly due to one of the interfaces not coming back when I had some script checking interface state during boot, it disturbs something for the thunderbolt interface during initialization.

I put this in my interfaces file, basically 5 second sleep was enough and openfabric network always comes back up. post-up sleep 5 && /usr/bin/systemctl restart frr.service kep in mind that even though it is said to put the post up as the last line in the interfaces file, it needs to be before source /etc/network/interfaces.d/* if you have it there. Otherwise, the post-up command won't be executed.

here is the full interfaces file for anyone interested.

iface enp114s0 inet manual

iface wlo1 inet manual

auto vmbr0
iface vmbr0 inet static
	address 10.30.10.8/24
	gateway 10.30.10.1
	bridge-ports enp114s0
	bridge-stp off
	bridge-fd 0

allow-hotplug en05
iface en05 inet manual
       mtu 1500

allow-hotplug en06
iface en06 inet manual
        mtu 1500

auto lo
iface lo inet loopback

auto lo:0
iface lo:0 inet static
        address 10.0.0.8/32

post-up sleep 5 && /usr/bin/systemctl restart frr.service

source /etc/network/interfaces.d/*

Thank you this fixed it for me

@ronindesign
Copy link

ronindesign commented Jun 12, 2024

Even with auto en05/auto en06 in /etc/network/interfaces, neither interface comes up after reboot.

I've also tried adding post-up sleep 5 && /usr/bin/systemctl restart frr.service, but nothing seems to start the en05/en06 interfaces on startup.

Everything works just fine after I manually bring up the interfaces using ifup en05/ifup en06

I don't see any errors or other reasons why en05/en06 aren't coming up. Am I supposed to add a script to call ifup en05/ifup en06 or is /usr/bin/systemctl restart frr.service supposed to also do this? Any log I can check for errors?

EDIT: Oh, I forgot, the pve-en05.sh scripts are supposed to bring up the interfaces. I have this in system log on start up:

kernel: thunderbolt 0-1: new host found, vendor=0x8086 device=0x1
kernel: thunderbolt 0-1: Intel Corp. virtual302
kernel: thunderbolt-net 0-1.0 en05: renamed from thunderbolt0
(udev-worker)[1031]: en05: Process '/usr/local/bin/pve-en05.sh' failed with exit code 89.

@joostvdl
Copy link

Also having trouble with the ip4 use. I'm having posted an issue on the FRR Slack channel. not only a systemctl restart of the frr.service solved it temporarly (until the next boot). But I also tried:
/usr/lib/frr/frr-reload.py --reload /etc/frr/frr.conf
But no errors to see here why this isn't working. But the reload did work. So it looks like an timing issue about when the restart of the frr.service is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment