Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active May 15, 2024 20:28
Show Gist options
  • Save scyto/4c664734535da122f4ab2951b22b2085 to your computer and use it in GitHub Desktop.
Save scyto/4c664734535da122f4ab2951b22b2085 to your computer and use it in GitHub Desktop.

Enable Dual Stack (IPv4 and IPv6) OpenFabric Routing

This will result in an IPv4 and IPv6 routable mesh network that can survive any one node failure or any one cable failure. Alls the steps in this section must be performed on each node

Note for ceph do not dual stack - either use IPv4 or IPv6 addressees for all the monitors, MDS and daemons - despite the docs implying it is ok my findings on quincy are is it is funky....

this gist is part of this series

Create Loopback interfaces

Doing this means we don't have to give each thunderbolt a manual IPv6 or IPv4 addrees and that these addresses stay constant no matter what. Add the following to each node using nano /etc/network/interfaces

This should go uder the auto lo section and for each node the X should be 1, 2 or depending on the node

auto lo:0
iface lo:0 inet static
        address 10.0.0.8X/32
        
auto lo:6
iface lo:6 inet static
        address fc00::8X/128

so on the first node it would look comething like this:

...
auto lo
iface lo inet loopback
 
auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

auto lo:6
iface lo:6 inet static
        address fc00::81/128
...

also add this is as the last line to the interfaces file

# This must be the last line in the file
post-up /usr/bin/systemctl restart frr.service

Save file, repeat on each node.

Enable IPv4 and IPv6 forwarding

  1. use nano /etc/sysctl.conf to open the file
  2. uncomment #net.ipv6.conf.all.forwarding=1 (remove the # symbol)
  3. uncomment #net.ipv4.ip_forward=1 (remove the # symbol)
  4. save the file
  5. reboot?

FRR Setup

Install FRR

Install Free Range Routing (FRR) apt install frr

Enable the fabricd daemon

  1. edit the frr daemons file (nano /etc/frr/daemons) to change fabricd=no to fabricd=yes
  2. save the file
  3. restart the service with systemctl restart frr

Configure OpenFabric (perforn on all nodes)

  1. enter the FRR shell with vtysh
  2. optionally show the current config with show running-config
  3. enter the configure mode with configure
  4. Apply the bellow configuration (it is possible to cut and paste this into the shell instead of typing it manually, you may need to press return to set the last !. Also check there were no errors in repsonse to the paste text.).

Note: the X should be the number of the node you are working on, as an example - 0.0.0.1, 0.0.0.2 or 0.0.0.3.

ip forwarding
ipv6 forwarding
!
frr version 8.5.2
frr defaults traditional
hostname pve1
service integrated-vtysh-config
!
interface en05
ip router openfabric 1
ipv6 router openfabric 1
exit
!
interface en06
ip router openfabric 1
ipv6 router openfabric 1
exit
!
interface lo
ip router openfabric 1
ipv6 router openfabric 1
openfabric passive
exit
!
router openfabric 1
net 49.0000.0000.000X.00
exit
!

  1. you may need to pres return after the last ! to get to a new line - if so do this

  2. exit the configure mode with the command end

  3. save the configu with write memory

  4. show the configure applied correctly with show running-config - note the order of the items will be different to how you entered them and thats ok. (If you made a mistake i found the easiest way was to edt /etc/frr/frr.conf - but be careful if you do that.)

  5. use the command exit to leave setup

  6. repeat steps 1 to 9 on the other 3 nodes

  7. once you have configured all 3 nodes issue the command vtysh -c "show openfabric topology" if you did everything right you will see:

Area 1:
IS-IS paths to level-2 routers that speak IP
Vertex               Type         Metric Next-Hop             Interface Parent
pve1                                                                  
10.0.0.81/32         IP internal  0                                     pve1(4)
pve2                 TE-IS        10     pve2                 en06      pve1(4)
pve3                 TE-IS        10     pve3                 en05      pve1(4)
10.0.0.82/32         IP TE        20     pve2                 en06      pve2(4)
10.0.0.83/32         IP TE        20     pve3                 en05      pve3(4)

IS-IS paths to level-2 routers that speak IPv6
Vertex               Type         Metric Next-Hop             Interface Parent
pve1                                                                  
fc00::81/128         IP6 internal 0                                     pve1(4)
pve2                 TE-IS        10     pve2                 en06      pve1(4)
pve3                 TE-IS        10     pve3                 en05      pve1(4)
fc00::82/128         IP6 internal 20     pve2                 en06      pve2(4)
fc00::83/128         IP6 internal 20     pve3                 en05      pve3(4)

IS-IS paths to level-2 routers with hop-by-hop metric
Vertex               Type         Metric Next-Hop             Interface Parent

Now you should be in a place to ping each node from evey node across the thunderbolt mesh using IPv4 or IPv6 as you see fit.

@vdovhanych
Copy link

vdovhanych commented Apr 12, 2024

Ive setup cluster with quite a lot of help from these gists. Just wanted to mention it here for anyone else dealing with network being broken after node reboot.

post-up /usr/bin/systemctl restart frr.service was not working for me in the /etc/network/interfaces and when i run ifreload -a it complained about it and couldt resolve the line.

I created a file restart-frr in the /etc/network/if-up.d/ with command to restart the service as sh script

#!/bin/sh

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
    # Restart the frr service
    /usr/bin/systemctl restart frr.service
fi

Make the file executable and it should run this script after the network target is up, check for en06 or en05 and run the restart command. Reason for this is that it will run the script in that folder for any network up so depending on how many networks you have it will run for every one.
I tested and rebooted one node and the network came back online.

@flx-666
Copy link

flx-666 commented Apr 20, 2024

I faced issues getting en05/06 up after reboot, I ended up adding auto en05 and auto en06 in all my interfaces config files.
frr did not start properly either before I added this before the starting of the service:

post-up /usr/bin/systemctl reset-failed frr.service

So, my /etc/networking/interfaces files end like this:
auto en05
allow-hotplug en05
iface en05 inet manual
mtu 65520

iface en05 inet6 manual
mtu 65520

auto en06
allow-hotplug en06
iface en06 inet manual
mtu 65520

iface en06 inet6 manual
mtu 65520

#source /etc/network/interfaces.d/*
post-up /usr/bin/systemctl reset-failed frr.service
post-up /usr/bin/systemctl restart frr.service

Hope this helps.

BTW, I wonder if I should have just added the reset-failed command in the scipt /etc/network/if-up.d/restart-frr ??

@vdovhanych
Copy link

BTW, I wonder if I should have just added the reset-failed command in the scipt /etc/network/if-up.d/restart-frr ??

If you went and did what i described in my post about setting up a simple script to restart the frr service you shouldnt need to have anything else in the /etc/networking/interfaces. Saying that if its working for you like you described and you have the post-up scripts to restart the service in /etc/networking/interfaces i would just get rid of the /etc/network/if-up.d/restart-frr if you have that. It will only restart the service multiple times (maybe that is what was failing the service too). Also that script i have in the /etc/network/if-up.d/ runs after any of the en05 en06 is up, it wont run if the interfaces are not up.

My /etc/networking/interfaces looks like this

<default proxmox configuration above>

# thunderbolt network configuration
allow-hotplug en05
iface en05 inet manual
       mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

auto lo
iface lo inet loopback

auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

source /etc/network/interfaces.d/*

And then i have what was described in my previous post.

@flx-666
Copy link

flx-666 commented Apr 21, 2024

@vdovhanych thanks for your answer!
I added the reset failed command in the restart-frr script, removed all entries about it in /etc/network/interfaces and now everything is running smoothly :)
Seems to me that routing comes back quicker, probably as I don't restart the services many times

I however had to add the auto en0X entries in the /etc/network/interfaces to have them up at reboot.

Thanks a lot for your input!

@thaynes43
Copy link

thaynes43 commented May 7, 2024

Hello,

Thank you for the gist, it got me up and running in no time. However, like others I am struggling to get the TB network back running after a restart. Using the post-up command in the interfaces file I am able to see the correct topology but I get this when I ping another node:

root@pve01:~# ping 10.0.0.83
PING 10.0.0.83 (10.0.0.83) 56(84) bytes of data.
From 10.0.0.81 icmp_seq=1 Destination Host Unreachable
From 10.0.0.81 icmp_seq=2 Destination Host Unreachable

If I delete that line and add vdovhanych's script I do not get frr running on startup. Each path leads to me running this manually:

 /usr/bin/systemctl restart frr.service

I will keep troubleshooting, just wondering if anyone has overcome this "Destination Host Unreachable" issue. Here is my interface file, I've been playing around with or without the pose-up vs the script:

auto lo
iface lo inet loopback

# Begin thunderbolt edits

auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

auto lo:6
iface lo:6 inet static
        address fc00::81/128

# End thunderbolt edits

iface enp2s0f0np0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.0.34/24
        gateway 192.168.0.1
        bridge-ports enp2s0f0np0
        bridge-stp off
        bridge-fd 0

iface enp87s0 inet manual

iface enp90s0 inet manual

iface enp2s0f1np1 inet manual

iface wlp91s0 inet manual

# Begin thunderbolt edits

auto en05
allow-hotplug en05
iface en05 inet manual
       mtu 65520

iface en05 inet6 manual
        mtu 65520

auto en06
allow-hotplug en06
iface en06 inet manual
        mtu 65520

iface en06 inet6 manual
        mtu 65520

# End thunderbolt edits

source /etc/network/interfaces.d/*

# TB last line
post-up /usr/bin/systemctl reset-failed frr.service
post-up /usr/bin/systemctl restart frr.service

@thaynes43
Copy link

Following up to my last post I was able to hack around this by adding the following to my crontab:

@reboot sleep 60 && /usr/bin/systemctl restart frr.service

Has worked so far but it's not a good fix.

@thaynes43
Copy link

thaynes43 commented May 10, 2024

Some weirdness with open fabric's config. When I use vtysh to show the config it's missing the ip router for en05 on two nodes:

interface en05
 ipv6 router openfabric 1
exit
!

However in etc/frr/frr.conf it is present:

interface en05
 ip router openfabric 1
 ipv6 router openfabric 1
exit
!

Update - I was able to fix this by removing the post-up commands from the interfaces file. They were crashing out the scripts for plugging in the cables.

@Allistah
Copy link

Allistah commented May 10, 2024

Update - I was able to fix this by removing the post-up commands from the interfaces file. They were crashing out the scripts for plugging in the cables.

Just wanted to say thank you for posting these updates - and thanks to the Gist author for everything. I have a single NUC 13 Pro and have another one on the way. I'll only be able to set up Thunderbolt networking between the two NUCs until I can get the third one but thats fine. I will also be only using IPv4. I'll put all of the lines of code in there but comment them out until I have all three nodes online. Really looking forward to setting this up! Just wanted to say Thank You in advance!

@JamesTurland
Copy link

I have this working on 3x MS-01, thanks for the detailed instructions. I can only do it over ipv6 though, I don't see ipv4, not a problem I guess but any reason why?

@rogly
Copy link

rogly commented May 15, 2024

for what is worth, I had to manually run an "ifup lo" for things to start working, after running that once on each node I was able to see the topology populate and ping between nodes. Also make sure if you are using a firewall at the cluster level, that you add appropriate rules for the mesh network

@vdovhanych
Copy link

I have this working on 3x MS-01, thanks for the detailed instructions. I can only do it over ipv6 though, I don't see ipv4, not a problem I guess but any reason why?

@JamesTurland There is nothing special needed for the IPv4 configuration. I would double-check if you configured everything properly. Also, check if you have the IPv4 address present on the lo interface on all nodes (different IP in the same subnet for each node ofc); check with ip a. If you see it there, then I would go through frr.conf on all the nodes. If not take a look at /etc/network/interfaces on all nodes again. Also, check if you enabled ipv4 forwarding. Nothing else comes to mind that would prevent you from using IPV4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment