Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active June 20, 2024 22:22
Show Gist options
  • Save scyto/67fdc9a517faefa68f730f82d7fa3570 to your computer and use it in GitHub Desktop.
Save scyto/67fdc9a517faefa68f730f82d7fa3570 to your computer and use it in GitHub Desktop.
Thunderbolt Networking Setup

Thunderbolt Networking

this gist is part of this series

NOTE FOR THIS TO BE RELIABLE ON NODE RESTARTS YOU WILL NEED PROXMOX KERNEL 6.2.16-14-pve OR HIGER

This fixes issues i bugged with the thunderbolt / thunderbolt-net maintainers (i will take everyones thanks now, lol)

Install LLDP - this is great to see what nodes can see which.

  • install lldpctl with apt install lldpd

Load Kernel Modules

  • add thunderbolt and thunderbolt-net kernel modules (this must be done all nodes - yes i know it can sometimes work withoutm but the thuderbolt-net one has interesting behaviou' so do as i say - add both ;-)
    1. nano /etc/modules add modules at bottom of file, one on each line
    2. save using x then y then enter

Prepare /etc/network/interfaces

doing this means we don't have to give each thunderbolt a manual IPv6 addrees and that these addresses stay constant no matter what Add the following to each node using nano /etc/network/interfaces

If you see any sections called thunderbolt0 or thunderbol1 delete them at this point.

Now add the following (note we will set IP addresses in the UI):

allow-hotplug en05
iface en05 inet manual
       mtu 65520

iface en05 inet6 manual
        mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

iface en06 inet6 manual
        mtu 65520

If you see any thunderbol sections delete them from the file before you save it.

Rename Thunderbolt Connections

This is needed as proxmox doesn't recognize the thunderbolt interface name. There are various methods to do this. This method was selected after trial and error because:

  • the thunderboltX naming is not fixed to a port (it seems to be based on sequence you plug the cables in)
  • the MAC address of the interfaces changes with most cable insertion and removale events
  1. use udevadm monitor command to find your device IDs when you insert and remove each TB4 cable. Yes you can use other ways to do this, i recommend this one as it is great way to understand what udev does - the command proved more useful to me than the syslog or lspci command for troublehsooting thunderbolt issues and behavious. In my case my two pci paths are 0000:00:0d.2and 0000:00:0d.3 if you bought the same hardware this will be the same on all 3 units. Don't assume your PCI device paths will be the same as mine.

  2. create a link file using nano /etc/systemd/network/00-thunderbolt0.link and enter the following content:

[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05
  1. create a second link file using nano /etc/systemd/network/00-thunderbolt1.link and enter the following content:
[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06

Set Interfaces to UP on reboots and cable insertions

This section en sure that the interfaces will be brought up at boot or cable insertion with whatever settings are in /etc/network/interfaces - this shouldn't need to be done, it seems like a bug in the way thunderbolt networking is handled (i assume this is debian wide but haven't checked).

  1. create a udev rule to detect for cable insertion using nano /etc/udev/rules.d/10-tb-en.rules with the following content:
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en05", RUN+="/usr/local/bin/pve-en05.sh"
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en06", RUN+="/usr/local/bin/pve-en06.sh"
  1. save the file

  2. create the first script referenced above using nano /usr/local/bin/pve-en05.sh and with the follwing content:

#!/bin/bash

# this brings the renamed interface up and reprocesses any settings in /etc/network/interfaces for the renamed interface
/usr/sbin/ifup en05

save the file and then

  1. create the second script referenced above using nano /usr/local/bin/pve-en06.sh and with the follwing content:
#!/bin/bash

# this brings the renamed interface up and reprocesses any settings in /etc/network/interfaces for the renamed interface
/usr/sbin/ifup en06

and save the file

  1. make both scripts executable with chmod +x /usr/local/bin/*.sh
  2. Reboot (restarting networking, init 1 and init 3 are not good enough, so reboot)

Enabling IP Connectivity

proceed to the next gist

@JamesTurland
Copy link

Hi peeps; for anyone like me with a non-unique PCI device, you can use the MAC address to match and the easiest place to get it is "ip link show" As well, don't use quotes around the address; see below for example:

[Match] MACAddress=aa:bb:cc:dd:ee:ff Driver=thunderbolt-net [Link] MACAddressPolicy=none Name=en06

Great thread peeps. Clay.

Unfortunately my MAC and permanent MAC change on each boot... Go figure.

@damitjimii
Copy link

@JamesTurland maybe the whole path? Or maybe look at the udevadm info -ap /whole pci path + /net/thunderbold0 to see if there is something static in one of the path/device levels.

@scyto
Copy link
Author

scyto commented Jun 3, 2024

Unfortunately my MAC and permanent MAC change on each boot... Go figure.

yes thats what I observed too, also the thunderbolt0 and thunderbolt1 names are not consistent, whichever port comes up first with cable will get 0 and the next port gets 1 - so this can change based on order cables are plugged in or whatever weird race condition on the bus / kernel happens.....

@scyto
Copy link
Author

scyto commented Jun 3, 2024

@damitjimii forget the dmesg output it will lead you astray - what do you see in the udevadm tool for each port - thats the key because you are creating match conditions based on the udev meta data for those paths. By querying the various paths you see in the monitor output with udevadm to explicitly inspect the paths you can try and find other consistent identifiers to do match on.

@luilegeant
Copy link

luilegeant commented Jun 4, 2024

@uvalleza & all: Quick summary from my last few days using 3 um790 pro from minisforum:

  • Context: Ubuntu 24.04 LTS on 3 "nuc" with AMD cpu. (I was told: usb-4 doesn't necessary means thunderbolt-3, the spec is a pick & choose; on top of that until recently thunderbolt was a intel only feature)
  • The speed i get on direct link is around ~12Gbits like you do
  • The frr needs a "reload" after boot (no need to play with the wires, unless the logs shows invalid config for usb port x) see code snipped bellow
  • Downgrading the bios/uefi from 1.09 to 1.07 gave me less "invalid config for usb x" kind of errors and more reliability after reboots (i still have to unplug-replug some wires sometimes) also, it seems that my speed went from 10-11gbits to 12-13 gbits, but i can't really confirm. => what bios/uefi version are you running with ?
  • I had to stop using encrypted boot drives as it required me to unplug the thunderbolt links to let the hdmi work, then replug it all.
  • cables are indeed placed the same way you do
  • my frr setup seems to be working, but once I remove 1 of the link, the speed is about 2Mbits (yes mega bits) when it needs to do 1 more hop through the 2 other thunderbolt links => i have yet to figure out that part

To auto-reload the frr configuration after reboot (required otherwise it fails to see the thunderbolt links and I get 3 independent nodes that don't see each other via vtysh -c "show openfabric topology")
Requirement: have your interfaces renamed (see "tbt" in script) as explained in the first post by scyto (don't use hyphen in interface names, it wasn't working for me)

#!/bin/sh
# Delayed start script to tell frr to reload ensuring that it sees thunderbolt links towards other nodes.
# condition: is there any tbt network interface and frr service up
COUNTER=0
while [ ${COUNTER} -lt 5 ]; do
	sleep 1;
	TEST=$(ip a | grep ": tbt" | grep "UP" | awk 'BEGIN { ORS=""}; {print $2}')
	if [ ${#TEST} -ge 2 ]; then
		TEST_SVC=$(service frr status | grep "active (running)")
		if [ ${#TEST_SVC} -ge 2 ]; then
			service frr reload;
			echo "frr service reload request sent"
			exit 0;
		fi
	fi
	COUNTER=$((COUNTER+1));
done
echo "Failed to request frr service reload: request NOT sent"
exit 1;
[Unit]
After=network.target

[Service]
ExecStart=/usr/local/bin/restart-frr.sh

[Install]
WantedBy=default.target

Note: The script is called restart, but after some testing, I realised that reload was enough.

To all: thank you for sharing your experience, its a great help & motivation to figure out what's going sideways 😄

@nicedevil007
Copy link

you have a typo in your gist. to find it much easier I just marked it with "###" around it

allow-hotplug en05
iface en05 inet manual
       mtu 65520

iface en05 ###inet6### manual
        mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

iface en06 inet6 manual
        mtu 65520

should look like this, or am I wrong?

allow-hotplug en05
iface en05 inet manual
       mtu 65520

iface en05 inet5 manual
        mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

iface en06 inet6 manual
        mtu 65520

@scyto
Copy link
Author

scyto commented Jun 10, 2024

you have a typo in your gist. to find it much easier I just marked it with "###" around it

iface en05 ###inet6### manual
        mtu 65520

should look like this, or am I wrong?


iface en05 inet5 manual
        mtu 65520

Its not a typo and you are wrong :-) - inet6 is how you refer to the IPv6 settings (see https://wiki.debian.org/NetworkConfiguration)

I do note that on my current running system the inet6 has been removed - i suspect this is a version difference / done during upgrade as you can see here , i assume it is now an assumed setting if not mentioned.

For reference this is my current running interfaces file off of one node (node 3), note i have things here that are not required - like my ipv6 bridge to the LAN and my wireless lan settings :-)


auto lo
iface lo inet loopback

auto lo:0
iface lo:0 inet static
        address 10.0.0.83/32

auto lo:6
iface lo:6 inet static
        address fc00::83/128

iface enp86s0 inet manual

auto enp87s0
iface enp87s0 inet manual

iface wlo1 inet manual

allow-hotplug en05
iface en05 inet manual
        mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.83/24
        gateway 192.168.1.1
        bridge-ports enp86s0
        bridge-stp off
        bridge-fd 0

iface vmbr0 inet6 static
        address 2600:dead:beef:1::83/64
        gateway 2600:dead:beef:1::1



post-up /usr/bin/systemctl restart frr.service

@scyto
Copy link
Author

scyto commented Jun 11, 2024

I guess it was one glass of wine too much.... inet5... have to lol about my self, sry for missleading anyone else that read my part. but now I have to figure out why my TB network isn't finding all devices :(

hehe, don't worry about it, i saw how you arrived at your assumption, quite logical, asking questions is how we all learn, including me, for example your question meant i just learn't on my system the inet6 isn't required; i know it was at one point - i was messing around with also having fixed IPs on those interfaces (not the loopbacks) and there the inet6 was absolutely required.

I hesitate to edit the gist to remove the inet6 incase it breaks someone (there is no harm having it) but if you find it works without it let me know and i will edit - less things to put in config files is good...

.... oh i know it might have been promox that put those in originally because i was using the UI to do things with IPv6.... not sure....

@nicedevil007
Copy link

So you guess it is not needed anymore to use the iface lo:6 anymore?
Yesterday I was able to get it up and running with IPv6 but I would love to just use IPv4 ofc... so much easier to understand (at least for me).

@nicedevil007
Copy link

@scyto

Tip

today I figured out what is the best way to make sure everything is getting up and working again. (because I didn't know how to troubleshoot in the past I made about 4-5 reinstalls of my whole NUC Cluster.... that leads me to my own private gitea repo where I can copy paste most of the commands in a more easy way than here).

If you mind using this or even change parts of your commands, I want to post it here. Some parts are taken from other users ideas here.

Caution

This is done with the Intel NUCs that @scyto is using! I took the same IP-addresses/interface names.
It is all done with ONLY IPv4. No need for IPv6.

How to get Thunderbolt Network up and running

Main idea is from here, but I like to be able to copy paste a bit more comfortable.

Assumptions

This manual was used on Intel NUC 13th generation with 2 TB4 Ports.

On all Nodes

Optional package to track which node can see which other one.

apt install -y lldpd

Mandatory packages.

apt install -y lsb-release
curl -s https://deb.frrouting.org/frr/keys.gpg | sudo tee /usr/share/keyrings/frrouting.gpg > /dev/null
FRRVER="frr-stable"
echo deb '[signed-by=/usr/share/keyrings/frrouting.gpg]' https://deb.frrouting.org/frr \
     $(lsb_release -s -c) $FRRVER | sudo tee -a /etc/apt/sources.list.d/frr.list
apt update
apt install -y frr

Add kernel modules.

# remove empty lines
sed '/^$/d' /etc/modules > temp.txt && mv temp.txt /etc/modules

# add modules
tee -a /etc/modules <<EOF
thunderbolt
thunderbolt-net
EOF

Add the thunderbolt links.

tee -a /etc/systemd/network/00-thunderbolt0.link <<EOF
[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05
EOF

tee -a /etc/systemd/network/00-thunderbolt1.link <<EOF
[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06
EOF

Automatic setup of interface to be up after reboot or cable insertion.

tee -a /etc/udev/rules.d/10-tb-en.rules <<EOF
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en05", RUN+="/usr/local/bin/pve-en05.sh"
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en06", RUN+="/usr/local/bin/pve-en06.sh"
EOF

tee -a /usr/local/bin/pve-en05.sh <<EOF
#!/bin/bash
/usr/sbin/ifup en05
EOF

tee -a /usr/local/bin/pve-en06.sh <<EOF
#!/bin/bash
/usr/sbin/ifup en06
EOF

chmod +x /usr/local/bin/pve-en05.sh
chmod +x /usr/local/bin/pve-en06.sh

Enable IPv4 forwarding.

sed -i "s/\#net.ipv4.ip_forward\=1/net.ipv4.ip_forward\=1/" /etc/sysctl.conf

Presetup and configuration of FRR.

sed -i "s/fabricd=no/fabricd=yes/" /etc/frr/daemons
systemctl restart frr

Make sure interface is coming up!
Idea coming from here => https://gist.github.com/scyto/67fdc9a517faefa68f730f82d7fa3570?permalink_comment_id=5077802#gistcomment-5077802

tee -a /usr/local/bin/restart-frr.sh <<EOF
#!/bin/sh
# Delayed start script to tell frr to reload ensuring that it sees thunderbolt links towards other nodes.
# condition: is there any tbt network interface and frr service up
COUNTER=0
while [ ${COUNTER} -lt 5 ]; do
        sleep 1;
        TEST=$(ip a | grep ": en0" | grep "UP" | awk 'BEGIN { ORS=""}; {print $2}')
        if [ ${#TEST} -ge 2 ]; then
                TEST_SVC=$(service frr status | grep "active (running)")
                if [ ${#TEST_SVC} -ge 2 ]; then
                        service frr reload;
                        echo "frr service reload request sent"
                        exit 0;
                fi
        fi
        COUNTER=$((COUNTER+1));
done
echo "Failed to request frr service reload: request NOT sent"
exit 1;
EOF
chmod +x /usr/local/bin/restart-frr.sh

# create systemd service and make it autoboot
tee -a /etc/systemd/system/frr-restarter.service <<EOF
[Unit]
After=network.target

[Service]
ExecStart=/usr/local/bin/restart-frr.sh

[Install]
WantedBy=default.target
EOF

systemctl daemon-reload
systemctl enable frr-restarter

Different settings per Node!

Adjust the /etc/network/interfaces. Remove any section that belongs to any auto added thunderbolt0 or thunderbolt1 interface.

Node 1

sed -i '/iface lo inet loopback/a\
\
auto lo:0\niface lo:0 inet static\n        address 10.0.0.81/32' /etc/network/interfaces
sed -i '/^source \/etc\/network\/interfaces\.d\/\*$/d' /etc/network/interfaces
sed '${/^$/d;}' /etc/network/interfaces > temp.txt && mv temp.txt /etc/network/interfaces

tee -a /etc/network/interfaces <<EOF
auto en05
allow-hotplug en05
iface en05 inet manual
       mtu 65520

auto en06
allow-hotplug en06
iface en06 inet manual
       mtu 65520
EOF

Node 2

sed -i '/iface lo inet loopback/a\
\
auto lo:0\niface lo:0 inet static\n        address 10.0.0.82/32' /etc/network/interfaces
sed -i '/^source \/etc\/network\/interfaces\.d\/\*$/d' /etc/network/interfaces
sed '${/^$/d;}' /etc/network/interfaces > temp.txt && mv temp.txt /etc/network/interfaces

tee -a /etc/network/interfaces <<EOF
allow-hotplug en05
iface en05 inet manual
       mtu 65520

allow-hotplug en06
iface en06 inet manual
       mtu 65520
EOF

Node 3

sed -i '/iface lo inet loopback/a\
\
auto lo:0\niface lo:0 inet static\n        address 10.0.0.83/32' /etc/network/interfaces
sed -i '/^source \/etc\/network\/interfaces\.d\/\*$/d' /etc/network/interfaces
sed '${/^$/d;}' /etc/network/interfaces > temp.txt && mv temp.txt /etc/network/interfaces

tee -a /etc/network/interfaces <<EOF
allow-hotplug en05
iface en05 inet manual
       mtu 65520

allow-hotplug en06
iface en06 inet manual
       mtu 65520
EOF

Open VTYSH CLI.

vtysh

Enter config mode.

configure

Node 1

ip forwarding
!
interface en05
ip router openfabric 1
exit
!
interface en06
ip router openfabric 1
exit
!
interface lo
ip router openfabric 1
openfabric passive
exit
!
router openfabric 1
net 49.0000.0000.0001.00
exit
!

end
write memory
exit

# Doublecheck correct config
vtysh -c "show running-config"

Node 2

ip forwarding
!
interface en05
ip router openfabric 1
exit
!
interface en06
ip router openfabric 1
exit
!
interface lo
ip router openfabric 1
openfabric passive
exit
!
router openfabric 1
net 49.0000.0000.0002.00
exit
!

end
write memory
exit

# Doublecheck correct config
vtysh -c "show running-config"

Node 3

ip forwarding
!
interface en05
ip router openfabric 1
exit
!
interface en06
ip router openfabric 1
exit
!
interface lo
ip router openfabric 1
openfabric passive
exit
!
router openfabric 1
net 49.0000.0000.0003.00
exit
!

end
write memory
exit

# Doublecheck correct config
vtysh -c "show running-config"

Time for the reboot.

/sbin/reboot

Debugging

# shows the actual configuration
vtysh -c "show running-config"
# shows all links
vtysh -c "show openfabric topology"

@scyto
Copy link
Author

scyto commented Jun 20, 2024

@scyto
nice work, can you fork, its a little long for a comment...?

@scyto
Copy link
Author

scyto commented Jun 20, 2024

and you don't need to wait for the link to come up - just have the restart at the bottom of the /interfaces file and it should 'just work'. i have no idea why you are seeing issues unless you hardware is fundamentally different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment