this gist is part of this series
This fixes issues i bugged with the thunderbolt / thunderbolt-net maintainers (i will take everyones thanks now, lol)
- install lldpctl with
apt install lldpd
- add
thunderbolt
andthunderbolt-net
kernel modules (this must be done all nodes - yes i know it can sometimes work withoutm but the thuderbolt-net one has interesting behaviou' so do as i say - add both ;-)nano /etc/modules
add modules at bottom of file, one on each line- save using
x
theny
thenenter
doing this means we don't have to give each thunderbolt a manual IPv6 addrees and that these addresses stay constant no matter what
Add the following to each node using nano /etc/network/interfaces
If you see any sections called thunderbolt0 or thunderbol1 delete them at this point.
Now add the following (note we will set IP addresses in the UI):
allow-hotplug en05
iface en05 inet manual
mtu 65520
iface en05 inet6 manual
mtu 65520
allow-hotplug en06
iface en06 inet manual
mtu 65520
iface en06 inet6 manual
mtu 65520
If you see any thunderbol sections delete them from the file before you save it.
This is needed as proxmox doesn't recognize the thunderbolt interface name. There are various methods to do this. This method was selected after trial and error because:
- the thunderboltX naming is not fixed to a port (it seems to be based on sequence you plug the cables in)
- the MAC address of the interfaces changes with most cable insertion and removale events
-
use
udevadm monitor
command to find your device IDs when you insert and remove each TB4 cable. Yes you can use other ways to do this, i recommend this one as it is great way to understand what udev does - the command proved more useful to me thanthe syslog
orlspci command
for troublehsooting thunderbolt issues and behavious. In my case my two pci paths are0000:00:0d.2
and0000:00:0d.3
if you bought the same hardware this will be the same on all 3 units. Don't assume your PCI device paths will be the same as mine. -
create a link file using
nano /etc/systemd/network/00-thunderbolt0.link
and enter the following content:
[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05
- create a second link file using
nano /etc/systemd/network/00-thunderbolt1.link
and enter the following content:
[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06
This section en sure that the interfaces will be brought up at boot or cable insertion with whatever settings are in /etc/network/interfaces - this shouldn't need to be done, it seems like a bug in the way thunderbolt networking is handled (i assume this is debian wide but haven't checked).
- create a udev rule to detect for cable insertion using
nano /etc/udev/rules.d/10-tb-en.rules
with the following content:
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en05", RUN+="/usr/local/bin/pve-en05.sh"
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en06", RUN+="/usr/local/bin/pve-en06.sh"
-
save the file
-
create the first script referenced above using
nano /usr/local/bin/pve-en05.sh
and with the follwing content:
#!/bin/bash
# this brings the renamed interface up and reprocesses any settings in /etc/network/interfaces for the renamed interface
/usr/sbin/ifup en05
save the file and then
- create the second script referenced above using
nano /usr/local/bin/pve-en06.sh
and with the follwing content:
#!/bin/bash
# this brings the renamed interface up and reprocesses any settings in /etc/network/interfaces for the renamed interface
/usr/sbin/ifup en06
and save the file
- make both scripts executable with
chmod +x /usr/local/bin/*.sh
- Reboot (restarting networking, init 1 and init 3 are not good enough, so reboot)
@luilegeant Seems like something normal then but as of yesterday those notifications have gone away for me after doing the fix in the below comment. But the thing for me is that my connection was never broken. However, I did encounter that in the beginning but once I ran systemctl restart frr.service / unplug and replugged, everything worked. At this point, I am pointing post-up /usr/bin/systemctl restart frr.service from /etc/network/interfaces was not running successfully but unsure.
https://gist.github.com/scyto/4c664734535da122f4ab2951b22b2085?permalink_comment_id=5021706#gistcomment-5021706
Oh and last thing, I did have to go back and fix the order of how everything was connected. I followed the below.
using the numbers printed on the case of the intel13 nucs connect cables as follows (this is important):
My current errors are now the below.... Would like to see if the above works for you and if you start reporting similar issues like me.
May 23 06:34:10 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 06:41:12 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 06:44:55 pve systemd[1]: Starting apt-daily-upgrade.service - Daily apt upgrade and clean activities...
May 23 06:44:55 pve systemd[1]: apt-daily-upgrade.service: Deactivated successfully.
May 23 06:44:55 pve systemd[1]: Finished apt-daily-upgrade.service - Daily apt upgrade and clean activities.
May 23 06:47:55 pve fabricd[1074]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
May 23 06:48:55 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 06:55:37 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:01:47 pve pmxcfs[1196]: [dcdb] notice: data verification successful
May 23 07:02:20 pve fabricd[1074]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
May 23 07:03:20 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:09:53 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:16:36 pve fabricd[1074]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
May 23 07:17:01 pve CRON[480274]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
May 23 07:17:01 pve CRON[480275]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
May 23 07:17:01 pve CRON[480274]: pam_unix(cron:session): session closed for user root
May 23 07:17:36 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:24:17 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:30:51 pve fabricd[1074]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
May 23 07:31:51 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
May 23 07:38:38 pve fabricd[1074]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Just out of curiosity are you getting similar speeds like the below? I notice multiple people with MS-01's are getting higher so i am wondering if it's a limitation on the USB4 of the UM790
root@pve2:~# iperf3 -c 10.0.0.81 -B 10.0.0.82
Connecting to host 10.0.0.81, port 5201
[ 5] local 10.0.0.82 port 58585 connected to 10.0.0.81 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 1.39 GBytes 12.0 Gbits/sec 12 2.37 MBytes
[ 5] 1.00-2.00 sec 1.40 GBytes 12.0 Gbits/sec 0 2.50 MBytes
[ 5] 2.00-3.00 sec 1.39 GBytes 12.0 Gbits/sec 0 2.50 MBytes
[ 5] 3.00-4.00 sec 1.39 GBytes 11.9 Gbits/sec 0 2.50 MBytes
[ 5] 4.00-5.00 sec 1.39 GBytes 12.0 Gbits/sec 0 2.50 MBytes
[ 5] 5.00-6.00 sec 1.39 GBytes 12.0 Gbits/sec 0 2.50 MBytes
[ 5] 6.00-7.00 sec 1.39 GBytes 11.9 Gbits/sec 0 2.50 MBytes
[ 5] 7.00-8.00 sec 1.39 GBytes 11.9 Gbits/sec 0 2.50 MBytes
[ 5] 8.00-9.00 sec 1.39 GBytes 11.9 Gbits/sec 0 2.50 MBytes
[ 5] 9.00-10.00 sec 1.39 GBytes 12.0 Gbits/sec 0 2.50 MBytes
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 13.9 GBytes 12.0 Gbits/sec 12 sender
[ 5] 0.00-10.00 sec 13.9 GBytes 12.0 Gbits/sec receiver