Hello,
I do have a problem with my Proxmox intel Nuc Cluster.
It contains 3 Intel Nuc 13 Pro and uses its 2 Thunderbolt Ports to form a ring networt for ceph communication.
Proxmox/Ceph – Full Mesh HCI Cluster w/ Dynamic Routing
To handle the Network routing, I am using FRR and ospf.
To enable the Thunderboltports as a network device, I loaded the kernel modules "thunderbolt" and "thunderbolt-net" in "/etc/modules"
and renamed the interfaces names from "thunderbolt0" / "thunderbolt1" to "en05" / "en06" to make them visible in the Proxmox GUI.
Connect 2 hosts via thunderbolt 3
To do so, I created 2 files in "/etc/systemd/network/", "10-thunderbolt0.link" and "10-thunderbolt1.link".
example: 10-thunderbolt0.link
In general it works! The Network has 10Gig speed, writing to the nvme ceph storage is super fast, and moving am VM to an other node works great.
BUT!
Problem:
Every time I reboot a node (all show the same behavior), the network interfaces en05 and en06 (aka Thunderbolt ports) are down.
Oke, i tought "ip link set en05 up" and ""ip link set en06 up" at startup solves the problem but no.
turns out, i matters in wich order you enable the connection on each node:
the behavior is the same for every node!
But thats not all, to actually establish a tcp connection or ping an other node,
after every reboot I have to change the ip adress of en05 / en06 and restart the networking service (and change it back to its original value).
After that it works just fine and as intendet.
I dont know what causes this, I found this action (by accident) to "solve" the Problem.
Does anybody know why I have to reassign IP adresses to communicate over "en05" and "en06"?
Dokumentation
Proxmox version: 8.0.3
Frr version: 8.4.2
I do have a problem with my Proxmox intel Nuc Cluster.
It contains 3 Intel Nuc 13 Pro and uses its 2 Thunderbolt Ports to form a ring networt for ceph communication.
Proxmox/Ceph – Full Mesh HCI Cluster w/ Dynamic Routing
To handle the Network routing, I am using FRR and ospf.
To enable the Thunderboltports as a network device, I loaded the kernel modules "thunderbolt" and "thunderbolt-net" in "/etc/modules"
and renamed the interfaces names from "thunderbolt0" / "thunderbolt1" to "en05" / "en06" to make them visible in the Proxmox GUI.
Connect 2 hosts via thunderbolt 3
To do so, I created 2 files in "/etc/systemd/network/", "10-thunderbolt0.link" and "10-thunderbolt1.link".
example: 10-thunderbolt0.link
Code:
[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
MACAddress=02:89:12:b5:35:cf
Name=en05
In general it works! The Network has 10Gig speed, writing to the nvme ceph storage is super fast, and moving am VM to an other node works great.
BUT!
Problem:
Every time I reboot a node (all show the same behavior), the network interfaces en05 and en06 (aka Thunderbolt ports) are down.
Code:
5: en05: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 02:bf:2f:cf:19:a1 brd ff:ff:ff:ff:ff:ff
6: en06: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 02:ae:57:53:99:83 brd ff:ff:ff:ff:ff:ff
Oke, i tought "ip link set en05 up" and ""ip link set en06 up" at startup solves the problem but no.
Code:
5: en05: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:bf:2f:cf:19:a1 brd ff:ff:ff:ff:ff:ff
inet6 fe80::bf:2fff:fecf:19a1/64 scope link
valid_lft forever preferred_lft forever
6: en06: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether 02:ae:57:53:99:83 brd ff:ff:ff:ff:ff:ff
turns out, i matters in wich order you enable the connection on each node:
Code:
does work!
on Nuc1: -------> on Nuc2:
ip link set en05 up ip link set en06 up
does not work
on Nuc1: -------> on Nuc2:
ip link set en06 up ip link set en05 up
the behavior is the same for every node!
But thats not all, to actually establish a tcp connection or ping an other node,
after every reboot I have to change the ip adress of en05 / en06 and restart the networking service (and change it back to its original value).
Code:
systemctl restart networking.service
I dont know what causes this, I found this action (by accident) to "solve" the Problem.
Does anybody know why I have to reassign IP adresses to communicate over "en05" and "en06"?
Dokumentation
Proxmox version: 8.0.3
Frr version: 8.4.2
Code:
#FRR config, router ID different on each device
ip forwarding
!
router ospf
ospf router-id 0.0.0.1
log-adjacency-changes
exit
!
interface lo
ip ospf area 0
exit
!
interface en05
ip ospf area 0
ip ospf network point-to-point
exit
!
interface en06
ip ospf area 0
ip ospf network point-to-point
exit
!