Thunderbolt Network packets going out over public network

Allister

Member
Jul 12, 2023
41
2
8
I have a 3 node cluster and I'm using Thunderbolt 4 along with Ceph in a Thunderbolt mesh/ring network. That thunderbolt network is on the 10.0.0.80/29 network.

I'm seeing deny packets on my firewall from the other interface on Proxmox (10.16.0.0 network). You can see the deny on the packets in the screenshot below. How can I prevent packets for the thunderbolt network to stay on that network and not go out over the other network interface? if anyone has any info on this, I'd really appreciate some insight. Here is a snapshot of the network from the first node:

Code:
auto lo
iface lo inet loopback

auto lo:0
iface lo:0 inet static
        address 10.0.0.81/32

iface enp86s0 inet manual
iface wlo1 inet manual

allow-hotplug en05
iface en05 inet manual
        mtu 65520

allow-hotplug en06
iface en06 inet manual
        mtu 65520

auto vmbr0
iface vmbr0 inet static
        address 10.16.30.21/24
        gateway 10.16.30.1
        bridge-ports enp86s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 10,20,35

source /etc/network/interfaces.d/*

Screenshot 2024-08-23 at 3.06.45 PM.png
 
Last edited:
Looking at your configuration, there is no other way to reach 10.0.0.83 from this host than to use the only external interface you have: the 10.16.30.0/24 network, using the default gateway 10.16.30.1.

I'm not familiar with this special TB network you are referring to, and you don't have a 10.0.0.80/29 network connected to this host at all, you only have a local loopback interface with 10.0.0.81/32 address. Should one of those en05 or en06 interfaces represent that TB network? You don't have any IP network configured on those interfaces.
 
  • Like
Reactions: Allister
Looking at your configuration, there is no other way to reach 10.0.0.83 from this host than to use the only external interface you have: the 10.16.30.0/24 network, using the default gateway 10.16.30.1.

I'm not familiar with this special TB network you are referring to, and you don't have a 10.0.0.80/29 network connected to this host at all, you only have a local loopback interface with 10.0.0.81/32 address. Should one of those en05 or en06 interfaces represent that TB network? You don't have any IP network configured on those interfaces.
Thanks for taking the time to reply. I've followed the gist for setting this up here:
https://gist.github.com/scyto/76e94832927a89d977ea989da157e9dc

Maybe the config could be improved so that nothing tries to go out of the public (10.16.0.0) networks. Any idea what I might try to accomplish that?
 
Long list of comments about the configurations in that link, but I take it that there is FRR routing daemon taking care of routing between the 10.0.0.x loopback addresses (I think you are not using IPv6 for the connectivity as you don't have "inet6" in your network configurations).

Can you show the outputs of the following commands:

ip a
ip route
ip -6 route
systemctl status frr
 
Last edited:
Here are the outputs of what you requested:

Code:
root@pve-node01:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.0.0.81/32 scope global lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: enp86s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether 48:21:0b:5a:59:75 brd ff:ff:ff:ff:ff:ff
3: wlo1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether dc:46:28:8d:57:3e brd ff:ff:ff:ff:ff:ff
    altname wlp0s20f3
4: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 48:21:0b:5a:59:75 brd ff:ff:ff:ff:ff:ff
    inet 10.16.30.21/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::4a21:bff:fe5a:5975/64 scope link
       valid_lft forever preferred_lft forever
13: en05: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:13:cf:c4:68:ff brd ff:ff:ff:ff:ff:ff
    inet6 fe80::13:cfff:fec4:68ff/64 scope link
       valid_lft forever preferred_lft forever
33: en06: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 65520 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 02:b9:1d:95:e9:e0 brd ff:ff:ff:ff:ff:ff
34: veth801i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether fe:cb:fe:a7:64:98 brd ff:ff:ff:ff:ff:ff link-netnsid 1
35: veth903i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether fe:59:fa:2a:4a:52 brd ff:ff:ff:ff:ff:ff link-netnsid 2
37: tap504i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr504i0 state UNKNOWN group default qlen 1000
    link/ether be:67:06:86:ac:a8 brd ff:ff:ff:ff:ff:ff
38: fwbr504i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0a:25:5f:da:2b:29 brd ff:ff:ff:ff:ff:ff
39: fwpr504p0@fwln504i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether ba:ae:a7:62:d7:fc brd ff:ff:ff:ff:ff:ff
40: fwln504i0@fwpr504p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr504i0 state UP group default qlen 1000
    link/ether 0a:25:5f:da:2b:29 brd ff:ff:ff:ff:ff:ff
41: tap101i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr101i0 state UNKNOWN group default qlen 1000
    link/ether 62:fe:68:90:64:45 brd ff:ff:ff:ff:ff:ff
42: fwbr101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d2:bd:ba:3c:cd:90 brd ff:ff:ff:ff:ff:ff
43: fwpr101p0@fwln101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 32:94:b6:ee:4f:e3 brd ff:ff:ff:ff:ff:ff
44: fwln101i0@fwpr101p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
    link/ether d2:bd:ba:3c:cd:90 brd ff:ff:ff:ff:ff:ff
root@pve-node01:~#

Code:
root@pve-node01:~# ip route
default via 10.16.30.1 dev vmbr0 proto kernel onlink
10.0.0.82 nhid 10 via 10.0.0.82 dev en05 proto openfabric metric 20 onlink
10.0.0.83 nhid 10 via 10.0.0.82 dev en05 proto openfabric metric 20 onlink
10.16.30.0/24 dev vmbr0 proto kernel scope link src 10.16.30.21
root@pve-node01:~#

Code:
root@pve-node01:~# ip -6 route
fe80::/64 dev vmbr0 proto kernel metric 256 pref medium
fe80::/64 dev en05 proto kernel metric 256 pref medium
root@pve-node01:~#

Code:
root@pve-node01:~# systemctl status frr
● frr.service - FRRouting
     Loaded: loaded (/lib/systemd/system/frr.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-08-23 13:55:02 PDT; 24h ago
       Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Process: 93523 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
   Main PID: 93532 (watchfrr)
     Status: "FRR Operational"
      Tasks: 17 (limit: 74130)
     Memory: 23.5M
        CPU: 1min 4.718s
     CGroup: /system.slice/frr.service
             ├─93532 /usr/lib/frr/watchfrr -d -F traditional zebra bgpd staticd bfdd fabricd
             ├─93549 /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000
             ├─93554 /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1
             ├─93561 /usr/lib/frr/staticd -d -F traditional -A 127.0.0.1
             ├─93564 /usr/lib/frr/bfdd -d -F traditional -A 127.0.0.1
             └─93567 /usr/lib/frr/fabricd -d -F traditional -A 127.0.0.1


Aug 24 13:57:05 pve-node01 fabricd[93567]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
Aug 24 13:58:05 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:07:42 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:11:04 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:11:17 pve-node01 fabricd[93567]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
Aug 24 14:12:17 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:22:05 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:25:19 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
Aug 24 14:25:50 pve-node01 fabricd[93567]: [NBV6R-CM3PT] OpenFabric: Needed to resync LSPDB using CSNP!
Aug 24 14:26:51 pve-node01 fabricd[93567]: [QBAZ6-3YZR3] OpenFabric: Could not find two T0 routers
root@pve-node01:~#
 
Last edited:
Code:
10.0.0.82 nhid 10 via 10.0.0.82 dev en05 proto openfabric metric 20 onlink
10.0.0.83 nhid 10 via 10.0.0.82 dev en05 proto openfabric metric 20 onlink

So there is the "openfabric" protocol running, and this node has learnt 10.0.0.82 and 10.0.0.83 via en05 interface, which is up in the "ip addr" output.

en06 is not up however.

Looking at https://gist.github.com/scyto/67fdc...malink_comment_id=5093125#gistcomment-5093125, there is apparently OSI addressing in use on the en05 and en06 interfaces.

Have you configured and connected all the nodes and TB interfaces according to the instructions?
 
I believe I have.. just some background on the connections.. also, when I get home later, I’ll post my config files.

These three nodes are configured in a ring as follows:

Node 1, TB Port 1 (en05) —> Node 2, TB Port 2 (en06)
Node 2, TB Port 1 (en05) —> Node 3, TB Port 2 (en06)
Node 3, TB Port 1 (en05) —> Node 1, TB Port 2 (en06)
 
Last edited:
I think I found the problem.. I was missing the interface in node 3 in the vtysh config. en06 was missing. I've added it and rebooted 3, then 1, and all three nodes show the following ip route info:

root@pve-node01:~# ip route
default via 10.16.30.1 dev vmbr0 proto kernel onlink
10.0.0.82 nhid 10 via 10.0.0.83 dev en06 proto openfabric metric 20 onlink
10.0.0.83 nhid 10 via 10.0.0.83 dev en06 proto openfabric metric 20 onlink
10.16.30.0/24 dev vmbr0 proto kernel scope link src 10.16.30.21
root@pve-node01:~#

root@pve-node02:~# ip route
default via 10.16.30.1 dev vmbr0 proto kernel onlink
10.0.0.81 nhid 10 via 10.0.0.83 dev en05 proto openfabric metric 20 onlink
10.0.0.83 nhid 10 via 10.0.0.83 dev en05 proto openfabric metric 20 onlink
10.16.30.0/24 dev vmbr0 proto kernel scope link src 10.16.30.22
root@pve-node02:~#

root@pve-node03:~# ip route
default via 10.16.30.1 dev vmbr0 proto kernel onlink
10.0.0.81 nhid 13 via 10.0.0.81 dev en05 proto openfabric metric 20 onlink
10.0.0.82 nhid 14 via 10.0.0.82 dev en06 proto openfabric metric 20 onlink
10.16.30.0/24 dev vmbr0 proto kernel scope link src 10.16.30.23
root@pve-node03:~#

I will wonder if I'll get packets going out the other interface. Things work fine, Ceph is installed and is running on the 10.0.0.80/29 network without any problems. Was just surprised when I was seeing those packets leave the other interface.

Is there anything else you'd like to see or have suggestions for?
 
It's interesting that only one machine at a time uses both en05 and en06 at the same time. If I reboot that one, it changes to another one it seems. Not sure if this is expected when the nodes are connected in a ring. Not that I think about it, it prevents the traffic from looping around and around (I think).
 
I'm not familiar with the openfabric protocol but yes it looks like it split the ring topology between node1 and node2.

Whatever process is sending those packets (in your screenshot), it is using incorrect source address: all the source addresses are 10.16.x.x, not the loopback addresses 10.0.0.x. Ceph is also unknown territory for me, but maybe someone else can comment, if the relevant processes need to be separately configured to bind to the loopback address and not to the Ethernet address.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!