Random Networking issues

toxic · Jan 22, 2024

Hello,
I'm facing some strange random networking issues with LXCs on my PVE cluster not able to communicate.

For instance, sometimes, 10.0.10.51 which is a LXC will not be able to communicate with 10.0.1.23 which is one of my switches.
When this occurs, I see no trafic at all coming in on the gateway (using opnSense, made a packet capture, nothing there), meaning the trafic is not leaving the LXC or not leaving the network bridge. I think I did try a packet capture on the pve host of the lxc and did not see any trafic on the vmbr10 either...
I see that often thanks to my uptime-kuma instance runing on this LXC, and can't really understand why, there is a timeout (60 secs) during which uptime isn't able to either ping or curl http the switch, and doing nothing it starts working again a few minutes later...

The LXC in question is a ubuntu jammy attached with a static ip to vmbr10, the pve host is running v8.1.3 on kernel 6.5.11-7-pve.

While this is occurring, I can reproduce using ssh onside the LXC and communication is indeed down, and during this time I was able to ssh onto my opnsense gateway and confirm it is indeed able to ping or clurl the switch no problem, so were my opnSense to recieve the packets from the LXC it would pass them along correctly...

Uptime is running inside docker inside the LXC and I do believe I have similar issues within docker networking itself (some containers timeout between my traefik instance and the gitea container itselft for example...) but that seems unrelated as within docker itself...

The host is a 8365U so powerfull enough, it's sitting arount 30%CPU usage, no swapping with the 32GB of RAM I added, it is quite busy running around 100 containers total, some in LXCs, some in VMs, but overall no slowness or anything besides these random network dropouts*

I recently tries to increase ulimit -n 99999 (it was 1024 everywhere) but it doesn't seem to do any better...

Any idea ?

Here is my /etc/network/interfaces :

Bash:

auto lo
iface lo inet loopback

auto enp1s0
iface enp1s0 inet manual
        mtu 9000
#eth0

auto enp2s0
iface enp2s0 inet manual
        mtu 9000
#eth1

auto enp3s0
iface enp3s0 inet manual
        mtu 9000
#eth2

auto enp4s0
iface enp4s0 inet manual
        mtu 9000
#eth3

auto enp5s0
iface enp5s0 inet manual
        mtu 9000
#eth4

auto enp6s0
iface enp6s0 inet manual
        mtu 9000
#eth5

iface enx00e04c534458 inet manual

auto bond1
iface bond1 inet manual
        bond-slaves enp5s0 enp6s0
        bond-miimon 100
        bond-mode balance-xor
        bond-xmit-hash-policy layer3+4
        mtu 9000
#LAGG_WAN

auto bond0
iface bond0 inet manual
        bond-slaves enp1s0 enp2s0 enp3s0 enp4s0
        bond-miimon 100
        bond-mode balance-xor
        bond-xmit-hash-policy layer3+4
        mtu 9000
#LAGG_Switch


auto vmbr1000
iface vmbr1000 inet manual
        bridge-ports bond0
        bridge-stp on
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 1-4094
        mtu 9000
#Bridge All VLANs to SWITCH

auto vmbr2000
iface vmbr2000 inet manual
        bridge-ports bond1
        bridge-stp on
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 1-4094
        mtu 9000
#Bidge WAN

auto vmbr1000.10
iface vmbr1000.10 inet manual
        mtu 9000
#VMs

auto vmbr1000.99
iface vmbr1000.99 inet manual
        mtu 9000
#VMs

auto vmbr10
iface vmbr10 inet static
        address 10.0.10.9/24
        gateway 10.0.10.1
        bridge-ports vmbr1000.10
        bridge-stp off
        bridge-fd 0
        post-up   ip rule add from 10.0.10.0/24 table 10Server prio 1
        post-up   ip route add default via 10.0.10.1 dev vmbr10 table 10Server
        post-up   ip route add 10.0.10.0/24 dev vmbr10 table 10Server
        mtu 9000

auto vmbr99
iface vmbr99 inet static
        address 10.0.99.9/24
        gateway 10.0.99.1
        bridge-ports vmbr1000.99
        bridge-stp off
        bridge-fd 0
        post-up   ip rule add from 10.0.99.0/24 table 99Test prio 1
        post-up   ip route add default via 10.0.99.1 dev vmbr99 table 99Test
        post-up   ip route add 10.0.99.0/24 dev vmbr99 table 99Test
        mtu 9000

I do have the proper tables created I believe :

Bash:

root@pve:~ # cat /etc/iproute2/rt_tables.d/200_10Server.conf
200 10Server
root@pve:~ # cat /etc/iproute2/rt_tables.d/204_99Test.conf
204 99Test
root@pve:~ #

Search

Search

Random Networking issues

toxic

Active Member

We value your privacy