Hi,
I have the same issue here.
i installed proxmox on my NUC 11 two month ago.
My setup is simple :
1 NIC with this config :
Code:
auto lo
iface lo inet loopback
iface enp88s0 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.10.6/24
gateway 192.168.10.1
bridge-ports enp88s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
Few weeks ago i had a weird behavior (network instability due to a VM acting as dhcp server hosted on proxmox).
Today the same issue again ; Some client lost their connectivity due to lack of DHCP answer, proxmox is up and running (can ping), vm behind don't have any warning message.
I tried to reboot the VM etc. same thing until i reboot completely the proxmox server, i still don't understand why i have this (can be after few days or weeks), i had this on 7.3 and 7.4 (two different NUC)
Please find below the only line that appearing everytime i have this issue :
Code:
Apr 2 00:50:25 nuc11 kernel: [624233.977334] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 2 01:00:25 nuc11 kernel: [624834.025564] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 2 01:02:25 nuc11 kernel: [624954.027989] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
.........
Apr 7 10:12:53 nuc11 kernel: [1089987.046048] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 7 11:46:53 nuc11 kernel: [1095627.455498] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 7 11:52:53 nuc11 kernel: [1095987.477464] igc 0000:58:00.0 enp88s0: NIC Link is Down
Apr 7 11:52:53 nuc11 kernel: [1095987.477493] vmbr0: port 1(enp88s0) entered disabled state
Apr 7 11:56:47 nuc11 kernel: [1096220.987351] nfs: server 192.168.2.13 not responding, timed out
Apr 7 11:59:47 nuc11 kernel: [1096401.017130] nfs: server 192.168.2.13 not responding, timed out
Apr 7 11:59:52 nuc11 kernel: [1096406.141064] nfs: server 192.168.2.13 not responding, timed out
Apr 7 12:00:23 nuc11 kernel: [1096436.830444] igc 0000:58:00.0 enp88s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 7 12:00:23 nuc11 kernel: [1096436.830709] vmbr0: port 1(enp88s0) entered blocking state
Apr 7 12:00:23 nuc11 kernel: [1096436.830714] vmbr0: port 1(enp88s0) entered forwarding state
Apr 7 12:06:53 nuc11 kernel: [1096827.529892] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 7 12:20:53 nuc11 kernel: [1097667.577582] x86/split lock detection: #AC: CPU 0/KVM/1069 took a split_lock trap at address: 0xfffff8074401e643
Apr 7 12:37:13 nuc11 kernel: [1098647.569034] device tap100i1 entered promiscuous mode
Apr 7 12:37:13 nuc11 kernel: [1098647.608016] vmbr0: port 11(fwpr100p1) entered blocking state
Apr 7 12:37:13 nuc11 kernel: [1098647.608019] vmbr0: port 11(fwpr100p1) entered disabled state
Apr 7 12:37:13 nuc11 kernel: [1098647.608071] device fwpr100p1 entered promiscuous mode
Apr 7 12:37:13 nuc11 kernel: [1098647.608445] vmbr0: port 11(fwpr100p1) entered blocking state
Apr 7 12:37:13 nuc11 kernel: [1098647.608446] vmbr0: port 11(fwpr100p1) entered forwarding state
Apr 7 12:37:13 nuc11 kernel: [1098647.625312] fwbr100i1: port 1(fwln100i1) entered blocking state
Apr 7 12:37:13 nuc11 kernel: [1098647.625315] fwbr100i1: port 1(fwln100i1) entered disabled state
Apr 7 12:37:13 nuc11 kernel: [1098647.625364] device fwln100i1 entered promiscuous mode
Apr 7 12:37:13 nuc11 kernel: [1098647.625389] fwbr100i1: port 1(fwln100i1) entered blocking state
Apr 7 12:37:13 nuc11 kernel: [1098647.625390] fwbr100i1: port 1(fwln100i1) entered forwarding state
I am not able to understand the logs neither explain them (forwarding, blocking etc point me to the network with spanning-tree but it can be that as nothing was done on the network and no L2 issue on anything else)
If someone has an idea about what can be the root cause, i am interested, i have already checked the network (switch, etc.) so i think the issue is ont the server side