[SOLVED] Network Issues after updating

Kaoplo

New Member
Jul 3, 2024
6
0
1
Hello Proxmox Community,

I have recently updated Proxmox VE from version 7, to 8, I performed a clean install, and did not retain any previous applications, configs, etc. I am now running into an issue where the system drops the connection to my network every 30 minutes or so. The only fix I know of right now is restarting the node or waiting ~30 minutes, none of which are ideal. This has to be an issue with my network configuration or Proxmox VE 8, since I am unable to replicate this issue on other distributions.
I have looked at dmesg, journalctl logs, but none of them give any hints to why this could be happening, If requested, I can post them here, along with any other files that could help solve the issue.

My system is a Mac Mini late 2012, it's a weird machine to run proxmox on, but it has been working fine up until now.

UPDATE:
The issue is caused by IOMMU enabled by default on kernels 6.8 and above.

SOLUTION:
For GRUB users:
Edit `/etc/default/grub` with your preferred text editor, add `intel_iommu=off` to the end of `GRUB_CMDLINE_LINUX_DEFAULT` between the quotes. Run `update-grub` and reboot your node
For Systemd-boot users:
The kernel commandline needs to be placed as one line in /etc/kernel/cmdline. To apply your changes, run proxmox-boot-tool refresh, which sets it as the option line for all config files in loader/entries/proxmox-*.conf.
Add `intel_iommu=off` as a new line into `/etc/kernel/cmdline` then run `proxmox-boot-tool refresh`.

Solutions are courtesy of gfngfn256 and AverageMarcus
 
Last edited:
Hi,

Could you please post the network configuration `cat /etc/network/interfaces` and the output of `ip a` and `lspci` commands?
 
Hi,

Could you please post the network configuration `cat /etc/network/interfaces` and the output of `ip a` and `lspci` commands?
Hey there,

Here's the contents of `/etc/network/interfaces`:
Code:
auto lo
iface lo inet loopback

iface enp1s0f0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.11/24
        gateway 192.168.1.1
        bridge-ports enp1s0f0
        bridge-stp off
        bridge-fd 0


source /etc/network/interfaces.d/*

Output of `ip a`:
Code:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether a8:20:66:11:63:a3 brd ff:ff:ff:ff:ff:ff
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether a8:20:66:11:63:a3 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.11/24 scope global vmbr0
       valid_lft forever preferred_lft forever
4: veth100i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr100i0 state UP group default qlen 1000
    link/ether fe:9d:c2:a0:c8:54 brd ff:ff:ff:ff:ff:ff link-netnsid 0
5: fwbr100i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether de:42:29:7a:b5:d6 brd ff:ff:ff:ff:ff:ff
6: fwpr100p0@fwln100i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 46:f3:9f:98:c6:2d brd ff:ff:ff:ff:ff:ff
7: fwln100i0@fwpr100p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr100i0 state UP group default qlen 1000
    link/ether de:42:29:7a:b5:d6 brd ff:ff:ff:ff:ff:ff
8: veth101i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
    link/ether fe:cc:2f:62:06:20 brd ff:ff:ff:ff:ff:ff link-netnsid 1
9: fwbr101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f6:be:b7:bf:1c:b4 brd ff:ff:ff:ff:ff:ff
10: fwpr101p0@fwln101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 32:a7:29:f3:4b:4c brd ff:ff:ff:ff:ff:ff
11: fwln101i0@fwpr101p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
    link/ether f6:be:b7:bf:1c:b4 brd ff:ff:ff:ff:ff:ff

And output of `lspci`:
Code:
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C216 Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C216 Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C216 Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation HM77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C216 Chipset Family SMBus Controller (rev 04)
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM57766 Gigabit Ethernet PCIe (rev 01)
01:00.1 SD Host controller: Broadcom Inc. and subsidiaries BCM57765/57785 SDXC/MMC Card Reader (rev 01)
02:00.0 Network controller: Broadcom Inc. and subsidiaries BCM4331 802.11a/b/g/n (rev 02)
03:00.0 FireWire (IEEE 1394): LSI Corporation FW643 [TrueFire] PCIe 1394b Controller (rev 08)
04:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:03.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:04.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:05.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:06.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
06:00.0 System peripheral: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)

Thank you
 
I have the same problem, and there is no further activity in this topic, so I will share the information.

The Broadcom NetXtreme BCM57766 is in my Mac Mini Late 2012
Proxmox version 8

The problems started after the last kernel upgrade. The connection disconnects for a short period of 10 to 60 seconds. This about 30 times a day (according to monitoring)

Kernel: 6.8.8-2-pve
pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-2-pve)

Error messages
-short version (there are many more entries with the same tg3 message)
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00006920: 0x00000000, 0x00000000, 0x00000001, 0x00000000
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00007000: 0x08000188, 0x00000000, 0x00000000, 0x000000c4
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00007010: 0xd1c00001, 0x02408200, 0x000500db, 0x03000a00
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00007020: 0x00000000, 0x00000000, 0x00000406, 0x10004000
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00007030: 0x00010000, 0x000000c8, 0x000c0030, 0x00000000
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0x00007500: 0x00000000, 0x00000000, 0x00000080, 0x00000000
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0: Host status block [00000001:0000000b:(0000:0015:0000):(0000:0012)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 0: NAPI info [0000000b:0000000b:(01ff:0012:01ff):0000:(00dc:0000:0000:0000)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 1: Host status block [00000001:0000009d:(0000:0000:0000):(0015:0000)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 1: NAPI info [0000009d:0000009d:(0000:0000:01ff):0015:(0015:0015:0000:0000)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 2: Host status block [00000001:00000014:(03ff:0000:0000):(0000:0000)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: 2: NAPI info [00000014:00000014:(0000:0000:01ff):03ff:(01ff:01ff:0000:0000)]
Jul 13 19:28:44 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: Link is down
Jul 13 19:28:44 proxmox kernel: vmbr0: port 1(enp1s0f0) entered disabled state
Jul 13 19:28:48 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: Link is up at 1000 Mbps, full duplex
Jul 13 19:28:48 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: Flow control is off for TX and off for RX
Jul 13 19:28:48 proxmox kernel: tg3 0000:01:00.0 enp1s0f0: EEE is disabled
Jul 13 19:28:48 proxmox kernel: vmbr0: port 1(enp1s0f0) entered blocking state
Jul 13 19:28:48 proxmox kernel: vmbr0: port 1(enp1s0f0) entered forwarding state



/etc/network/interfaces
auto lo
iface lo inet loopback

iface enp1s0f0 inet manual

auto vmbr0
iface vmbr0 inet static
<------>address 10.0.100.205/24
<------>gateway 10.0.100.254
<------>bridge-ports enp1s0f0
<------>bridge-stp off
<------>bridge-fd 0



root@proxmox:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
link/ether 68:5b:35:b3:55:7a brd ff:ff:ff:ff:ff:ff
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 68:5b:35:b3:55:7a brd ff:ff:ff:ff:ff:ff
inet 10.0.100.205/24 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::6a5b:35ff:feb3:557a/64 scope link
valid_lft forever preferred_lft forever
4: veth101i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
link/ether fe:a9:6c:5b:ba:fb brd ff:ff:ff:ff:ff:ff link-netnsid 0
5: fwbr101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 8e:7f:72:91:d7:ad brd ff:ff:ff:ff:ff:ff
6: fwpr101p0@fwln101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether a6:e8:9a:a3:6c:bb brd ff:ff:ff:ff:ff:ff
7: fwln101i0@fwpr101p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
link/ether 8e:7f:72:91:d7:ad brd ff:ff:ff:ff:ff:ff
8: veth102i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr102i0 state UP group default qlen 1000
link/ether fe:4b:9d:3d:b9:5c brd ff:ff:ff:ff:ff:ff link-netnsid 1
9: fwbr102i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 82:2c:35:b5:81:d8 brd ff:ff:ff:ff:ff:ff
10: fwpr102p0@fwln102i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether c6:7b:46:53:0d:08 brd ff:ff:ff:ff:ff:ff
11: fwln102i0@fwpr102p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr102i0 state UP group default qlen 1000
link/ether 82:2c:35:b5:81:d8 brd ff:ff:ff:ff:ff:ff
12: tap104i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr104i0 state UNKNOWN group default qlen 1000
link/ether aa:39:e1:e2:4f:6c brd ff:ff:ff:ff:ff:ff
13: fwbr104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether ea:e6:45:50:85:7f brd ff:ff:ff:ff:ff:ff
14: fwpr104p0@fwln104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether 8e:e4:60:59:ae:db brd ff:ff:ff:ff:ff:ff
15: fwln104i0@fwpr104p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr104i0 state UP group default qlen 1000
link/ether ea:e6:45:50:85:7f brd ff:ff:ff:ff:ff:ff
16: veth105i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr105i0 state UP group default qlen 1000
link/ether fe:28:00:46:ad:bf brd ff:ff:ff:ff:ff:ff link-netnsid 2
17: fwbr105i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 1a:56:0d:e1:c6:25 brd ff:ff:ff:ff:ff:ff
18: fwpr105p0@fwln105i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether 66:b6:b7:c4:e4:4e brd ff:ff:ff:ff:ff:ff
19: fwln105i0@fwpr105p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr105i0 state UP group default qlen 1000
link/ether 1a:56:0d:e1:c6:25 brd ff:ff:ff:ff:ff:ff
20: tap106i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr106i0 state UNKNOWN group default qlen 1000
link/ether da:f0:f8:5b:23:a6 brd ff:ff:ff:ff:ff:ff
21: fwbr106i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 5a:41:ef:41:1a:2a brd ff:ff:ff:ff:ff:ff
22: fwpr106p0@fwln106i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether de:77:6a:e8:1e:09 brd ff:ff:ff:ff:ff:ff
23: fwln106i0@fwpr106p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr106i0 state UP group default qlen 1000
link/ether 5a:41:ef:41:1a:2a brd ff:ff:ff:ff:ff:ff
24: veth107i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr107i0 state UP group default qlen 1000
link/ether fe:c0:9b:41:02:82 brd ff:ff:ff:ff:ff:ff link-netnsid 3
25: fwbr107i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether ce:eb:b1:26:af:ca brd ff:ff:ff:ff:ff:ff
26: fwpr107p0@fwln107i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether 9e:0a:9e:57:89:86 brd ff:ff:ff:ff:ff:ff
27: fwln107i0@fwpr107p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr107i0 state UP group default qlen 1000
link/ether ce:eb:b1:26:af:ca brd ff:ff:ff:ff:ff:ff
28: veth108i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr108i0 state UP group default qlen 1000
link/ether fe:e9:83:68:22:3b brd ff:ff:ff:ff:ff:ff link-netnsid 4
29: fwbr108i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether b2:65:16:7f:e4:05 brd ff:ff:ff:ff:ff:ff
30: fwpr108p0@fwln108i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether 02:c5:01:29:e9:9b brd ff:ff:ff:ff:ff:ff
31: fwln108i0@fwpr108p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr108i0 state UP group default qlen 1000
link/ether b2:65:16:7f:e4:05 brd ff:ff:ff:ff:ff:ff
32: tap109i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr109i0 state UNKNOWN group default qlen 1000
link/ether fe:48:7e:48:19:84 brd ff:ff:ff:ff:ff:ff
33: fwbr109i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 7e:88:91:17:4a:98 brd ff:ff:ff:ff:ff:ff
34: fwpr109p0@fwln109i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
link/ether 62:57:75:82:b3:f1 brd ff:ff:ff:ff:ff:ff
35: fwln109i0@fwpr109p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr109i0 state UP group default qlen 1000
link/ether 7e:88:91:17:4a:98 brd ff:ff:ff:ff:ff:ff
37: enx000acd302d93: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:0a:cd:30:2d:93 brd ff:ff:ff:ff:ff:ff
root@proxmox:~#
 
Frogot

root@proxmox:~# lspci
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C216 Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C216 Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C216 Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation HM77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C216 Chipset Family SMBus Controller (rev 04)
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM57766 Gigabit Ethernet PCIe (rev 01)
01:00.1 SD Host controller: Broadcom Inc. and subsidiaries BCM57765/57785 SDXC/MMC Card Reader (rev 01)
02:00.0 Network controller: Broadcom Inc. and subsidiaries BCM4331 802.11a/b/g/n (rev 02)
03:00.0 FireWire (IEEE 1394): LSI Corporation FW643 [TrueFire] PCIe 1394b Controller (rev 08)
04:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:03.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:04.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:05.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
05:06.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
06:00.0 System peripheral: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge 4C 2012] (rev 03)
 
I have the same problem
Now looking more closely at the problem of the OP & yourself, I notice you have 2 things in common. Firstly & obviously you both seem to use the BCM57766 NIC. But I also notice you both are using firewall bridges for your VM/s (or possibly some of them) as I notice you both have fwbr... devices listed. Is this intentional? If its not then maybe go in your VMs & disable the firewall. My thinking is that maybe this is causing a freeze/lockup on the NIC. Give it a try & maybe you will have success. If not, you are probably going to have to live with an earlier working kernel.

Good luck.
 
I was facing this same problem while using a Mac Mini. I had the problem with both fresh install of PVE8 and with upgrades from PVE7->8 on multiple machines.

I was using kernel version 6.8.8-2-pve after upgrade and this ended up causing the network driver (I think) to crash at some point and cut off the host from the network and the rest of the Proxmox cluster.

I tried downgrading my kernel version to 6.8.4-2-pve which appeared to prevent the crashing but I was still seeing frequent network dropoff for short periods of time but the machines became available again unlike with 6.8.8-2-pve.

I have now downgraded to the kernel version I was previously using before I upgrade - 5.15.158-1-pve - and so far I haven't seen any network dropout or crashes. It's only been about an hour so far though so still too early to say for sure but it's already more reliable than 6.8.4-2-pve.

This is what I ran on each of my hosts:

Bash:
apt install pve-kernel-5.15.158-1-pve && \
proxmox-boot-tool kernel pin 5.15.158-1-pve && \
proxmox-boot-tool refresh && \
proxmox-boot-tool kernel list # Confirm the old kernel is pinned before rebooting

Note: If you installed PVE8 fresh you wont be able to pull the pve-kernel-5.15.158-1-pve pacakge unless you (temporarily) add the following to your /etc/apt/sources.list:

Code:
deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription

Hopefully this helps someone else. :)
 
Thanks everyone for the replies!
I am also currently using kernel 6.8.8-2-pve, and managed to catch an error during a network dropout while it was connected to a screen:
Code:
[791.761742] tg3 0000:01:00.0 enp1s0f0: NETDEV WATCHDOG: CPU: 3: transmit queue 0 timed out 5632 ms
This may confirm Marcus' suspicion that this may be a driver related issue. I have downgraded my kernel, and will continue to monitor. If this solves the issue long term, I will mark this thread as solved, in case someone else with the same hardware runs into this issue.
 
Unfortunately I'm on holiday now, but in two weeks I'll try another kernel.
Thanks for the advices, I will report back the outcome
 
I have just switched one of my hosts to using the 6.5.13-5-pve kernel. It's still pretty early to be sure but so far I haven't seen any network dropouts. This is slightly different to my previous testing though as I first moved all VMs off this host so it could be that it doesn't occur without any VMs. If it remains stable for the next few hours I'll move one of my VMs back onto it and continue to monitor.

Does anyone know where I can find what changes there are between 6.5.xxx and 6.8.xxx of the kernel?
 
Does anyone know where I can find what changes there are between 6.5.xxx and 6.8.xxx of the kernel?
Not exactly, but as a (very) general guideline, AFAIK PVE kernel is basically built around Ubuntu's - so maybe look here & here to get some idea.

Please note that the changes listed in those posts (AFAIK) are from THEIR previous versions - not the 6.5 kernels.

For general info on PVE 8.2 which uses the 6.8 kernel (default) see here.
 
  • Like
Reactions: AverageMarcus
Quick update - It's been over 5 hours now with the 6.5.13-5-pve kernel (including having a VM running on there for most of it) and I haven't seen any evidence of the network issue.

My next plan is to move the VM off again and test with 6.8.8-2-pve taking into consideration these two notes from the PVE 8 change notes:

Kernel: intel_iommu now defaults to on​


The intel_iommu parameter defaults to on in the kernel 6.8 series. Enabling IOMMU can cause problems with older hardware, or systems with not up to date BIOS, due to bugs in the BIOS.

The issue can be fixed by explicitly disabling intel_iommu on the kernel commandline (intel_iommu=off) following the reference documentation.


Kernel: Broadcom Infiniband driver issue​


The bnxt_re module causes issues with some Broadcom NICs, which have their Infiniband functionality enabled.As Infiniband is not used in most deployments simply preventing the module from loading mitigates the issue. Create a file /etc/modprobe.d/bnxt-re-blacklist.conf containing:

blacklist bnxt_re


Afterwards make sure to update the initramfs with update-initramfs -k all -u.

Alternatively you can also install Broadcom's niccli utility and the corresponding dkms module to disable the Infiniband functionality of the NIC permamently. See the relevant post in our community forum.
I'm not sure if either of these things are what we're experiencing but they _sound_ similar so I'll try each in turn and see what resolves the problem and report back. :)
 
Quick update - It's been over 5 hours now with the 6.5.13-5-pve kernel (including having a VM running on there for most of it) and I haven't seen any evidence of the network issue.

My next plan is to move the VM off again and test with 6.8.8-2-pve taking into consideration these two notes from the PVE 8 change notes:


I'm not sure if either of these things are what we're experiencing but they _sound_ similar so I'll try each in turn and see what resolves the problem and report back. :)
iommu grouping defaulting to on is defenitely the issue, I used to have it enabled before on PVE7 and I had the same issue, I didn't mention it because I ran a script that shows iommu groups and that returned nothing, I guess the script was broken or I did something wrong. If there's a way to disable iommu grouping on the newer kernel that would solve the issue I'm pretty sure.
 
Yeah, that seems like it so far :)

If you edit /etc/default/grub and add intel_iommu=off to the GRUB_CMDLINE_LINUX_DEFAULT property followed by running update-grub and a reboot it seems like that fixes it.

2 hours so far with no issues. I'll report back in the morning to be sure.
 
Yeah, that seems like it so far :)

If you edit /etc/default/grub and add intel_iommu=off to the GRUB_CMDLINE_LINUX_DEFAULT property followed by running update-grub and a reboot it seems like that fixes it.

2 hours so far with no issues. I'll report back in the morning to be sure.
I did the same, hopefully this will fully solve the issue. I'm gonna keep monitoring and update the forum post with the solution tomorrow if this does end up working.
Thanks everyone for the help!
 
If you edit /etc/default/grub and add intel_iommu=off to the GRUB_CMDLINE_LINUX_DEFAULT property followed by running update-grub and a reboot it seems like that fixes it.
I did the same
This will work if you are actually using the GRUB bootloader. However if you are using Systemd-boot you need as per docs :
The kernel commandline needs to be placed as one line in /etc/kernel/cmdline. To apply your changes, run proxmox-boot-tool refresh, which sets it as the option line for all config files in loader/entries/proxmox-*.conf.