Proxmox Network Blocking state

GroupPolicy · Sep 26, 2024

Hi Everyone

I'm using 8.2.7 version on the my desktop pc and it has 1 NIC

Problem is that, especially during daily backup to Backup server (on another device), i see this logs on my PVE. Also yesterday i connected to virtual WİN10 pc via guacomole (from outside) connection lost a few times and Host computer was unaccessible. I had to turn off and turn on again PVE host. Do you have any idea about this issue? What's the problem here?

ep 26 12:30:11 homeserver kernel: vmbr0: port 12(fwpr106p0) entered blocking state
Sep 26 12:30:11 homeserver kernel: vmbr0: port 12(fwpr106p0) entered forwarding state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered disabled state
Sep 26 12:30:11 homeserver kernel: fwln106i0: entered allmulticast mode
Sep 26 12:30:11 homeserver kernel: fwln106i0: entered promiscuous mode
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered forwarding state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered disabled state
Sep 26 12:30:11 homeserver kernel: tap106i0: entered allmulticast mode
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered forwarding state

dakralex · Sep 27, 2024

Hello GroupPolicy!

These are just informational messages about when your bridges are running STP (Spanning Tree Protocol) and as long as they land in "forwarding state", it should usually not be a problem. The connectivity issues you're describing are a problem of course.

Could you post how you configured your network and VM (e.g. cat /etc/network/interfaces and qm config <windows-vmid>?
Does your syslog have a message containing something like a link down or some messages from your NIC directly?
What is the output of lspci -nnk | grep -A2 -E "(Network|Ethernet)"?

GroupPolicy · Sep 27, 2024

dakralex said:
Hello GroupPolicy!

These are just informational messages about when your bridges are running STP (Spanning Tree Protocol) and as long as they land in "forwarding state", it should usually not be a problem. The connectivity issues you're describing are a problem of course.

Could you post how you configured your network and VM (e.g. cat /etc/network/interfaces and qm config <windows-vmid>?

Does your syslog have a message containing something like a link down or some messages from your NIC directly?

What is the output of lspci -nnk | grep -A2 -E "(Network|Ethernet)"?

Hello Dakralex

Thanks for the answer, it's a little bit strange for me, i see these messages only during backup task. Also i checked the vm detail in the message, seems it's only for "off status" vm's For your second question, i saw link down or similar message. I added also this.

You can see below requested information.

auto lo
iface lo inet loopback

iface enp0s31f6 inet manual

auto vmbr0
iface vmbr0 inet manual
bridge-ports enp0s31f6
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094

auto vmbr0.1
iface vmbr0.1 inet static
address 192.168.1.150/24
gateway 192.168.1.1
#Management

auto vmbr0.10
iface vmbr0.10 inet manual
#Office

auto vmbr0.20
iface vmbr0.20 inet manual
#IoT

auto vmbr0.30
iface vmbr0.30 inet manual
#Guest

auto vmbr0.40
iface vmbr0.40 inet manual
#Media

auto vmbr0.50
iface vmbr0.50 inet manual
#Lab

auto vmbr0.60
iface vmbr0.60 inet manual
#Rdp

--------------------------------------------------

root@homeserver:~# qm config 112
agent: 1
balloon: 4096
boot: order=scsi0;net0
cores: 2
cpu: x86-64-v2-AES
description: Remote Access Bilgisayar%C4%B1%0A%0A%0AVLAN-60
machine: pc-i440fx-8.1
memory: 8192
meta: creation-qemu=8.1.5,ctime=1722120287
name: Windows-10-Lab
net0: virtio=BC:24:11:A2:4B:25,bridge=vmbr0,firewall=1,tag=60
numa: 0
onboot: 1
ostype: win10
scsi0: SSD:vm-112-disk-0,cache=writeback,iothread=1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=58dc4358-1b34-4883-b5e1-5889d0b27787
sockets: 1
startup: order=10,up=15,down=15
tags: test
vmgenid: 5f88629f-c644-4115-b4b6-6acb56287926

----------------------------------------------------------------------------------------------------------

Sep 24 14:36:46 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: NETDEV WATCHDOG: CPU: 6: transmit queue 0 timed out 9846 ms
Sep 24 14:36:46 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
Sep 24 14:36:46 homeserver kernel: vmbr0: port 1(enp0s31f6) entered disabled state
Sep 24 14:36:50 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Sep 24 14:36:50 homeserver kernel: vmbr0: port 1(enp0s31f6) entered blocking state
Sep 24 14:36:50 homeserver kernel: vmbr0: port 1(enp0s31f6) entered forwarding state
Sep 24 14:36:52 homeserver pvestatd[1611]: got timeout
Sep 24 14:36:53 homeserver pvestatd[1611]: status update time (9.606 seconds)
Sep 24 14:36:55 homeserver pvestatd[1611]: got timeout
Sep 24 15:17:01 homeserver CRON[553593]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 24 15:17:01 homeserver CRON[553594]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 24 15:17:01 homeserver CRON[553593]: pam_unix(cron:session): session closed for user root
Sep 24 15:55:04 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb327180> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:06 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb327941> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:08 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb328140> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:10 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb328901> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>

-----------------------------------------------------------------------------

And the last one;

root@homeserver:~# lspci -nnk | grep -A2 -E "(Network|Ethernet)"
00:14.3 Network controller [0280]: Intel Corporation Alder Lake-S PCH CNVi WiFi [8086:7af0] (rev 11)
Subsystem: Intel Corporation Alder Lake-S PCH CNVi WiFi [8086:4090]
Kernel driver in use: iwlwifi
--
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (17) I219-LM [8086:1a1c] (rev 11)
Subsystem: Dell Ethernet Connection (17) I219-LM [1028:0c6f]
Kernel driver in use: e1000e
Kernel modules: e1000e

dakralex · Sep 30, 2024

Your network and VM configuration looks fine, it seems like you suffer from a similar problem as described in this post [0] (as far as I can tell most people there have the same NIC as you). Have you tried the solutions from there (disabling some or all hardware offloading, if your NIC supports that [1] and/or [2])?

[0] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/
[1] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/#post-378352
[2] https://forum.proxmox.com/threads/e1000-driver-hang.58284/page-8#post-390709

GroupPolicy · Sep 30, 2024

dakralex said:
Your network and VM configuration looks fine, it seems like you suffer from a similar problem as described in this post [0] (as far as I can tell most people there have the same NIC as you). Have you tried the solutions from there (disabling some or all hardware offloading, if your NIC supports that [1] and/or [2])?

[0] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/
[1] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/#post-378352
[2] https://forum.proxmox.com/threads/e1000-driver-hang.58284/page-8#post-390709

Honestly i did not tried disabling some or all hardware offloading, because i read something about that like " performance is effecting after that" What's your comment about that? Does this really affect network spead of PVE host?

dakralex · Oct 2, 2024

GroupPolicy said:
Honestly i did not tried disabling some or all hardware offloading, because i read something about that like " performance is effecting after that" What's your comment about that? Does this really affect network spead of PVE host?

I can only give you anecdotal advice here - as I don't have your exact NIC - but as far as I heard from others, it should usually make not a lot of difference, especially when dealing with 1Gbit NICs. But of course, this is not a permanent change and you could verify/disprove that claim with running a performance test (e.g. iperf) when it is enabled and when it is disabled.

As far as it goes for others, it does solve your problem more often than not. If it doesn't, it would be interesting to see the output of ethtool -S enp0s31f6 after some of these hardware unit hangs happened to see the statistics of the NIC itself.

GroupPolicy · Oct 5, 2024

dakralex said:
I can only give you anecdotal advice here - as I don't have your exact NIC - but as far as I heard from others, it should usually make not a lot of difference, especially when dealing with 1Gbit NICs. But of course, this is not a permanent change and you could verify/disprove that claim with running a performance test (e.g. iperf) when it is enabled and when it is disabled.

As far as it goes for others, it does solve your problem more often than not. If it doesn't, it would be interesting to see the output of ethtool -S enp0s31f6 after some of these hardware unit hangs happened to see the statistics of the NIC itself.

Thanks a lot Daniel, Also i will check ethtool -S enp0s31f6

Search

Search

Proxmox Network Blocking state

GroupPolicy

New Member

dakralex

Proxmox Staff Member

GroupPolicy

New Member

dakralex

Proxmox Staff Member

GroupPolicy

New Member

dakralex

Proxmox Staff Member

GroupPolicy

New Member