Proxmox Network Blocking state

GroupPolicy

New Member
Aug 7, 2024
6
0
1
Hi Everyone



I'm using 8.2.7 version on the my desktop pc and it has 1 NIC



Problem is that, especially during daily backup to Backup server (on another device), i see this logs on my PVE. Also yesterday i connected to virtual WİN10 pc via guacomole (from outside) connection lost a few times and Host computer was unaccessible. I had to turn off and turn on again PVE host. Do you have any idea about this issue? What's the problem here?



ep 26 12:30:11 homeserver kernel: vmbr0: port 12(fwpr106p0) entered blocking state
Sep 26 12:30:11 homeserver kernel: vmbr0: port 12(fwpr106p0) entered forwarding state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered disabled state
Sep 26 12:30:11 homeserver kernel: fwln106i0: entered allmulticast mode
Sep 26 12:30:11 homeserver kernel: fwln106i0: entered promiscuous mode
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 1(fwln106i0) entered forwarding state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered disabled state
Sep 26 12:30:11 homeserver kernel: tap106i0: entered allmulticast mode
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered blocking state
Sep 26 12:30:11 homeserver kernel: fwbr106i0: port 2(tap106i0) entered forwarding state
 
Hello GroupPolicy!

These are just informational messages about when your bridges are running STP (Spanning Tree Protocol) and as long as they land in "forwarding state", it should usually not be a problem. The connectivity issues you're describing are a problem of course.

  • Could you post how you configured your network and VM (e.g. cat /etc/network/interfaces and qm config <windows-vmid>?
  • Does your syslog have a message containing something like a link down or some messages from your NIC directly?
  • What is the output of lspci -nnk | grep -A2 -E "(Network|Ethernet)"?
 
Hello GroupPolicy!

These are just informational messages about when your bridges are running STP (Spanning Tree Protocol) and as long as they land in "forwarding state", it should usually not be a problem. The connectivity issues you're describing are a problem of course.

  • Could you post how you configured your network and VM (e.g. cat /etc/network/interfaces and qm config <windows-vmid>?
  • Does your syslog have a message containing something like a link down or some messages from your NIC directly?
  • What is the output of lspci -nnk | grep -A2 -E "(Network|Ethernet)"?
Hello Dakralex

Thanks for the answer, it's a little bit strange for me, i see these messages only during backup task. Also i checked the vm detail in the message, seems it's only for "off status" vm's For your second question, i saw link down or similar message. I added also this.

You can see below requested information.

auto lo
iface lo inet loopback

iface enp0s31f6 inet manual

auto vmbr0
iface vmbr0 inet manual
bridge-ports enp0s31f6
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094

auto vmbr0.1
iface vmbr0.1 inet static
address 192.168.1.150/24
gateway 192.168.1.1
#Management

auto vmbr0.10
iface vmbr0.10 inet manual
#Office

auto vmbr0.20
iface vmbr0.20 inet manual
#IoT

auto vmbr0.30
iface vmbr0.30 inet manual
#Guest

auto vmbr0.40
iface vmbr0.40 inet manual
#Media

auto vmbr0.50
iface vmbr0.50 inet manual
#Lab

auto vmbr0.60
iface vmbr0.60 inet manual
#Rdp


--------------------------------------------------

root@homeserver:~# qm config 112
agent: 1
balloon: 4096
boot: order=scsi0;net0
cores: 2
cpu: x86-64-v2-AES
description: Remote Access Bilgisayar%C4%B1%0A%0A%0AVLAN-60
machine: pc-i440fx-8.1
memory: 8192
meta: creation-qemu=8.1.5,ctime=1722120287
name: Windows-10-Lab
net0: virtio=BC:24:11:A2:4B:25,bridge=vmbr0,firewall=1,tag=60
numa: 0
onboot: 1
ostype: win10
scsi0: SSD:vm-112-disk-0,cache=writeback,iothread=1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=58dc4358-1b34-4883-b5e1-5889d0b27787
sockets: 1
startup: order=10,up=15,down=15
tags: test
vmgenid: 5f88629f-c644-4115-b4b6-6acb56287926


----------------------------------------------------------------------------------------------------------

Sep 24 14:36:46 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: NETDEV WATCHDOG: CPU: 6: transmit queue 0 timed out 9846 ms
Sep 24 14:36:46 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
Sep 24 14:36:46 homeserver kernel: vmbr0: port 1(enp0s31f6) entered disabled state
Sep 24 14:36:50 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Sep 24 14:36:50 homeserver kernel: vmbr0: port 1(enp0s31f6) entered blocking state
Sep 24 14:36:50 homeserver kernel: vmbr0: port 1(enp0s31f6) entered forwarding state
Sep 24 14:36:52 homeserver pvestatd[1611]: got timeout
Sep 24 14:36:53 homeserver pvestatd[1611]: status update time (9.606 seconds)
Sep 24 14:36:55 homeserver pvestatd[1611]: got timeout
Sep 24 15:17:01 homeserver CRON[553593]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 24 15:17:01 homeserver CRON[553594]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Sep 24 15:17:01 homeserver CRON[553593]: pam_unix(cron:session): session closed for user root
Sep 24 15:55:04 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb327180> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:06 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb327941> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:08 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb328140> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>PHY Extended Status <3000>PCI Status <10>
Sep 24 15:55:10 homeserver kernel: e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <95> TDT <7e> next_to_use <7e> next_to_clean <94>buffer_info[next_to_clean]: time_stamp <1cb326c1c> next_to_watch <95> jiffies <1cb328901> next_to_watch.status <0>MAC Status <40080083>PHY Status <796d>PHY 1000BASE-T Status <3800>

-----------------------------------------------------------------------------


And the last one;


root@homeserver:~# lspci -nnk | grep -A2 -E "(Network|Ethernet)"
00:14.3 Network controller [0280]: Intel Corporation Alder Lake-S PCH CNVi WiFi [8086:7af0] (rev 11)
Subsystem: Intel Corporation Alder Lake-S PCH CNVi WiFi [8086:4090]
Kernel driver in use: iwlwifi
--
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (17) I219-LM [8086:1a1c] (rev 11)
Subsystem: Dell Ethernet Connection (17) I219-LM [1028:0c6f]
Kernel driver in use: e1000e
Kernel modules: e1000e
 
Your network and VM configuration looks fine, it seems like you suffer from a similar problem as described in this post [0] (as far as I can tell most people there have the same NIC as you). Have you tried the solutions from there (disabling some or all hardware offloading, if your NIC supports that [1] and/or [2])?

[0] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/
[1] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/#post-378352
[2] https://forum.proxmox.com/threads/e1000-driver-hang.58284/page-8#post-390709
 
Your network and VM configuration looks fine, it seems like you suffer from a similar problem as described in this post [0] (as far as I can tell most people there have the same NIC as you). Have you tried the solutions from there (disabling some or all hardware offloading, if your NIC supports that [1] and/or [2])?

[0] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/
[1] https://forum.proxmox.com/threads/e1000e-eno1-detected-hardware-unit-hang.59928/#post-378352
[2] https://forum.proxmox.com/threads/e1000-driver-hang.58284/page-8#post-390709
Honestly i did not tried disabling some or all hardware offloading, because i read something about that like " performance is effecting after that" What's your comment about that? Does this really affect network spead of PVE host?
 
Honestly i did not tried disabling some or all hardware offloading, because i read something about that like " performance is effecting after that" What's your comment about that? Does this really affect network spead of PVE host?
I can only give you anecdotal advice here - as I don't have your exact NIC - but as far as I heard from others, it should usually make not a lot of difference, especially when dealing with 1Gbit NICs. But of course, this is not a permanent change and you could verify/disprove that claim with running a performance test (e.g. iperf) when it is enabled and when it is disabled.

As far as it goes for others, it does solve your problem more often than not. If it doesn't, it would be interesting to see the output of ethtool -S enp0s31f6 after some of these hardware unit hangs happened to see the statistics of the NIC itself.
 
  • Like
Reactions: GroupPolicy
I can only give you anecdotal advice here - as I don't have your exact NIC - but as far as I heard from others, it should usually make not a lot of difference, especially when dealing with 1Gbit NICs. But of course, this is not a permanent change and you could verify/disprove that claim with running a performance test (e.g. iperf) when it is enabled and when it is disabled.

As far as it goes for others, it does solve your problem more often than not. If it doesn't, it would be interesting to see the output of ethtool -S enp0s31f6 after some of these hardware unit hangs happened to see the statistics of the NIC itself.
Thanks a lot Daniel, Also i will check ethtool -S enp0s31f6
 
I have a slightly different problem, which seems to concern the network stack in general.

I have a standard vmbr0 bridge and an enp6s0 LAN interface. (just standard installation with no change, no VLANs etc.)

Since a couple of days (nothing has been changed in the network settings in the meantime) I can no longer access the LAN and Internet from my Proxmox host.
When I reboot the Proxmox server (Version 8.3.3), I can access the network for a while (e.g. ping our router) from the proxmox host. But then it stops pinging after about 5 seconds. I have no related entries in messages or syslog at that time (however I have the same blocking and forwarded entries before (as above) and it complains, that it cannot access the Proxmox Backup server). When I detach and re-attach the network cable, I can ping for another 5 seconds and then it stops. Iptables is empty btw.
This is also valid, when no LXC or QM containers are started.
I can, however, from within the LXC containers still access the LAN and Internet (but not the Proxmox host itself) with no problems. So the network and network card and network environment in general is still working.

I also tried a USB LAN card as network interface, and I have exactly the same behaviour.
Maybe I should also note that I have 2 NVIDIA cards in the system (RTX 3060 and RTX 4060) which I use for LLMs inside LXC cointainers.

Any hints from your side, where to look at?
 
Last edited:
After a lot of tries and work with different adapters, cables, switches, I finally changed the IP of vmbr0. Then it worked with all network cards. I do not know why.
Also I can ensure, that there was no duplicate IP.