Networking slow to come up on VMs, intermittent routing issues on some

Skamanda · 2026-02-18T17:09:44+0100

I'm having an issue with the VMs on my PVE server, that may well be two issues, or could be just one. I'm spinning up a VM to run more tests as I type this, but I wanted to try and get the ball rolling looking for a fix here while I wait.

I first noticed the issue when working with some Ubuntu VMs, and it presented as the network being slow to come up. In some cases, if I opened the console to the VM and ran a ping to anywhere, after a few request timeouts, everything would wake up and behave nicely. In others, it would be stubborn, and not come online for 15-30 minutes, even with a ping running. On the Linux VMs, it doesn't seem to become unstable after coming online, but those systems don't see a ton of traffic, and there don't seem to be any error messages in any logs either on the PVE host or the VMs to be able to determine if that's the case.

This week, I've been trying to spin up a Windows 11 desktop for a contractor to remote into, and it's been having a severe enough variant of that issue that I ended up digging into it further. On the Windows VM (using the virtio network driver, though the same issue was present when I tested with both E1000 and RTL8139), the issue will arise after using the network - it will see the internet, and allow me to access a website or two, but won't stay connected long enough to activate the copy of Windows.

In troubleshooting I noticed that the Windows VM can ping the PVE host by IP just fine, but it can't ping any of the other VMs running on PVE, anything else on the local network, or anything beyond the gateway. The odd part is, it still gets DHCP from the network, which I can observe through changing the IP of its static lease at the DHCP server, and refreshing it on the VM. It will obtain the new IP even when Windows is unable to route beyond the PVE host.

This seems to be a routing issue, but as I'm kind of new to Proxmox, the inconsistent nature of it, as well as it not throwing any errors on host or VM, is throwing me for a loop. Any help someone could provide would be greatly appreciated.

Info on my environment, from requested fields in someone else's post for a similar issue:
- PVE version: 8.4.0
- Cluster configuration: Single PVE server
- Network configuration: Everything is the defaults at install, aside from bridging the two physical ports on the server
- VM OS: Ubuntu 24.04, and Windows 11 Pro
- VM network configuration: Defaults, other than manual IP configuration on Ubuntu. DHCP on Windows, with static lease configured on DHCP server
- VM logs during the issue: No errors present
- Network traces during the issue: Traceroute doesn't get past the PVE server, PVE server pings, nothing beyond the PVE server itself pings or gets traffic, even the other VMs
- What is the state of the VM during the issue? Can you access it via a console? Can you run troubleshooting commands?: I can access the VMs via console just fine, but network connectivity beyond that is nil

Onslow · 2026-02-18T19:59:21+0100

Skamanda said:
- Network configuration: Everything is the defaults at install, aside from bridging the two physical ports on the server

Welcome, @Skamanda

Sounds it can be the reason. Are these bridges in the same network, by chance?
What are the networks' settings?
Especially these bridges.

Skamanda · 2026-02-18T20:02:27+0100

Onslow said:
Welcome, @Skamanda

Sounds it can be the reason. Are these bridges in the same network, by chance?
What are the networks' settings?
Especially these bridges.

Sorry, coffee hadn't kicked yet. Bonded, not bridged.

Onslow · 2026-02-18T20:23:01+0100

The result of cat /etc/network/interfaces
would give more details

Of course not as a screenshot, but as text in the CODE tags (using this </> button above).

Skamanda · 2026-02-18T20:24:37+0100

Code:

# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eno8303
iface eno8303 inet manual
#Left port on back of server

auto eno8403
iface eno8403 inet manual
#Right port on back of server

auto bond0
iface bond0 inet manual
        bond-slaves eno8303 eno8403
        bond-miimon 100
        bond-mode balance-rr

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.253/24
        gateway 192.168.1.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0

source /etc/network/interfaces.d/*

There are no additional files in /etc/network/interfaces.d/

Onslow · 2026-02-18T20:39:35+0100

I admit I'm not very experienced with Proxmox' networking, so at the moment I'm just comparing your config with the similar one at
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysadmin_network_bond
- section "Example: Use a bond as the bridge port"

I can see you have auto eno8303 and auto eno8403 while the quoted docs don't use auto for the slaved ifaces.

I don't know if it makes a substantial difference, but you may make a backup of the original file (just in case) and modify the working file according to the docs to check if this solves the issue.

P.S. Or maybe the switch doesn't like your bond-mode?...

I mean this fragment:

"If your switch supports the LACP (IEEE 802.3ad) protocol, then we recommend using the corresponding bonding mode (802.3ad). Otherwise you should generally use the active-backup mode."
(I'm bolding the second sentence).

bitranox · 2026-02-18T21:01:19+0100

The Problem: bond-mode balance-rr
Round-robin (balance-rr / mode 0) alternates packets across both NICs on a per-packet basis. This means:
1. Packet 1 goes out eno8303 (with the bond's MAC address)
2. Packet 2 goes out eno8403 (with the same MAC address)
3. Your switch sees the same source MAC arriving on two different physical ports
4. The switch MAC flaps , it constantly updates its forwarding table, sending some return traffic to the wrong port

This is exactly what produces your symptoms: intermittent connectivity, DHCP sometimes works (broadcast), unicast traffic to anything beyond PVE is unreliable, no errors logged anywhere.

balance-rr requires switch-side port-channel/EtherChannel configuration to work correctly. Without it, the switch is confused about which port leads to your server.

Change this one line in /etc/network/interfaces to one of these, depending on your needs:

Code:

 bond-mode balance-rr

to :

#1 bond-mode active-backup (start with that one to confirm the issue)

Code:

bond-mode active-backup

(no switch changes needed
One NIC handles all traffic, the other takes over if the first fails. No switch configuration required. You lose load balancing but gain immediate reliability.

#2 802.3ad / LACP (if your switch supports it)

Code:

  bond-mode 802.3ad
  bond-xmit-hash-policy layer3+4

This gives you both redundancy and load balancing, but you must also configure an LACP/LAG group on your switch for those two ports. Without the switch-side config, this won't come up at all.

#3 balance-alb (load balancing without switch config)

Code:

  bond-mode balance-alb

Adaptive load balancing, no switch config needed, provides some TX and RX load balancing. However, it works by rewriting MAC addresses, which can occasionally cause issues with bridged VMs. It's worth trying if you want more than active-backup without touching the switch.

Apply the Change

Code:

  # Edit the file
  nano /etc/network/interfaces

Code:

  # Apply without reboot
  ifreload -a

If all your VMs immediately get stable networking, you've confirmed the root cause. Then decide if you want to set up LACP on your switch for the bandwidth benefit of 802.3ad.

Skamanda · 2026-02-18T21:28:50+0100

bitranox said:
#1 bond-mode active-backup (start with that one to confirm the issue)

Code:

bond-mode active-backup

(no switch changes needed
One NIC handles all traffic, the other takes over if the first fails. No switch configuration required. You lose load balancing but gain immediate reliability.

If all your VMs immediately get stable networking, you've confirmed the root cause. Then decide if you want to set up LACP on your switch for the bandwidth benefit of 802.3ad.

I tried active-backup, to at least make it simple. No joy.

It can ping the PVE host, but not the gateway, and nothing outside...

bitranox · 2026-02-18T21:37:06+0100

"PING: transmit failed. General failure." is a very different error than a timeout. This means Windows couldn't even send the packet, the network stack itself is failing locally, before anything hits the wire. This shifts focus to the VM side.

on the Windows VM, can you run:
ipconfig /all route print

This will show whether the VM has a valid IP, correct subnet mask, and a default gateway in its routing table. A missing/wrong gateway or subnet mask would cause "General failure" on anything outside the immediate network.

Also try resetting the Windows network stack:

netsh winsock reset
netsh int ip reset
ipconfig /release
ipconfig /renew
(Requires reboot after the first two.)

And one more thing on the PVE host, I need to see:
cat /proc/sys/net/bridge/bridge-nf-call-iptables
iptables -L -n -v
pve-firewall status

There are likely two issues stacked on top of each other:
1. The Windows VM's network stack is in a bad state (causing "General failure")
2. The PVE bridge/firewall may also be filtering traffic (which would explain the Linux VMs having the same slow-to-connect behavior)

The PVE host diagnostics will help sort out whether the Linux VM issue and the Windows VM issue share a common cause on the host side, or if Windows has its own separate problem.

Skamanda · 2026-02-18T22:51:49+0100

bitranox said:
"PING: transmit failed. General failure." is a very different error than a timeout. This means Windows couldn't even send the packet, the network stack itself is failing locally, before anything hits the wire. This shifts focus to the VM side.

ipconfig /release
ipconfig /renew

Looks like I had a stale DHCP lease that collided with another device. The release and renew cleared the jam, and that PC seems to be stable. I'll continue testing and comment further if there are any more issues, but that may have gotten this solved, thanks!

Search

Search

Networking slow to come up on VMs, intermittent routing issues on some

Skamanda

New Member

Onslow

Active Member

Skamanda

New Member

Onslow

Active Member

Skamanda

New Member

Onslow

Active Member

bitranox

New Member

Skamanda

New Member

bitranox

New Member

Skamanda

New Member

We value your privacy