Host losing network when starting Windows 2025 VM

FrankWest

New Member
May 3, 2025
10
2
3
Hi,

I've quite a strange issue. I've have a host in my lab with several Linux VM's. That works perfectly fine. However, once I create a Windows Server 2025 VM's and run it, the host loses it's network connectivity and I have to reboot it in order to get access again. I've tried everything I could find, disabling ASPM in the kernel, disabling hardware offloading on the NIC. During the setup of Windows it works fine, but as soon as the VM is fully up and running, the host loses it's network connectivity after a random period of time. As soon as the Windows VM is shutdown, the proxmox host is stable again.

The NIC is a ' AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion]' controller. Proxmox version is 8.4.1.

Any ideas?

Frank.
 
Last edited:
So if you take away it's NIC it doesn't bring down the host? you're passthroughing the AQC107? how about just e1000e on the vmbr0?
 
  • Like
Reactions: news
Without a virtual NIC present on the Windows vm, the host is stable. As soon as I give the vm a virtual NIC (even a e1000) the host looses its connectivity after a while. I’m not doing any pass through. The host doesn’t crash, so I can still access it on the console and reboot it from there.
 
what if you create a new bridge and put the VM on that? just to see what it does. I would probably work, but something to build from

also make a quick additional win10 or win11 VM. does that crash the vmbr0?
 
Last edited:
  • Like
Reactions: news
New bridge also crashes. New VM also.

As a last resort I replaced the AQC107 based host NIC with a X540-AT2 based NIC and now the system is stable for more hours then ever before. Let's monitor it the next couple of days, but so far so good.
 
  • Like
Reactions: Johannes S
This sounds like an MTU issue. Windows VM hogging that MTU with the same MTU inherited from the host?
I would be interested to know what MTU setting you have (had?) on that NIC.

Thanks.
/network/etc/interfaces doesn't contain an MTU parameter. So it uses the default of 1500. The Windows VM also has the default MTU setting since I didn't change that also. The NIC had an MTU of 1500. After I moved the Windows VM to another host, that host was stable. No issues. As soon as I moved the VM back to the original host, this time it lost it's network connection after about half an hour.
 
Last edited:
maybe you have your proxmox host ip on a specific vlan ? (eth.X) , and vm on a non-vlanware bridge with same vlan ?
No, I haven't. The proxmox host is in the same subnet (without a vlan tag) as the VM. If it was a VLAN issue, it should happen immediately and not random after a while (up to several hours, but most of the time withing half an hour).

With the Intel NIC, the host is still stable. So it looks like it's solved. The only thing I changed after swapping the NICs were the NIC names in the /etc/network/interfaces file (the old NIC had one port and the new NIC has two ports.
 
/network/etc/interfaces doesn't contain an MTU parameter. So it uses the default of 1500. The Windows VM also has the default MTU setting since I didn't change that also. The NIC had an MTU of 1500.
As a test - to see if my theory holds true (Windows VM hogging the NIC through the same MTU as the host NIC itself) - you could try setting a lower MTU on the the VM. This can be done in the GUI from the VM hardware Network Device setting with Advanced checked. Alternately if possible you could try using a greater value MTU on the NIC itself. But this may not be possible with your HW/NW/infrastructure.

I'm not sure if you are prepared to test it - but if you are - I would be most grateful with the results.
 
  • Like
Reactions: Johannes S
You could start with any number lower than the standard 1500 - lets say 1350 to begin with?

One other thought that comes to mind - rereading your OP; maybe the Windows VM is somehow "power sleeping" the NIC. Have you gone into the Windows settings (Device Manager) & checked the "Allow Windows to Switch this Device OFF" settings & other Energy Saving etc.

BTW what virtual adapter NIC have you given that Windows VM? Have you tried others?
 
I didn’t look at the power settings of Windows, but that shouldn't bring down the entire host with all VMs on it, should it? Also, that doesn’t explain why replacing the host NIC solves the issue. I’ve tried all types of NICs for the Windows guest, but that didn’t help.

To me it looks like a compatibility issue between the AQC107 NIC and Proxmox.
 
I didn’t look at the power settings of Windows, but that shouldn't bring down the entire host with all VMs on it, should it? Also, that doesn’t explain why replacing the host NIC solves the issue.
You are correct in assuming that "normally" it should not - but that Windows VM should not be bringing down the host NW at all - which it has. It is probably a combination together with that AQC107 NIC + virtualization via Proxmox.

In one case I have seen Windows VM power management cause issues on a node host - but agreeably it is rare.
Since it appears your issue is happening "after a certain amount of time" power management comes to mind. I would really turn off any PM/energy savings etc. setting in the Windows VM DM for that NIC. You could also try disabling any sleep/hibernation functions on that Windows client.
Anyway I'm trying to see what has caused your issue - this could help other users, so we must cover all the basics!

To me it looks like a compatibility issue between the AQC107 NIC and Proxmox.
+ Windows!

Some more things to ponder:
  • Has the firmware on that AQC107 NIC been updated?
  • Are you running a 10GB NW? NIC overheating?


I thank you for your patience & preparedness in testing.
 
Looks like you have nailed down Windows VMs crashes the networking stack in Proxmox with that NIC
what emulations for the VMs have you tried?
do you have local terminal access or do you have to Console or SSH in?
I'm wondering if you could also have a systemctl restart networking run via cron every 15min or so when you change the VM NIC if you lose access
(also does that work to get vmbr0 back at all... if not it's sending something weird to the hardware and crashing the card which should be seen in kernel log)
 
I did some more testing by swapping the intel nic with the AQC107, but after following all the troubleshooting steps (changing MTU, disabling power management settings), nothing helps. The host keeps loosing network connectivity. Performing a systemctl restart networking also doesn’t help. Nor does disconnecting and reconnecting the network cable. I need to reboot the host in order to get connectivity back.

It’s a 10Gb nic, no overheating as far as I can tell. Firmware wasn’t updates and no firmware update is available. The host was an ESXi host before with the same AQC107 NIC and that one was rock stable. I’ve tried all types of virtual nic, but all of them caused the host to loose connectivity.

I’ve reverted back to the Intel NIC and all is fine again.
 
Last edited: