Proxmox 8/8.1, high cpu on host, idle in guest

neko_code

New Member
Aug 28, 2022
3
0
1
Good night/evening everyone!

I tried to search for such issue everywhere at this forum but I couldn't find a solution that could help me.
My issue: Idle cpu % in Windows 11 guest, while host shows over 100-600% usage for that VM. Windows VM got GPU passthrough. Ballooning is disabled. I checked Windows 11 stats to see IO/net usage, but it is very low and lower than 4% while doing nothing.
I tried to disable all software including rdp (moonlight).

In some cases host shows that guest cpu usage went below 50% but after a minute it rises to large amount again.

pveversion output:
Code:
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
pve-kernel-5.15: 7.4-6
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.53-1-pve: 5.15.53-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx7
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.5
pve-qemu-kvm: 8.1.2-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
Though I don't understand why it says pve-kernel-5.15 (update: apt autoremove got rid of that)

Windows 11 version: 10.0.22000
Virtio version: 0.1.240 (Got updated from 0.1.171 yesterday, thinking that the issue would go away - nope)

Host specs:
i7-13700, 128gb 2666 mhz, Intel's integrated CPU is being passthrough to another linux VM (and there are no issues), RTX 4080 (I can post more specs but I don't think that it is the reason)

VM specs:
Code:
args: -uuid 00000000-0000-0000-0000-000000000101 -machine hpet=off -rtc driftfix=slew -global kvm-pit.lost_tick_policy=discard -cpu 'host,-hypervisor,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_synic,hv_stimer,hv_vpindex,hv_runtime,hv_relaxed,kvm=off,hv_vendor_id=intel'
balloon: 0
bios: ovmf
boot: order=virtio0;net0
cores: 12
cpu: host,flags=+pcid
efidisk0: data-2:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:01:00,pcie=1,x-vga=1
machine: pc-q35-8.1
memory: 42000
meta: creation-qemu=7.0.0,ctime=1663465040
name: win11
net0: virtio=76:03:84:99:4B:3F,bridge=vmbr0,tag=70
numa: 0
ostype: other
scsihw: virtio-scsi-single
smbios1: uuid=58741d44-5349-4bff-975c-7bc584445f37
sockets: 1
tablet: 0
tpmstate0: data-2:vm-101-disk-1,size=4M,version=v2.0
vga: none
virtio0: data-2:vm-101-disk-2,iothread=1,size=128G
vmgenid: 5cf688dd-9de3-45f4-8ac5-7d169327e336

I tried to completely remove args to check if it is the reason but I still have large host cpu usage.

What I tried so far and none of that helped:
- Changing CPU type to kvm64 and tried to set some specific cpu type
- -machine no-hpet option, clock skew
- Tried to change VirtIO SCSI to VirtIO SCSI Single
- Removing PCI device (gpu passthrough)
- Changing disk parameters (with iothread and without)
- Changing machine type from pc-q35-7.1 to 8.1 (Can't change to another because Proxmox about passthrough stating that I should use q35)
- Tried to lower core count (I don't use NUMA because its a single cpu board)
- Lowering RAM for that VM
- Changing OS type in Proxmox settings to "Other"
- Disabling trackpad tracking


One thing I noticed for sure, that strace output shows futex with a lot of errors:
Code:
$ /etc/pve/qemu-server# strace -c -p $(cat /var/run/qemu-server/101.pid)
strace: Process 17587 attached
strace: Process 17587 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 78.57    2.233183          16    132816           ppoll
  7.04    0.200061           6     30691           read
  6.90    0.196116           2     80337           ioctl
  6.35    0.180434           1     99245     10854 futex
  0.95    0.026892           0     34287           write
  0.19    0.005493           0      8347           recvmsg
  0.00    0.000076           2        34           accept4
  0.00    0.000023           0       170           sendmsg
  0.00    0.000015           0        34           close
  0.00    0.000006           0        68           fcntl
  0.00    0.000004           0        34           getsockname
------ ----------- ----------- --------- --------- ----------------
100.00    2.842303           7    386063     10854 total

Host is based on Debian 12 with Proxmox on top with several VMs and only Windows 11 VM is behaving like that.
Screenshot 2024-01-04 at 21.07.28.png


Host grub parameters:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset intel_iommu=on iommu=pt initcall_blacklist=sysfb_init pcie_acs_override=downstream,multifunction video=simplefb:off video=vesafb:off video=efifb:off video=vesa:off disable_vga=1"

Such parameters are required for gpu passthrough so host won't use my GPU to avoid problems with VM. (Taken from docs if I recall correctly)
I got another Linux (Debian 12/6.15 kernel) based VM with exact same parameters for GPU (no device sharing, I got to shutdown the Windows VM to use Linux VM) and there's no such issue so its not related to GPU passthrough I think.

Any idea is welcome:)
 
Last edited:
Did you shutdown all VMs or reboot after upgrading "pve-qemu-kvm 8.1.2-5" to "pve-qemu-kvm 8.1.2-6"? As this sounds like the bug of "pve-qemu-kvm 8.1.2-5" that got fixed with your "pve-qemu-kvm 8.1.2-6" but VMs will still use the old "pve-qemu-kvm 8.1.2-5" until you restart them.
 
Last edited:
Did you shutdown all VMs or reboot after upgrading "pve-qemu-kvm 8.1.2-5" to "pve-qemu-kvm 8.1.2-6"? As this sounds like the bug of "pve-qemu-kvm 8.1.2-5" that got fixed with your "pve-qemu-kvm 8.1.2-6" but VMs will still use the old "pve-qemu-kvm 8.1.2-5" until you restart them.
I had to shutdown the whole node to replace the fan, so yes
 
Turns out it was win11 fault (lol, not surprised), when cloned (using Clonezilla) to barebones configuration I had weird stuff going on. Task Manager shows around 1-4% of cpu usage while the whole os was very laggy. Had to reinstall the whole os (currently 22h3) and its running smooth with no cpu bumps. I don't know what was going on, once again task manager didn't show any process with high io/net/cpu usage

Sorry for inconvenience, I thought it would be related to VM host software
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!