Hello there, I'm new here and I start off with something that I haven't managed to find, but maybe I just can't search good enough.
For a few weeks, I've been haunted by strange Win11 VM freezes, what happens is the most awkward OS freeze I've ever seen. The sound just stops to silence, no "machinegun fire", no 1s loops, just silence, however, the mouse is still working, I can even jump through the windows. If I had Task Manager open, it would stop updating, but I could still jump through its windows. If I had been in the game at that time, the game would be still playable, I could still walk around however interacting with items around wouldn't work. I can alt-tab to another window, sometimes it's even possible to open another website in the web browser. But all this will in the end lead to a complete freeze/black screen. If it's still working, I can open the Start Menu and press on the Restart/Shutdown but all it will do is show the corresponding screen and freeze completely. All I can really do is use the Stop from the Proxmox UI.
This issue started happening at random, can't really pinpoint it to any change in hardware, software, or package updates in the proxmox itself. Sometimes it'll happen after 15 minutes of VM runtime, sometimes it won't happen for 4 days. However, for the few months prior to this, it wasn't happening.
Managed to roughly pinpoint this occurrence to these messages in the Syslog:
These two are from the two latest freezes.
As you can see on the graph, once this happens, and all the work in the VM will inevitably halt, the traffic and usage will be minimal.
I recently upgraded from 7 to 8, mainly because it was also happening before. Same with changing Q35 from 5.1 to 8.0
The VM config:
and the pveversion
I don't think I need to add that I'm a total newbie with this. Was also wondering if that's maybe an issue with the Windows itself, but considering it's not throwing any error or bluescreen... There's also nothing in the Event Viewer/Windows logs.
For a few weeks, I've been haunted by strange Win11 VM freezes, what happens is the most awkward OS freeze I've ever seen. The sound just stops to silence, no "machinegun fire", no 1s loops, just silence, however, the mouse is still working, I can even jump through the windows. If I had Task Manager open, it would stop updating, but I could still jump through its windows. If I had been in the game at that time, the game would be still playable, I could still walk around however interacting with items around wouldn't work. I can alt-tab to another window, sometimes it's even possible to open another website in the web browser. But all this will in the end lead to a complete freeze/black screen. If it's still working, I can open the Start Menu and press on the Restart/Shutdown but all it will do is show the corresponding screen and freeze completely. All I can really do is use the Stop from the Proxmox UI.
This issue started happening at random, can't really pinpoint it to any change in hardware, software, or package updates in the proxmox itself. Sometimes it'll happen after 15 minutes of VM runtime, sometimes it won't happen for 4 days. However, for the few months prior to this, it wasn't happening.
Managed to roughly pinpoint this occurrence to these messages in the Syslog:
Code:
Sep 07 22:24:03 pve kernel: kvm: vcpu 0: requested 31256 ns lapic timer period limited to 200000 ns
Sep 07 22:26:08 pve pvedaemon[36808]: <root@pam> successful auth for user 'root@pam'
Sep 07 22:32:54 pve kernel: kvm: vcpu 0: requested 31256 ns lapic timer period limited to 200000 ns
Sep 07 22:33:45 pve kernel: kvm: vcpu 0: requested 31256 ns lapic timer period limited to 200000 ns
Sep 07 22:38:33 pve smartd[1188]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 35 to 36
Sep 07 22:38:34 pve smartd[1188]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 36 to 38
Sep 07 22:41:08 pve pvedaemon[36536]: <root@pam> successful auth for user 'root@pam'
Sep 07 22:51:14 pve pveproxy[93223]: worker exit
Sep 07 22:51:14 pve pveproxy[1567]: worker 93223 finished
Sep 07 22:51:14 pve pveproxy[1567]: starting 1 worker(s)
Sep 07 22:51:14 pve pveproxy[1567]: worker 114331 started
Sep 07 22:52:11 pve kernel: kvm: vcpu 0: requested 93752 ns lapic timer period limited to 200000 ns
Sep 07 22:56:09 pve pvedaemon[36536]: <root@pam> successful auth for user 'root@pam'
Sep 07 22:59:51 pve kernel: kvm: vcpu 0: requested 31256 ns lapic timer period limited to 200000 ns
Sep 07 23:01:16 pve pveproxy[1567]: worker 96015 finished
Sep 07 23:01:16 pve pveproxy[1567]: starting 1 worker(s)
Sep 07 23:01:16 pve pveproxy[1567]: worker 115870 started
Sep 07 23:45:32 pve kernel: kvm: vcpu 0: requested 62496 ns lapic timer period limited to 200000 ns
Sep 07 23:45:33 pve kernel: kvm: vcpu 0: requested 62496 ns lapic timer period limited to 200000 ns
Sep 07 23:46:36 pve pveproxy[119861]: worker exit
Sep 07 23:46:36 pve pveproxy[1567]: worker 119861 finished
Sep 07 23:46:36 pve pveproxy[1567]: starting 1 worker(s)
Sep 07 23:46:36 pve pveproxy[1567]: worker 123028 started
Sep 07 23:51:00 pve kernel: kvm: vcpu 0: requested 31256 ns lapic timer period limited to 200000 ns
These two are from the two latest freezes.
As you can see on the graph, once this happens, and all the work in the VM will inevitably halt, the traffic and usage will be minimal.
I recently upgraded from 7 to 8, mainly because it was also happening before. Same with changing Q35 from 5.1 to 8.0
The VM config:
Code:
args: -cpu 'host,hv_vapic,hv_stimer,hv_time,hv_synic,hv_vpindex,+invtsc,-hypervisor'
bios: ovmf
boot: order=scsi0;net0;ide0
cores: 12
cpu: host,hidden=1,flags=+md-clear;+pcid;+spec-ctrl;+ssbd;-ibpb;-virt-ssbd;-amd-ssbd;-amd-no-ssb;+pdpe1gb;+hv-tlbflush;+hv-evmcs;+aes
cpulimit: 12
cpuunits: 200
efidisk0: local-lvm:vm-101-disk-0,efitype=4m,size=4M
hostpci0: 0000:03:00,pcie=1,x-vga=1
ide0: none,media=cdrom
localtime: 1
machine: pc-q35-8.0
memory: 32768
meta: creation-qemu=7.2.0,ctime=1682936634
name: kinsenka
net0: virtio=FE:69:70:FD:CA:E2,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: tenjikubotan:vm-101-disk-0,iothread=1,size=900G,ssd=1
scsi1: asobi:vm-101-disk-0,backup=0,iothread=1,size=450G,ssd=1
scsi2: data:vm-101-disk-0,backup=0,iothread=1,size=700G
scsihw: virtio-scsi-single
smbios1: uuid=38069bd2-b3dc-4e5e-bca1-325661889236
sockets: 1
tpmstate0: local-lvm:vm-101-disk-2,size=4M,version=v2.0
usb0: host=1e7d:300c
usb1: host=258a:0027
usb2: host=24c6:592a
usb4: host=03f0:2b17
vmgenid: 1a197821-7e79-42c8-b5f1-3cb336482f82
and the pveversion
Code:
proxmox-ve: 8.0.2 (running kernel: 6.2.16-12-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-6
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
proxmox-kernel-6.2: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.111-1-pve: 5.15.111-1
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-5
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
I don't think I need to add that I'm a total newbie with this. Was also wondering if that's maybe an issue with the Windows itself, but considering it's not throwing any error or bluescreen... There's also nothing in the Event Viewer/Windows logs.