freezing of virtual Windows Clients on boot

Martec

Member
Jul 10, 2019
20
0
21
I experienced a weird behavior on some of my Windows 10 or Windows Server 2019/16 guests.
The Guest doesn't startup and hangs at boot (this Windows circle) when restarting the guest.
This usually happens after a month of uptime of the guest.
If I press reset on the proxmox server the guest hangs again at boot.
If I press stop. And then startup the guest, it will start normal and I can reboot it without or reset it without the bootproblems.
After a month (especially MS Patchday) it happens again.

I experienced it on different hardwareplattforms (Epyc Xeon E).
The virtual machines usually are alway online. Windows 10 Guests restart after Updates and Antivirus Updates.
It began a year ago and may have a correlation to the new kernel or virtio versions(5.15). (actual kernel: Linux pve 5.15.39-1-pve)
I Installed virtio 0.1.221 guests and tools. The language is german, so I can't use the installer.
It got better with the newer kernels and virtio. But we use german iso's, so I experience problems by install the newest ones.
Virtio Driver for W10 is 100.91.104.22100 (21.05.2022)

root@pve:~# cat /etc/pve/qemu-server/102.conf
Code:
agent: 1,fstrim_cloned_disks=1
balloon: 0
bios: ovmf
bootdisk: virtio0
cores: 2
efidisk0: local-zfs:vm-102-disk-0,size=1M
ide2: none,media=cdrom
machine: pc-q35-5.1
memory: 4000
name: OIP01
net0: e1000=XX:XX:XX:XX:XX:XX,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
sockets: 1
virtio0: local-zfs:vm-102-disk-1,discard=on,mbps_wr=200,size=90G
vmgenid: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx

My guess is that it has to do with the storage driver but im not sure. Does anyone experienced the same behaivior?
Maybe someone has already a hint or a solution to this?
 
Maybe it's the same behaviour. I can't tell because there is to little information about it.
I use different versions but as it seems it begun with PVE7.

root@pve:~# pveversion -v
Code:
proxmox-ve: 7.2-1 (running kernel: 5.15.39-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-6
pve-kernel-helper: 7.2-6
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-5
pve-kernel-5.3: 6.1-6
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-7
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
 
Last edited:
Thank you, I will try that.
It takes some time till it occurs again, so hopefully in the next 3 months it wont happen again. :)

Kind regards
 
Hello, Proxmox users !

we experience the same problems on windows 2016, windows 2019 and windows 2022 guests.

At some point in time in Jan 2022 after windows updates, this "windows won't boot" behaviour stated to show up.

It happend last time yesterday on proxmox-ve: 7.3-1 (running kernel: 5.15.102-1-pve) ; we are also using virtio 0.1.221 drivers on the faulty VM.

It happend on AMD EPYC 7513, and on Xeon Gold 6230 platforms so it does not seems to be platform related.


(Bonus is that sometimes the VM won't start at all after STOP, and a recovery is needed)

I have seen that the related bug https://bugzilla.proxmox.com/show_bug.cgi?id=3933 is still opened.

So far I have seen multiple possibilities :

- problem with Virtio ?
- recreate EFI Disks ?
- changer parameter /sys/module/kvm/parameters/ignore_msrs ?
- change options command line args: -cpu '...'

Even if we are long time users of proxmox, we must admit that we are a bit stuck on how to attack this problem... (that seems microsoft related)

How to investigate ? what is the consensus on how to solve this ? what could be the best workaround on this problem ?

Thanks.

pve-manager/7.3-6/723bb6ec (running kernel: 5.15.102-1-pve)
root@proxmox1:/mnt/pve/Proxmox/template/iso# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.102-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-8
pve-kernel-5.15: 7.3-3
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-6
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20221111-1
pve-firewall: 4.2-7
pve-firmware: 3.6-4
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1
boot: order=virtio0;ide2;ide3
cores: 4
efidisk0: ssd1:vm-20001-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: none,media=cdrom
ide3: none,media=cdrom
machine: pc-i440fx-6.2
memory: 8192
meta: creation-qemu=6.2.0,ctime=1669631494
name: FFFFF-ADDC
net0: virtio=22:D9:47:8D:DB:84,bridge=vmbr0,firewall=1,tag=20
numa: 0
onboot: 1
ostype: win11
scsihw: virtio-scsi-pci
smbios1: uuid=a471a91a-81cf-40e9-b8e9-b316e20d7b5b
sockets: 1
startup: order=1,up=30
tpmstate0: ssd1:vm-20001-disk-3,size=4M,version=v2.0
virtio0: ssd1:vm-20001-disk-4,size=64G
vmgenid: 327a751e-d642-4d59-96bf-31ec0eb2347b
 
Same here.
Windows server 2019,2022 and win10 VMs show the "spinning icon" (non-spinning here) after the reboot that follows a windows update. Rebooting restores exactly this state. machine must be stopped and will boot fine right after the next start.

we are on a fairly new pve version.
pve-manager/7.4-4/4a8501a8 (running kernel: 6.2.11-2-pve)
root@pve-4-5-rz.wo.priv:~ # pveversion -v
proxmox-ve: 7.4-1 (running kernel: 6.2.11-2-pve)
pve-manager: 7.4-4 (running version: 7.4-4/4a8501a8)
pve-kernel-5.15: 7.4-3
pve-kernel-6.2.11-2-pve: 6.2.11-2
pve-kernel-5.15.107-2-pve: 5.15.107-2
ceph: 17.2.6-pve1
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-1
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.6
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.0
pve-cluster: 7.3-3
pve-container: 4.4-4
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-2
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
 
Last edited:
  • Like
Reactions: seedoublejuu
Same here - two of our Windows 10 Pro vm machines do hang on the boot screen (Windows-Logo + loading circle dots) until we manually "stop" the vm and then manually switch it on. After switching it on manually, the machine boots fine and the windows updates automatically and without an error is running through.

Does anyone know if there a simple solution (script etc.) to - lets say - recognize that for example the guest tools are not loading and automatically switch off an vm and turn it on again?
 
  • Like
Reactions: asdasdasdaSDASD
We have that in two different organizations of our group, server2019 and server 2022 affected, problem also in the newest PVE version.
We think it might have to do with windows fast boot (because a reboot does not fix it but a poweroff / start does).
We are in the task of switching fast boot off at the windows inside a few of our virtual machines.
 
  • Like
Reactions: seedoublejuu

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!