Hey There,
we have this on all Windows VMs, Installed on PVE 6.x after Upgrade to 7.x
and also for VMs installed on 7.x
we have this on all Windows VMs, Installed on PVE 6.x after Upgrade to 7.x
and also for VMs installed on 7.x
Can you post a vm config?We have tested a few things. This seems to be an issue with Hyper-V VMs imported to ProxMox on 7.1-4. We have been importing Gen 2 Hyper-V VMs into ProxMox and also importing from another KVM based hyper-visor using qm importdisk. The KVM based hyper-visor imports work fine, the Hyper-v imports have been causing trouble ONLY on 7.1-4. The Hyper-v imports to 6.4-13 are not having this issue.
I am worried that now upgrading our cluster to 7.1-4 is going to make this issue appear. Windows VMs get stuck on a black screen and the only solution is to stop/start. This is not ideal as it means we are having to fix this primitively at the wee hours of the morning before the scheduled reboots take effect.
I think this is an issue with UEFI / EFI but I am not 100% at this point, but 7.1-4 seems to be the only thing that is different and showing these issues.
Can you try with:Same here for our Windows VM's. Some reboot, some don't.
Windows Server 2019 pc-i440fx-5.2 (SeaBIOS), VirtIO SCSI
Windows Server 2022 pc-i440fx-6.0 (OVMF UEFI), VirtIO SCSI
machine: pc-q35-6.1
Windows Server 2019 Datacenter (1809)
agent: 1
balloon: 4096
bootdisk: virtio0
cores: 4
cpu: Westmere
cpuunits: 2048
ide0: none,media=cdrom
localtime: 1
memory: 8192
name: WEHY-SM
net0: virtio=8A:81:1A:36:82:C8,bridge=vmbr0,firewall=1,link_down=1,tag=102
net1: virtio=6A:44:2E:95:F3:09,bridge=vmbr0,firewall=1,tag=108
numa: 1
onboot: 1
ostype: win10
protection: 1
scsihw: virtio-scsi-single
smbios1: uuid=4778cce2-9222-46f0-870d-7c1230f07d84
sockets: 2
startup: order=400,up=60
virtio0: VM_SSD:vm-115-disk-1,discard=on,size=64G
virtio1: VM_SSD:vm-115-disk-0,discard=on,size=96G
vmgenid: 4f27a838-0055-473c-8c4d-52727aac1525
Windows Server 2016 Standard (1607)
agent: 1
balloon: 8192
bootdisk: virtio0
cores: 4
cpu: Westmere
ide0: none,media=cdrom
memory: 16384
name: FC-01
net0: virtio=6A:6B:6A:51:6D:37,bridge=vmbr0,firewall=1,tag=2000
numa: 1
onboot: 1
ostype: win10
protection: 1
scsihw: virtio-scsi-pci
smbios1: uuid=a98ebbed-ac84-4418-a083-b7a37ac4555d
sockets: 2
startup: order=200,up=60
virtio0: VM_HDD:vm-106-disk-0,discard=on,size=1000G
vmgenid: ffa8c494-81e3-42ad-8818-cb6a8800b434
Windows Server 2019 Datacenter (1809)
agent: 1
boot: order=scsi0;net0
cores: 1
cpu: Westmere
ide0: none,media=cdrom
memory: 4098
name: AC-W01
net0: virtio=5A:42:EC:03:A5:C6,bridge=vmbr0,firewall=1,tag=30
numa: 1
onboot: 1
ostype: win10
protection: 1
scsi0: VM_HDD:vm-149-disk-0,discard=on,size=64G
scsihw: virtio-scsi-pci
smbios1: uuid=d26bb6a2-e672-43c0-9bdb-f23599355c64
sockets: 2
startup: order=600,up=60
vmgenid: 2265db06-3391-41f0-8367-d585199d6ef5
Windows Server 2019 Standard (1809)
agent: 1
balloon: 4096
bootdisk: scsi0
cores: 2
cpu: Westmere
ide2: none,media=cdrom
memory: 6144
name: AC-D2
net0: virtio=EA:C7:ED:F1:B4:14,bridge=vmbr0,firewall=1,tag=1300
numa: 1
onboot: 1
ostype: win10
protection: 1
sata0: none,media=cdrom
scsi0: VM_HDD:vm-119-disk-0,discard=on,size=64G
scsihw: virtio-scsi-pci
smbios1: uuid=417f381c-ae1a-4593-8f24-5a7ff613e40c
sockets: 2
startup: order=200,up=60
vmgenid: f3a1bc1a-a35b-43bc-8a81-ef623efc966d
proxmox-ve: 7.1-1 (running kernel: 5.13.19-3-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-8
pve-kernel-5.13: 7.1-6
pve-kernel-5.4: 6.4-12
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 15.2.15-pve1
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-2
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.0-15
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.1.4-1
proxmox-backup-file-restore: 2.1.4-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-5
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
Same story on many Windows vms in our cluster (Windows Server 2012/2016/2019). NFS storage and SCSI disks
Hi Tuxis,@weehooey Great work on summarizing all info. Since there was no official bugreport yet, I created it here:
https://bugzilla.proxmox.com/show_bug.cgi?id=3933
We are seeing the same behaviour since upgrading to 7.x. At first, we thought it was Windows Updates because they always break and you usually only reboot a Windows Server if you are applying updates.
A sample of the machines that have hung:
Code:Windows Server 2019 Datacenter (1809) agent: 1 balloon: 4096 bootdisk: virtio0 cores: 4 cpu: Westmere ...
cpu: Westmere
cpu: host
cpu: host
. cpu: Westmere
about a year ago. This same cluster was only updated to 7.x this year when the hanging started. Hey @Emilien
Sure we can try that. One of the VMs that froze has been changed tocpu: host
.
Some of these VMs have been running on the same hardware for over two years and only switched tocpu: Westmere
about a year ago. This same cluster was only updated to 7.x this year when the hanging started.
Additionally, some people are reporting Windows Server 2012 having the issue. I have not seen any reports of Windows Server 2008 having the issue.
It appears that some interaction between PVE 7.x and Windows happens when the VM has been running for a while.
At this point, it appears that somewhere between 5 days and ~22 days the issue will arise.
cpu: Westmere
I don't think latest version of Windows like much this old cpu...
Can you give a try with:
cpu: host
?
For me (in different clusters) the problem occurs on Host with different physical CPU (always Intel) and with different virtual CPU in the VMs.All VMs in my cluster have: cpu: host
Xeon(R) CPU E5-26xx (v2)