PVE 6 VM boots single thread

Mar 28, 2019
18
2
23
28
Hey all,
Honestly I'm burned out and have not done much investigation into this yet. I have a couple month old installation on PVE 6 at a customer location. Tonight I did updates, which involved Windows updates on a single 2019 VM followed by guest OS shutdown, PVE updates by "apt-get update; apt-get dist-upgrade" and reboot. After the host came back, I booted the VM. After 15 minutes (I'm running ZFS RAID10 with zil/l2arc and a 4C/8T Xeon + 32GB DDR4 with 16GB ARC) I noticed I didn't even have a Windows logo for the boot yet, still showing Proxmox logo and UEFI boot text at the top of the console. top from a shell showed 100% cpu on the kvm process, so it was running single thread. After a half hour, I stopped the VM and started it again. Noticing the same 100% cpu on KVM, I was about ready to cry myself to sleep, having sold Proxmox as the single best virtualization option no matter the situation (single host, multi-host HA setup on shared storage, etc.) to my employer. It eventually booted and started running multi thread. My final question... Normally I can boot this VM on this host in less than 30 seconds, is there a bug of some sorts causing UEFI BIOS to operate as a single thread? Does it always operate as a single thread, and for some reason something was significantly heavier about this boot? Once it got past the Proxmox logo it was extremely fast and responsive, and using more than 100% CPU indicating a multi thread process.
Thanks for any feedback/opinions! For now I'm not going to bother with forensics and let it ride. VM is running beautifully, and outperforming every single VMWare Virtual Environment I've ever seen.
 
Hi,

it would be helpful to know the exact software version you are using. Could you please post the output of the following commands using [code]your output here[/code]?
Code:
pveversion -v
qm config VMID
 
pveversion -v
Code:
proxmox-ve: 6.1-2 (running kernel: 5.3.18-2-pve)
pve-manager: 6.1-7 (running version: 6.1-7/13e58d5e)
pve-kernel-5.3: 6.1-5
pve-kernel-helper: 6.1-5
pve-kernel-5.0: 6.0-11
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.3.13-2-pve: 5.3.13-2
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.14-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-12
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-4
libpve-storage-perl: 6.1-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-19
pve-docs: 6.1-6
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-10
pve-firmware: 3.0-5
pve-ha-manager: 3.0-8
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-3
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-6
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

qm config 100
Code:
agent: 1
balloon: 5120
bios: ovmf
bootdisk: scsi0
cores: 6
cpu: host
efidisk0: local-zfs:vm-100-disk-1,size=128K
ide0: none,media=cdrom
machine: q35
memory: 8192
name: edc1
net0: virtio=0A:5E:D0:77:1E:A0,bridge=vmbr0
numa: 0
ostype: win10
parent: preupdates
scsi0: local-zfs:vm-100-disk-0,discard=on,iothread=1,size=250G,ssd=1
scsi1: local-zfs:vm-100-disk-2,discard=on,iothread=1,size=2T,ssd=1
scsi2: local-zfs:vm-100-disk-3,discard=on,iothread=1,size=100G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=941dffd8-1471-425a-a861-bb2b2b179484
sockets: 1
vga: virtio
vmgenid: 94b94999-8feb-4514-880f-9ef9d1283019
 
Your settings look ok. As far as I know UEFI is single threaded. Normally CPU utilization should not be high during this phase. Consequently, this should not be a problem. So far, I have been unable to reproduce your unusually high CPU utilization while UEFI was active. Please give an update if this happens again.
 
Hello,
We are experiencing the same extremely slow boot with this server again. CPU is pinned to 100% for the process, using one full core. Boot is taking ~15 minutes so far, I'm looking through all the logs that I can and not finding anything useful.
Thanks
 
Do you have a backup of your VM of before the Windows update? It might well be that one of them is the cause of the long boot times. If this turns out to be true, the next step would be to find out which one exactly.
 
Do you have a backup of your VM of before the Windows update? It might well be that one of them is the cause of the long boot times. If this turns out to be true, the next step would be to find out which one exactly.
I have a snapshot of the VM, unfortunately we are running ZFS and cannot create a clone from a snapshot. I found another forum post that had a workaround, creating a zfs clone of the snapshot, creating a VM (not installing an OS,) and dd if=*clone* of=/dev/zvol/rpool/data/*vm destination disk*. It looked like it was going to work, but unfortunately I just have not had the time available to dig into this. Coronavirus has me doing more VPN/RDP setups than I've ever wanted to do. I sincerely hope to find a cause/resolution for this issue, as Promxox VE is a product I recommend over all others for environments of any capacity. Hopefully I'll have some info soon!
 
I have a snapshot of the VM, unfortunately we are running ZFS and cannot create a clone from a snapshot.

Why not? I do that all the time.
I found another forum post that had a workaround, creating a zfs clone of the snapshot, creating a VM (not installing an OS,) and dd if=*clone* of=/dev/zvol/rpool/data/*vm destination disk*

Cloning is ok, but why dding? Just create a VM get the resulting disk image of the newly created vm, delete the zvol manually, clone the source to the new disk and fire it up.
 
  • Like
Reactions: Dominic
I'd forgotten to write back for some time. It's definitely a Windows problem, updates work at a decent speed but running DISM operations takes hours. I've got an instance of Windows 10 running on PVE 5.4 doing this as well as the Server 2019 on PVE 6.1 referenced above. Just how things are now I guess, Microsoft will never fix the problem. Thanks for the help though!
 
  • Like
Reactions: Dominic
That's not ideal, but at least you know what's going on. Thank you for the update!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!