VM take too much time to start

Inglebard

Renowned Member
May 20, 2016
109
7
83
33
Hi,

We bought a brand new server.

The server has 500Go of RAM, 2x Intel(R) Xeon(R) Gold 6226R and a RTX A4500.

We have 2 Vms inside. VM1 works properly.

VM 2 is a Windows SERVER 2022 with 480Go of RAM, 28 cores, and PCI passthrought to get access to the RTX A4500.
The VM take too long to start :
swtpm_setup: Not overwriting existing state file.
TASK ERROR: start failed: command '/usr/bin/kvm -id 101 -name 'WIN2022RDS,debug-threads=on' -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/101.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/101.pid -daemonize -smbios 'type=1,uuid=4ea8162d-4282-4229-91c1-47d85be97eda' -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.secboot.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=540672,file=/dev/pve/vm-101-disk-0' -smp '28,sockets=1,cores=28,maxcpus=28' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -no-hpet -cpu 'kvm64,enforce,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vendor_id=proxmox,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep' -m 491520 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=bea74088-785d-4ab8-95db-43f82036ef53' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:af:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:af:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'usb-host,hostbus=1,hostport=12,id=usb0' -device 'usb-host,hostbus=1,hostport=10,id=usb1' -chardev 'socket,id=tpmchar,path=/var/run/qemu-server/101.swtpm' -tpmdev 'emulator,id=tpmdev,chardev=tpmchar' -device 'tpm-tis,tpmdev=tpmdev' -chardev 'socket,path=/var/run/qemu-server/101.qga,server=on,wait=off,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:6ed62213e89a' -drive 'file=/dev/pve/vm-101-disk-1,if=none,id=drive-ide0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,rotation_rate=1,bootindex=100' -drive 'if=none,id=drive-ide2,media=cdrom,aio=io_uring' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/pve/vm-101-disk-3,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=io_uring,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1' -netdev 'type=tap,id=net0,ifname=tap101i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=52:F5:7A:18:78:0C,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102' -rtc 'driftfix=slew,base=localtime' -machine 'type=pc-q35-7.0+pve0' -global 'kvm-pit.lost_tick_policy=discard'' failed: got timeout

I think the issue is related to the quantity of RAM and the pci passthrough.
I notice on start (when it starts), I need to wait about 7min to see the Proxmox bios.
It waits to fill all the vm ram before booting. If I divide the quantity of ram by 2, I need to wait about 3min to see the Proxmox bios.
When I remove the PCI passthrought, the boot is instant.

It is a normal behavior ? How can I speed up the start and avoid timeout ?
 
It is a normal behavior ? How can I speed up the start and avoid timeout ?

Yes, because with PCIe-passthrough the complete 480 GB of memory you gave that VM have to be allocated on the startup of that VM.

Add: hugepages: 1024 and: numa: 1 each in a new line to your: /etc/pve/qemu-server/101.conf to utilize 1 GB hugepages for that VM and see if it helps.

PS.: For such a resourceful VM, you might also consider using CPU-type: host and adjust the virtual processor architecture to better match the physical one (e.g.: number of sockets), if you want the best performance.