one of 20 kvm vm does not start reliably

roadrunner_rad

Active Member
Aug 6, 2019
51
5
28
61
Hi support team,

we are running several kvm vm's on a cluster of two nodes. We have one vm that does not start reliably.
Often the start via commandline (script) "qm start " ends with an:


TASK ERROR: start failed: command '/usr/bin/kvm -id 200 -name xen5-bkup -chardev 'socket,id=qmp,path=/var/run/qemu-server/200.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/200.pid -daemonize -smbios 'type=1,uuid=270034e8-e942-47ca-a03b-2ff765dae193' -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/200.vnc,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 81920 -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'vmgenid,guid=9e96020f-b4c7-4180-8258-61131d28557d' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'vfio-pci,host=0000:85:00.0,id=hostpci1,bus=pci.0,addr=0x11' -device 'VGA,id=vga,bus=pci.0,addr=0x2' -chardev 'socket,path=/var/run/qemu-server/200.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:d78e46f69bf' -drive 'if=none,id=drive-ide1,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.0,unit=1,drive=drive-ide1,id=ide1,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/mnt/vmstore/images/200/vm-200-disk-0.qcow2,if=none,id=drive-scsi0,discard=on,format=qcow2,cache=none,aio=native,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap200i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=EE:C8:CB:05:02:72,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc+pve1'' failed: got timeout

Kernelversion

Linux 5.0.21-5-pve #1 SMP PVE 5.0.21-10 (Wed, 13 Nov 2019 08:27:10 +0100)

PVE Manager Version

pve-manager/6.0-15/52b91481



Starting it from the WEB UI mostly works.

The difference between this and the other vm's is that the vm has much more ram assigned (80GB) and has one pci card (Emulex) passed through.

Any idea on how to proceed and find out more about this issue ?

Regards
Joachim
 
Last edited:
There is more than 300 GB Mem free (machine has 512 GB). I will give the setting a try and come back.
Thanks for your hint.
 
OK I tested a bit. I need to set:

qm set 200 --hugepages 2
qm set 200 --numa 1

then the vm starts each time (20 attemps, all successfull).

So can you please explain a bit more what is the root cause and what does hugepages 2 mean ?

Thanks
Joachim
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!