[SOLVED] Starting qm fails with "got timeout"

cmonty14

Renowned Member
Mar 4, 2014
344
5
83
Hi,
I have defined a storage of type "RBD" with only SSD drives connected to the relevant pool.
All RBDs are available:
Code:
root@ld3955:~# rbd ls -l ssd                                                                     [25/1974]NAME          SIZE    PARENT FMT PROT LOCK
vm-100-disk-0   1 MiB          2
vm-100-disk-1  20 GiB          2      excl
vm-101-disk-0  20 GiB          2      excl
vm-101-disk-1   1 MiB          2
vm-112-disk-0   4 MiB          2      excl
vm-112-disk-1  20 GiB          2      excl
vm-113-disk-0   4 MiB          2      excl
vm-113-disk-1  40 GiB          2      excl
vm-114-disk-0   4 MiB          2      excl
vm-114-disk-1  25 GiB          2      excl
vm-115-disk-0   4 MiB          2
vm-115-disk-1  25 GiB          2      excl
vm-116-disk-0   4 MiB          2
vm-116-disk-1  25 GiB          2      excl
vm-117-disk-0   4 MiB          2      excl
vm-117-disk-1 201 GiB          2      excl
vm-118-disk-0   4 MiB          2      excl
vm-118-disk-1  35 GiB          2      excl
vm-118-disk-2  10 GiB          2
vm-118-disk-3 150 GiB          2      excl
vm-118-disk-4  10 GiB          2      excl
vm-119-disk-0   4 MiB          2      excl
vm-119-disk-1 100 GiB          2      excl
vm-120-disk-0  50 GiB          2      excl
vm-120-disk-1   4 MiB          2      excl
vm-122-disk-0   4 MiB          2      excl
vm-122-disk-1  50 GiB          2      excl
vm-123-disk-0   4 MiB          2      excl
vm-123-disk-1  25 GiB          2
vm-123-disk-2  25 GiB          2      excl
vm-124-disk-0  40 GiB          2      excl
vm-124-disk-1   4 MiB          2
vm-126-disk-0  20 GiB          2      excl
vm-127-disk-0  20 GiB          2      excl
vm-127-disk-1 400 GiB          2
vm-127-disk-2 700 GiB          2
vm-127-disk-3   4 MiB          2
vm-131-disk-0   4 MiB          2      excl
vm-131-disk-1  50 GiB          2      excl
vm-191-disk-0  60 GiB          2      excl
vm-191-disk-1   1 MiB          2
vm-192-disk-0  60 GiB          2      excl
vm-192-disk-1   1 MiB          2      excl
vm-193-disk-0   1 MiB          2      excl
vm-193-disk-1  60 GiB          2      excl
vm-194-disk-0   1 MiB          2      excl
vm-194-disk-1  60 GiB          2      excl
vm-195-disk-0   1 MiB          2      excl
vm-195-disk-1  60 GiB          2      excl
vm-200-disk-0   4 GiB          2
vm-204-disk-0   1 GiB          2
vm-204-disk-1   1 GiB          2
vm-205-disk-0   4 GiB          2
vm-206-disk-0   1 GiB          2

When I try to start VMs 112, 123, 126 the following error is displayed:
Code:
root@ld3955:~# qm start 126
start failed: command '/usr/bin/kvm -id 126 -name vm126-lvewiki -chardev 'socket,id=qmp,path=/var/run/qemu-server/126.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/126.pid -daemonize -smbios 'type=1,uuid=15c0ce84-1bb3-43e7-8a6e-0d843cc8e107' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=rbd:hdd/vm-126-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/images.keyring' -smp '4,sockets=1,cores=4,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/126.vnc,password -cpu host,+kvm_pv_unhalt,+kvm_pv_eoi -m 2048 -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'VGA,id=vga,bus=pci.0,addr=0x2' -chardev 'socket,path=/var/run/qemu-server/126.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:80c575d38d2d' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=rbd:ssd/vm-126-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/fast_images.keyring,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap126i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=6E:68:F4:C5:3C:21,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -netdev 'type=tap,id=net1,ifname=tap126i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=F2:69:73:87:43:F4,netdev=net1,bus=pci.0,addr=0x13,id=net1,bootindex=301' -machine 'type=pc'' failed: got timeout

Many other VMs can be started w/o issues.

How can I analyse the root cause for this failure?
I assume the issue is related to virtual disk of the relevant VMs, but I don't know how to check this.

Please advise.

THX
 
Last edited by a moderator:
Danke.
Ich habe mittels qm unlock <vmid> die Locks entfernt und danach klappt es auch mit dem Start der jeweiligen VM.
 
Danke.
Ich habe mittels qm unlock <vmid> die Locks entfernt und danach klappt es auch mit dem Start der jeweiligen VM.
qm unlock removes only pending config locks, not storage ones. I'd guess that the lock was not properly released (killed client, network) and thus you had to wait for a bit until RBD noticed that the client holding the lock vanished, and thus released it again.
Just an (educated) guess though.