qm start fails with timeout error, but vm is actually started

blargling

New Member
Jan 31, 2011
10
0
1
After taking the latest updates, whenever I start a vm either via the CLI or GUI, the task times out. However, the VM actually starts. This would just be a minor annoyance, but it appears to be preventing from live migrating my VMs because the qm start on the target node is not reported as successful. Any ideas?

Example simple start and success despite reporting timeout:

root@proxmox01:~# qm start 111
start failed: command '/usr/bin/kvm -id 111 -chardev 'socket,id=qmp,path=/var/run/qemu-server/111.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/111.vnc,x509,password -pidfile /var/run/qemu-server/111.pid -daemonize -name zm01 -smp 'sockets=1,cores=2' -nodefaults -boot 'menu=on' -vga cirrus -cpu kvm64,+lahf_lm,+x2apic,+sep -k en-us -m 2048 -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -drive 'file=/mnt/pve/vmstore01/template/iso/ubuntu-12.04.4-server-amd64.iso,if=none,id=drive-ide2,media=cdrom,aio=native' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive 'file=gluster://heartbeat/vmstore01/images/111/vm-111-disk-1.qcow2,if=none,id=drive-virtio0,format=qcow2,cache=none,aio=native' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap111i0,script=/var/lib/qemu-server/pve-bridge,vhost=on' -device 'virtio-net-pci,mac=C2:B1:8D:B6:42:5E,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'' failed: got timeout
root@proxmox01:~# ps -ef | grep zm01
root 7664 1 4 01:06 ? 00:00:14 /usr/bin/kvm -id 111 -chardev socket,id=qmp,path=/var/run/qemu-server/111.qmp,server,nowait -mon chardev=qmp,mode=control -vnc unix:/var/run/qemu-server/111.vnc,x509,password -pidfile /var/run/qemu-server/111.pid -daemonize -name zm01 -smp sockets=1,cores=2 -nodefaults -boot menu=on -vga cirrus -cpu kvm64,+lahf_lm,+x2apic,+sep -k en-us -m 2048 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -drive file=/mnt/pve/vmstore01/template/iso/ubuntu-12.04.4-server-amd64.iso,if=none,id=drive-ide2,media=cdrom,aio=native -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=gluster://heartbeat/vmstore01/images/111/vm-111-disk-1.qcow2,if=none,id=drive-virtio0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 -netdev type=tap,id=net0,ifname=tap111i0,script=/var/lib/qemu-server/pve-bridge,vhost=on -device virtio-net-pci,mac=C2:B1:8D:B6:42:5E,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300
 
Here's the additional info (sorry, it's been a while since i've visited the forums, forgot this is always needed). Good call on testing a non-glusterfs VM. A VM with storage on a local directory does NOT have the issue. I suspect the issue is with glusterfs 3.5 versus 3.4 (I recently upgraded the nodes to 3.5).


Output from pveversion:

proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-29-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-24-pve: 2.6.32-111
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-25-pve: 2.6.32-113
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.0-1

Here's the VM config:

root@proxmox01:~# qm config 111
balloon: 512
boot: cdn
bootdisk: virtio0
cores: 2
ide2: vmstore01:iso/ubuntu-12.04.4-server-amd64.iso,media=cdrom,size=679M
memory: 2048
name: zm01
net0: virtio=C2:B1:8D:B6:42:5E,bridge=vmbr0
ostype: l26
sockets: 1
virtio0: vmstore01:111/vm-111-disk-1.qcow2,format=qcow2,cache=none,size=12G
 
I've been pulling my hair out with the same issue, I'm also running GlusterFS 3.5. I just starting playing with ProxMox this week, so GlusterFS 3.5 was the natural choice. Was a workaround ever introduced for this?
 
I've actually switched to testing ceph instead of glusterfs, I hit some other issues with glusterfs even after rolling back to 3.4, such as i/o errors reported to a VM when one of the gluster nodes was rebooted even though quorum still existed .. resulting in the VM remounting the root fs as readonly, essentially taking down the vm.
 
The cause for this problem is a bug in gfapi (the Gluster API).
The VM takes longer to start than the 30 sec timeout of Proxmox waiting for a response of the start action.
A workaround for this bug is to reduce the value of /proc/sys/net/ipv4/tcp_syn_retries from 5 to 3 (on all cluster nodes), so the maximum time it may take to try to reach the unresponsive cluster node is below 30 secs.
With a value of 5, it may take about a minute until the connection establishment is cancelled, with a setting of 3 it may take ~ 20 seconds at max.

A bug report has been filed in the Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1054694 please support it!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!