After Proxmox 1.9 upgrade - Timeout starting VM after scheduled offline backup

redly

New Member
Oct 29, 2010
8
0
1
Hi,

after upgrading to Proxmox 1.9 i get the following error when doing a scheduled backup:

Oct 20 03:54:20 prox01 qm[258091]:
VM 102 start failed: command
'/usr/bin/kvm -monitor unix:/var/run/qemu-server/102.mon,server,nowait
-vnc unix:/var/run/qemu-server/102.vnc,password
-pidfile /var/run/qemu-server/102.pid
-daemonize
-usbdevice tablet
-name TSRV1
-smp sockets=2,cores=2
-nodefaults
-boot menu=on,order=ca
-vga cirrus
-tdf
-localtime
-rtc-td-hack
-k de
-drive if=ide,index=2,media=cdrom
-drive file=/var/lib/vz/images/102/vm-102-disk-1.raw,if=virtio,index=0,cache=none,boot=on
-m 4096
-netdev type=tap,id=vlan0d0,ifname=tap102i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on
-device virtio-net-pci,romfile=,mac=FA:00:D6:0A:1B:A0,netdev=vlan0d0'
failed: timeout at /usr/share/perl5/PVE/Storage.pm line 350.

The offline backup is correctly done but when Proxmox tries to start the VM the above error occurs.

Anyone any ideas?
 
Yes - there are 2 RAID-Sets in the server.
1. 2x 300GB RAID1 (Proxmox-System LVM-VG pve) VM 101 (70GB) and VM 102 (20GB)
2. 4x 600GB RAID10 (LVM-VG RAID1) VM 103 (500GB) and VM 104 (200GB)

The backup is written to a attached 250GB USB Disk.
At the moment only VM 101 and VM 102 are backuped.
Backup (and start afterwards) of VM 101 is done first
and works without problems. Both VMs are W2k3 and
have virtio drivers intalled.
The Proxmox upgrade was from version 1.6 to 1.9 and the 2.
RAID-Set was installed. The server is a HP DL380 G5 and
HDDs are all SAS.

Some logfile-info that maybe correlates:

prox01:/var/log# cat ./vzdump/qemu-102.log
Oct 21 02:55:54 INFO: Starting Backup of VM 102 (qemu)
Oct 21 02:55:55 INFO: running
Oct 21 02:55:55 INFO: status = running
Oct 21 02:55:55 INFO: backup mode: stop
Oct 21 02:55:55 INFO: ionice priority: 7
Oct 21 02:55:55 INFO: stopping vm
Oct 21 03:05:50 INFO: creating archive '/backup/vzdump-qemu-102-2011_10_21-02_55_54.tar'
Oct 21 03:05:50 INFO: adding '/backup/vzdump-qemu-102-2011_10_21-02_55_54.tmp/qemu-server.conf' to archive ('qemu-server.conf')
Oct 21 03:05:50 INFO: adding '/var/lib/vz/images/102/vm-102-disk-1.raw' to archive ('vm-disk-virtio0.raw')
Oct 21 03:34:25 INFO: Total bytes written: 19664761344 (10.93 MiB/s)
Oct 21 03:34:25 INFO: archive file size: 18.31GB
Oct 21 03:34:25 INFO: delete old backup '/backup/vzdump-qemu-102-2011_10_20-03_19_07.tar'
Oct 21 03:34:27 INFO: restarting vm
Oct 21 03:37:11 INFO: Finished Backup of VM 102 (00:41:17)




prox01:/var/log# cat ./messages
Oct 21 02:55:54 prox01 kernel: device tap101i0d0 entered promiscuous mode
Oct 21 02:55:54 prox01 kernel: vmbr0: port 4(tap101i0d0) entering forwarding state
Oct 21 03:00:20 prox01 kernel: kvm D ffff8805288c8640 0 438727 1 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff8804b97ede58 0000000000000086 0000000000000000 ffff8803d9021000
Oct 21 03:00:20 prox01 kernel: ffff8804b97eddd8 000000000000f788 ffff8804b97edfd8 ffff8805288c8640
Oct 21 03:00:20 prox01 kernel: ffff88052ca8ee20 ffff8805288c8c08 ffff88052f808880 ffff8805288c8c08
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff81186939>] ? do_sync_read+0xf9/0x140
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f63>] rwsem_down_write_failed+0x23/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632e3>] call_rwsem_down_write_failed+0x13/0x20
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4472>] ? down_write+0x32/0x40
Oct 21 03:00:20 prox01 kernel: [<ffffffff811546d7>] sys_munmap+0x47/0x80
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:00:20 prox01 kernel: kvm D ffff880529e4b0e0 0 25380 1 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff8801d4599da0 0000000000000086 0000000000000001 0000000044a299e0
Oct 21 03:00:20 prox01 kernel: 0000000000000001 000000000000f788 ffff8801d4599fd8 ffff880529e4b0e0
Oct 21 03:00:20 prox01 kernel: ffff880529311420 ffff880529e4b6a8 0000000000986000 ffff880529e4b6a8
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff810aea12>] ? futex_wake+0x122/0x140
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f96>] rwsem_down_read_failed+0x26/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632b4>] call_rwsem_down_read_failed+0x14/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f44a4>] ? down_read+0x24/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106cf3d>] exit_mm+0x3d/0x160
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106ed86>] do_exit+0x196/0x980
Oct 21 03:00:20 prox01 kernel: [<ffffffff81186ebf>] ? vfs_write+0x13f/0x1a0
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106f667>] sys_exit+0x17/0x20
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:00:20 prox01 kernel: kvm D ffff8805288134e0 0 25381 1 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff88028b4dfe58 0000000000000086 ffff88028b4dfd78 ffffffff810aea12
Oct 21 03:00:20 prox01 kernel: 0000000000006325 000000000000f788 ffff88028b4dffd8 ffff8805288134e0
Oct 21 03:00:20 prox01 kernel: ffff8801006f8880 ffff880528813aa8 ffff88028b4dfd78 ffff880528813aa8
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff810aea12>] ? futex_wake+0x122/0x140
Oct 21 03:00:20 prox01 kernel: [<ffffffff810b1030>] ? do_futex+0xe0/0xbe0
Oct 21 03:00:20 prox01 kernel: [<ffffffff81083e4f>] ? send_signal+0x4f/0x90
Oct 21 03:00:20 prox01 kernel: [<ffffffff81083f00>] ? do_send_sig_info+0x70/0x90
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f63>] rwsem_down_write_failed+0x23/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632e3>] call_rwsem_down_write_failed+0x13/0x20
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4472>] ? down_write+0x32/0x40
Oct 21 03:00:20 prox01 kernel: [<ffffffff811546d7>] sys_munmap+0x47/0x80
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:00:20 prox01 kernel: kvm D ffff880529311420 0 25384 1 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff880106fe5da0 0000000000000086 0000000000000001 0000000047c2e9e0
Oct 21 03:00:20 prox01 kernel: 0000000000000001 000000000000f788 ffff880106fe5fd8 ffff880529311420
Oct 21 03:00:20 prox01 kernel: ffff8805288134e0 ffff8805293119e8 0000000000986000 ffff8805293119e8
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff810aea12>] ? futex_wake+0x122/0x140
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f96>] rwsem_down_read_failed+0x26/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632b4>] call_rwsem_down_read_failed+0x14/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f44a4>] ? down_read+0x24/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106cf3d>] exit_mm+0x3d/0x160
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106ed86>] do_exit+0x196/0x980
Oct 21 03:00:20 prox01 kernel: [<ffffffff81186ebf>] ? vfs_write+0x13f/0x1a0
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106f667>] sys_exit+0x17/0x20
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:00:20 prox01 kernel: kvm D ffff8801a67b53e0 0 25385 1 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff88012fb65da0 0000000000000086 0000000000000001 000000004862f9e0
Oct 21 03:00:20 prox01 kernel: 0000000000000001 000000000000f788 ffff88012fb65fd8 ffff8801a67b53e0
Oct 21 03:00:20 prox01 kernel: ffff8802f00500c0 ffff8801a67b59a8 0000000000986000 ffff8801a67b59a8
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff810aea12>] ? futex_wake+0x122/0x140
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f96>] rwsem_down_read_failed+0x26/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632b4>] call_rwsem_down_read_failed+0x14/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f44a4>] ? down_read+0x24/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106cf3d>] exit_mm+0x3d/0x160
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106ed86>] do_exit+0x196/0x980
Oct 21 03:00:20 prox01 kernel: [<ffffffff8106f667>] sys_exit+0x17/0x20
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:00:20 prox01 kernel: qm D ffff880529f3d160 0 25376 25373 0x00000000
Oct 21 03:00:20 prox01 kernel: ffff88007be05ce0 0000000000000082 0000000000000000 ffff88007be05c28
Oct 21 03:00:20 prox01 kernel: ffffffff8112f44f 000000000000f788 ffff88007be05fd8 ffff880529f3d160
Oct 21 03:00:20 prox01 kernel: ffff88052cabe3c0 ffff880529f3d728 ffff88052f808a80 ffff880529f3d728
Oct 21 03:00:20 prox01 kernel: Call Trace:
Oct 21 03:00:20 prox01 kernel: [<ffffffff8112f44f>] ? __rmqueue+0x3cf/0x480
Oct 21 03:00:20 prox01 kernel: [<ffffffff81144499>] ? zone_statistics+0x99/0xc0
Oct 21 03:00:20 prox01 kernel: [<ffffffff81195a97>] ? __link_path_walk+0x387/0x1260
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4e05>] rwsem_down_failed_common+0x95/0x1d0
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f4f96>] rwsem_down_read_failed+0x26/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff812632b4>] call_rwsem_down_read_failed+0x14/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff814f44a4>] ? down_read+0x24/0x30
Oct 21 03:00:20 prox01 kernel: [<ffffffff8114f1b3>] access_process_vm+0x53/0x200
Oct 21 03:00:20 prox01 kernel: [<ffffffff811ef197>] proc_pid_cmdline+0x77/0x120
Oct 21 03:00:20 prox01 kernel: [<ffffffff811eff4a>] proc_info_read+0xaa/0xf0
Oct 21 03:00:20 prox01 kernel: [<ffffffff81187118>] vfs_read+0xc8/0x1a0
Oct 21 03:00:20 prox01 kernel: [<ffffffff811872f5>] sys_read+0x55/0x90
Oct 21 03:00:20 prox01 kernel: [<ffffffff8100b302>] system_call_fastpath+0x16/0x1b
Oct 21 03:02:03 prox01 kernel: vmbr0: port 3(tap102i0d0) entering disabled state
Oct 21 03:02:03 prox01 kernel: vmbr0: port 3(tap102i0d0) entering disabled state
 
Last edited:
The error message comes from syslog - the first above is from 20th october,
the following corresponds with the other logfile listings from 21th october.

The VM has to be started by hand every morning - working the whole
day without any problems - shutdown for backup - the tar file is written
to the USB-Disk - seems to get the startup command but does not start.

prox01:/var/log# cat ./syslog
Oct 21 02:55:55 prox01 qm[25374]: VM 102 sending 'shutdown'
Oct 21 02:55:56 prox01 qm[25376]: VM 102 waiting until stopped
Oct 21 03:02:03 prox01 kernel: vmbr0: port 3(tap102i0d0) entering disabled state
Oct 21 03:02:03 prox01 kernel: vmbr0: port 3(tap102i0d0) entering disabled state
Oct 21 03:03:13 prox01 ntpd[4898]: Deleting interface #32 tap102i0d0, fe80::c809:42ff:fe61:4025#123, interface stats: received=0, sent=0, dropped=0, active_time=64500 secs
Oct 21 03:34:28 prox01 qm[41562]: VM 102 start
Oct 21 03:34:59 prox01 qm[41562]: VM 102 start failed:
command '/usr/bin/kvm -monitor unix:/var/run/qemu-server/102.mon,server,nowait
-vnc unix:/var/run/qemu-server/102.vnc,password
-pidfile /var/run/qemu-server/102.pid
-daemonize
-usbdevice tablet
-name TSRV1
-smp sockets=2,cores=2
-nodefaults
-boot menu=on,order=ca
-vga cirrus
-tdf
-localtime
-rtc-td-hack
-k de -drive if=ide,index=2,media=cdrom
-drive file=/var/lib/vz/images/102/vm-102-disk-1.raw,if=virtio,index=0,cache=none,boot=on
-m 4096 -netdev type=tap,id=vlan0d0,ifname=tap102i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on
-device virtio-net-pci,romfile=,mac=FA:00:D6:0A:1B:A0,netdev=vlan0d0'
failed: timeout at /usr/share/perl5/PVE/Storage.pm line 350.
Oct 21 03:37:11 prox01 vzdump[1005146]: INFO: Finished Backup of VM 102 (00:41:17)
Oct 21 03:37:15 prox01 postfix/cleanup[42276]: 552D14063F: message-id=<20111021013714.552D14063F@prox01.localdomain>
Oct 21 03:37:28 prox01 postfix/smtp[42278]: 552D14063F: to=<*****.****@*****.***>, relay=192.168.0.100[192.168.0.100]:25, delay=14, delays=1.1/1.1/0.12/12, dsn=2.6.0, status=sent (250 2.6.0 <20111021013714.552D14063F@prox01.localdomain> Queued mail for delivery)
 
Last edited:
OK - the latest kernel ist installed.
We will see whether the problem is gone.

Many thanks!