I have been using Proxmox since version 1, and am a huge fan, and it has been very solid for us.
We have 36 Host servers, with approx 100 KVM and 20 OpenVZ instances running in production.
Recently, I upgraded all hosts to version 3.4, and it had been good except for this one issue, where
KVM's seem to randomly stop during backups.
All of our KVM's are running from local storage and backup to an NFS server.
Here is an example of what the backup logs look like when a KVM stops randomly:
INFO: starting new backup job: vzdump --mailnotification always --quiet 1 --mailto sysadmin@ivenue.com --mode snapshot --compress lzo --storage dump02_iv --all 1
INFO: Starting Backup of VM 1001159 (qemu)
INFO: status = running
INFO: update VM 1001159: -lock backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/dump02_iv/dump/vzdump-qemu-1001159-2015_05_23-00_00_01.vma.lzo'
INFO: started backup task '18f3adab-a5c1-4b1d-9d22-491a20c84a25'
INFO: status: 0% (451149824/68719476736), sparse 0% (8048640), duration 3, 150/147 MB/s
INFO: status: 1% (874840064/68719476736), sparse 0% (14073856), duration 6, 141/139 MB/s
INFO: status: 2% (1452015616/68719476736), sparse 0% (32710656), duration 10, 144/139 MB/s
INFO: status: 3% (2550530048/68719476736), sparse 0% (597819392), duration 14, 274/133 MB/s
INFO: status: 5% (3713662976/68719476736), sparse 2% (1404854272), duration 17, 387/118 MB/s
INFO: status: 8% (6175916032/68719476736), sparse 5% (3633430528), duration 20, 820/77 MB/s
INFO: status: 14% (10286596096/68719476736), sparse 11% (7632240640), duration 23, 1370/37 MB/s
INFO: status: 19% (13143179264/68719476736), sparse 14% (10289602560), duration 26, 952/66 MB/s
INFO: status: 23% (16331440128/68719476736), sparse 19% (13314412544), duration 29, 1062/54 MB/s
INFO: status: 29% (19967115264/68719476736), sparse 24% (16939245568), duration 32, 1211/3 MB/s
ERROR: VM 1001159 not running
INFO: aborting backup job
ERROR: VM 1001159 not running
ERROR: Backup of VM 1001159 failed - VM 1001159 not running
INFO: Backup job finished with errors
TASK ERROR: job errors
Does anyone else have this same issue, or know of a solution?
I dont have a Proxmox subscription for updates.
I've seen the issue on both Vanilla Proxmox 3.4 installed from the ISO without any updates, as well as proxmox upgraded with the non-subscription updates.
# pveversion
pve-manager/3.4-1/3f2d890e (running kernel: 2.6.32-37-pve)
root@ivclds51:~# pveversion -v
proxmox-ve-2.6.32: 3.3-147 (running kernel: 2.6.32-37-pve)
pve-manager: 3.4-1 (running version: 3.4-1/3f2d890e)
pve-kernel-2.6.32-37-pve: 2.6.32-147
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.3-20
pve-firmware: 1.1-3
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-31
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-12
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
We have 36 Host servers, with approx 100 KVM and 20 OpenVZ instances running in production.
Recently, I upgraded all hosts to version 3.4, and it had been good except for this one issue, where
KVM's seem to randomly stop during backups.
All of our KVM's are running from local storage and backup to an NFS server.
Here is an example of what the backup logs look like when a KVM stops randomly:
INFO: starting new backup job: vzdump --mailnotification always --quiet 1 --mailto sysadmin@ivenue.com --mode snapshot --compress lzo --storage dump02_iv --all 1
INFO: Starting Backup of VM 1001159 (qemu)
INFO: status = running
INFO: update VM 1001159: -lock backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/dump02_iv/dump/vzdump-qemu-1001159-2015_05_23-00_00_01.vma.lzo'
INFO: started backup task '18f3adab-a5c1-4b1d-9d22-491a20c84a25'
INFO: status: 0% (451149824/68719476736), sparse 0% (8048640), duration 3, 150/147 MB/s
INFO: status: 1% (874840064/68719476736), sparse 0% (14073856), duration 6, 141/139 MB/s
INFO: status: 2% (1452015616/68719476736), sparse 0% (32710656), duration 10, 144/139 MB/s
INFO: status: 3% (2550530048/68719476736), sparse 0% (597819392), duration 14, 274/133 MB/s
INFO: status: 5% (3713662976/68719476736), sparse 2% (1404854272), duration 17, 387/118 MB/s
INFO: status: 8% (6175916032/68719476736), sparse 5% (3633430528), duration 20, 820/77 MB/s
INFO: status: 14% (10286596096/68719476736), sparse 11% (7632240640), duration 23, 1370/37 MB/s
INFO: status: 19% (13143179264/68719476736), sparse 14% (10289602560), duration 26, 952/66 MB/s
INFO: status: 23% (16331440128/68719476736), sparse 19% (13314412544), duration 29, 1062/54 MB/s
INFO: status: 29% (19967115264/68719476736), sparse 24% (16939245568), duration 32, 1211/3 MB/s
ERROR: VM 1001159 not running
INFO: aborting backup job
ERROR: VM 1001159 not running
ERROR: Backup of VM 1001159 failed - VM 1001159 not running
INFO: Backup job finished with errors
TASK ERROR: job errors
Does anyone else have this same issue, or know of a solution?
I dont have a Proxmox subscription for updates.
I've seen the issue on both Vanilla Proxmox 3.4 installed from the ISO without any updates, as well as proxmox upgraded with the non-subscription updates.
# pveversion
pve-manager/3.4-1/3f2d890e (running kernel: 2.6.32-37-pve)
root@ivclds51:~# pveversion -v
proxmox-ve-2.6.32: 3.3-147 (running kernel: 2.6.32-37-pve)
pve-manager: 3.4-1 (running version: 3.4-1/3f2d890e)
pve-kernel-2.6.32-37-pve: 2.6.32-147
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.3-20
pve-firmware: 1.1-3
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-31
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-12
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
Last edited: