Hey All,
I have a bit of an issue with our VZDump Backup.. this works fine on all VM's except one running Ubuntu 6.06.2.
When the backup kicks off it should suspend the VM and then resume, but when I go to check the VM in the morning it is stopped.
All logs indicate that this has been resumed by prox, but this is not the case.
Heres the logs:
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4116]: (root) CMD (vzdump --quiet --node 1 --suspend --storage Backup --mailto it@holdcroft.com 113)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4115]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4118]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4117]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Apr 10 22:50:02 PROXVE01 vzdump[4116]: INFO: trying to get global lock - waiting...
.....
Apr 11 00:21:55 PROXVE01 vzdump[3269]: INFO: Finished Backup of VM 116 (00:07:15)
Apr 11 00:21:55 PROXVE01 vzdump[3269]: INFO: Backup job finished successfuly
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: got global lock
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: starting new backup job: vzdump --quiet --node 1 --suspend --storage Backup --mailto it@holdcroft.com 113
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: Starting Backup of VM 113 (qemu)
Apr 11 00:21:57 PROXVE01 postfix/pickup[4735]: 6BC163A47D2: uid=0 from=<root>
Apr 11 00:21:57 PROXVE01 postfix/cleanup[5682]: 6BC163A47D2: message-id=<20110410232157.6BC163A47D2@PROXVE01.localdomain>
Apr 11 00:21:57 PROXVE01 postfix/qmgr[1938]: 6BC163A47D2: from=<root@PROXVE01.localdomain>, size=6097, nrcpt=1 (queue active)
Apr 11 00:21:58 PROXVE01 postfix/smtp[5685]: 6BC163A47D2: to=<it@holdcroft.com>, relay=vpop.holdcroft.com[192.168.0.5]:25, delay=1.2, delays=0.51/0.06/0.08/0.53, dsn=2.0.0, status=sent (250 2.0.0 OK)
Apr 11 00:21:58 PROXVE01 postfix/qmgr[1938]: 6BC163A47D2: removed
Apr 11 00:21:58 PROXVE01 qm[5692]: VM 113 suspend
Apr 11 00:22:24 PROXVE01 pvemirror[2079]: starting cluster syncronization
Apr 11 00:22:25 PROXVE01 pvemirror[2079]: syncing templates
Apr 11 00:22:25 PROXVE01 pvemirror[2079]: cluster syncronization finished (0.79 seconds (files 0.00, config 0.00))
....
Apr 11 02:25:13 PROXVE01 qm[7685]: VM 113 resume
Apr 11 02:25:13 PROXVE01 kernel: vmbr0: port 2(tap113i0d0) entering disabled state
Apr 11 02:25:13 PROXVE01 kernel: vmbr0: port 2(tap113i0d0) entering disabled state
Apr 11 02:25:14 PROXVE01 vzdump[4116]: INFO: Finished Backup of VM 113 (02:03:18)
Apr 11 02:25:14 PROXVE01 vzdump[4116]: INFO: Backup job finished successfuly
Apr 11 02:25:16 PROXVE01 postfix/pickup[6418]: A24C43A47D2: uid=0 from=<root>
Apr 11 02:25:16 PROXVE01 postfix/cleanup[7697]: A24C43A47D2: message-id=<20110411012516.A24C43A47D2@PROXVE01.localdomain>
Apr 11 02:25:16 PROXVE01 postfix/qmgr[1938]: A24C43A47D2: from=<root@PROXVE01.localdomain>, size=3943, nrcpt=1 (queue active)
Apr 11 02:25:16 PROXVE01 postfix/smtp[7699]: A24C43A47D2: to=<it@holdcroft.com>, relay=vpop.holdcroft.com[192.168.0.5]:25, delay=2.1, delays=1.9/0.03/0.06/0.12, dsn=2.0.0, status=sent (250 2.0.0 OK)
Apr 11 02:25:16 PROXVE01 postfix/qmgr[1938]: A24C43A47D2: removed
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: starting cluster syncronization
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: syncing templates
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: cluster syncronization finished (0.25 seconds (files 0.00, config 0.00))
Apr 11 02:25:29 PROXVE01 ntpd[2090]: Deleting interface #14 tap113i0d0, fe80::f899:dbff:fee6:9910#123, interface stats: received=0, sent=0, dropped=0, active_time=242100 secs
And the logs from the VM:
Apr 6 23:07:50 han-pdc slapd[8082]: conn=5792 op=1 SRCH base="" scope=0 deref=0 filter="(objectClass=*)"
Apr 6 23:07:50 han-pdc slapd[8082]: conn=5792 op=1 SRCH attr=supportedControl
Apr 7 06:22:54 han-pdc syslogd 1.4.1#17ubuntu7.1: restart.
Apr 7 06:22:54 han-pdc nscd: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Apr 7 06:22:54 han-pdc nscd: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Apr 7 06:22:54 han-pdc kernel: Inspecting /boot/System.map-2.6.15-23-server
Apr 7 06:22:54 han-pdc kernel: Loaded 23140 symbols from /boot/System.map-2.6.15-23-server.
PVE Version:
PROXVE01:~# pveversion -v
pve-manager: 1.8-15 (pve-manager/1.8/5754)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-32
pve-kernel-2.6.32-4-pve: 2.6.32-32
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-11
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.0-3
ksm-control-daemon: 1.0-5
Backup's are stored to local mounted backup folder (/var/lib/vz/backup).
Image files are stored on secondary disk, mounted on folder (in linux) and added to Prox as NFS (/var/lib/vz/DataStore/), this is an LVM where the Backups are just stored in ext3 (I think, dependant on what Prox VE uses for file system)...
I have tried changing the backup method to use suspend, stop and snapshot but I still get the same result each time..
I have checked ACPI on the Linux VM and tried to install this as a package, but this did not resolve.
We have a WinXP VM running on the same node backing up to the same place fine..
Any idea's guys?
I have a bit of an issue with our VZDump Backup.. this works fine on all VM's except one running Ubuntu 6.06.2.
When the backup kicks off it should suspend the VM and then resume, but when I go to check the VM in the morning it is stopped.
All logs indicate that this has been resumed by prox, but this is not the case.
Heres the logs:
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4116]: (root) CMD (vzdump --quiet --node 1 --suspend --storage Backup --mailto it@holdcroft.com 113)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4115]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4118]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Apr 10 22:50:01 PROXVE01 /USR/SBIN/CRON[4117]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Apr 10 22:50:02 PROXVE01 vzdump[4116]: INFO: trying to get global lock - waiting...
.....
Apr 11 00:21:55 PROXVE01 vzdump[3269]: INFO: Finished Backup of VM 116 (00:07:15)
Apr 11 00:21:55 PROXVE01 vzdump[3269]: INFO: Backup job finished successfuly
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: got global lock
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: starting new backup job: vzdump --quiet --node 1 --suspend --storage Backup --mailto it@holdcroft.com 113
Apr 11 00:21:56 PROXVE01 vzdump[4116]: INFO: Starting Backup of VM 113 (qemu)
Apr 11 00:21:57 PROXVE01 postfix/pickup[4735]: 6BC163A47D2: uid=0 from=<root>
Apr 11 00:21:57 PROXVE01 postfix/cleanup[5682]: 6BC163A47D2: message-id=<20110410232157.6BC163A47D2@PROXVE01.localdomain>
Apr 11 00:21:57 PROXVE01 postfix/qmgr[1938]: 6BC163A47D2: from=<root@PROXVE01.localdomain>, size=6097, nrcpt=1 (queue active)
Apr 11 00:21:58 PROXVE01 postfix/smtp[5685]: 6BC163A47D2: to=<it@holdcroft.com>, relay=vpop.holdcroft.com[192.168.0.5]:25, delay=1.2, delays=0.51/0.06/0.08/0.53, dsn=2.0.0, status=sent (250 2.0.0 OK)
Apr 11 00:21:58 PROXVE01 postfix/qmgr[1938]: 6BC163A47D2: removed
Apr 11 00:21:58 PROXVE01 qm[5692]: VM 113 suspend
Apr 11 00:22:24 PROXVE01 pvemirror[2079]: starting cluster syncronization
Apr 11 00:22:25 PROXVE01 pvemirror[2079]: syncing templates
Apr 11 00:22:25 PROXVE01 pvemirror[2079]: cluster syncronization finished (0.79 seconds (files 0.00, config 0.00))
....
Apr 11 02:25:13 PROXVE01 qm[7685]: VM 113 resume
Apr 11 02:25:13 PROXVE01 kernel: vmbr0: port 2(tap113i0d0) entering disabled state
Apr 11 02:25:13 PROXVE01 kernel: vmbr0: port 2(tap113i0d0) entering disabled state
Apr 11 02:25:14 PROXVE01 vzdump[4116]: INFO: Finished Backup of VM 113 (02:03:18)
Apr 11 02:25:14 PROXVE01 vzdump[4116]: INFO: Backup job finished successfuly
Apr 11 02:25:16 PROXVE01 postfix/pickup[6418]: A24C43A47D2: uid=0 from=<root>
Apr 11 02:25:16 PROXVE01 postfix/cleanup[7697]: A24C43A47D2: message-id=<20110411012516.A24C43A47D2@PROXVE01.localdomain>
Apr 11 02:25:16 PROXVE01 postfix/qmgr[1938]: A24C43A47D2: from=<root@PROXVE01.localdomain>, size=3943, nrcpt=1 (queue active)
Apr 11 02:25:16 PROXVE01 postfix/smtp[7699]: A24C43A47D2: to=<it@holdcroft.com>, relay=vpop.holdcroft.com[192.168.0.5]:25, delay=2.1, delays=1.9/0.03/0.06/0.12, dsn=2.0.0, status=sent (250 2.0.0 OK)
Apr 11 02:25:16 PROXVE01 postfix/qmgr[1938]: A24C43A47D2: removed
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: starting cluster syncronization
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: syncing templates
Apr 11 02:25:24 PROXVE01 pvemirror[2079]: cluster syncronization finished (0.25 seconds (files 0.00, config 0.00))
Apr 11 02:25:29 PROXVE01 ntpd[2090]: Deleting interface #14 tap113i0d0, fe80::f899:dbff:fee6:9910#123, interface stats: received=0, sent=0, dropped=0, active_time=242100 secs
And the logs from the VM:
Apr 6 23:07:50 han-pdc slapd[8082]: conn=5792 op=1 SRCH base="" scope=0 deref=0 filter="(objectClass=*)"
Apr 6 23:07:50 han-pdc slapd[8082]: conn=5792 op=1 SRCH attr=supportedControl
Apr 7 06:22:54 han-pdc syslogd 1.4.1#17ubuntu7.1: restart.
Apr 7 06:22:54 han-pdc nscd: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Apr 7 06:22:54 han-pdc nscd: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Apr 7 06:22:54 han-pdc kernel: Inspecting /boot/System.map-2.6.15-23-server
Apr 7 06:22:54 han-pdc kernel: Loaded 23140 symbols from /boot/System.map-2.6.15-23-server.
PVE Version:
PROXVE01:~# pveversion -v
pve-manager: 1.8-15 (pve-manager/1.8/5754)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-32
pve-kernel-2.6.32-4-pve: 2.6.32-32
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-11
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.0-3
ksm-control-daemon: 1.0-5
Backup's are stored to local mounted backup folder (/var/lib/vz/backup).
Image files are stored on secondary disk, mounted on folder (in linux) and added to Prox as NFS (/var/lib/vz/DataStore/), this is an LVM where the Backups are just stored in ext3 (I think, dependant on what Prox VE uses for file system)...
I have tried changing the backup method to use suspend, stop and snapshot but I still get the same result each time..
I have checked ACPI on the Linux VM and tried to install this as a package, but this did not resolve.
We have a WinXP VM running on the same node backing up to the same place fine..
Any idea's guys?