vzdump does not work all the time when there is a filesystem using xfs

bread-baker

Member
Mar 6, 2010
432
0
16
Yesterday I added a new disk and setup an xfs file system to use for backing up data.

We have about 15 open'vs . most use squeeze. the containers are not stored on the xfs file system

On our nightly backup, 12 of the containers were successfully backedup , 3 were not.

from the vzdump mail we got this [ note that the storage is XFS ]
Code:
 vzdump --quiet --snapshot --compress --storage pvebackup --mailto fbcadmin@fantinibakery.com --all


  101        err      00:00:46     0.00MB  -
  102        ok       00:07:41     1.53GB  /bkup/pvebackup/vzdump-openvz-102-2011_07_12-23_00_48.tgz
  104        err      00:01:01     0.00MB  -
  114        ok       00:02:15      331MB  /bkup/pvebackup/vzdump-openvz-114-2011_07_12-23_09_30.tgz
  128        ok       00:01:45      228MB  /bkup/pvebackup/vzdump-openvz-128-2011_07_12-23_11_45.tgz
  129        ok       00:01:24      177MB  /bkup/pvebackup/vzdump-openvz-129-2011_07_12-23_13_30.tgz
  149        ok       00:00:55      169MB  /bkup/pvebackup/vzdump-openvz-149-2011_07_12-23_14_54.tgz
  169        err      00:01:02     0.00MB  -
  175        ok       00:02:24      532MB  /bkup/pvebackup/vzdump-openvz-175-2011_07_12-23_16_51.tgz
.. an more..

101 is etch, the rest are squeeze.

from the detail part of 104's failure:
Code:
  104: Jul 12 23:08:29 INFO: Starting Backup of VM 104 (openvz)
  104: Jul 12 23:08:29 INFO: CTID 104 exist mounted running
  104: Jul 12 23:08:29 INFO: status = CTID 104 exist mounted running
  104: Jul 12 23:08:29 INFO: mode failure - unable to detect lvm volume group
  104: Jul 12 23:08:29 INFO: trying 'suspend' mode instead
  104: Jul 12 23:08:29 INFO: backup mode: suspend
  104: Jul 12 23:08:29 INFO: ionice priority: 7
  104: Jul 12 23:08:29 INFO: starting first sync /var/lib/vz/private/104/ to /bkup/pvebackup/vzdump-openvz-104-2011_07_12-23_08_29.tmp
  104: Jul 12 23:09:28 INFO: Number of files: 25103
  104: Jul 12 23:09:28 INFO: Number of files transferred: 20268
  104: Jul 12 23:09:28 INFO: Total file size: 711835936 bytes
  104: Jul 12 23:09:28 INFO: Total transferred file size: 708472007 bytes
  104: Jul 12 23:09:28 INFO: Literal data: 708472007 bytes
  104: Jul 12 23:09:28 INFO: Matched data: 0 bytes
  104: Jul 12 23:09:28 INFO: File list size: 552311
  104: Jul 12 23:09:28 INFO: File list generation time: 0.028 seconds
  104: Jul 12 23:09:28 INFO: File list transfer time: 0.000 seconds
  104: Jul 12 23:09:28 INFO: Total bytes sent: 710001609
  104: Jul 12 23:09:28 INFO: Total bytes received: 417774
  104: Jul 12 23:09:28 INFO: sent 710001609 bytes  received 417774 bytes  11939821.56 bytes/sec
  104: Jul 12 23:09:28 INFO: total size is 711835936  speedup is 1.00
  104: Jul 12 23:09:28 INFO: first sync finished (59 seconds)
  104: Jul 12 23:09:28 INFO: suspend vm
  104: Jul 12 23:09:28 INFO: Setting up checkpoint...
  104: Jul 12 23:09:28 INFO:     suspend...
  104: Jul 12 23:09:28 INFO: Can not suspend container: Invalid argument
  104: Jul 12 23:09:28 INFO: Error: unsupported fs type xfs
  104: Jul 12 23:09:28 INFO: Checkpointing failed
  104: Jul 12 23:09:30 ERROR: Backup of VM 104 failed - command 'vzctl --skiplock chkpnt 104 --suspend' failed with exit code 16

ok so I setup a new backup storage area , which uses ext3 and got this:
Code:
  vzdump --quiet --snapshot --compress --storage vzbkp --mailto fbcadmin@fantinibakery.com 104 169
  
  104: Jul 13 09:45:02 INFO: Starting Backup of VM 104 (openvz)
  104: Jul 13 09:45:02 INFO: CTID 104 exist mounted running
  104: Jul 13 09:45:02 INFO: status = CTID 104 exist mounted running
  104: Jul 13 09:45:02 INFO: mode failure - unable to detect lvm volume group
  104: Jul 13 09:45:02 INFO: trying 'suspend' mode instead
  104: Jul 13 09:45:02 INFO: backup mode: suspend
  104: Jul 13 09:45:02 INFO: ionice priority: 7
  104: Jul 13 09:45:02 INFO: starting first sync /var/lib/vz/private/104/ to /vzbkp/vzdump-openvz-104-2011_07_13-09_45_02.tmp
  104: Jul 13 09:45:43 INFO: Number of files: 25103
  104: Jul 13 09:45:43 INFO: Number of files transferred: 20268
  104: Jul 13 09:45:43 INFO: Total file size: 716086963 bytes
  104: Jul 13 09:45:43 INFO: Total transferred file size: 712723034 bytes
  104: Jul 13 09:45:43 INFO: Literal data: 712723034 bytes
  104: Jul 13 09:45:43 INFO: Matched data: 0 bytes
  104: Jul 13 09:45:43 INFO: File list size: 552275
  104: Jul 13 09:45:43 INFO: File list generation time: 0.013 seconds
  104: Jul 13 09:45:43 INFO: File list transfer time: 0.000 seconds
  104: Jul 13 09:45:43 INFO: Total bytes sent: 714252831
  104: Jul 13 09:45:43 INFO: Total bytes received: 417485
  104: Jul 13 09:45:43 INFO: sent 714252831 bytes  received 417485 bytes  17220971.47 bytes/sec
  104: Jul 13 09:45:43 INFO: total size is 716086963  speedup is 1.00
  104: Jul 13 09:45:43 INFO: first sync finished (41 seconds)
  104: Jul 13 09:45:43 INFO: suspend vm
  104: Jul 13 09:45:43 INFO: Setting up checkpoint...
  104: Jul 13 09:45:43 INFO:     suspend...
  104: Jul 13 09:45:43 INFO: Can not suspend container: Invalid argument
  104: Jul 13 09:45:43 INFO: Error: unsupported fs type xfs
  104: Jul 13 09:45:43 INFO: Checkpointing failed
  104: Jul 13 09:45:44 ERROR: Backup of VM 104 failed - command 'vzctl --skiplock chkpnt 104 --suspend' failed with exit code 16

It is strange that some containers get backed up and others do not.

Does anyone have a suggestion to try?

Else I'll need to reformat the 1-tb drive and put the 500GB+ data back.
 
note if i use --stop the backup works:
Code:
vzdump  --stop --storage pvebackup  104 INFO: starting new backup job: vzdump --stop --storage pvebackup 104
INFO: Starting Backup of VM 104 (openvz)
INFO: CTID 104 exist mounted running
INFO: status = CTID 104 exist mounted running
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: stopping vm
INFO: Stopping container ...
INFO: Container was stopped
INFO: Container is unmounted
INFO: creating archive '/bkup/pvebackup/vzdump-openvz-104-2011_07_13-10_23_08.tar'
INFO: Total bytes written: 718653440 (686MiB, 413MiB/s)
INFO: archive file size: 685MB
INFO: delete old backup '/bkup/pvebackup/vzdump-openvz-104-2011_07_06-23_11_16.tgz'
INFO: restarting vm
INFO: Starting container ...
INFO: Container is mounted
INFO: Adding IP address(es): 10.100.100.146
INFO: Setting CPU units: 1000
INFO: Setting CPUs: 1
INFO: Set hostname: fbcadmin.fantinibakery.com
INFO: File resolv.conf was modified
INFO: Setting quota ugidlimit: 0
INFO: Container start in progress...
INFO: vm is online again after 8 seconds
INFO: Finished Backup of VM 104 (00:00:08)
INFO: Backup job finished successfuly
 
ok the moral of the story is do not use xfs for any partition on the system .
so I'll reformat /bkup to ext3 and restore it's data .
 
ok the moral of the story is do not use xfs for any partition on the system .
so I'll reformat /bkup to ext3 and restore it's data .

I am not sure if that helps - seems to be a container related problem. What kernel version do you run?
 
it has worked for a year.

Last night I changed the drive to ext3 , and we are currently using what was the Secondary. The system which had the xfs mount is now the Secondary. As we only run the backup on the primary, I'll need to switch back on the weekend and have the backup run .
The back results from early this morning:
vzdump backup status (proxmox4.fantinibakery.com) : backup successful

So I'll test this on the current system: make and mount a small xfs partition. then try to backup 2 of the vz's which failled on the other.
 
Using a different system, with a XFS filesystem mounted, the vzdump works . So I can not duplicate the issue.

When the original system is the primary [ using drbd we just use one prinary] , I'll try the test again.