vzdump umount and lvremove steps is very slow

andrea

Member
Sep 18, 2009
36
0
6
Roma, Italy
www.mediatouch.it
From some days we noticed the umount and lvremove steps takes very long to perform, with no apprent reason. From a time point of view the vzdump of the same group of OpenVZ containers (sized approximately 1GB each) completed in about 30 minutes now takes about 30-40 minutes for each container.

Looking at the log of the containers dump, we noticed the backup completes very fast, however the umont and lvremove steps take about 30 minutes, which is very strange:

Oct 12 08:06:11 INFO: Total bytes written: 1059358720 (1011MiB, 18MiB/s)
Oct 12 08:06:11 INFO: archive file size: 1010MB
Oct 12 08:06:12 INFO: delete old backup '/media/extstorage/vzdump/vzdump-openvz-106-2010_09_20-06_43_59.tar'
Oct 12 08:37:00 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 08:37:00 INFO: Finished Backup of VM 106 (00:31:57)

This holds true with all the dumps:

Oct 12 07:23:32 INFO: Total bytes written: 592445440 (565MiB, 18MiB/s)
Oct 12 07:23:32 INFO: archive file size: 565MB
Oct 12 07:23:32 INFO: delete old backup '/media/extstorage/vzdump/vzdump-openvz-105-2010_09_20-06_42_20.tar'
Oct 12 08:05:03 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 08:05:03 INFO: Finished Backup of VM 105 (00:42:08)

Both /media and /mnt are in PRUNEPATHS of updatedb.conf. Lvs command dsiplay an occupation of 2-3% for vzsnap snapshot. All applications running in the container are also affected, responding slowly until vzdump completes.

Option used in vzdump command (launched form command line):
--dumpdir=/path/to/external/storage (physical external disk)
--maxfiles 7
--snapshot
--all
--bwlimit 20480
--exclude (some unwanted containers)

I'm clueless at the moment. :confused:

Andrea
 
Oct 12 07:23:32 INFO: delete old backup '/media/extstorage/vzdump/vzdump-openvz-105-2010_09_20-06_42_20.tar'

Obviously above command is quite slow - So it seems your external storage is simply slow (/media/extstorage/)? Please post the output of:

# pveperf /media/extstorage/vzdump
 
Dietmar:

# pveperf /media/extstorage/vzdump
CPU BOGOMIPS: 10400.28
REGEX/SECOND: 459861
HD SIZE: 2750.67 GB (/dev/mapper/backups)
BUFFERED READS: 30.02 MB/sec
AVERAGE SEEK TIME: 18.03 ms
FSYNCS/SECOND: 122.33

Tried also to delete a old single vzdump .tar file residing on the same external storage (about 750MB):

# date; rm -fr vzdump-openvz-201*; date

Tue Oct 12 10:17:03 CEST 2010
Tue Oct 12 10:17:04 CEST 2010

More or less 1 second.
 
Thanks for the hint. The test made:

1. dump using the external sotrage as dump dir (on a different folder):

# vzdump --dumpdir=/media/extstorage/vzdumptest --maxfiles 1 --snapshot --bwlimit 20480 101

Oct 12 11:09:43 INFO: Starting Backup of VM 101 (openvz)
Oct 12 11:09:43 INFO: CTID 101 exist mounted running
Oct 12 11:09:43 INFO: status = CTID 101 exist mounted running
Oct 12 11:09:43 INFO: backup mode: snapshot
Oct 12 11:09:43 INFO: bandwidth limit: 20480 KB/s
Oct 12 11:09:43 INFO: ionice priority: 7
Oct 12 11:09:43 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox1-0')
Oct 12 11:09:44 INFO: Logical volume "vzsnap-proxmox1-0" created
Oct 12 11:09:45 INFO: creating archive '/media/extstorage/vzdumptest/vzdump-openvz-101-2010_10_12-11_09_43.tar'
Oct 12 11:10:58 INFO: Total bytes written: 925265920 (883MiB, 16MiB/s)
Oct 12 11:10:58 INFO: archive file size: 882MB
Oct 12 11:40:21 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 11:40:21 INFO: Finished Backup of VM 101 (00:30:38)


2. dump using the server internal disk array (hardware RAID10)

# vzdump --dumpdir=/delendo --maxfiles 1 --snapshot --bwlimit 20480 101

Oct 12 11:46:53 INFO: Starting Backup of VM 101 (openvz)
Oct 12 11:46:53 INFO: CTID 101 exist mounted running
Oct 12 11:46:53 INFO: status = CTID 101 exist mounted running
Oct 12 11:46:53 INFO: backup mode: snapshot
Oct 12 11:46:53 INFO: bandwidth limit: 20480 KB/s
Oct 12 11:46:53 INFO: ionice priority: 7
Oct 12 11:46:53 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox1-0')
Oct 12 11:46:54 INFO: Logical volume "vzsnap-proxmox1-0" created
Oct 12 11:46:55 INFO: creating archive '/delendo/vzdump-openvz-101-2010_10_12-11_46_53.tar'
Oct 12 11:48:07 INFO: Total bytes written: 925265920 (883MiB, 14MiB/s)
Oct 12 11:48:07 INFO: archive file size: 882MB
Oct 12 12:20:44 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 12:20:44 INFO: Finished Backup of VM 101 (00:33:51)

Time looks quite similar. Then tried to remove both dumps:

From external storage:
# time rm -fr /media/extstorage/vzdumptest/vzdump-openvz-101*
real 0m0.082s
user 0m0.000s
sys 0m0.080s

From server raid10 disk array:
# time rm -fr /delendo/vzdump-openvz-101*
real 0m0.318s
user 0m0.000s
sys 0m0.320s
 
Tested, with:

# vzdump --dumpdir=/media/extstorage/vzdumptest --maxfiles 1 --snapshot --ionice 2 --bwlimit 20480 101

Not much difference:

Oct 12 15:17:17 INFO: Starting Backup of VM 101 (openvz)
Oct 12 15:17:17 INFO: CTID 101 exist mounted running
Oct 12 15:17:17 INFO: status = CTID 101 exist mounted running
Oct 12 15:17:18 INFO: backup mode: snapshot
Oct 12 15:17:18 INFO: bandwidth limit: 20480 KB/s
Oct 12 15:17:18 INFO: ionice priority: 2
Oct 12 15:17:18 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox1-0')
Oct 12 15:17:19 INFO: Logical volume "vzsnap-proxmox1-0" created
Oct 12 15:17:20 INFO: creating archive '/media/extstorage/vzdumptest/vzdump-openvz-101-2010_10_12-15_17_17.tar'
Oct 12 15:18:22 INFO: Total bytes written: 925276160 (883MiB, 17MiB/s)
Oct 12 15:18:22 INFO: archive file size: 882MB
Oct 12 15:44:19 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 15:44:19 INFO: Finished Backup of VM 101 (00:27:02)
 
# pveperf

CPU BOGOMIPS: 10400.28
REGEX/SECOND: 449780
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 154.28 MB/sec
AVERAGE SEEK TIME: 11.68 ms
FSYNCS/SECOND: 1231.74
DNS EXT: 61.63 ms
DNS INT: 1.12 ms

Last night we tested a vzdump using "--ionice 0" in a time window where containers are not under load. Dump took more than two hours, following the log of the first dumped system, about two minutes to make the dump and about 20 minutes for the rest:

Oct 12 21:15:02 INFO: Starting Backup of VM 101 (openvz)
Oct 12 21:15:02 INFO: CTID 101 exist mounted running
Oct 12 21:15:02 INFO: status = CTID 101 exist mounted running
Oct 12 21:15:02 INFO: backup mode: snapshot
Oct 12 21:15:02 INFO: ionice priority: 0
Oct 12 21:15:02 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox1-0')
Oct 12 21:15:06 INFO: Logical volume "vzsnap-proxmox1-0" created
Oct 12 21:15:06 INFO: creating archive '/media/extstorage/vzdump/vzdump-openvz-101-2010_10_12-21_15_02.tar'
Oct 12 21:16:19 INFO: Total bytes written: 925286400 (883MiB, 14MiB/s)
Oct 12 21:16:19 INFO: archive file size: 882MB
Oct 12 21:16:20 INFO: delete old backup '/media/extstorage/vzdump/vzdump-openvz-101-2010_09_29-06_15_02.tar'
Oct 12 21:36:26 INFO: Logical volume "vzsnap-proxmox1-0" successfully removed
Oct 12 21:36:26 INFO: Finished Backup of VM 101 (00:21:24)
 
I launched manually a vzdump:

# vzdump --dumpdir=/media/extstorage/vzdumptest --maxfiles 1 --snapshot --ionice 0 101

Looking at the folder where vzdump is done, I noticed the folder ".tmp" that appears to stay there even after the messages of tar completed:

....
....
INFO: creating archive '/media/extstorage/vzdumptest/vzdump-openvz-101-2010_10_13-10_32_21.tar'
INFO: Total bytes written: 925337600 (883MiB, 14MiB/s)
INFO: archive file size: 882MB

The folder "vzdumptest" shows the tar file but not the log. Log will appear after the umont and lvremove, in the meantime (about 20 minutes) vzdump is trying to complete.
 
To debug that, we need to know what causes the delay - umount or lvremove. (I suggest you simple add some debug statements to /usr/share/perl5/PVE/VZDump/QemuServer.pm)

Code:
--- VZDump::QemuServer.pm    2010-10-12 10:38:48.000000000 +0200
+++ VZDump::QemuServer-debug.pm    2010-10-13 10:59:20.000000000 +0200
@@ -438,15 +438,20 @@
     my ($self, $task, $vmid) = @_;
 
    foreach my $di (@{$task->{disks}}) {
        if ($di->{snapshot_mount}) {
+          $self->loginfo ("start umount");
           $self->cmd_noerr ("umount $di->{mountpoint}");
+          $self->loginfo ("end umount");
        }
 
        if ($di->{cleanup_lvm}) {
        if (-b $di->{snapdev}) {
            if ($di->{type} eq 'block') {
+             $self->loginfo ("start snapshot_free");
              $self->snapshot_free ($di->{volid}, $di->{snapname}, $di->{snapdev}, 1);
+             $self->loginfo ("end snapshot_free");
            } elsif ($di->{type} eq 'file') {
              $self->snapshot_free ($di->{path}, $di->{snapname}, $di->{snapdev}, 1);
            }
 
Ok, searched for the corresponding lines in OpenVZ.pm but I was unable to find them. I'm not sure I correctly understood where to do the changes. I did them in QemuServer.pm but none of the messages added was logged when issuing:

# vzdump --dumpdir=/media/extstorage/vzdumptest --maxfiles 1 --snapshot --ionice 0 101
 
pls add logs to the forum, no pastbin please. (add as *.txt attachment).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!