qmrestore slow to restore

e100 · Sep 7, 2011

We have an extra server that we have been using to restore our vzdump backups as part of our backup validation process.

The performance of the qmrestore was quite poor and we spent a lot of time trying to figure out why.
The solution to this problem was mentioned here, thanks kefear!:
http://forum.proxmox.com/threads/4830-Poor-performance-with-qmrestore?p=27545#post27545

We tried everything possible, the only thing that made the restore process faster was to edit qmrestore and add "oflag=direct" to the dd command.
exec 'dd', 'bs=256K', "of=$path", "oflag=direct";

It was mentioned in the other thread that this might have a bad impact on performance of other vms on the system.
While this is true I also agree with kefear, if I am restoring something bad has happened and I want the issue fixed ASAP even if it might impact performance a bit.

4802 Seconds ~79MB/sec to restore without oflag=direct
921 Seconds ~414MB/sec to restore with oflag=direct

I would gladly trade 15 minutes of slowness to reduce my downtime from 80 minutes to 15 minutes.
In our case I doubt it would cause a slowdown anyway since even at 414MB/sec our IO is not saturated, uncompressing the tar file is the bottleneck.

I feel qmrestore should always use oflag=direct or at least make an option to turn it on with a command line switch or configuration file.

Without oflag=direct :

Code:

INFO: restore QemuServer backup '/backup1/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 1101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-1101-disk-1" created
INFO: new volume ID is 'DRBD0:vm-1101-disk-1'
INFO: restore data to '/dev/drbdvg0/vm-1101-disk-1' (34359738368 bytes)
INFO: 0+995464 records in
INFO: 0+995464 records out
INFO: 34359738368 bytes (34 GB) copied, 455.777 s, 75.4 MB/s
INFO: extracting 'vm-disk-virtio1.raw' from archive
INFO:   Rounding up size to full physical extent 250.00 GB
INFO:   Logical volume "vm-1101-disk-2" created
INFO: new volume ID is 'DRBD0:vm-1101-disk-2'
INFO: restore data to '/dev/drbdvg0/vm-1101-disk-2' (268435456000 bytes)
INFO: 0+4083496 records in
INFO: 0+4083496 records out
INFO: 268435456000 bytes (268 GB) copied, 2889.31 s, 92.9 MB/s
INFO: extracting 'vm-disk-virtio2.raw' from archive
INFO:   Rounding up size to full physical extent 100.00 GB
INFO:   Logical volume "vm-1101-disk-3" created
INFO: new volume ID is 'DRBD0:vm-1101-disk-3'
INFO: restore data to '/dev/drbdvg0/vm-1101-disk-3' (107374182400 bytes)
INFO: 2+1798722 records in
INFO: 2+1798722 records out
INFO: 107374182400 bytes (107 GB) copied, 1456.95 s, 73.7 MB/s
INFO: restore QemuServer backup '/backup1/vzdump-qemu-101-2011_08_29-22_00_06.tgz' successful

With oflag=direct:

Code:

INFO: restore QemuServer backup '/backup1/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-1'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-1' (34359738368 bytes)
INFO: 0+1203040 records in
INFO: 0+1203040 records out
INFO: 34359738368 bytes (34 GB) copied, 129.033 s, 266 MB/s
INFO: extracting 'vm-disk-virtio1.raw' from archive
INFO:   Rounding up size to full physical extent 250.00 GB
INFO:   Logical volume "vm-3101-disk-2" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-2'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-2' (268435456000 bytes)
INFO: 0+4707404 records in
INFO: 0+4707404 records out
INFO: 268435456000 bytes (268 GB) copied, 576.755 s, 465 MB/s
INFO: extracting 'vm-disk-virtio2.raw' from archive
INFO:   Rounding up size to full physical extent 100.00 GB
INFO:   Logical volume "vm-3101-disk-3" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-3'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-3' (107374182400 bytes)
INFO: 0+2102632 records in
INFO: 0+2102632 records out
INFO: 107374182400 bytes (107 GB) copied, 245.053 s, 438 MB/s
INFO: restore QemuServer backup '/backup1/vzdump-qemu-101-2011_08_29-22_00_06.tgz' successful

dietmar · Sep 8, 2011

Would you mind to test with new 2.6.32 kernel from pvetest repository?

e100 · Sep 8, 2011

I upgraded to 2.6.32 from pvetest.
Worked fine with my 1880ix-12 Areca card too, thanks!

I did not do a full restore, let it run long enough to restore the first disk in from the tar file.

Without oflag=direct on 2.6.32 from pvetest:

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-1'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-1' (34359738368 bytes)
INFO: 0+655206 records in
INFO: 0+655206 records out
INFO: 34359738368 bytes (34 GB) copied, 432.759 s, 79.4 MB/s

With oflag=direct on 2.6.32 from pvetest:

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-1'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-1' (34359738368 bytes)
INFO: 0+1207554 records in
INFO: 0+1207554 records out
INFO: 34359738368 bytes (34 GB) copied, 125.701 s, 273 MB/s

Just about identical to the same test on 2.6.35.

oflag=direct is much faster.

dietmar · Sep 9, 2011

e100 said:
oflag=direct is much faster.

From the 'man 2 open' manual page:

Code:

 Some  file  systems  may  not implement the flag (O_DIRECT) and open() will fail

So how can we handle that (for example NFS)?

e100 · Sep 9, 2011

I would be happy if we could turn the option on with a config file option, environment variable, or command line switch.

Ie:
/etc/qmrestore.conf:
ddoptions="oflag=direct"

It would be even better if we could define that this option should be used for specific storage destinations.

Ie:
/etc/qmrestore.conf:
storagename:ddoptions="oflag=direct"

That way if I want that option enabled for my storage name DRBD0 but not NFS0 I could setup my system with a config file like this:

/etc/qmrestore.conf:
DRBD0:ddoptions="oflag=direct"

dietmar · Sep 9, 2011

e100 said:
I would be happy if we could turn the option on with a config file option, environment variable, or command line switch.

Well, a command line switch would be OK for me.

But what I really want to know is if this is DRBD related. On what storage types does O_DIRECT speed up things?

e100 · Sep 9, 2011

I believe dd is only used with LVM volumes since the path must be a block device:

if ($opts->{prealloc} || $format ne 'raw' || (-b $path)) {
exec 'dd', 'bs=256K', "of=$path";
die "couldn't exec dd: $!\n";
} else {
exec '/usr/lib/qemu-server/sparsecp', $path;
die "couldn't exec sparsecp: $!\n";
}

Without oflag=direct onto an LVM volume that is not on DRBD:

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'LVM0:vm-3101-disk-1'
INFO: restore data to '/dev/lvm0/vm-3101-disk-1' (34359738368 bytes)
INFO: 0+1311261 records in
INFO: 0+1311261 records out
INFO: 34359738368 bytes (34 GB) copied, 217.709 s, 158 MB/s

With oflag=direct onto an LVM volume that is not on DRBD:

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'LVM0:vm-3101-disk-1'
INFO: restore data to '/dev/lvm0/vm-3101-disk-1' (34359738368 bytes)
INFO: 0+1295416 records in
INFO: 0+1295416 records out
INFO: 34359738368 bytes (34 GB) copied, 133.189 s, 258 MB/s

While running those last test I think I discovered the real cause of the problem.

I edited qmrestore and change bs to obs:
exec 'dd', 'obs=256K', "of=$path";

output from iotop with obs=256K
PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND
13360 root 0 B/s 186.73 M/s 0.00 % 0.00 % dd obs 256K of /dev/lvm0/vm-3101-disk-1

output from iotop with bs=256K
PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND
13435 root 3.71 M/s 53.34 M/s 0.00 % 91.25 % dd bs 256K of /dev/lvm0/vm-3101-disk-1

Seems like there is some weird bug in dd where bs does not always set ibs and obs like the man page says:
bs=BYTES
force ibs=BYTES and obs=BYTES

LVM uses 4096 blocks so if you are not outputting at a multiple of 4096 performance will be bad due to LVM needing to read data to make partial block updates.

restore to LVM no DRBD without oflag=direct and bs= changed to obs=256K:

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'LVM0:vm-3101-disk-1'
INFO: restore data to '/dev/lvm0/vm-3101-disk-1' (34359738368 bytes)
INFO: 67108864+0 records in
INFO: 131072+0 records out
INFO: 34359738368 bytes (34 GB) copied, 148.751 s, 231 MB/s

I think changing bs to obs would be the best solution.
Maybe adding ibs=256K would help too.

dietmar · Sep 10, 2011

e100 said:
I believe dd is only used with LVM volumes since the path must be a block device:

path can be a file too!

e100 said:
I think changing bs to obs would be the best solution.
Maybe adding ibs=256K would help too.

I suggest that you report that bug to the debian bug tracking system instead.

It makes no sense to add workarounds for such bugs - instead we need to fix the bugs.

e100 · Sep 10, 2011

dietmar said:
I suggest that you report that bug to the debian bug tracking system instead.

It makes no sense to add workarounds for such bugs - instead we need to fix the bugs.

Looks like someone did report this bug to the gnu team and the behavior is posix compliant.
http://lists.gnu.org/archive/html/bug-coreutils/2011-03/msg00039.html

The proper fix is to explicitly set ibs and obs according to the above thread.

dietmar · Sep 11, 2011

e100 said:
The proper fix is to explicitly set ibs and obs according to the above thread.

Many thanks for tracking that down - I will upload a fix next week.

e100 · Sep 11, 2011

dietmar said:
Many thanks for tracking that down - I will upload a fix next week.

Thanks!

I've been using dd for years, I need to retrain my brain not to use bs= and start using obs and ibs.
Just need to remember that bs is exactly that, BS.

dietmar · Sep 12, 2011

e100 said:
Thanks!

I've been using dd for years, I need to retrain my brain not to use bs= and start using obs and ibs.
Just need to remember that bs is exactly that, BS.

Just uploaded a fix - please can you test?

http://download1.proxmox.com/debian/dists/lenny/pvetest/binary-amd64/qemu-server_1.1-32_amd64.deb

e100 · Sep 12, 2011

dietmar said:
Just uploaded a fix - please can you test?

http://download1.proxmox.com/debian/dists/lenny/pvetest/binary-amd64/qemu-server_1.1-32_amd64.deb

Works perfect!

Code:

INFO: restore QemuServer backup '/var/lib/vz/restore/vzdump-qemu-101-2011_08_29-22_00_06.tgz' using ID 3101
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-virtio0.raw' from archive
INFO:   Rounding up size to full physical extent 32.00 GB
INFO:   Logical volume "vm-3101-disk-1" created
INFO: new volume ID is 'DRBD0:vm-3101-disk-1'
INFO: restore data to '/dev/drbdvg0/vm-3101-disk-1' (34359738368 bytes)
INFO: 13+30565843 records in
INFO: 131072+0 records out
INFO: 34359738368 bytes (34 GB) copied, 137.612 s, 250 MB/s

Code:

  PID USER      DISK READ  DISK WRITE   SWAPIN    IO>    COMMAND                                                                                       
42687 root           0 B/s  440.55 M/s  0.00 %  0.00 % dd ibs 256K obs 256K of /dev/drbdvg0/vm-3101-disk-1

Thanks for the quick fix!

Search

Search

qmrestore slow to restore

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

We value your privacy