KVM machines very slow /unreachable during vmtar backup

...
FSYNCS/SECOND: 21.73
...
[/CODE]

looks quite bad, this is the reason. should be between 1000 and 5000. you have no raid controller with raid cache enabled (with BBU).

more details about your storage config?
 
looks quite bad, this is the reason. should be between 1000 and 5000. you have no raid controller with raid cache enabled (with BBU).
more details about your storage config?

Our VMs are on lvm, on Areca 1222 w/ raid cache enabled and BBU, 8x 1TB SAS drives. It's quite fast. Is there a way to run pveperf on a physical volume? We have no mounted file systems from this storage on PVE host.
 
no, pveperf needs a filesystem. but if you want to test just create a lvm volume, format it with ext3 and mount it. now do the test.
 
Here you go:

CPU BOGOMIPS: 44756.82
REGEX/SECOND: 862539
HD SIZE: 98.43 GB (/dev/mapper/array-test)
BUFFERED READS: 464.61 MB/sec
AVERAGE SEEK TIME: 9.18 ms
FSYNCS/SECOND: 3595.76
DNS EXT: 48.39 ms
DNS INT: 41.49 ms (praece.com)
 
speed is great, as expected with your hardware. bottleneck seems to be somewhere else. btw, did you change the elevator? which kernel do you run?

can you reduce the backup speed to 10000, to get another result?
 
proxmox-ve-2.6.32: 1.9-53

I changed /sys/block/sda/queue/scheduler to noop, which seemed to help system performance a bit in general. Elevator for other drives is cfq, as they are single SATA (boot/pve and backup) drives.

I will change backup speed to 10k and check resulting system load tomorrow.

Any other ideas?

Mike
 
Also, if 10K improves it, the situation will still be quite poor, as backups will take a very, very long time. If scaling down bwlimit further helps, should we expect a faster backup target disk to allow us to bring bwlimit back up?
 
VERY interesting result from last night: On a hunch, rather than moving bwlimit down to 10000, I changed our size parameter (the maximum snapshot diff size) from 32768 to 8192. We had increased the size parameter a while ago because we were hitting the max of default on some VMs that used to be quite i/o intensive during the night.

Interestingly, this reduced load by 2.5x (resulting in acceptable performance) throughout the entire backup period.

I believe we've found our culprit. I will re-increase bwlimit tonight (to 50000) and see what happens to load. I don't know much about LVM internals, but apparently snapshot size has a dramatic effect of performance.
 
yes, lvm shapshots are not the performance winner but so far its working great in almost 99 % of cases.

for big KVM guests (especially windows) I tend to use separate partition (e.g. 4 GB) for swap and I exclude this one from backups. I see smaller snapshot size then during the backup. also I use cache=unsafe for this swap disk.

Example in a config file:

Code:
virtio0: local:126/vm-126-disk-3.raw,cache=unsafe,backup=no
 
VERY interesting result from last night: On a hunch, rather than moving bwlimit down to 10000, I changed our size parameter (the maximum snapshot diff size) from 32768 to 8192. We had increased the size parameter a while ago because we were hitting the max of default on some VMs that used to be quite i/o intensive during the night.

Interestingly, this reduced load by 2.5x (resulting in acceptable performance) throughout the entire backup period.

I believe we've found our culprit. I will re-increase bwlimit tonight (to 50000) and see what happens to load. I don't know much about LVM internals, but apparently snapshot size has a dramatic effect of performance.

I don't have a /etc/vzdump.conf so everything is at defaults i guess, but still having IO problems.
It only happens with our pve1.9 (vzdump 1.2-16) machine. Can i downgrade vzdump only (to vzdump 1.2-14)?



 
Or, perhaps, do any of the Proxmox folks have an idea of what may have changed between these versions to cause load issues?
 
Or, perhaps, do any of the Proxmox folks have an idea of what may have changed between these versions to cause load issues?

Yes, something definitely changed. We are using OpenVZ guests, but the experience is very similar: huge load while doing snapshot backups. There even was a 5 minute hang at the start of each backup (when creating the snapshot), but this is gone for 2-3 weeks now. Only the huge load remains.
 
just to note, doing suspend backups for openvz guests is also great, just a minimum downtime but you do need a lvm snapshot.
 
Tom - Do you know what changes were made in the script? Or is it perhaps a kernel changed that caused this? Many of us love lvm snapshot backups, but the performance regression has been huge.
 
Unlike the other user (atran), I don't have enough data / test machines to perfectly correlate to a specific version. All of our machines showing major load during backup are running newest pvestest kernel 2.6.32 built Nov 24.
 
did you test with cfq? ionice only works with cfq scheduler.
 
We have tested w/ cfq, noop actually improves the situation over cfq, so ionice on the source at least doesn't seem to be causing the issue.
 
Unlike the other user (atran), I don't have enough data / test machines to perfectly correlate to a specific version. All of our machines showing major load during backup are running newest pvestest kernel 2.6.32 built Nov 24.

indeed the other hosts that uses 2.6.35 doesn't have this problem (also using deadline).
my problem began when i downgraded from 2.6.35 (with pve1.8) to the recommended 2.6.32 (with pve1.9)

problem machine:
pve-manager: 1.9-26 (pve-manager/1.9/6567)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 1.9-50
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.32-6-pve: 2.6.32-50
qemu-server: 1.1-32
pve-firmware: 1.0-14
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-3pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-1
ksm-control-daemon: 1.0-6
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!