High server load during backup creation

Last edited:
So, even after having updated to Proxmox v3.2 and a reboot for all changes to take place, running a backup to a NFS share still causes high load (~3.0) within a virtual machine. I have not monitored that machine for more detailled problems occuring within it, but even the high load should not occur. Is there any way to help you further to find the problem?

Code:
root@proxmox:~# pveversion -v
proxmox-ve-2.6.32: 3.2-121 (running kernel: 2.6.32-27-pve)
pve-manager: 3.2-1 (running version: 3.2-1/1933730b)
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-25-pve: 2.6.32-113
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-14
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-4
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
 
Your original post said load went over 6 and now you observe 3. Seems like an improvement.

Backups have an impact on disk IO, the size of that impact depends on your systems IO capabilities. Your IO subsystem can only go so fast and when you add a backup process to that IO you should expect higher load average. There is no free lunch.

Be careful, looks like backups are causing VMs to crash now, KVM fails with a segmentation fault.
http://forum.proxmox.com/threads/18069-After-update-to-3-2-VM-crashing-during-backup
 
As I stated some times: IO that happens on the VM host system should never, never affect the IO or load of the guest system. My server should be able to handle this without problems. Data by pveperf:
Code:
root@proxmox:~# pveperf
CPU BOGOMIPS:      31997.08
REGEX/SECOND:      839138
HD SIZE:           94.49 GB (/dev/mapper/pve-root)
BUFFERED READS:    111.14 MB/sec
AVERAGE SEEK TIME: 7.90 ms
FSYNCS/SECOND:     2278.53
DNS EXT:           18.31 ms
DNS INT:           0.96 ms (my.domain)
The CPU is an Intel Xeon E5504, VMs are stored on a RAID 5 with Hot Spare, the backup is done using NFS onto an external server, according to the backup log with a speed of 50 MB/s average. The load is larger than 1.5 during the first three of five hours of the backup execution, while there is no other extraordinary usage on the server. During the last weeks, where we had the backup disabled, the load was never higher than 1.1, and even that is only reached very rarely...
 
As I stated some times: IO that happens on the VM host system should never, never affect the IO or load of the guest system

The totality of the disk IO will cause the load average of the host and the VMs to go up. What you think 'should never' happen is a wrong assumption on your part.
You have a finite amount of IO to/from your VM disk.
When the backup process is reading from your VM disk, that uses some of the total IO available to your VM disk.

Your pveperf says that you can read at 111MB/sec, if your VM is reading 60MB/sec and the backup is reading at 51MB/sec you are using 100% of your total IO.
If either the backup or the VM wants to (can) read more than that, then load average goes up, there is no way to avoid this unless your have infinite IO available.

The load average is just one metric that gives you an idea of what is going on, it is ok for it to go up and is perfectly normal for it to go up especially when IO intensive tasks such as backups are taking place.
Are you having a performance problem or just see a higher load average an 'think' there is a problem?

If the backup is causing issues, because it is using too much IO, then you need to limit how fast the backup process reads OR you need to get a faster IO system.

All that being said, I am not happy with the recent changes to the backups in Proxmox either. I have not put the latest version of Proxmox into production yet so I do not know, first hand, if there is an issue in 3.2. I did test KVM 1.7 ( whats in the latest version of Proxmox ) awhile back and it had greatly improved the performance of disk IO inside the VM while backups were taking place. You can see my benchmarks on the pve-devel mailing list archives here.
 
All that being said, I am not happy with the recent changes to the backups in Proxmox either. I have not put the latest version of Proxmox into production yet so I do not know, first hand, if there is an issue in 3.2. I did test KVM 1.7 ( whats in the latest version of Proxmox ) awhile back and it had greatly improved the performance of disk IO inside the VM while backups were taking place. You can see my benchmarks on the pve-devel mailing list archives here.

Hi e100

Only for know, do you have tested the same test of I/O without a KVM Live backup in progress for compare the benchmark?
If the answer is positive, can you post the results?

Best regards
Cesar
 
Cesar, I have done lots of tests with IO inside the VM when backups are not being performed. The limitations, in my experience, is not the speed only IOPS. I can easily get 2000MB/sec sequential ( buss speed of my RAID card ) but when doing random IO the operations per second are limited by the speed of the CPU. KVM only has a single thread for IO, that's the limitation. If my memory is correct most of my systems top out at 30k IOPS even on the servers with 22 SSDs that can do 60k+ IOPS.

Eric
 
Cesar, I have done lots of tests with IO inside the VM when backups are not being performed. The limitations, in my experience, is not the speed only IOPS. I can easily get 2000MB/sec sequential ( buss speed of my RAID card ) but when doing random IO the operations per second are limited by the speed of the CPU. KVM only has a single thread for IO, that's the limitation. If my memory is correct most of my systems top out at 30k IOPS even on the servers with 22 SSDs that can do 60k+ IOPS.

Eric

Wow e100!!! (You are a Master and very noble for help to much people)
Many thanks for your answer

Best regards
Cesar
 
Your pveperf says that you can read at 111MB/sec, if your VM is reading 60MB/sec and the backup is reading at 51MB/sec you are using 100% of your total IO.
Even if I don't monitor my VM all the time, I cannot think of anything happening within that causes a stable load of 60 MB/s on the I/O over three hours. As this happens during the night, there should be anything running that hard. But let's see, I pulled the handbrake down to 35MB/s.....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!