High server load during backup creation

Both things are fixable by using a temporary storage on the local hard disk (as LVM does).
Not with every setup ;)
If one VM uses more space than free space is available than it could be a little bit tricky. Especially with expensive/fast storage it could be that you don't oversize that much.

esco
 
I really wonder why somebody claims such nonsense? Both things are fixable by using a temporary storage on the local hard disk (as LVM does).
Sorry to ask so silly, but you just claim that everything is alright and there can be no problem even if we have one? Of course, not everyone of us is a paying subscriber, but how can you be that little helpful for a problem like this which seems to affect some users seriously?
 
Sorry to ask so silly, but you just claim that everything is alright and there can be no problem even if we have one?

Please re-read my post carefully. I just told you that the issue is 'fixable', and also described a way to solve the issue. So feel free to send a patch to implement that functionality.
 
Please re-read my post carefully. I just told you that the issue is 'fixable', and also described a way to solve the issue. So feel free to send a patch to implement that functionality.
Sorry, even if I am a student of computer science, I have no time to get into your stuff. Telling people who have a problem with your software they can fix it themself if they want to see it fixed sounds weird. If I wanted to do it myself, there would be no need to use your software...
 
What? I thought the snapshot gets invalid if it runs out of space and the original volume is accessible?

But even with high write load the snapshot can be smaller then the complete virtual disk during backup. Only if you have to write to the whole disk during backup (I think that is a rare case ;) ) you would need a snapshot with the same size as the disk.

esco
 
I really wonder why somebody claims such nonsense? Both things are fixable by using a temporary storage on the local hard disk (as LVM does).

My fast storage is on LVM, so you are proposing that to keep the speed up you would create an LVM volume to use for temporary storage so write speed is not limited by the backup device.
Sounds like you want to re-invent what LVM snapshots do.

As I have said before I like the concept behind KVM Live Backup, but it is (currently) very flawed.
Having VMs hang because a backup device failed is unacceptable.
Limiting write speed is unacceptable.

Do you plan to fix these problems?
If so, when can we expect a fix?
 
What? I thought the snapshot gets invalid if it runs out of space and the original volume is accessible?

yes. The snapshot gets unusable.

But even with high write load the snapshot can be smaller then the complete virtual disk during backup. Only if you have to write to the whole disk during backup (I think that is a rare case ;) ) you would need a snapshot with the same size as the disk.

You can implement exactly the same with the new backup method. I think a simple (mmapped) ring buffer can do it - I will try that when I touch the backup code next time.
 
Sounds like you want to re-invent what LVM snapshots do.

You seem to be totally unaware of all the problems with LVM (search the forum).

Do you plan to fix these problems?
If so, when can we expect a fix?

I think you should use a fast local disk for such backups, and use hook script to transfer result to slow storage.
 
Please re-read my post carefully. I just told you that the issue is 'fixable', and also described a way to solve the issue. So feel free to send a patch to implement that functionality.

Dietmar,

Many of us here are not great developers like you and could not even attempt to fix these issues in KVM Live Backup.
That is what upsets us, you took a perfectly good working feature away and replaced it with a flawed feature.

What is wrong with allowing LVM or KVM Live Backup?

I also have a hypothesis as to why KVM Live Backup increases load more than LVM Snapshots did.
KVM Live Backup needs to move data around inside the KVM process itself.
All of this data moving is increasing cache misses, especially for the virtual server or maybe just the additional memory copies in the KVM process is to blame.
Now I have no idea how to proove or disprove this but this is what I suspect is causing the increased load numerous people are complaining about.
What is your opinion on this hypothesis?
 
Now I have no idea how to proove or disprove this but this is what I suspect is causing the increased load numerous people are complaining about.
What is your opinion on this hypothesis?

The new backup avoids additional read/write cycles. So you need to provide a test case to show your claims. Maybe it is just a small bug somewhere, and we can fix it fast if we have a test case.

Note: We can only fix things if we have a test case.
 
Maybe I did not quite explain my hypothesis.
I agree that KVM Live Backup avoids additional IO related read/write cycles, this is the advantage that it has.
But since it packs all of the data moving into the KVM process itself it has a negative impact on CPU performance of the KVM process.

With LVM, the kernel is doing the extra IO for the COW to make the snapshot work.
Some other process is reading from the snapshow and writing out the backup file.
All of this is likely happening on a different CPU/Cores than where the KVM process is running too.


With KVM Live Backup the COW happens in the KVM process itself.
The KVM Process is also doing the reading from the disk, not some other process.
The KVM Process is then sending the read data to the actual backup process.
Putting all of those things into the process running the VM itself has to have a negative impact on the operation of the VM, there is no free lunch here.
KVM is suddenly moving around massive amounts of data that it normally does not touch, there must be a negative impact on the CPU cache where the KVM process is running.
Cache misses increase in the KVM Process and load rises.

Essentially KVM Live Backup reduces IO at the cost of CPU effencicy and that is what is causing the increased load people are seeing/complaining about.
 
Essentially KVM Live Backup reduces IO at the cost of CPU effencicy and that is what is causing the increased load people are seeing/complaining about.

So why is it fast then when they use local storage as target?
 
So why is it fast then when they use local storage as target?
Less latency is a good explanation.

No one is complaining about the speed of the backup. I am pointing out how the new backup process has a negative impact on the performance of the guest VM.

I have not noticed the KVM live backup being much, if any, faster than the LVM snapshot method. I have noticed KVM live backup having an extremely negative impact on performance.
 
You seem to be totally unaware of all the problems with LVM (search the forum).
You seem to be unwilling to admit that your invention is not perfect. Your cure to the LVM problem creates a new set of problems.
One user reported how the new method corrupted his guest filesystem, LVM snapshot never did this as far as I am aware.
Never had a LVM snapshot method cause the guest to stall when backups had issues, new method does this.
Numerous people have observed increased load with new method, this I predict will not be resolved with patches to KVM live backup.

I think you should use a fast local disk for such backups, and use hook script to transfer result to slow storage.
This was not necessary with LVM snapshot backup. Yet somehow you expect me to believe that this new method is vastly superior.

Both methods are flawed and have their issues, neither is a perfect solution in all situations. One can stall the VM when things go wrong where the other does not, one uses more disk IO than the other, one requires writing external scripts to avoid decreased IO performance the other does not.

This is why I advocate allowing users to pick the method that fits their needs the best.

Maybe at some point in the future the new method will be much better and LVM snapshot will not be needed. In the meantime people need to be able to reliably make backups without risking a stalled VM or causing the VM to run slow.
 
You seem to be unwilling to admit that your invention is not perfect.

Sorry, I can't remember that I claimed the new backup is perfect.

Maybe at some point in the future the new method will be much better and LVM snapshot will not be needed. In the meantime people need to be able to reliably make backups without risking a stalled VM or causing the VM to run slow.

If somebody find a bug, he should try to provide a test case to reproduce it. We can then try to fix it. Maintaining old code forever is not an option.
 
hi,

did you install the system with the baremetal iso or with debian wheezy?
i run into this problem after installing debian wheezy..
maybe for some other reason you dont have the right scheduler set...

check cat /sys/block/YOURDISKS/queue/scheduler where YOURDIKS = sda etc...
it should say cfq. any other scheduler would cause the problems you have....

to the sysadming here. can i change the wiki?
the description how to install proxmox via debian wheezy is perfect. i just miss this VERY IMPORTANT step at the end:

  1. echo cfq > /sys/block/DISKS/queue/scheduler to all of your disks
  2. find GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub
  3. and add "... elevator=cfq"
  4. run update-grub

otherwise you will run for sure in the same problem as above...
this would actually also happen with LVM snapshots.....

regards
philipp



 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!