Proxmox backup speed extremely slow for VM's stored on Ceph??

SanderM

Member
Oct 21, 2016
40
1
6
40
I have a new proxmox cluster with Ceph storage for VM's and NFS storage for backups.
Performance on Ceph is perfect (10G network limit) and performance on NFS is 1gigabit (=network limit).

Now, my problem is that backing up VM's stored on Ceph are extremely slow. The backup process reports an average of 36mb/s

But it's not Ceph, nor my NFS server that's slow. I did some "dd" tests and all is perfectly fine in terms of speed.
Also:

- If I clone a VM stored on Ceph to a new VM (also stored on the same Ceph storage) I get good speeds: 400mb/s
- If I backup a VM stored on local disk the backup process reports about 117mb/s



So, as you can see, NFS performance is good, Ceph performance is also good.
But still, VM's stored on Ceph have an extremely slow backup speed.

Any suggestions?
 
The system has 192Gb of ram and I'm only running 5 VM's on this node which have 8Gb assigned. So RAM shouldn't be a problem.
During backup the kvm process seems to be between 40 and 100% CPU according to 'top' (but the VM is using CPU too offcourse...)

I already tried disabling compression, to rule out LZO. Without compression the speed is exactly the same.
 
So, is there a different way to make backups and increase the performance? Because at this speed the backup process will take longer than 24 hours if I move all VM's to this kind of setup
 
Okay, so I have write my own script for this? Or is there something ready made already ?

I just tried that but my VM freezed. After doing some more testing it turns out that VM's running CloudLinux freeze as soon as I try to create a snapshot. Is this a known issue?
 
So, what's the status of this at this time? We've been seeing bad backup performance since before Ceph, but since Ceph, is abysmal. Any news on when this will be working without using some homebrew scripting? In other words, using the standard Qemu backup method ... is there something in the works to be able to change block sizes according to needs? It seems 64KB block sizes for backups doesn't make sense but I don't claim to be an expert ... just asking .... anyway, this is getting to be more and more urgent since we now have most of our customers on Ceph and with larger VM sizes the backups are creeping into the daytime instead of finishing during the night like they used to...

Thanks in advance
 
Someone from Proxmox's team needs to answer this and with a more valid answer than, "this is by design in qemu" ... instead, what is being done about it is what's needed to be known. You can't talk "Enterprise" and leave something like this alone. No offense to anyone, it's just that this is touted as an Enterprise product just as RHEV is which means those that develop it need to see this issues and solve them not just say "it's by design" and leave people with abysmal backup speed
 
Someone from Proxmox's team needs to answer this and with a more valid answer than, "this is by design in qemu" ... instead, what is being done about it is what's needed to be known. You can't talk "Enterprise" and leave something like this alone. No offense to anyone, it's just that this is touted as an Enterprise product just as RHEV is which means those that develop it need to see this issues and solve them not just say "it's by design" and leave people with abysmal backup speed

AFAIK, this is a qemu issue, not a proxmox issue.
Is qemu to be like this "by design", not proxmox.
 
I think I'm being thoroughly misunderstood.

The point isn't whether or not it's a QEMU problem or not. The point is that Proxmox IS Proxmox because of what it does and how it does it, and this is accomplished by means of the software components utilized. QEMU is an integrative part of Proxmox so if QEMU has a problem then so does Proxmox and so do all projects that rely on it, and make no mistake, Proxmox definitely relies on it.

You can't build a piece of software that relies on other people's software and then say "it's not our problem" when you integrated their software into your stack. That'd be like Ubuntu saying they don't support the linux kernel, although that's the engine of the distro, just because it isn't their project or their software specifically. What's more you can't take people's money for a subscription to your software repository saying it's "Enterprise ready" software and then leave those components less than "Enterprise ready".

It's fine for you to say "As far as I know" and I'm ok with that from those that are like me, users not developers of the product, but that doesn't solve the issue which is the reason I'm patiently awaiting a Proxmox staffer to answer this and outline what is being done to resolve the issue.

If it's a feature for the QEMU project it clearly needs to be changed for a project that uses Ceph as a supported storage backend to their virtualization product.

This is not meant to be combative, please don't misunderstand, this is meant to ask someone to urgently outline what's being done to correct this as it is most definitely an issue.
 
Hello,

any new on this?
New Quemu Version on proxmox 5? still 64KB Blocks?
Our Backups now take about 4 Days for each Server ( more then 3 TB on each) and getting worse while data is growing
Does anyone have some workaorund fot this?

greetings
Philipp