Proxmox / Ceph / Backups & Replica Policy

Nov 28, 2016
102
24
83
Hamburg
Hello everyone!

We've recently upgraded our backbone to 50G and are having some interesting findings in our (3 node) cluster . We're running on latest Proxmox 8.3 with Ceph 18.2.
Ceph VM-Pool is configured with 3x replication over all 3 nodes (so one copy resides on each node).

When we're running backups (both LXC and KVM), CEPH reads the VM-image from the blockdevice / placement group which has been set as primary. This primary group my reside in either the local server or one of the other two.

To prevent this behaviour from happening, we've now set
Code:
rbd_read_from_replica_policy
to
Code:
localize
. The default behaviour prefers the primary placement groups, the localize setting prefers the location closest to the server with the VM residing on.

For a 3-node 3x replication cluster this eliminates any network-usage while doing backups (reads are all done locally), on our bigger clusters (20-50 nodes) have a noticeably lower network usage while doing backups.

Question: Why is this setting set to default and not localize? @fabian (sorry for tagging you directly here but we're doing awesome playing ping-pong together) ;-)

Cheerio

Florian
 
the default one is probably better at distributing the load across disks, but I am not a ceph expert. @aaron ? ;)
 
Please file a feature request at https://bugzilla.proxmox.com, ideally with some numbers that you have seen in your cluster(s).
We can then think about either making this the default or easy to enable/disable from the Proxmox VE tooling.
 
Hello everyone!

We've recently upgraded our backbone to 50G and are having some interesting findings in our (3 node) cluster . We're running on latest Proxmox 8.3 with Ceph 18.2.
Ceph VM-Pool is configured with 3x replication over all 3 nodes (so one copy resides on each node).

When we're running backups (both LXC and KVM), CEPH reads the VM-image from the blockdevice / placement group which has been set as primary. This primary group my reside in either the local server or one of the other two.

To prevent this behaviour from happening, we've now set
Code:
rbd_read_from_replica_policy
to
Code:
localize
. The default behaviour prefers the primary placement groups, the localize setting prefers the location closest to the server with the VM residing on.

For a 3-node 3x replication cluster this eliminates any network-usage while doing backups (reads are all done locally), on our bigger clusters (20-50 nodes) have a noticeably lower network usage while doing backups.

Question: Why is this setting set to default and not localize? @fabian (sorry for tagging you directly here but we're doing awesome playing ping-pong together) ;-)

Cheerio

Florian
Does the change effect(increase/decrease) the performance of the VM (bandwidth/throughput) ?
 
Does the change effect(increase/decrease) the performance of the VM (bandwidth/throughput) ?
I'll be able to supply a post-mortem here soon. I've been testing this in our test-lab (development center) quite extensivly.
 
@fstrankowski I'm looking to add the option in the proxmox gui, to be sure, how do you set the value ?

"ceph config set client.admin rbd_read_from_replica_policy localize"

?
Code:
rbd config pool set POOLNAME rbd_read_from_replica_policy localize

Regarding the post morten: I had to delay my work on that because i have to deal with lots of other stuff with higher inhouse priority at the moment. Hopefully i'll be able to prepare something within Q4/2025. I didnt forget you guys ;)
 
Last edited:
ah you can do it also on the pool, great :)
If you have a pull request for Proxmox please be so kind to link it here so i can review/improve it before there is a chance Proxmox will merge it.
I'd add the option into the CEPH pool configuration UI because its linked on a per-pool-basis and not globally.

Ceph -> Pool -> <Poolname> -> Advanced Config

Thats where i would put it at.
 
  • Like
Reactions: aaron