Remote ceph performance degrades when changing pg_num or adding ceph node/OSDs

vidarno · Tuesday at 15:43

Hi everyone,

During testing we are seeing behaviour we haven't seen mentioned elsewhere, but hope somebody here might have ideas on how to solve it.
Proxmox is connected to an external 5-node ceph cluster with each node having 3 HDD OSDs using DB/WAL on SSD. Proxmox is running version 8, while we have seen this issue on ceph both on ceph 19.2.3 and 20.2.1.

Attaching a disk to a VM and running fio-tests gives about 13K IOPS, 50MiB/s throughput and 20ms latency, but if we change the pg_num on the pool or if we add a host (and its OSDs) the performance degrades and we get less than 2K IOPS, less than 10MiB/s throughput and +100ms latency consistently.
We of course wait until all PGs are active+clean and have also waited severals hours in-case there was some behind-the-scenes housekeeping/balancing.

Creating a new pool on ceph or just attaching a new disk will give us the expected throughput (13K IOPS, 50MiB/s, <20ms latency).

We assumed that this was due to some sort of fragmentation of data on the "old" disk, so we tried migrating it to local storage on the proxmox host and then back to the ceph rbd storage, but that did not change the results of fio.
SSD Emulation and discard is enabled/checked on the disk and no cache is configured.

Any ideas what could cause this?

vidarno · Wednesday at 15:00

I think we can chalk this up to a rookie mistake - further testing reveals we get the same performance-hit if pre-filling the disk/image using fio before running performance tests. In other words the issue is that updates/overwrites performs worse than new writes to an empty disk.

Key take-away: Always pre-fill disks before performance testing, to get real-world results.

Search

Search

Remote ceph performance degrades when changing pg_num or adding ceph node/OSDs

vidarno

New Member

vidarno

New Member

We value your privacy