Some OSDs slow after upgrade from 6 to 7 and from nautilus to pacific

hivehive

New Member
Oct 8, 2022
1
0
1
I've been searching around for weeks.. maybe months and so far with no positive results. I upgraded my 3 node cluster from proxmox 6 to 7 and ceph from nautilus to octopus to pacific. I also converted all OSDs to bluestore. My containers and VMs are all very slow with read/write which led me to checking:

root@pmox1:~# ceph osd perf osd commit_latency(ms) apply_latency(ms) 8 703 703 6 7 7 0 3 3 1 3 3 11 9 9 10 550 550 9 229 229 7 5 5 5 6 6 4 4 4 3 3 3 2 3 3

which seems to be the culprit. All OSDs are SSDs. The ones with the high latency are each on a different node. I have tried deleting and recreating these OSDs, and it didn't help. I've found various posts about snaptrim running wild, but that didn't seem to be the case with my cluster, but for fun I set nosnaptrim, which didn't do anything either.

All three drives with this issue are consumer grade samsung 870 1TB. All the other drives are crucial brand SSDs (also consumer grade).

Any assistance anyone can offer would be greatly appreciated. I'm of course willing to replace the drives, but it seems strange the issue has only surfaced after the update, and I'd really like to understand the problem.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!