Some OSDs slow after upgrade from 6 to 7 and from nautilus to pacific

hivehive · Oct 8, 2022

I've been searching around for weeks.. maybe months and so far with no positive results. I upgraded my 3 node cluster from proxmox 6 to 7 and ceph from nautilus to octopus to pacific. I also converted all OSDs to bluestore. My containers and VMs are all very slow with read/write which led me to checking:


root@pmox1:~# ceph osd perf
osd  commit_latency(ms)  apply_latency(ms)
  8                 703                703
  6                   7                  7
  0                   3                  3
  1                   3                  3
 11                   9                  9
 10                 550                550
  9                 229                229
  7                   5                  5
  5                   6                  6
  4                   4                  4
  3                   3                  3
  2                   3                  3

which seems to be the culprit. All OSDs are SSDs. The ones with the high latency are each on a different node. I have tried deleting and recreating these OSDs, and it didn't help. I've found various posts about snaptrim running wild, but that didn't seem to be the case with my cluster, but for fun I set nosnaptrim, which didn't do anything either.

All three drives with this issue are consumer grade samsung 870 1TB. All the other drives are crucial brand SSDs (also consumer grade).

Any assistance anyone can offer would be greatly appreciated. I'm of course willing to replace the drives, but it seems strange the issue has only surfaced after the update, and I'd really like to understand the problem.

Search

Search

Some OSDs slow after upgrade from 6 to 7 and from nautilus to pacific

hivehive

New Member