Proxmox with Ceph - Disk crashed rate is too high

viisking

Member
Jul 6, 2021
4
2
8
37
My cluster has 5 HP DL380G9, P440ar card, 3x4TB Seagate HDD, 3x4TB Samsung QVO each server for Ceph.
I got quite many OSD crash recent weeks (~15%), on both HDD and SSD. Anyone has exp on that?
 

Attachments

I think the QLC NAND SSDs are not so good for Ceph. Have you any other SSD to test?
 
  • Like
Reactions: viisking
Did you find a solution?

I get the same errors with Samsung SM863a drives connected to a Broadcom/LSI 9305 HBA.

Happens random on 4 nodes as soon as I boot Kernel 5.11 or 5.13 but 5.4 seems fine.