[SOLVED] CEPH OSD delay in service startup after reboot.

itNGO

Well-Known Member
Jun 12, 2020
778
176
53
45
Germany
it-ngo.com
We are investigating an issue, where after a NODE-Reboot in a 3-Node-Cluster the OSDs are delayed in startup.
We need to set "NOOUT"-Flag to prevent CEPH from mixing data around, cause the OSD just take several minutes to get started.
They always start on their own and work normal after that.

Is this delay normal? We are on PVE 7.2-4 with CEPH 16.2.9. It takes 10 Minutes until last OSD is available again from 4 OSDs per Node.

Anyone had this also?
 
Last edited:
Is this only after a reboot, or also when you restart an OSD?
Could you provide both the journal from the boot until all OSDs are up and running and the Ceph log (/var/log/ceph/ceph.log) and OSD logs for that timeframe?
 
We updated all Nodes yesterday to 7.2-7 with latest Kernel from "Enterprise Repository".
The reboot went without issues and all OSDs came up on their own after a few minutes.

So I guess this is no longer an issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!