ceph-volume@.service also needs a ceph-after-pve-cluster.conf

gurubert

Distinguished Member
Mar 12, 2015
1,233
377
153
Berlin, Germany
www.heinlein-consulting.de
The systemd unit ceph-volume@.service activates the local OSDs. To do that it needs the ceph.conf.
On Proxmox nodes /etc/ceph/ceph.conf is a symlink to /etc/pve/ceph.conf and therefor only available after pve-cluster.service is running.

Please add a ceph-after-pve-cluster.conf into /lib/systemd/system/ceph-volume@.service.d like with the other Ceph service units in the pve-manager package.
 
We discovered this a few weeks ago and the fix will be in one of the next versions.
 
  • Like
Reactions: gurubert
Activating the network on these nodes takes appr 3 minutes which is more than the timeout for the ceph-volume services.
This is how we discovered it.
Wow. Are there by any chance Mellanox NICs involved? If so, try to update the firmware https://network.nvidia.com/support/firmware/mlxup-mft/

Download the mlxup binary for Linux x64, make it executable and run it. This cut down the time to get the network up and running considerably for the customer that ran into the same issue as you regarding the Ceph volume activation.
 
  • Like
Reactions: gurubert