[SOLVED] ceph startup script not working

walter.egosson · Dec 27, 2019

Hi!
My configuration (before upgrade to proxmox 5 on debian stretch):
- 3 proxmox nodes running Debian jessie
- proxmox installed on top of Debian jessie
- 2 hard drives per nodes as OSDs = total of 6 OSDs

Today we upgraded our "proxmox 4 + ceph hammer" to "proxmox 5 + ceph luminous" following this guide upgrade from 4.x to 5.x (inplace upgrade)
Everything went perfect but when any node is rebooted:

1/ the ceph main service does not start:

# systemctl status ceph

● ceph.service - PVE activate Ceph OSD disks
Loaded: loaded (/etc/systemd/system/ceph.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Fri 2019-12-27 15:50:49 EAT; 1h 5min ago
Process: 9590 ExecStart=/usr/sbin/ceph-disk --log-stdout activate-all (code=exited, status=0/SUCCESS)
Main PID: 9590 (code=exited, status=0/SUCCESS)
CPU: 179ms

déc. 27 15:50:49 srv-virt-3 systemd[1]: Starting PVE activate Ceph OSD disks...
déc. 27 15:50:49 srv-virt-3 systemd[1]: Started PVE activate Ceph OSD disks.

2/ the ceph OSD are not mounted

3/ ceph cluster goes HEALTH_WARNING
Looks like the issue comes from our ceph main service (availaible in /etc/init.d/ceph or via systemctl) since it does all the mounting and starts all ceph subservices. What went wrong?

We could easily fix it by manually do the mounting and by starting everything manually but is there better way to solve this ?

Alwin · Feb 3, 2020

Did you run through the Ceph upgrade guides as well?
There have been same major changes between hammer and luminous. One of them being, that the OSDs no longer run as the root user.
https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel
https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous

walter.egosson · Feb 5, 2020

Alwin said:
Did you run through the Ceph upgrade guides as well?
There have been same major changes between hammer and luminous. One of them being, that the OSDs no longer run as the root user.
https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel
https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous

Hi!
Sorry for the late anwser, we found out that the source was a weird useless systemd script that was supposed to do the jobs (/etc/systemd/system/ceph.service).
In fact, systemctl start ceph calls that script that does nothing so OSDs do not go up and filesystems are not mounted

The trick was deleting that script so that the /etc/init.d/ceph script is called instead. Now everything is OK!

maybe a bug?

Alwin · Feb 5, 2020

walter.egosson said:
Sorry for the late anwser, we found out that the source was a weird useless systemd script that was supposed to do the jobs (/etc/systemd/system/ceph.service).
In fact, systemctl start ceph calls that script that does nothing so OSDs do not go up and filesystems are not mounted

That's why I asked. For such reasons, we provide upgrade guides.
https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel#Start_the_daemon

walter.egosson said:
maybe a bug?

Well, even if, Ceph Hammer, Jewel and Proxmox VE 4 are EoL.
https://pve.proxmox.com/wiki/FAQ
https://docs.ceph.com/docs/master/releases/archived-index/

walter.egosson · Feb 6, 2020

Alwin said:
That's why I asked. For such reasons, we provide upgrade guides.
https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel#Start_the_daemon

Well, even if, Ceph Hammer, Jewel and Proxmox VE 4 are EoL.
https://pve.proxmox.com/wiki/FAQ
https://docs.ceph.com/docs/master/releases/archived-index/

Hi Alwin,
We followed the guide you mentioned and everything went right.

I find it strange that on a native debian 9 + ceph luminux + proxmox 5 , the ceph service is correctly launched by the /etc/init.d/ceph script

but following the guide, the ceph.service script is a template copied from /usr/share/doc/pve-manager/examples/ceph.service to /etc/systemd/system/ceph.service and contains something like:

[Unit]
Description=PVE activate Ceph OSD disks
After=pve-cluster.service
Requires=pve-cluster.service

[Service]
ExecStart=/usr/sbin/ceph-disk --log-stdout activate-all
Type=oneshot

[Install]
WantedBy=multi-user.target

Which did not work for us. So we had to delete that /etc/systemd/system/ceph.service and fallback to /etc/init.d/ceph script to make things work.

Alwin · Feb 6, 2020

walter.egosson said:
ExecStart=/usr/sbin/ceph-disk --log-stdout activate-all

With Luminous, ceph-disk is/was responsible for activating all OSDs (identifying, mounting). And around that time the switch from SysV to systemd was made. The init script should definitely go, as only systemd will be used in later Ceph releases.

walter.egosson · Feb 17, 2020

This thread should be now marked as resolved
Thank you all

Alwin · Feb 17, 2020

You can edit the thread title and set the solved prefix.

Search

Search

[SOLVED] ceph startup script not working

walter.egosson

Active Member

Alwin

Proxmox Retired Staff

walter.egosson

Active Member

Alwin

Proxmox Retired Staff

walter.egosson

Active Member

Alwin

Proxmox Retired Staff

walter.egosson

Active Member

Alwin

Proxmox Retired Staff