[SOLVED] ceph upgraded then: Failed to restart ceph.service: Unit ceph.service not found.

RobFantini

Famous Member
May 24, 2012
2,041
107
133
Boston,Mass
I just did this
Code:
# apt dist-upgrade
Reading package lists... Done
Building dependency tree     
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  ceph ceph-base ceph-common ceph-fuse ceph-mds ceph-mgr ceph-mon ceph-osd libcephfs2 librados2 libradosstriper1 librbd1 librgw2 python-ceph-argparse
  python-cephfs python-rados python-rbd python-rgw
18 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/54.1 MB of archives.
After this operation, 2,172 kB disk space will be freed.
Do you want to continue? [Y/n]

that seemed to work


but then not a minute later , all 3 osds on that node are down . i set noout.
Code:
# ceph -s
  cluster:
    id:     220b9a53-4556-48e3-a73c-28deff665e45
    health: HEALTH_WARN
            noout flag(s) set

  services:
    mon: 3 daemons, quorum pve3,pve10,pve15 (age 44h)
    mgr: pve15(active, since 12d), standbys: pve3, pve10
    mds: cephfs:1 {0=pve2=up:active} 2 up:standby
    osd: 21 osds: 18 up (since 7d), 17 in (since 7d)
         flags noout

  data:
    pools:   3 pools, 288 pgs
    objects: 1.71M objects, 6.3 TiB
    usage:   18 TiB used, 47 TiB / 65 TiB avail
    pgs:     288 active+clean

  io:
    client:   14 KiB/s rd, 7.9 MiB/s wr, 3 op/s rd, 367 op/s wr

Code:
# systemctl restart ceph.service
Failed to restart ceph.service: Unit ceph.service not found.

any advise?
 
Code:
# ceph start ceph.target
no valid command found; 10 closest matches:
osd crush weight-set reweight-compat <item> <float[0.0-]> [<float[0.0-]>...]
osd crush weight-set reweight <poolname> <item> <float[0.0-]> [<float[0.0-]>...]
osd crush weight-set rm-compat
osd crush weight-set rm <poolname>
osd crush weight-set create <poolname> flat|positional
osd crush weight-set create-compat
osd crush weight-set dump
osd crush weight-set ls
osd crush get-device-class <ids> [<ids>...]
osd crush class ls-osd <class>
Error EINVAL: invalid command
 
reboot node seemed to fix.

Question - why does the normal upgrade cause ceph to break?

I have 6 more nodes to upgrade, and do not mind trying attempted fixes.
 
reboot node seemed to fix.

Question - why does the normal upgrade cause ceph to break?

I have 6 more nodes to upgrade, and do not mind trying attempted fixes.

What version were you going from? -> to?

From your CEPH output, it doesn't look like there were any issues, CEPH OSD's dont actually update till you restart the OSD process.

Some recent versions the OSD didn't reconnect to the new MON's till they were running the new version.
 
What version were you going from? -> to?

From your CEPH output, it doesn't look like there were any issues, CEPH OSD's dont actually update till you restart the OSD process.

Some recent versions the OSD didn't reconnect to the new MON's till they were running the new version.

we upgraded from 14.2.4.1 to 14.2.5

I just noticed after reboot the osd's are down on the upgraded node. so the issue with needing to upgrade the MON's may have been run in to.


What should I do next?
 
But the restart did not seem to work, check above.

I can do another node soon, can you tell me which command to use?

You replied just saying the OSD's are fine and you just needed to refresh the screen.

If still an issue what does

ceph -s
&
ceph health detail


Show?
 
also after each upgrade there is a lot of backfilling . I do not recall that on some other upgrades, this upgrade may be doing something different?

so be sure to check ceph -s and wait until that settles before upgrading the next node.

Only one node had major backfills.

I think that it is best to start with a node that runs a mon . then osd's will have a mon of same version.
 
Last edited:
also after each upgrade there is a lot of backfilling . I do not recall that on some other upgrades, this upgrade may be doing something different?

so be sure to check ceph -s and wait until that settles before upgrading the next node.

Only one node had major backfills.

I think that it is best to start with a node that runs a mon . then osd's will have a mon of same version.

You should ideally always upgrade all the MON's first as seen here : https://docs.ceph.com/docs/master/install/upgrading-ceph/
 
  • Like
Reactions: RobFantini

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!