Proxmox Ceph osd down

riccardo.ragni · Sep 2, 2023

Please find this
I could not get ODS up after upgrade from Pacific to Quincy

Code:

root@PVE1:~# ceph osd stat
6 osds: 1 up (since 3h), 4 in (since 112m); epoch: e1064415
root@PVE1:~# ^C
root@PVE1:~# ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP    META     AVAIL    %USE   VAR   PGS  STATUS
 0    hdd  0.87279   1.00000  894 GiB  508 GiB  507 GiB   5 KiB  787 MiB  386 GiB  56.83  0.98    0    down
 3    hdd  0.87279         0      0 B      0 B      0 B     0 B      0 B      0 B      0     0    0    down
 1    hdd  0.87279   1.00000  894 GiB  526 GiB  525 GiB   9 KiB  799 MiB  368 GiB  58.81  1.01    0    down
 4    hdd  0.87279   1.00000  894 GiB  492 GiB  492 GiB   3 KiB  756 MiB  401 GiB  55.10  0.95    0    down
 2    hdd  0.87279         0      0 B      0 B      0 B     0 B      0 B      0 B      0     0    0    down
 5    hdd  0.87279   1.00000  894 GiB  556 GiB  556 GiB   9 KiB  855 MiB  337 GiB  62.25  1.07   71      up
                       TOTAL  3.5 TiB  2.0 TiB  2.0 TiB  28 KiB  3.1 GiB  1.5 TiB  58.25
MIN/MAX VAR: 0.95/1.07  STDDEV: 2.66
root@PVE1:~# ^C
root@PVE1:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME      STATUS  REWEIGHT  PRI-AFF
-1         5.23672  root default
-3         1.74557      host PVE1
 0    hdd  0.87279          osd.0    down   1.00000  1.00000
 3    hdd  0.87279          osd.3    down         0  1.00000
-5         1.74557      host pve2
 1    hdd  0.87279          osd.1    down   1.00000  1.00000
 4    hdd  0.87279          osd.4    down   1.00000  1.00000
-7         1.74557      host pve3
 2    hdd  0.87279          osd.2    down         0  1.00000
 5    hdd  0.87279          osd.5      up   1.00000  1.00000
root@PVE1:~# ~^C
root@PVE1:~#
root@PVE1:~#
root@PVE1:~# systemctl start ceph-osd@2
Job for ceph-osd@2.service failed because the control process exited with error code.
See "systemctl status ceph-osd@2.service" and "journalctl -xe" for details.
root@PVE1:~# systemctl start ceph-osd@0
root@PVE1:~# systemctl start ceph-osd@1
Job for ceph-osd@1.service failed because the control process exited with error code.
See "systemctl status ceph-osd@1.service" and "journalctl -xe" for details.
root@PVE1:~# systemctl start ceph-osd@2
Job for ceph-osd@2.service failed because the control process exited with error code.
See "systemctl status ceph-osd@2.service" and "journalctl -xe" for details.
root@PVE1:~# systemctl start ceph-osd@3
root@PVE1:~# systemctl start ceph-osd@4
Job for ceph-osd@4.service failed because the control process exited with error code.
Hi I could not get OSD up after upgrade from Pacific to Quincy 


See "systemctl status ceph-osd@4.service" and "journalctl -xe" for details.
root@PVE1:~# systemctl start ceph-osd@5
Job for ceph-osd@5.service failed because the control process exited with error code.
See "systemctl status ceph-osd@5.service" and "journalctl -xe" for details.
root@PVE1:~# systemctl start ceph-osd@0
root@PVE1:~# ^C
root@PVE1:~# ceph osd set noout^C
root@PVE1:~# ceph status
  cluster:
    id:     02a6e31e-a89e-4c52-9672-224c24a63104
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            3 osds down
            2 hosts (4 osds) down
            all OSDs are running octopus or later but require_osd_release < octopus
            Reduced data availability: 129 pgs inactive
            Degraded data redundancy: 285964/428946 objects degraded (66.667%), 71 pgs degraded, 71 pgs undersized

  services:
    mon: 3 daemons, quorum PVE1,pve2,pve3 (age 0.221722s)
    mgr: PVE1(active, since 2h)
    osd: 6 osds: 1 up (since 3h), 4 in (since 116m)

  data:
    pools:   2 pools, 129 pgs
    objects: 142.98k objects, 558 GiB
    usage:   2.0 TiB used, 1.5 TiB / 3.5 TiB avail
    pgs:     44.961% pgs unknown
             55.039% pgs not active
             285964/428946 objects degraded (66.667%)
             71 undersized+degraded+peered
             58 unknown

root@PVE1:~# ^C

riccardo.ragni · Sep 2, 2023

riccardo.ragni · Sep 2, 2023

scyto · Sep 3, 2023

riccardo.ragni said:
"systemctl status ceph-osd@1.service" and "journalctl -xe" for details.

and what do these say?
and what do you see in your syslog node

Search

Search

Proxmox Ceph osd down

riccardo.ragni

New Member

riccardo.ragni

New Member

riccardo.ragni

New Member

scyto

Well-Known Member

We value your privacy