unable to find a keyring - need help to recover ceph

GoZippy

Member
Nov 27, 2020
112
2
23
45
www.gozippy.com
Can anyone point me in the right direction to fix this?

ceph-osd --check-wants-journal
Code:
root@node2:/var/lib/ceph/osd/ceph-1# ceph-osd --check-wants-journal
2022-04-14T20:55:43.210-0500 7f71cd669f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-admin/keyring: (2) No such file or directory
2022-04-14T20:55:43.214-0500 7f71cd669f00 -1 AuthRegistry(0x558eb8d39340) no keyring found at /var/lib/ceph/osd/ceph-admin/keyring, disabling cephx
2022-04-14T20:55:43.214-0500 7f71cd669f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-admin/keyring: (2) No such file or directory
2022-04-14T20:55:43.214-0500 7f71cd669f00 -1 AuthRegistry(0x7fff5ac25250) no keyring found at /var/lib/ceph/osd/ceph-admin/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)
root@node2:/var/lib/ceph/osd/ceph-1# cd ..
root@node2:/var/lib/ceph/osd# ls
ceph-1
root@node2:/var/lib/ceph/osd# cd ceph-1/
root@node2:/var/lib/ceph/osd/ceph-1# ls
block  ceph_fsid  fsid  keyring  ready  require_osd_release  type  whoami
root@node2:/var/lib/ceph/osd/ceph-1#


View attachment 35990
 
ceph -s
Code:
root@node2:~# ceph -s
  cluster:
    id:     cfa7f7e5-64a7-48dd-bd77-466ff1e77bbb
    health: HEALTH_WARN
            1 filesystem is degraded
            insufficient standby MDS daemons available
            1 MDSs report slow metadata IOs
            6 osds down
            2 hosts (2 osds) down
            Reduced data availability: 512 pgs inactive, 63 pgs down, 29 pgs peering
            Degraded data redundancy: 49931/470465 objects degraded (10.613%), 19 pgs degraded, 30 pgs undersized
            121 pgs not deep-scrubbed in time
            121 pgs not scrubbed in time
            256 slow ops, oldest one blocked for 261608 sec, osd.1 has slow ops
 [ICODE]ceph -s[/ICODE]
  services:
    mon: 1 daemons, quorum node2 (age 3d)
    mgr: node2(active, since 5h)
    mds: 1/1 daemons up
    osd: 14 osds: 4 up (since 3d), 10 in (since 9d)
 
  data:
    volumes: 0/1 healthy, 1 recovering
    pools:   6 pools, 512 pgs
    objects: 156.82k objects, 606 GiB
    usage:   719 GiB used, 212 GiB / 932 GiB avail
    pgs:     76.172% pgs unknown
             23.828% pgs not active
             49931/470465 objects degraded (10.613%)
             390 unknown
             63  down
             29  peering
             19  undersized+degraded+peered
             11  undersized+peered
pveversion -v
Code:
root@node2:~# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-12 (running version: 7.1-12/b3c09de3)
pve-kernel-helper: 7.1-14
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-4
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-7
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
ceph osd df tree
Code:
root@node2:~# ceph osd df tree
ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME 
 -1         11.82256         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -          root default
 -5          0.90970         -  932 GiB  719 GiB  718 GiB    3 KiB  1.2 GiB  212 GiB  77.22  1.00    -              host node2
  1    hdd   0.90970   1.00000  932 GiB  719 GiB  718 GiB    3 KiB  1.2 GiB  212 GiB  77.22  1.00  122      up          osd.1
 -7          0.90970         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node3
  2    hdd   0.90970         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.2
 -9          0.90970         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node4
  3    hdd   0.90970         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.3
-13                0         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node5
-11                0         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node6
-15          0.90970         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node7
  6    hdd   0.90970   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.6
-17          0.90970         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node8
  7    hdd   0.90970         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.7
-19          6.36436         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host node900
  8    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.8
  9    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.9
 10    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.10
 11    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.11
 12    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    6      up          osd.12
 13    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    7      up          osd.13
 14    hdd   0.90919   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0   13      up          osd.14
 -3          0.90970         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host stack1
  0    hdd   0.90970   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.0
  5                0         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down  osd.5     
                         TOTAL  932 GiB  719 GiB  718 GiB  3.9 KiB  1.2 GiB  212 GiB  77.22                               
MIN/MAX VAR: 0/1.00  STDDEV: 73.25
ceph-osd --check-wants-journal
Code:
root@node2:~# ceph-osd --check-wants-journal
2022-04-17T21:43:37.600-0500 7fef16ef9f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-admin/keyring: (2) No such file or directory
2022-04-17T21:43:37.600-0500 7fef16ef9f00 -1 AuthRegistry(0x55c9f642b340) no keyring found at /var/lib/ceph/osd/ceph-admin/keyring, disabling cephx
2022-04-17T21:43:37.604-0500 7fef16ef9f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-admin/keyring: (2) No such file or directory
2022-04-17T21:43:37.604-0500 7fef16ef9f00 -1 AuthRegistry(0x7ffd73a32310) no keyring found at /var/lib/ceph/osd/ceph-admin/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)
root@node2:~#

Code:
root@node2:/var/lib/ceph/osd# ls
ceph-1
root@node2:/var/lib/ceph/osd#

looks like update or something broke ceph looking for keyring in default location instead of proxmox bootstrap location id for custer name.

how did this happen? Ideas on where to patch to get it live again?

/varl/lib/ceph/osd/ceph-1 is correct I think.. each node has own osds so not sure where the map is gone wrong
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!