[SOLVED] Help! Ceph access totally broken

this seems to work. sort of.


Code:
# ceph -n mon. --keyring /var/lib/ceph/mon/ceph-pvecloud01/keyring -s
2025-02-21T17:45:17.161+0100 76a6658006c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.mon..keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
2025-02-21T17:45:17.161+0100 76a6658006c0 -1 AuthRegistry(0x76a660065920) no keyring found at /etc/ceph/ceph.mon..keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin, disabling cephx
  cluster:
    id:     de7290f6-faac-4cfa-8569-502fae22c3ca
    health: HEALTH_WARN
            all OSDs are running squid or later but require_osd_release < squid
            1 subtrees have overcommitted pool target_size_bytes
            6 daemons have recently crashed
            too many PGs per OSD (320 > max 250)

  services:
    mon: 3 daemons, quorum pvecloud01,pvecloud02,pvecloud03 (age 7h)
    mgr: pvecloud01(active, since 7h), standbys: pvecloud02, pvecloud03
    osd: 15 osds: 15 up (since 7h), 15 in (since 9d)

  data:
    pools:   44 pools, 1601 pgs
    objects: 5.43M objects, 21 TiB
    usage:   60 TiB used, 44 TiB / 105 TiB avail
    pgs:     1601 active+clean

  io:
    client:   920 KiB/s rd, 41 MiB/s wr, 74 op/s rd, 659 op/s wr

The warning messages have been known before.

What you need to do is to create missing config files.

To get all keys use this: # ceph -n mon. --keyring /var/lib/ceph/mon/ceph-pvecloud01/keyring auth ls

1. Create /etc/pve/ceph.conf and put needed info
2. Link the file: # ln -s /etc/ceph/ceph.conf /etc/pve/ceph.conf
3. Restore keyrings

/etc/pve/ceph/ceph.client.crash.keyring
/etc/pve/priv/ceph.client.admin.keyring
/etc/pve/priv/ceph.mon.keyring

4. Copy admin keyring to /etc/pve/priv/ceph/{storage_name}.keyring

Check does all nodes have the same. Try to restart one MON to see could it start up.

Check systemd.
1. Does it have needed links in /etc/systemd/system/{ceph-mds.target.wants/ceph-mgr.target.wants/ceph-mon.target.wants/ceph.target.wants} ?
2. Does it have /etc/systemd/system/multi-user.target.wants/ ceph-volume services for OSD

And the system should go as before.
 
What you need to do is to create missing config files.

To get all keys use this: # ceph -n mon. --keyring /var/lib/ceph/mon/ceph-pvecloud01/keyring auth ls

1. Create /etc/pve/ceph.conf and put needed info
2. Link the file: # ln -s /etc/ceph/ceph.conf /etc/pve/ceph.conf
3. Restore keyrings

/etc/pve/ceph/ceph.client.crash.keyring
/etc/pve/priv/ceph.client.admin.keyring
/etc/pve/priv/ceph.mon.keyring

4. Copy admin keyring to /etc/pve/priv/ceph/{storage_name}.keyring

Check does all nodes have the same. Try to restart one MON to see could it start up.

Check systemd.
1. Does it have needed links in /etc/systemd/system/{ceph-mds.target.wants/ceph-mgr.target.wants/ceph-mon.target.wants/ceph.target.wants} ?
2. Does it have /etc/systemd/system/multi-user.target.wants/ ceph-volume services for OSD

And the system should go as before.

I was able to copy the old admin key from /etc/pve/priv/ceph/{storage_name}.keyring back to /etc/pve/priv/ceph.client.admin.keyring
Et voilà: everything worked just fine. :cool:

Thanks a lot to everyone and especially @Nemesiz !