[CEPH] auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied

fcarucci

New Member
May 13, 2023
22
9
3
Hello, I'm pretty new to Ceph, I have a 3-node cluster.
On one of the nodes I keep getting these warnings in the log:

Code:
Mar 07 14:59:02 pve-ceph1 ceph-crash[974]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-03-06T03:47:35.113588Z_ae2458c5-5cb9-4b85-9afe-1b2efe7f8600 as client.admin failed: 2024-03-07T14:59:02.088-0800 7639492346c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Mar 07 14:59:02 pve-ceph1 ceph-crash[974]: 2024-03-07T14:59:02.096-0800 7639492346c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Mar 07 14:59:02 pve-ceph1 ceph-crash[974]: 2024-03-07T14:59:02.096-0800 7639492346c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Mar 07 14:59:02 pve-ceph1 ceph-crash[974]: 2024-03-07T14:59:02.096-0800 7639492346c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Mar 07 14:59:02 pve-ceph1 ceph-crash[974]: 2024-03-07T14:59:02.096-0800 7639492346c0 -1 monclient: keyring not found

The keyring file is there:
Code:
root@pve-ceph1:~# ls -la /etc/pve/priv/ceph.client.admin.keyring
-rw------- 1 root www-data 151 Feb 29 14:22 /etc/pve/priv/ceph.client.admin.keyring

The cluster seems otherwise fine and healthy.

How do I clear this warnings? What kind of information do you I need to diagnose the problem?

Thanks!
 
Last edited:
Thanks for your help!

Code:
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.0.20.4/24
         err_to_syslog = true
         fsid = 5ce42d57-4371-475a-94fb-eac8acefe72e
         mon_allow_pool_delete = true
         mon_allow_pool_size_one = false
         mon_cluster_log_file_level = info
         mon_cluster_log_to_file = false
         mon_host = 10.0.20.3 10.0.20.1 10.0.20.4
         ms_bind_ipv4 = true
         ms_bind_ipv6 = false
         osd_deep_scrub_interval = 1209600
         osd_pool_default_min_size = 2
         osd_pool_default_size = 4
         osd_scrub_begin_hour = 23
         osd_scrub_end_hour = 7
         osd_scrub_sleep = 0.1
         public_network = 10.0.20.4/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.pve-1]
         host = pve
         mds_standby_for_name = pve

[mds.pve-2]
         host = pve
         mds_standby_for_name = pve

[mds.pve-3]
         host = pve
         mds_standby_for_name = pve

[mds.pve-ceph1-1]
         host = pve-ceph1
         mds_standby_for_name = pve

[mds.pve-ceph1-2]
         host = pve-ceph1
         mds_standby_for_name = pve

[mds.pve-ceph1-3]
         host = pve-ceph1
         mds_standby_for_name = pve

[mds.pve2-1]
         host = pve2
         mds_standby_for_name = pve

[mds.pve3-1]
         host = pve3
         mds_standby_for_name = pve

[mon.pve]
         debug_mon = 0/5
         public_addr = 10.0.20.1

[mon.pve-ceph1]
         public_addr = 10.0.20.4

[mon.pve3]
         debug_mon = 0/5
         public_addr = 10.0.20.3

Feel free to point out anything wrong in this .conf.
 
this is a known issue that will be fixed with one of the next versions, it's spammy but all it does is prevent crash information from being submitted upstream.
 
this is a known issue that will be fixed with one of the next versions, it's spammy but all it does is prevent crash information from being submitted upstream.
Hi @fabian , than ks for the answer, any chance to point the issue so that we could follow up the resolution?
Regards
 
Hey @fabian is this still an issue?

I'm running the following version:

Code:
pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.8-2-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-2
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
ceph: 18.2.2-pve1
ceph-fuse: 18.2.2-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
intel-microcode: 3.20240531.1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.4-1
proxmox-backup-file-restore: 3.2.4-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

I seem to still have this issue on my Cluster getting the following errors every 10 minutes or so:

Code:
ceph-crash[1714]: [errno 13] RADOS permission denied (error connecting to the cluster)
ceph-crash[1714]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.admin failed: 2024-09-03T09:11:30.141+1000 7688f80006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
ceph-crash[1714]: 2024-09-03T09:11:30.152+1000 7688f80006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
ceph-crash[1714]: 2024-09-03T09:11:30.153+1000 7688f80006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
ceph-crash[1714]: 2024-09-03T09:11:30.153+1000 7688f80006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
ceph-crash[1714]: 2024-09-03T09:11:30.153+1000 7688f80006c0 -1 monclient: keyring not found

I checked /var/lib/ceph/mds and there's nothing in that folder.
Is there anything else I can do to troubleshoot this because I've looked at other threads and they either have no answer or never got resolved?
 
yes, this should be fixed now. can you try restarting the ceph-crash service?
 
yes, this should be fixed now. can you try restarting the ceph-crash service?
@fabian I just tried this and here's the log straight after restarting it

Code:
Sep 04 08:39:32  systemd[1]: Stopping ceph-crash.service - Ceph crash dump collector...
Sep 04 08:39:32  ceph-crash[1714]: *** Interrupted with signal 15 ***
Sep 04 08:39:32  systemd[1]: ceph-crash.service: Deactivated successfully.
Sep 04 08:39:32  systemd[1]: Stopped ceph-crash.service - Ceph crash dump collector.
Sep 04 08:39:32  systemd[1]: ceph-crash.service: Consumed 2h 24min 25.264s CPU time.
Sep 04 08:39:32  systemd[1]: Started ceph-crash.service - Ceph crash dump collector.
Sep 04 08:39:32  ceph-crash[3267938]: INFO:ceph-crash:pinging cluster to exercise our key
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.484+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.486+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 monclient: keyring not found
Sep 04 08:39:32  ceph-crash[3267941]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:32  ceph-crash[3267938]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.crash failed: Error EINVAL: Traceback (most recent call last):
Sep 04 08:39:33  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in _handle_command
Sep 04 08:39:33  ceph-crash[3267938]:     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
Sep 04 08:39:33  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:33  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
Sep 04 08:39:33  ceph-crash[3267938]:     return self.func(mgr, **kwargs)
Sep 04 08:39:33  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:33  ceph-crash[3267938]: TypeError: Module.do_post() missing 1 required positional argument: 'inbuf'
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.crash. failed: 2024-09-04T08:39:33.368+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.379+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 monclient: keyring not found
Sep 04 08:39:33  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.admin failed: 2024-09-04T08:39:33.598+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.609+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 monclient: keyring not found
Sep 04 08:39:33  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.crash failed: Error EINVAL: Traceback (most recent call last):
Sep 04 08:39:34  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in _handle_command
Sep 04 08:39:34  ceph-crash[3267938]:     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
Sep 04 08:39:34  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:34  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
Sep 04 08:39:34  ceph-crash[3267938]:     return self.func(mgr, **kwargs)
Sep 04 08:39:34  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:34  ceph-crash[3267938]: TypeError: Module.do_post() missing 1 required positional argument: 'inbuf'
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.crash. failed: 2024-09-04T08:39:34.488+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.498+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 monclient: keyring not found
Sep 04 08:39:34  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.admin failed: 2024-09-04T08:39:34.715+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.725+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 monclient: keyring not found
Sep 04 08:39:34  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
 
@fabian I just tried this and here's the log straight after restarting it

Code:
Sep 04 08:39:32  systemd[1]: Stopping ceph-crash.service - Ceph crash dump collector...
Sep 04 08:39:32  ceph-crash[1714]: *** Interrupted with signal 15 ***
Sep 04 08:39:32  systemd[1]: ceph-crash.service: Deactivated successfully.
Sep 04 08:39:32  systemd[1]: Stopped ceph-crash.service - Ceph crash dump collector.
Sep 04 08:39:32  systemd[1]: ceph-crash.service: Consumed 2h 24min 25.264s CPU time.
Sep 04 08:39:32  systemd[1]: Started ceph-crash.service - Ceph crash dump collector.
Sep 04 08:39:32  ceph-crash[3267938]: INFO:ceph-crash:pinging cluster to exercise our key
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.484+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.486+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:32  ceph-crash[3267941]: 2024-09-04T08:39:32.487+1000 7d41554006c0 -1 monclient: keyring not found
Sep 04 08:39:32  ceph-crash[3267941]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:32  ceph-crash[3267938]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.crash failed: Error EINVAL: Traceback (most recent call last):
Sep 04 08:39:33  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in _handle_command
Sep 04 08:39:33  ceph-crash[3267938]:     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
Sep 04 08:39:33  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:33  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
Sep 04 08:39:33  ceph-crash[3267938]:     return self.func(mgr, **kwargs)
Sep 04 08:39:33  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:33  ceph-crash[3267938]: TypeError: Module.do_post() missing 1 required positional argument: 'inbuf'
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.crash. failed: 2024-09-04T08:39:33.368+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.379+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.381+1000 7c378aa006c0 -1 monclient: keyring not found
Sep 04 08:39:33  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:33  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551 as client.admin failed: 2024-09-04T08:39:33.598+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.609+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:33  ceph-crash[3267938]: 2024-09-04T08:39:33.610+1000 741fc18006c0 -1 monclient: keyring not found
Sep 04 08:39:33  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.crash failed: Error EINVAL: Traceback (most recent call last):
Sep 04 08:39:34  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in _handle_command
Sep 04 08:39:34  ceph-crash[3267938]:     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
Sep 04 08:39:34  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:34  ceph-crash[3267938]:   File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
Sep 04 08:39:34  ceph-crash[3267938]:     return self.func(mgr, **kwargs)
Sep 04 08:39:34  ceph-crash[3267938]:            ^^^^^^^^^^^^^^^^^^^^^^^^
Sep 04 08:39:34  ceph-crash[3267938]: TypeError: Module.do_post() missing 1 required positional argument: 'inbuf'
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.crash. failed: 2024-09-04T08:39:34.488+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.498+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash..keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.500+1000 7844154006c0 -1 monclient: keyring not found
Sep 04 08:39:34  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)
Sep 04 08:39:34  ceph-crash[3267938]: WARNING:ceph-crash:post /var/lib/ceph/crash/2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6 as client.admin failed: 2024-09-04T08:39:34.715+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.725+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
Sep 04 08:39:34  ceph-crash[3267938]: 2024-09-04T08:39:34.727+1000 79d5c18006c0 -1 monclient: keyring not found
Sep 04 08:39:34  ceph-crash[3267938]: [errno 13] RADOS permission denied (error connecting to the cluster)

Hello!

Could you please post the output of the following commands?

  1. pveversion

  2. ceph --version

  3. ls -alhu /etc/pve/ceph

  4. ls -alhu /var/lib/ceph/crash

  5. ls -alhu /var/lib/ceph/crash/posted

  6. stat /etc/pve/priv/ceph.client.admin.keyring
 
Hello!

Could you please post the output of the following commands?

  1. pveversion

  2. ceph --version

  3. ls -alhu /etc/pve/ceph

  4. ls -alhu /var/lib/ceph/crash

  5. ls -alhu /var/lib/ceph/crash/posted

  6. stat /etc/pve/priv/ceph.client.admin.keyring
Hi @Max Carrara,

Code:
pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-2-pve)

Code:
ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable)

Code:
ls -alhu /etc/pve/ceph
total 512
drwxr-xr-x 2 root www-data  0 Jun 19 15:26 .
drwxr-xr-x 2 root www-data  0 Jan  1  1970 ..
-rw-r----- 1 root www-data 63 Jun 19 15:26 ceph.client.crash.keyring

Code:
ls -alhu /var/lib/ceph/crash
total 20K
drwxr-xr-x  5 ceph ceph 4.0K Sep  4 22:02 .
drwxr-x--- 14 ceph ceph 4.0K Sep  3 09:11 ..
drwx------  2 ceph ceph 4.0K Aug  7 19:07 2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551
drwx------  2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6
drwxr-xr-x  5 ceph ceph 4.0K Jun 19 15:26 posted

Code:
ls -alhu /var/lib/ceph/crash/posted
total 20K
drwxr-xr-x 5 ceph ceph 4.0K Sep  5 08:40 .
drwxr-xr-x 5 ceph ceph 4.0K Sep  4 22:02 ..
drwx------ 2 ceph ceph 4.0K Jul  5 09:08 2024-07-04T23:08:02.887774Z_ace1ad4e-2c81-4b87-81af-ea6302457785
drwx------ 2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:00.936326Z_60a41562-026e-4b50-8c7f-b597e66f38a7
drwx------ 2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:02.272533Z_ff19c064-30cf-4994-b32f-c73faea8ccbe

Code:
stat /etc/pve/priv/ceph.client.admin.keyring
  File: /etc/pve/priv/ceph.client.admin.keyring
  Size: 151             Blocks: 1          IO Block: 4096   regular file
Device: 0,47    Inode: 32          Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (   33/www-data)
Access: 2024-06-19 15:26:56.000000000 +1000
Modify: 2024-06-19 15:26:56.000000000 +1000
Change: 2024-06-19 15:26:56.000000000 +1000
 Birth: -

Thanks!
 
Hi @Max Carrara,

Code:
pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-2-pve)

Code:
ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable)

Code:
ls -alhu /etc/pve/ceph
total 512
drwxr-xr-x 2 root www-data  0 Jun 19 15:26 .
drwxr-xr-x 2 root www-data  0 Jan  1  1970 ..
-rw-r----- 1 root www-data 63 Jun 19 15:26 ceph.client.crash.keyring

Code:
ls -alhu /var/lib/ceph/crash
total 20K
drwxr-xr-x  5 ceph ceph 4.0K Sep  4 22:02 .
drwxr-x--- 14 ceph ceph 4.0K Sep  3 09:11 ..
drwx------  2 ceph ceph 4.0K Aug  7 19:07 2024-08-07T09:07:53.831553Z_2f7b806e-6d40-413c-8fc0-d89e71206551
drwx------  2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:01.169014Z_17b51a51-5d78-4c3e-b983-97ab75ec0ac6
drwxr-xr-x  5 ceph ceph 4.0K Jun 19 15:26 posted

Code:
ls -alhu /var/lib/ceph/crash/posted
total 20K
drwxr-xr-x 5 ceph ceph 4.0K Sep  5 08:40 .
drwxr-xr-x 5 ceph ceph 4.0K Sep  4 22:02 ..
drwx------ 2 ceph ceph 4.0K Jul  5 09:08 2024-07-04T23:08:02.887774Z_ace1ad4e-2c81-4b87-81af-ea6302457785
drwx------ 2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:00.936326Z_60a41562-026e-4b50-8c7f-b597e66f38a7
drwx------ 2 ceph ceph 4.0K Aug  7 19:08 2024-08-07T09:08:02.272533Z_ff19c064-30cf-4994-b32f-c73faea8ccbe

Code:
stat /etc/pve/priv/ceph.client.admin.keyring
  File: /etc/pve/priv/ceph.client.admin.keyring
  Size: 151             Blocks: 1          IO Block: 4096   regular file
Device: 0,47    Inode: 32          Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (   33/www-data)
Access: 2024-06-19 15:26:56.000000000 +1000
Modify: 2024-06-19 15:26:56.000000000 +1000
Change: 2024-06-19 15:26:56.000000000 +1000
 Birth: -

Thanks!

Thanks a bunch! So, everything seems to be in place, but the two recent crashes in /var/lib/ceph/crash aren't being moved to /var/lib/ceph/crash/posted, as the logs you posted earlier confirm ...

The lines saying auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied are usually harmless; I've tried to quiet those down a little in one of my patches from back then, but maybe I should revisit this. The ceph and ceph-crash CLIs will try to authenticate via their preferred methods first before trying other keyrings (e.g. the one in /etc/pve/ceph); the error logging isn't as "proper" from Ceph's side here in that case. But I digress.

Anyhow, the TypeError: Module.do_post() missing 1 required positional argument: 'inbuf' coming from Python is very unexpected, I haven't seen this anywhere else so far (though I'll keep my eyes open).

Just for my own curiosity, could you also attach the contents of /var/lib/ceph/crash/2024-* as .tar.gz or similar? The following command should do the trick:
Bash:
tar cvzf ceph-crash.tar.gz /var/lib/ceph/crash/2024-*/

Please check if there are any sensitive logs in either of those two directories first, though. If you don't want to share those logs, the crash metadata would still be nice to have:

Bash:
cat /var/lib/ceph/crash/2024-*/meta

Thanks a lot!
 
Thanks a bunch! So, everything seems to be in place, but the two recent crashes in /var/lib/ceph/crash aren't being moved to /var/lib/ceph/crash/posted, as the logs you posted earlier confirm ...

The lines saying auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied are usually harmless; I've tried to quiet those down a little in one of my patches from back then, but maybe I should revisit this. The ceph and ceph-crash CLIs will try to authenticate via their preferred methods first before trying other keyrings (e.g. the one in /etc/pve/ceph); the error logging isn't as "proper" from Ceph's side here in that case. But I digress.

Anyhow, the TypeError: Module.do_post() missing 1 required positional argument: 'inbuf' coming from Python is very unexpected, I haven't seen this anywhere else so far (though I'll keep my eyes open).

Just for my own curiosity, could you also attach the contents of /var/lib/ceph/crash/2024-* as .tar.gz or similar? The following command should do the trick:
Bash:
tar cvzf ceph-crash.tar.gz /var/lib/ceph/crash/2024-*/

Please check if there are any sensitive logs in either of those two directories first, though. If you don't want to share those logs, the crash metadata would still be nice to have:

Bash:
cat /var/lib/ceph/crash/2024-*/meta

Thanks a lot!
Hey @Max Carrara,

Huge thanks for looking into this. An interesting find when tarring the crash files and this for the both crash dumps are 0 bytes in size and contain no information, no text or anything, see screenshot here:

1725576411061.png

Running cat /var/lib/ceph/crash/2024-*/meta also results in nothing.
 

Attachments

  • ceph-crash.tar.gz
    292 bytes · Views: 0
Hey @Max Carrara,

Huge thanks for looking into this. An interesting find when tarring the crash files and this for the both crash dumps are 0 bytes in size and contain no information, no text or anything, see screenshot here:

View attachment 74313

Running cat /var/lib/ceph/crash/2024-*/meta also results in nothing.

Very interesting find, thank you very much!

I was able to reproduce the behaviour of ceph-crash - I copied one of the posted crash log directories from /var/lib/ceph/crash/posted to its parent directory, renamed it (just changed the timestamp at the beginning a little), then deleted and re-created its done, log and meta files with touch.

That way the error appears for me too.

Since there's nothing in either of your two unposted crash log directories, I'd say it's safe to just remove them. Whatever happened there happened repeatedly in quick succession (according to the ls outputs you posted above) and was about a month ago now, so if there was any persisting problem, I assume you'd have noticed by now.

Now, here's the interesting thing: It baffles me how those empty entries were created anyway. Maybe the crashes occurred (in whatever component that was affected) in such quick succession that the creation of those entries was interrupted..? That's the only thing I can think of at least. Perhaps whatever crashed was restarted too quickly by systemd, crashed again, was restarted again before the crash logger could finish. (Or something similar, really just speculating here.)

Since there are posted entries from around the same time of the invalid ones: Could you tar and share those, perhaps? Though, just wanna mention that you should check if there's any sensitive information in there. If there is, please redact it before posting.

If there's anything interesting in those logs, it would warrant further investigation.

Thanks a lot for sharing! Should this happen again at some point, please let me know.
 
Oh, I also wanna add: Since ceph-crash chokes and dies because received nothing where it expected something, I'll open an issue upstream -- in my opinion, nothing existing in those directories should be handled in a more graceful manner; e.g. creating a new crash entry that reports the fact that there was an empty crash entry. The empty crash entry should then be removed in order to let business continue as usual.

Edit: Done - https://tracker.ceph.com/issues/67931
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!