syslog error in ceph after upgrade to 8.1

Jun 25, 2022
68
7
13
recently i upgraded the server to proxmox version 8.1x from 7.x, and found this error message in syslog, kindly advise if this is some kind of error during upgrade ?

2023-12-07T18:35:26.242453+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242468+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242479+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242490+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.242504+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:35:26.313037+05:30 pve-3 ceph-crash[2036]: WARNING:ceph-crash:post /var/lib/ceph/crash/2023-11-25T17:51:38.807915Z_02ab151a-7b8d-4b87-a775-4b317b3795d4 as client.crash failed: 2023-12-07T18:35:26.304+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313058+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313073+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313084+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313095+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.313113+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:35:26.384380+05:30 pve-3 ceph-crash[2036]: WARNING:ceph-crash:post /var/lib/ceph/crash/2023-11-25T17:51:38.807915Z_02ab151a-7b8d-4b87-a775-4b317b3795d4 as client.admin failed: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384400+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384415+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384427+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384437+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.384453+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:37:01.282589+05:30 pve-3 pmxcfs[3477]: [status] notice: received log
 
What does your ceph.conf look like and whats the output of: ls -lisah /etc/pve/priv | grep ceph
what ceph version are you running and did you also upgrade ceph according to the docs, PVE8 require Ceph 17.
 
Last edited:
In my case:

root@proxmox1:~# ls -lisah /etc/pve/priv | grep ceph
36 0 drwx------ 2 root www-data 0 May 24 2016 ceph
38 512 -rw------- 1 root www-data 63 May 24 2016 ceph.client.admin.keyring
138096 512 -rw------- 1 root www-data 214 May 24 2016 ceph.mon.keyring


Config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 172.28.30.0/24
filestore_xattr_use_omap = true
fsid = aa0c6258-2b48-4888-a3a1-2e94b0896b82
mon_allow_pool_delete = true
mon_host = 172.28.29.13 172.28.29.12 172.28.29.11
mon_osd_full_ratio = 0.95
mon_osd_nearfull_ratio = 0.45
ms_bind_ipv4 = true
osd_journal_size = 5120
osd_max_backfills = 1
osd_pool_default_min_size = 1
osd_recovery_max_active = 1
public_network = 172.28.29.0/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mon.proxmox1]
public_addr = 172.28.29.11

[mon.proxmox2]
public_addr = 172.28.29.12

[mon.proxmox3]
public_addr = 172.28.29.13

Yes ceph was upgraded to 17.2.7 some time before updating to proxmox 8.1.3 from 7.x and had no problems
 
Last edited:
root@pve-3:~# ls -lisah /etc/pve/priv | grep ceph
35 0 drwx------ 2 root www-data 0 Dec 7 2022 ceph
37 512 -rw------- 1 root www-data 151 Dec 6 2022 ceph.client.admin.keyring
43965 512 -rw------- 1 root www-data 228 Dec 6 2022 ceph.mon.keyring


config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.20.110/24
fsid = ee114244-adc0-4a1a-acd3-8c2c59f5b2e8
mon_allow_pool_delete = true
mon_host = 10.10.20.110 10.10.20.111 10.10.20.112
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.10.20.110/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve-1]
public_addr = 10.10.20.110

[mon.pve-2]
public_addr = 10.10.20.111

[mon.pve-3]
public_addr = 10.10.20.112

root@pve-3:~# ceph --version
ceph version 17.2.7 (e303afc2e967a4705b40a7e5f76067c10eea0484) quincy (stable)



i am not upgraded the ceph, it is installed with version 7.2 last dec 22, only change i done is i upgraded the server to version 8.1.3 last week, nothing change there
 
Last edited:
root@pve-3:~# ls -lisah /etc/pve/priv | grep ceph
35 0 drwx------ 2 root www-data 0 Dec 7 2022 ceph
37 512 -rw------- 1 root www-data 151 Dec 6 2022 ceph.client.admin.keyring
43965 512 -rw------- 1 root www-data 228 Dec 6 2022 ceph.mon.keyring


config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.20.110/24
fsid = ee114244-adc0-4a1a-acd3-8c2c59f5b2e8
mon_allow_pool_delete = true
mon_host = 10.10.20.110 10.10.20.111 10.10.20.112
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.10.20.110/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve-1]
public_addr = 10.10.20.110

[mon.pve-2]
public_addr = 10.10.20.111

[mon.pve-3]
public_addr = 10.10.20.112

root@pve-3:~# ceph --version
ceph version 17.2.7 (e303afc2e967a4705b40a7e5f76067c10eea0484) quincy (stable)



i am not upgraded the ceph, it is installed with version 7.2 last dec 22, only change i done is i upgraded the server to version 8.1.3 last week, nothing change there

Solved restarting ceph-crash

Code:
systemctl restart ceph-crash

EDIT: Never mind, it didn't work​

 
Last edited:
I don't know if this solves the problem on the background but syslog is clean now after:

0. Made a backup of /var/lib/ceph/crash just in case I had to recover the files
1. deleted everything inside /var/lib/ceph/crash
2. created "posted" directory inside /var/lib/ceph/crash
3. restarted ceph-crash

Again, I'm not sure if this is the way to go but now there are no error logs in syslog

Update:
The syslog errors appeared again 5 days ago (6th June) so deleting those files does not help.
Any help on this matter would be appreciated

Thanks
 
Last edited:
I don't know if this solves the problem on the background but syslog is clean now after:

0. Made a backup of /var/lib/ceph/crash just in case I had to recover the files
1. deleted everything inside /var/lib/ceph/crash
2. created "posted" directory inside /var/lib/ceph/crash
3. restarted ceph-crash

Again, I'm not sure if this is the way to go but now there are no error logs in syslog
you solve the problem?
 
is it advisable to upgrade ceph from quincy to reef with this error ? any advise ? coz, i have a critical production cluster , directly upgrading will have any impact rather than troubleshooting ?

 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!