syslog error in ceph after upgrade to 8.1

Jun 25, 2022
99
9
13
Kota
recently i upgraded the server to proxmox version 8.1x from 7.x, and found this error message in syslog, kindly advise if this is some kind of error during upgrade ?

2023-12-07T18:35:26.242453+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242468+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242479+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.pve-3.keyring: (13) Permission denied
2023-12-07T18:35:26.242490+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.236+0530 7f83573ff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.242504+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:35:26.313037+05:30 pve-3 ceph-crash[2036]: WARNING:ceph-crash:post /var/lib/ceph/crash/2023-11-25T17:51:38.807915Z_02ab151a-7b8d-4b87-a775-4b317b3795d4 as client.crash failed: 2023-12-07T18:35:26.304+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313058+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313073+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313084+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.crash.keyring: (13) Permission denied
2023-12-07T18:35:26.313095+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.308+0530 7f5591cff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.313113+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:35:26.384380+05:30 pve-3 ceph-crash[2036]: WARNING:ceph-crash:post /var/lib/ceph/crash/2023-11-25T17:51:38.807915Z_02ab151a-7b8d-4b87-a775-4b317b3795d4 as client.admin failed: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384400+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384415+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384427+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied
2023-12-07T18:35:26.384437+05:30 pve-3 ceph-crash[2036]: 2023-12-07T18:35:26.376+0530 7f65442ff6c0 -1 monclient: keyring not found
2023-12-07T18:35:26.384453+05:30 pve-3 ceph-crash[2036]: [errno 13] RADOS permission denied (error connecting to the cluster)
2023-12-07T18:37:01.282589+05:30 pve-3 pmxcfs[3477]: [status] notice: received log
 
What does your ceph.conf look like and whats the output of: ls -lisah /etc/pve/priv | grep ceph
what ceph version are you running and did you also upgrade ceph according to the docs, PVE8 require Ceph 17.
 
Last edited:
In my case:

root@proxmox1:~# ls -lisah /etc/pve/priv | grep ceph
36 0 drwx------ 2 root www-data 0 May 24 2016 ceph
38 512 -rw------- 1 root www-data 63 May 24 2016 ceph.client.admin.keyring
138096 512 -rw------- 1 root www-data 214 May 24 2016 ceph.mon.keyring


Config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 172.28.30.0/24
filestore_xattr_use_omap = true
fsid = aa0c6258-2b48-4888-a3a1-2e94b0896b82
mon_allow_pool_delete = true
mon_host = 172.28.29.13 172.28.29.12 172.28.29.11
mon_osd_full_ratio = 0.95
mon_osd_nearfull_ratio = 0.45
ms_bind_ipv4 = true
osd_journal_size = 5120
osd_max_backfills = 1
osd_pool_default_min_size = 1
osd_recovery_max_active = 1
public_network = 172.28.29.0/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mon.proxmox1]
public_addr = 172.28.29.11

[mon.proxmox2]
public_addr = 172.28.29.12

[mon.proxmox3]
public_addr = 172.28.29.13

Yes ceph was upgraded to 17.2.7 some time before updating to proxmox 8.1.3 from 7.x and had no problems
 
Last edited:
root@pve-3:~# ls -lisah /etc/pve/priv | grep ceph
35 0 drwx------ 2 root www-data 0 Dec 7 2022 ceph
37 512 -rw------- 1 root www-data 151 Dec 6 2022 ceph.client.admin.keyring
43965 512 -rw------- 1 root www-data 228 Dec 6 2022 ceph.mon.keyring


config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.20.110/24
fsid = ee114244-adc0-4a1a-acd3-8c2c59f5b2e8
mon_allow_pool_delete = true
mon_host = 10.10.20.110 10.10.20.111 10.10.20.112
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.10.20.110/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve-1]
public_addr = 10.10.20.110

[mon.pve-2]
public_addr = 10.10.20.111

[mon.pve-3]
public_addr = 10.10.20.112

root@pve-3:~# ceph --version
ceph version 17.2.7 (e303afc2e967a4705b40a7e5f76067c10eea0484) quincy (stable)



i am not upgraded the ceph, it is installed with version 7.2 last dec 22, only change i done is i upgraded the server to version 8.1.3 last week, nothing change there
 
Last edited:
root@pve-3:~# ls -lisah /etc/pve/priv | grep ceph
35 0 drwx------ 2 root www-data 0 Dec 7 2022 ceph
37 512 -rw------- 1 root www-data 151 Dec 6 2022 ceph.client.admin.keyring
43965 512 -rw------- 1 root www-data 228 Dec 6 2022 ceph.mon.keyring


config:

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.20.110/24
fsid = ee114244-adc0-4a1a-acd3-8c2c59f5b2e8
mon_allow_pool_delete = true
mon_host = 10.10.20.110 10.10.20.111 10.10.20.112
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.10.20.110/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve-1]
public_addr = 10.10.20.110

[mon.pve-2]
public_addr = 10.10.20.111

[mon.pve-3]
public_addr = 10.10.20.112

root@pve-3:~# ceph --version
ceph version 17.2.7 (e303afc2e967a4705b40a7e5f76067c10eea0484) quincy (stable)



i am not upgraded the ceph, it is installed with version 7.2 last dec 22, only change i done is i upgraded the server to version 8.1.3 last week, nothing change there

Solved restarting ceph-crash

Code:
systemctl restart ceph-crash

EDIT: Never mind, it didn't work​

 
Last edited:
I don't know if this solves the problem on the background but syslog is clean now after:

0. Made a backup of /var/lib/ceph/crash just in case I had to recover the files
1. deleted everything inside /var/lib/ceph/crash
2. created "posted" directory inside /var/lib/ceph/crash
3. restarted ceph-crash

Again, I'm not sure if this is the way to go but now there are no error logs in syslog

Update:
The syslog errors appeared again 5 days ago (6th June) so deleting those files does not help.
Any help on this matter would be appreciated

Thanks
 
Last edited:
I don't know if this solves the problem on the background but syslog is clean now after:

0. Made a backup of /var/lib/ceph/crash just in case I had to recover the files
1. deleted everything inside /var/lib/ceph/crash
2. created "posted" directory inside /var/lib/ceph/crash
3. restarted ceph-crash

Again, I'm not sure if this is the way to go but now there are no error logs in syslog
you solve the problem?