Hi,
today one of my clusters randomly shutdown completely.
Within a second 5 OSDs crashed
I can't seem to see all OSDs in lsblk. Two osds are visible as completely empty and unused disks.
Network is fine.
All disks are the same and 9 months old. I doubt that 5 drives failed at a time.
5 drives failure couldn't be rebalanced, so the rebalance stopped too.
I can not get the osds active again. Not even per reboot.
At the Moment I am recovering all machines from backup, but this is really annoying at least. Ceph should be robust enogh. I can't see why the disks are missing.
What has happened here and how can I recover my ceph. not neccessarily data, but a working system again.
I tried ceph-volume simple scan and simple scan /dev/nvme0n1 (latter was not successful error: Argument is not a directory or device which is required to scan)
today one of my clusters randomly shutdown completely.
Within a second 5 OSDs crashed
I can't seem to see all OSDs in lsblk. Two osds are visible as completely empty and unused disks.
Network is fine.
All disks are the same and 9 months old. I doubt that 5 drives failed at a time.
5 drives failure couldn't be rebalanced, so the rebalance stopped too.
I can not get the osds active again. Not even per reboot.
At the Moment I am recovering all machines from backup, but this is really annoying at least. Ceph should be robust enogh. I can't see why the disks are missing.
root@pve2:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.35-2-pve)
pve-manager: 7.2-4 (running version: 7.2-4/ca9d43cc)
pve-kernel-5.15: 7.2-4
pve-kernel-helper: 7.2-4
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-2
libpve-storage-perl: 7.2-4
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-10
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
root@pve2:~#
proxmox-ve: 7.2-1 (running kernel: 5.15.35-2-pve)
pve-manager: 7.2-4 (running version: 7.2-4/ca9d43cc)
pve-kernel-5.15: 7.2-4
pve-kernel-helper: 7.2-4
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-2
libpve-storage-perl: 7.2-4
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-10
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
root@pve2:~#
What has happened here and how can I recover my ceph. not neccessarily data, but a working system again.
I tried ceph-volume simple scan and simple scan /dev/nvme0n1 (latter was not successful error: Argument is not a directory or device which is required to scan)
2022-06-21T13:59:19.189009+0200 mon.pve1 (mon.0) 3233822 : cluster [DBG] osdmap e4685: 10 total, 5 up, 7 in
2022-06-21T13:59:19.189733+0200 mon.pve1 (mon.0) 3233823 : cluster [DBG] mgrmap e88: pve3(active, since 6M), standbys: pve5, pve1
2022-06-21T13:59:19.189943+0200 mon.pve1 (mon.0) 3233824 : cluster [WRN] Health check failed: 2/5 mons down, quorum pve1,pve2,pve5 (MON_DOWN)
2022-06-21T13:59:19.190926+0200 mon.pve1 (mon.0) 3233825 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.190981+0200 mon.pve1 (mon.0) 3233826 : cluster [INF] osd.5 failed (root=default,host=pve3) (connection refused reported by osd.3)
2022-06-21T13:59:19.190999+0200 mon.pve1 (mon.0) 3233827 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191016+0200 mon.pve1 (mon.0) 3233828 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191031+0200 mon.pve1 (mon.0) 3233829 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191050+0200 mon.pve1 (mon.0) 3233830 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191071+0200 mon.pve1 (mon.0) 3233831 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191104+0200 mon.pve1 (mon.0) 3233832 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191119+0200 mon.pve1 (mon.0) 3233833 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191134+0200 mon.pve1 (mon.0) 3233834 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191157+0200 mon.pve1 (mon.0) 3233835 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191167+0200 mon.pve1 (mon.0) 3233836 : cluster [INF] osd.6 failed (root=default,host=pve4) (connection refused reported by osd.3)
2022-06-21T13:59:19.191181+0200 mon.pve1 (mon.0) 3233837 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191194+0200 mon.pve1 (mon.0) 3233838 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191214+0200 mon.pve1 (mon.0) 3233839 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191228+0200 mon.pve1 (mon.0) 3233840 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191242+0200 mon.pve1 (mon.0) 3233841 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191256+0200 mon.pve1 (mon.0) 3233842 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191269+0200 mon.pve1 (mon.0) 3233843 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191283+0200 mon.pve1 (mon.0) 3233844 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191303+0200 mon.pve1 (mon.0) 3233845 : cluster [DBG] osd.5 reported failed by osd.3
2022-06-21T13:59:19.193156+0200 mon.pve1 (mon.0) 3233846 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193208+0200 mon.pve1 (mon.0) 3233847 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193259+0200 mon.pve1 (mon.0) 3233848 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193285+0200 mon.pve1 (mon.0) 3233849 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193328+0200 mon.pve1 (mon.0) 3233850 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193359+0200 mon.pve1 (mon.0) 3233851 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193401+0200 mon.pve1 (mon.0) 3233852 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193425+0200 mon.pve1 (mon.0) 3233853 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193466+0200 mon.pve1 (mon.0) 3233854 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193493+0200 mon.pve1 (mon.0) 3233855 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193539+0200 mon.pve1 (mon.0) 3233856 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193579+0200 mon.pve1 (mon.0) 3233857 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193634+0200 mon.pve1 (mon.0) 3233858 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193663+0200 mon.pve1 (mon.0) 3233859 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193708+0200 mon.pve1 (mon.0) 3233860 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193736+0200 mon.pve1 (mon.0) 3233861 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193776+0200 mon.pve1 (mon.0) 3233862 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193840+0200 mon.pve1 (mon.0) 3233863 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193918+0200 mon.pve1 (mon.0) 3233864 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.193959+0200 mon.pve1 (mon.0) 3233865 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.193990+0200 mon.pve1 (mon.0) 3233866 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194045+0200 mon.pve1 (mon.0) 3233867 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194100+0200 mon.pve1 (mon.0) 3233868 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194159+0200 mon.pve1 (mon.0) 3233869 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194189+0200 mon.pve1 (mon.0) 3233870 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194238+0200 mon.pve1 (mon.0) 3233871 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194266+0200 mon.pve1 (mon.0) 3233872 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194300+0200 mon.pve1 (mon.0) 3233873 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194320+0200 mon.pve1 (mon.0) 3233874 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194369+0200 mon.pve1 (mon.0) 3233875 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194450+0200 mon.pve1 (mon.0) 3233876 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194480+0200 mon.pve1 (mon.0) 3233877 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194532+0200 mon.pve1 (mon.0) 3233878 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194567+0200 mon.pve1 (mon.0) 3233879 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194624+0200 mon.pve1 (mon.0) 3233880 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194670+0200 mon.pve1 (mon.0) 3233881 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194795+0200 mon.pve1 (mon.0) 3233882 : cluster [DBG] osd.5 reported failed by osd.2
2022-06-21T13:59:19.197543+0200 mon.pve1 (mon.0) 3233883 : cluster [WRN] Health detail: HEALTH_WARN 2/5 mons down, quorum pve1,pve2,pve5; 2 osds down; Reduced data availability: 41 pgs inactive; Degraded data redundancy: 616560/2297379 objects degraded (26.838%), 92 pgs degraded, 170 pgs undersized; 104 pgs not deep-scrubbed in time; 104 pgs not scrubbed in time; 1 daemons have recently crashed
2022-06-21T13:59:19.197556+0200 mon.pve1 (mon.0) 3233884 : cluster [WRN] [WRN] MON_DOWN: 2/5 mons down, quorum pve1,pve2,pve5
2022-06-21T13:59:19.197561+0200 mon.pve1 (mon.0) 3233885 : cluster [WRN] mon.pve3 (rank 2) addr [v2:10.20.15.5:3300/0,v1:10.20.15.5:6789/0] is down (out of quorum)
2022-06-21T13:59:19.197566+0200 mon.pve1 (mon.0) 3233886 : cluster [WRN] mon.pve4 (rank 3) addr [v2:10.20.15.6:3300/0,v1:10.20.15.6:6789/0] is down (out of quorum)
2022-06-21T13:59:19.197570+0200 mon.pve1 (mon.0) 3233887 : cluster [WRN] [WRN] OSD_DOWN: 2 osds down
2022-06-21T13:59:19.197575+0200 mon.pve1 (mon.0) 3233888 : cluster [WRN] osd.4 (root=default,host=pve3) is down
2022-06-21T13:59:19.197593+0200 mon.pve1 (mon.0) 3233889 : cluster [WRN] osd.7 (root=default,host=pve4) is down
2022-06-21T13:59:19.189733+0200 mon.pve1 (mon.0) 3233823 : cluster [DBG] mgrmap e88: pve3(active, since 6M), standbys: pve5, pve1
2022-06-21T13:59:19.189943+0200 mon.pve1 (mon.0) 3233824 : cluster [WRN] Health check failed: 2/5 mons down, quorum pve1,pve2,pve5 (MON_DOWN)
2022-06-21T13:59:19.190926+0200 mon.pve1 (mon.0) 3233825 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.190981+0200 mon.pve1 (mon.0) 3233826 : cluster [INF] osd.5 failed (root=default,host=pve3) (connection refused reported by osd.3)
2022-06-21T13:59:19.190999+0200 mon.pve1 (mon.0) 3233827 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191016+0200 mon.pve1 (mon.0) 3233828 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191031+0200 mon.pve1 (mon.0) 3233829 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191050+0200 mon.pve1 (mon.0) 3233830 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191071+0200 mon.pve1 (mon.0) 3233831 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191104+0200 mon.pve1 (mon.0) 3233832 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191119+0200 mon.pve1 (mon.0) 3233833 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191134+0200 mon.pve1 (mon.0) 3233834 : cluster [DBG] osd.5 reported immediately failed by osd.3
2022-06-21T13:59:19.191157+0200 mon.pve1 (mon.0) 3233835 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191167+0200 mon.pve1 (mon.0) 3233836 : cluster [INF] osd.6 failed (root=default,host=pve4) (connection refused reported by osd.3)
2022-06-21T13:59:19.191181+0200 mon.pve1 (mon.0) 3233837 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191194+0200 mon.pve1 (mon.0) 3233838 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191214+0200 mon.pve1 (mon.0) 3233839 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191228+0200 mon.pve1 (mon.0) 3233840 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191242+0200 mon.pve1 (mon.0) 3233841 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191256+0200 mon.pve1 (mon.0) 3233842 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191269+0200 mon.pve1 (mon.0) 3233843 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191283+0200 mon.pve1 (mon.0) 3233844 : cluster [DBG] osd.6 reported immediately failed by osd.3
2022-06-21T13:59:19.191303+0200 mon.pve1 (mon.0) 3233845 : cluster [DBG] osd.5 reported failed by osd.3
2022-06-21T13:59:19.193156+0200 mon.pve1 (mon.0) 3233846 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193208+0200 mon.pve1 (mon.0) 3233847 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193259+0200 mon.pve1 (mon.0) 3233848 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193285+0200 mon.pve1 (mon.0) 3233849 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193328+0200 mon.pve1 (mon.0) 3233850 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193359+0200 mon.pve1 (mon.0) 3233851 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193401+0200 mon.pve1 (mon.0) 3233852 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193425+0200 mon.pve1 (mon.0) 3233853 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193466+0200 mon.pve1 (mon.0) 3233854 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193493+0200 mon.pve1 (mon.0) 3233855 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193539+0200 mon.pve1 (mon.0) 3233856 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193579+0200 mon.pve1 (mon.0) 3233857 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193634+0200 mon.pve1 (mon.0) 3233858 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193663+0200 mon.pve1 (mon.0) 3233859 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193708+0200 mon.pve1 (mon.0) 3233860 : cluster [DBG] osd.5 reported immediately failed by osd.2
2022-06-21T13:59:19.193736+0200 mon.pve1 (mon.0) 3233861 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193776+0200 mon.pve1 (mon.0) 3233862 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193840+0200 mon.pve1 (mon.0) 3233863 : cluster [DBG] osd.5 reported immediately failed by osd.8
2022-06-21T13:59:19.193918+0200 mon.pve1 (mon.0) 3233864 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.193959+0200 mon.pve1 (mon.0) 3233865 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.193990+0200 mon.pve1 (mon.0) 3233866 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194045+0200 mon.pve1 (mon.0) 3233867 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194100+0200 mon.pve1 (mon.0) 3233868 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194159+0200 mon.pve1 (mon.0) 3233869 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194189+0200 mon.pve1 (mon.0) 3233870 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194238+0200 mon.pve1 (mon.0) 3233871 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194266+0200 mon.pve1 (mon.0) 3233872 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194300+0200 mon.pve1 (mon.0) 3233873 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194320+0200 mon.pve1 (mon.0) 3233874 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194369+0200 mon.pve1 (mon.0) 3233875 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194450+0200 mon.pve1 (mon.0) 3233876 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194480+0200 mon.pve1 (mon.0) 3233877 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194532+0200 mon.pve1 (mon.0) 3233878 : cluster [DBG] osd.6 reported immediately failed by osd.2
2022-06-21T13:59:19.194567+0200 mon.pve1 (mon.0) 3233879 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194624+0200 mon.pve1 (mon.0) 3233880 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194670+0200 mon.pve1 (mon.0) 3233881 : cluster [DBG] osd.6 reported immediately failed by osd.8
2022-06-21T13:59:19.194795+0200 mon.pve1 (mon.0) 3233882 : cluster [DBG] osd.5 reported failed by osd.2
2022-06-21T13:59:19.197543+0200 mon.pve1 (mon.0) 3233883 : cluster [WRN] Health detail: HEALTH_WARN 2/5 mons down, quorum pve1,pve2,pve5; 2 osds down; Reduced data availability: 41 pgs inactive; Degraded data redundancy: 616560/2297379 objects degraded (26.838%), 92 pgs degraded, 170 pgs undersized; 104 pgs not deep-scrubbed in time; 104 pgs not scrubbed in time; 1 daemons have recently crashed
2022-06-21T13:59:19.197556+0200 mon.pve1 (mon.0) 3233884 : cluster [WRN] [WRN] MON_DOWN: 2/5 mons down, quorum pve1,pve2,pve5
2022-06-21T13:59:19.197561+0200 mon.pve1 (mon.0) 3233885 : cluster [WRN] mon.pve3 (rank 2) addr [v2:10.20.15.5:3300/0,v1:10.20.15.5:6789/0] is down (out of quorum)
2022-06-21T13:59:19.197566+0200 mon.pve1 (mon.0) 3233886 : cluster [WRN] mon.pve4 (rank 3) addr [v2:10.20.15.6:3300/0,v1:10.20.15.6:6789/0] is down (out of quorum)
2022-06-21T13:59:19.197570+0200 mon.pve1 (mon.0) 3233887 : cluster [WRN] [WRN] OSD_DOWN: 2 osds down
2022-06-21T13:59:19.197575+0200 mon.pve1 (mon.0) 3233888 : cluster [WRN] osd.4 (root=default,host=pve3) is down
2022-06-21T13:59:19.197593+0200 mon.pve1 (mon.0) 3233889 : cluster [WRN] osd.7 (root=default,host=pve4) is down
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 AuthRegistry(0x5560492cca40) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 AuthRegistry(0x7fffc6dcb4c0) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx
Jun 21 16:45:02 pve2 ceph-osd[85907]: failed to fetch mon config (--no-mon-config to skip)
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 AuthRegistry(0x5560492cca40) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
Jun 21 16:45:02 pve2 ceph-osd[85907]: 2022-06-21T16:45:02.922+0200 7fbc5bf80f00 -1 AuthRegistry(0x7fffc6dcb4c0) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx
Jun 21 16:45:02 pve2 ceph-osd[85907]: failed to fetch mon config (--no-mon-config to skip)
Last edited: