I run a small single-node ceph cluster (not via Proxmox) for home file storage (deployed by cephadm). It was running bare-metal, and I attempted a physical-to-virtual migration to a Proxmox VM (I am passing through the PCIe HBA that is connected to all the disks to the VM). After doing so, all of my PGs seemed to be "unknown". Initial after a boot, the OSDs appear to be up, but after a while, they go down. I assume some sort of timeout in the OSD start process. The systemd processes (and podman containers) are still running and appear to be happy. I don't see anything crazy in their logs. I'm relatively new to Ceph, so I don't really know where to go from here. Can anyone provide any guidance?
ceph -s
ceph osd df
ceph pg stat
systemctl | grep ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b
I've attached the mon and osd.3 logs.
ceph -s
Code:
cluster:
id: 768819b0-a83f-11ee-81d6-74563c5bfc7b
health: HEALTH_WARN
Reduced data availability: 545 pgs inactive
139 pgs not deep-scrubbed in time
17 slow ops, oldest one blocked for 1668 sec, mon.fileserver has slow ops
services:
mon: 1 daemons, quorum fileserver (age 28m)
mgr: fileserver.rgtdvr(active, since 28m), standbys: fileserver.gikddq
osd: 17 osds: 5 up (since 116m), 5 in (since 10m)
data:
pools: 3 pools, 545 pgs
objects: 1.97M objects, 7.5 TiB
usage: 7.7 TiB used, 1.4 TiB / 9.1 TiB avail
pgs: 100.000% pgs unknown
545 unknown
ceph osd df
Code:
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down
1 hdd 3.63869 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down
3 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 112 down
4 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 117 down
5 hdd 3.63869 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down
6 hdd 3.63869 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down
7 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down
8 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 106 down
20 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 115 down
21 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 94 down
22 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 98 down
23 hdd 1.81940 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 109 down
24 hdd 1.81940 1.00000 1.8 TiB 1.6 TiB 1.6 TiB 4 KiB 3.0 GiB 186 GiB 90.00 1.06 117 up
25 hdd 1.81940 1.00000 1.8 TiB 1.6 TiB 1.6 TiB 10 KiB 2.8 GiB 220 GiB 88.18 1.04 114 up
26 hdd 1.81940 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 9 KiB 2.8 GiB 297 GiB 84.07 0.99 109 up
27 hdd 1.81940 1.00000 1.8 TiB 1.4 TiB 1.4 TiB 7 KiB 2.5 GiB 474 GiB 74.58 0.88 98 up
28 hdd 1.81940 1.00000 1.8 TiB 1.6 TiB 1.6 TiB 10 KiB 3.0 GiB 206 GiB 88.93 1.04 115 up
TOTAL 9.1 TiB 7.7 TiB 7.7 TiB 42 KiB 14 GiB 1.4 TiB 85.15
MIN/MAX VAR: 0.88/1.06 STDDEV: 5.65
ceph pg stat
Code:
545 pgs: 545 unknown; 7.5 TiB data, 7.7 TiB used, 1.4 TiB / 9.1 TiB avail
systemctl | grep ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b
Code:
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@alertmanager.fileserver.service loaded active running Ceph alertmanager.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@ceph-exporter.fileserver.service loaded active running Ceph ceph-exporter.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@crash.fileserver.service loaded active running Ceph crash.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@grafana.fileserver.service loaded active running Ceph grafana.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mgr.fileserver.gikddq.service loaded active running Ceph mgr.fileserver.gikddq for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mgr.fileserver.rgtdvr.service loaded active running Ceph mgr.fileserver.rgtdvr for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mon.fileserver.service loaded active running Ceph mon.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.0.service loaded active running Ceph osd.0 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.1.service loaded active running Ceph osd.1 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.20.service loaded active running Ceph osd.20 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.21.service loaded active running Ceph osd.21 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.22.service loaded active running Ceph osd.22 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.23.service loaded active running Ceph osd.23 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.24.service loaded active running Ceph osd.24 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.25.service loaded active running Ceph osd.25 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.26.service loaded active running Ceph osd.26 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.27.service loaded active running Ceph osd.27 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.28.service loaded active running Ceph osd.28 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.3.service loaded active running Ceph osd.3 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.4.service loaded active running Ceph osd.4 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.5.service loaded active running Ceph osd.5 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.6.service loaded active running Ceph osd.6 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.7.service loaded active running Ceph osd.7 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.8.service loaded active running Ceph osd.8 for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@prometheus.fileserver.service loaded active running Ceph prometheus.fileserver for 768819b0-a83f-11ee-81d6-74563c5bfc7b
system-ceph\x2d768819b0\x2da83f\x2d11ee\x2d81d6\x2d74563c5bfc7b.slice loaded active active Slice /system/ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b.target loaded active active Ceph cluster 768819b0-a83f-11ee-81d6-74563c5bfc7b
I've attached the mon and osd.3 logs.