Hello All,
I am new to ProxMox, and could use your help. I've done some troubleshooting, but don't know what to do next. Hoping someone can provide some guidance.
I was running a 4-host, 30-guest ProxMox 6.4 HCI cluster with Ceph Octopus for about a month that was working pretty well . I upgraded to 7.0 and that went OK. Once I upgraded to Pacific, that's where things went off the rails. Now, the clusterfs is completely inaccessible, only 1/2 devices are in, 3 are down.
Whenever I tried to bring up any of the OSDs, it would crash immediately.
Here's some output that I'd appreciate you kind souls reviewing:
ceph -s
ceph health detail will be posted in a followup message due to post size limitations.
I am new to ProxMox, and could use your help. I've done some troubleshooting, but don't know what to do next. Hoping someone can provide some guidance.
I was running a 4-host, 30-guest ProxMox 6.4 HCI cluster with Ceph Octopus for about a month that was working pretty well . I upgraded to 7.0 and that went OK. Once I upgraded to Pacific, that's where things went off the rails. Now, the clusterfs is completely inaccessible, only 1/2 devices are in, 3 are down.
Whenever I tried to bring up any of the OSDs, it would crash immediately.
Here's some output that I'd appreciate you kind souls reviewing:
ceph -s
root@sin-dc-196m-hv-cpu-001:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-002:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-003:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-004:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-002:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-003:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
root@sin-dc-196m-hv-cpu-004:~# ceph -s
cluster:
id: 9f0013c5-930e-4dc4-b359-02e8f0af74ad
health: HEALTH_WARN
1 filesystem is degraded
3 MDSs report slow metadata IOs
1 osds down
2 hosts (2 osds) down
1 nearfull osd(s)
Reduced data availability: 193 pgs inactive
4 pool(s) nearfull
8 daemons have recently crashed
services:
mon: 4 daemons, quorum sin-dc-196m-hv-cpu-001,sin-dc-196m-hv-cpu-002,sin-dc-196m-hv-cpu-003,sin-dc-196m-hv-cpu-004 (age 27h)
mgr: sin-dc-196m-hv-cpu-004(active, since 43h), standbys: sin-dc-196m-hv-cpu-001, sin-dc-196m-hv-cpu-002, sin-dc-196m-hv-cpu-003
mds: 3/3 daemons up, 1 standby
osd: 4 osds: 1 up (since 45h), 2 in (since 43h)
data:
volumes: 0/1 healthy, 1 recovering
pools: 4 pools, 193 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
193 unknown
ceph health detail will be posted in a followup message due to post size limitations.
Last edited: