I'm getting repeated kernel panics on one of my nodes (see picture, below).
I'm guessing that one of the CEPH OSDs on this system got broken somehow, and XFS is now really unhappy. So this produces two problems for me...
1) should I even try to fix the XFS filesystem?
2) CEPH isn't rebalancing properly even though it recognizes those OSDs as down/out.
And, of course, the root question would be "What happened???" but I have no answer for that.
Curiously, pveproxy/pvemanager are completely dead on this system; the server is half-dead - it responds to ping but not much else.
I'm guessing that one of the CEPH OSDs on this system got broken somehow, and XFS is now really unhappy. So this produces two problems for me...
1) should I even try to fix the XFS filesystem?
2) CEPH isn't rebalancing properly even though it recognizes those OSDs as down/out.
And, of course, the root question would be "What happened???" but I have no answer for that.
Curiously, pveproxy/pvemanager are completely dead on this system; the server is half-dead - it responds to ping but not much else.