Hello,
We are having some strange issues going on in our ProxMox cluster. Our nodes seem to be just randomly kernel panicking for no rhyme or reason. We can't find anything for logs on why this is happening(if there's any non-standard log locations we'd love to know about them!). Our storage back-end is ceph, and we currently run PM VE 5.4-13 with 8 nodes. There doesn't seem to be a pattern on any node as 3-4 of them have been having the issue. A reboot of the node seems to bring the node back into the cluster without issues. We've ran memtests on the nodes, and there doesn't seem to be an issue with memory. Resources seem to be sufficient (under 65% on each node). Any help is appreciated, as we're kind of at a loss here with nodes just randomly dying. We had two nodes die yesterday, one around 8:25 AM CST, and the other around 4:10 PM CST.
We are having some strange issues going on in our ProxMox cluster. Our nodes seem to be just randomly kernel panicking for no rhyme or reason. We can't find anything for logs on why this is happening(if there's any non-standard log locations we'd love to know about them!). Our storage back-end is ceph, and we currently run PM VE 5.4-13 with 8 nodes. There doesn't seem to be a pattern on any node as 3-4 of them have been having the issue. A reboot of the node seems to bring the node back into the cluster without issues. We've ran memtests on the nodes, and there doesn't seem to be an issue with memory. Resources seem to be sufficient (under 65% on each node). Any help is appreciated, as we're kind of at a loss here with nodes just randomly dying. We had two nodes die yesterday, one around 8:25 AM CST, and the other around 4:10 PM CST.