We've run in to this problem in the last couple of months. 3 full crashes since January.
We're using AMD EPYC CPUs (76** series) in a 4-node cluster. Every time it happens, it's random. Never at a certain time or anything.
By the errors (same as above) I would assume there is something wrong...