I have multiple AMD EPYC based systems on different motherboards all brand new and having same issue.
They keep crashing randomly, system would just reboot.
Cooling and temps are good, power proper with sine wave, UPS, Ram is brand new ECC,
They all run off NVME for main OS.
BCM does not report any faults or error events.
Systems all run different workloads, being fairly low. But crashes do happen under load.
I am unable to see any issues in error logs, but maybe i am not looking in the right place.
Few weeks a go I saw errors along the lines of core XX not responding in the CLI
I am going to be reaching out to all motherboard manufacturers, This happens on boards from Asrock,Gigabyte, Sumermicro. using EPYC 7301 and 7351P cpus.
Will keep this thread updated with what i find, if you have any information about this issue or can point to to place that does - will be appreciated.
They keep crashing randomly, system would just reboot.
Cooling and temps are good, power proper with sine wave, UPS, Ram is brand new ECC,
They all run off NVME for main OS.
BCM does not report any faults or error events.
Systems all run different workloads, being fairly low. But crashes do happen under load.
I am unable to see any issues in error logs, but maybe i am not looking in the right place.
Few weeks a go I saw errors along the lines of core XX not responding in the CLI
I am going to be reaching out to all motherboard manufacturers, This happens on boards from Asrock,Gigabyte, Sumermicro. using EPYC 7301 and 7351P cpus.
Will keep this thread updated with what i find, if you have any information about this issue or can point to to place that does - will be appreciated.