Hello everyone.
I have a small cluster, 2 HP mini pcs running v7.3.
They are both running latest debian updates, latest BIOS firmware.
One of the devices, a Prodesk with an i5 9500T cpu has a weird behavior, which only started recently.
It's difficul for me to explain the problem...
If I leave it without video or keyboard, just power and lan, after a random period of time it looses network (the leds on the lan card turn off) and it gets really hot (assuming from the cpu). After a while it reboots, I get access via ssh for a few seconds, and then the cycle repeats.
I took it out of the "rack" and brought it to my desk, with a monitor and keyboard attached, and this doesn't happen. It worked with no interruption for 24h. If I unplug the monitor and keyboard, it does the same thing it did in the rack.
I also tried to just turn off the monitor, and it worked fine, until I turned the monitor back on. When I turned it on, there was no output, pressed enter on the keyboard, and the system rebooted.
I can't find anything wrong in any log file, everything looks okay, no error message, nothing.
The logs just stop at the moment of the reboot...
This machine worked with no issue for almost 2 years, and besides the regular updates, nothing changed.
Also strage, nkt sure if related, the web ui screen is just white on this machine. I can controll it from the other one in the cluster though. I guess this is a different issue though.
Can you help me troubleshoot the situation?
Where do I start?
I have a small cluster, 2 HP mini pcs running v7.3.
They are both running latest debian updates, latest BIOS firmware.
One of the devices, a Prodesk with an i5 9500T cpu has a weird behavior, which only started recently.
It's difficul for me to explain the problem...
If I leave it without video or keyboard, just power and lan, after a random period of time it looses network (the leds on the lan card turn off) and it gets really hot (assuming from the cpu). After a while it reboots, I get access via ssh for a few seconds, and then the cycle repeats.
I took it out of the "rack" and brought it to my desk, with a monitor and keyboard attached, and this doesn't happen. It worked with no interruption for 24h. If I unplug the monitor and keyboard, it does the same thing it did in the rack.
I also tried to just turn off the monitor, and it worked fine, until I turned the monitor back on. When I turned it on, there was no output, pressed enter on the keyboard, and the system rebooted.
I can't find anything wrong in any log file, everything looks okay, no error message, nothing.
The logs just stop at the moment of the reboot...
This machine worked with no issue for almost 2 years, and besides the regular updates, nothing changed.
Also strage, nkt sure if related, the web ui screen is just white on this machine. I can controll it from the other one in the cluster though. I guess this is a different issue though.
Can you help me troubleshoot the situation?
Where do I start?