Found my NUC with Proxmox installed in unresponsive state today (first time ever after 2 weeks of use).
On reboot see these errors:
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: xxx
mce: [Hardware Error]: TSC 0 ADDR fef1ce80 MISC xxx
mce: [Hardware Error]: PROCESSOR 0:a0660 TIME xxx SOCKET 0
APIC 0 microcode ca
(see attached pic - https://i.imgur.com/LYsQyyN.png)
The box booted and seems normal so far but see those errors on boot
Quick memory test did not show any problems so far.
rasdaemon -f, journalctl -f show no obvious problems.
==========================
root@pve:~# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
node 0 size: 64036 MB
node 0 free: 42377 MB
node distances:
node 0
0: 10
(reverse-i-search)`jo': ^Curnalctl -f
root@pve:~# ras-mc-ctl --errors
No Memory errors.
No PCIe AER errors.
No Extlog errors.
No MCE errors.
===========================
root@pve:~# ras-mc-ctl --errors
No Memory errors.
No PCIe AER errors.
No Extlog errors.
No MCE errors.
root@pve:~# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
node 0 size: 64036 MB
node 0 free: 42345 MB
node distances:
node 0
0: 10
================
I run Intel NUC 7 BXNUC10i7FNH
Here is my CPU info https://pastebin.com/MpXedi1h
Anybody had experience with such errors ? Bad RAM, motherboard ?
Can it be benign?
Thx in advance!
On reboot see these errors:
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: xxx
mce: [Hardware Error]: TSC 0 ADDR fef1ce80 MISC xxx
mce: [Hardware Error]: PROCESSOR 0:a0660 TIME xxx SOCKET 0
APIC 0 microcode ca
(see attached pic - https://i.imgur.com/LYsQyyN.png)
The box booted and seems normal so far but see those errors on boot
Quick memory test did not show any problems so far.
rasdaemon -f, journalctl -f show no obvious problems.
==========================
root@pve:~# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
node 0 size: 64036 MB
node 0 free: 42377 MB
node distances:
node 0
0: 10
(reverse-i-search)`jo': ^Curnalctl -f
root@pve:~# ras-mc-ctl --errors
No Memory errors.
No PCIe AER errors.
No Extlog errors.
No MCE errors.
===========================
root@pve:~# ras-mc-ctl --errors
No Memory errors.
No PCIe AER errors.
No Extlog errors.
No MCE errors.
root@pve:~# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11
node 0 size: 64036 MB
node 0 free: 42345 MB
node distances:
node 0
0: 10
================
I run Intel NUC 7 BXNUC10i7FNH
Here is my CPU info https://pastebin.com/MpXedi1h
Anybody had experience with such errors ? Bad RAM, motherboard ?
Can it be benign?
Thx in advance!
Attachments
Last edited: