I got a new server and found some issues during loading and adding new VMs.
At first it froze, but after Proxmox upgrading it restarts automatically. It prints 'Start all VMs and Containers' in the tasks.
Provider tested its hardware and didn't found any problems. But I can see mce: [Hardware Error] from the logs below.
Is that CPU Hardware error or something else?
Thanks in avance.
syslog
I just found in logs this
At first it froze, but after Proxmox upgrading it restarts automatically. It prints 'Start all VMs and Containers' in the tasks.
Provider tested its hardware and didn't found any problems. But I can see mce: [Hardware Error] from the logs below.
Is that CPU Hardware error or something else?
Thanks in avance.
syslog
Code:
Jul 23 15:37:51 E2S rasdaemon[712]: cpu 01:rasdaemon: mce_record store: 0x55d31a9a7518
Jul 23 15:37:51 E2S kernel: [44481.960270] mce: [Hardware Error]: Machine check events logged
Jul 23 15:37:51 E2S rasdaemon[712]: rasdaemon: register inserted at db
Code:
ras-mc-ctl --summary
No Memory errors.
No PCIe AER errors.
No Extlog errors.
MCE records summary:
2 Instruction CACHE Level-0 Instruction-Fetch Error errors
1 Internal parity error errors
Code:
ras-mc-ctl --errors
No Memory errors.
No PCIe AER errors.
No Extlog errors.
MCE events:
1 2020-07-19 03:43:06 +0200 error: Instruction CACHE Level-0 Instruction-Fetch Error, mcg mcgstatus=0, mci Corrected_error Error_enabled, mcgcap=0x00000c0e, status=0x9400004000040150, addr=0x1ffff9c8e93c0, tsc=0x199d94e3f312c, walltime=0x5f13a52a, cpu=0x00000001, cpuid=0x000906ec, apicid=0x00000002
2 2020-07-19 03:55:10 +0200 error: Internal parity error, mcg mcgstatus=0, mci Corrected_error Error_enabled, mcgcap=0x00000c0e, status=0x9000004000010005, tsc=0x19c37efad6712, walltime=0x5f13a7fe, cpu=0x00000001, cpuid=0x000906ec, apicid=0x00000002
3 2020-07-23 15:37:51 +0200 error: Instruction CACHE Level-0 Instruction-Fetch Error, mcg mcgstatus=0, mci Corrected_error Error_enabled, mcgcap=0x00000c0e, status=0x9400004000040150, addr=0x974d56e7, tsc=0x91c13254a62a, walltime=0x5f1992af, cpu=0x00000001, cpuid=0x000906ec, apicid=0x00000002
Code:
pveversion -v
proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve)
pve-manager: 6.2-10 (running version: 6.2-10/a20769ed)
pve-kernel-5.4: 6.2-4
pve-kernel-helper: 6.2-4
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.41-1-pve: 5.4.41-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-5
libpve-guest-common-perl: 3.1-1
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-9
pve-cluster: 6.1-8
pve-container: 3.1-11
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-11
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-10
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1
I just found in logs this
Code:
Jul 23 16:30:10 E2S kernel: smpboot: CPU0: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz (family: 0x6, model: 0x9e, stepping: 0xc)
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: Machine check events logged
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: TSC 0 ADDR 63de0dd1 MISC 63de0dd1
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: PROCESSOR 0:906ec TIME 1595514604 SOCKET 0 APIC 0 microcode d6
...
Jul 23 16:30:10 E2S kernel: .... node #0, CPUs: #1
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: Machine check events logged
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 3: be00000000800400
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: TSC 0 ADDR 63de0dd1 MISC 63de0dd1
Jul 23 16:30:10 E2S kernel: mce: [Hardware Error]: PROCESSOR 0:906ec TIME 1595514604 SOCKET 0 APIC 2 microcode d6
Last edited: