I'm getting these kernel dumps after a while of running KVM's on the server.
I've replaced all the memory modules. The Mobo has the latest BIOS. This is a new install so I don't know if this is only happening on 4.4.35-2 or in previous versions, too.
Can this be a hardware error?
I've replaced all the memory modules. The Mobo has the latest BIOS. This is a new install so I don't know if this is only happening on 4.4.35-2 or in previous versions, too.
Can this be a hardware error?
Code:
[ 7234.424652] NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
[ 7234.424669] Modules linked in:
[ 7234.424690] ebt_ip binfmt_misc ebtable_filter ebtables nfsv3 ip_set ip6table_filter ip6_tables iptable_filter ip_tables x_tables softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_pcm snd_timer snd soundcore pcspkr joydev input_leds sb_edac edac_core i2c_i801 lpc_ich mei_me mei ioatdma ipmi_si ipmi_msghandler 8250_fintek
[ 7234.424737] shpchp wmi mac_hid vhost_net vhost macvtap macvlan autofs4 btrfs xor raid6_pq hid_generic ixgbe(O) vxlan ip6_udp_tunnel udp_tunnel usbkbd usbmouse usbhid ahci isci libahci hid libsas igb(O) scsi_transport_sas dca ptp pps_core megaraid_sas fjes
[ 7234.424758] CPU: 6 PID: 8900 Comm: kvm Tainted: P O 4.4.35-2-pve #1
[ 7234.424760] Hardware name: Supermicro X9DRH-7TF/7F/iTF/iF/X9DRH-7TF/7F/iTF/iF, BIOS 3.2a 06/18/2016
[ 7234.424762] 0000000000000086 000000004245fa45 ffff88207fc85b90 ffffffff813f9523
[ 7234.424765] 0000000000000000 0000000000000000 ffff88207fc85ba8 ffffffff8113bfbf
[ 7234.424768] ffff882038a88000 ffff88207fc85be0 ffffffff81184eb8 0000000000000001
[ 7234.424771] Call Trace:
[ 7234.424772] <NMI> [<ffffffff813f9523>] dump_stack+0x63/0x90
[ 7234.424787] [<ffffffff8113bfbf>] watchdog_overflow_callback+0xbf/0xd0
[ 7234.424791] [<ffffffff81184eb8>] __perf_event_overflow+0x88/0x1d0
[ 7234.424793] [<ffffffff81185a84>] perf_event_overflow+0x14/0x20
[ 7234.424797] [<ffffffff8100c6a1>] intel_pmu_handle_irq+0x1e1/0x490
[ 7234.424803] [<ffffffff811cee7c>] ? vunmap_page_range+0x20c/0x330
[ 7234.424806] [<ffffffff811cefb1>] ? unmap_kernel_range_noflush+0x11/0x20
[ 7234.424809] [<ffffffff814c6dbe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
[ 7234.424814] [<ffffffff8105a23b>] ? native_apic_msr_write+0x2b/0x30
[ 7234.424817] [<ffffffff8105a08d>] ? x2apic_send_IPI_self+0x1d/0x20
[ 7234.424821] [<ffffffff810058dd>] perf_event_nmi_handler+0x2d/0x50
[ 7234.424825] [<ffffffff810325d6>] nmi_handle+0x66/0x120
[ 7234.424827] [<ffffffff81032b40>] default_do_nmi+0x40/0x100
[ 7234.424830] [<ffffffff81032ce2>] do_nmi+0xe2/0x130
[ 7234.424834] [<ffffffff8185e751>] end_repeat_nmi+0x1a/0x1e
[ 7234.424838] [<ffffffff814067e5>] ? delay_tsc+0x25/0x50
[ 7234.424841] [<ffffffff814067e5>] ? delay_tsc+0x25/0x50
[ 7234.424843] [<ffffffff814067e5>] ? delay_tsc+0x25/0x50
[ 7234.424845] <<EOE>> [<ffffffff814066ff>] __delay+0xf/0x20
[ 7234.424877] [<ffffffffc056eb2b>] wait_lapic_expire+0x12b/0x130 [kvm]
[ 7234.424892] [<ffffffffc0552a28>] kvm_arch_vcpu_ioctl_run+0x608/0x1460 [kvm]
[ 7234.424906] [<ffffffffc054ca0a>] ? kvm_arch_vcpu_load+0x5a/0x220 [kvm]
[ 7234.424918] [<ffffffffc0539eca>] kvm_vcpu_ioctl+0x31a/0x5e0 [kvm]
[ 7234.424923] [<ffffffff81222d02>] do_vfs_ioctl+0x2d2/0x4b0
[ 7234.424926] [<ffffffff8118ae6b>] ? fire_user_return_notifiers+0x3b/0x50
[ 7234.424930] [<ffffffff81003360>] ? exit_to_usermode_loop+0xb0/0xd0
[ 7234.424932] [<ffffffff81222f59>] SyS_ioctl+0x79/0x90
[ 7234.424934] [<ffffffff81003c38>] ? syscall_return_slowpath+0x98/0x110
[ 7234.424937] [<ffffffff8185c276>] entry_SYSCALL_64_fastpath+0x16/0x75
Code:
~# pveversion -v
proxmox-ve: 4.4-79 (running kernel: 4.4.35-2-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.35-2-pve: 4.4.35-79
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-108
pve-firmware: 1.1-10
libpve-common-perl: 4.0-91
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-73
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-3
pve-qemu-kvm: 2.7.1-1
pve-container: 1.0-93
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-1
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve14~bpo80