With latest Proxmox 5.3-1 I am running into a Bug that halts KVMs. It is running on Intel NUC 8. Below is the syslog and pve info.
pveversion -v
Once this bug occurs I can ssh into Proxmox but all KVMs are unresponsive. When I reboot Pmox then it all KVMs are back to normal until the Bug resurfaces back.
Any suggestions on how to resolve this issue?
Thank you in advance,
Code:
Mar 26 00:21:09 pmox2 kernel: [630425.268008] BUG: unable to handle kernel paging request at ffffffffc17edb60
Mar 26 00:21:09 pmox2 kernel: [630425.268452] IP: vmx_handle_exit+0x200/0x1560 [kvm_intel]
Mar 26 00:21:09 pmox2 kernel: [630425.268882] PGD 4de60e067 P4D 4de60e067 PUD 4de610067 PMD 854fdd067 PTE 854d73061
Mar 26 00:21:09 pmox2 kernel: [630425.269319] Oops: 0003 [#3] SMP PTI
Mar 26 00:21:09 pmox2 kernel: [630425.269757] Modules linked in: tcp_diag inet_diag nfsv3 nfs_acl nfs lockd grace fscache ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_generic kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc zfs(PO) aesni_intel zunicode(PO) aes_x86_64 crypto_simd zavl(PO) glue_helper cryptd icp(PO) intel_cstate arc4 wmi_bmof intel_wmi_thunderbolt snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core iwlmvm snd_soc_sst_dsp snd_soc_sst_ipc snd_soc_acpi mac80211 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine ir_rc6_decoder snd_hda_intel iwlwifi btusb i915 btrtl btbcm snd_hda_codec
Mar 26 00:21:09 pmox2 kernel: [630425.272339] btintel snd_hda_core pcspkr intel_rapl_perf rtsx_pci_ms drm_kms_helper bluetooth memstick snd_hwdep drm ecdh_generic i2c_algo_bit snd_pcm cfg80211 fb_sys_fops syscopyarea snd_timer sysfillrect snd sysimgblt soundcore mei_me intel_pch_thermal mei shpchp wmi rc_rc6_mce ir_lirc_codec lirc_dev ite_cir rc_core video mac_hid acpi_pad zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq rtsx_pci_sdmmc e1000e(O) ptp pps_core i2c_i801 rtsx_pci ahci libahci
Mar 26 00:21:09 pmox2 kernel: [630425.275059] CPU: 6 PID: 15012 Comm: kvm Tainted: P D O 4.15.18-12-pve #1
Mar 26 00:21:09 pmox2 kernel: [630425.276014] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0041.2018.0719.1931 07/19/2018
Mar 26 00:21:09 pmox2 kernel: [630425.277001] RIP: 0010:vmx_handle_exit+0x200/0x1560 [kvm_intel]
Mar 26 00:21:09 pmox2 kernel: [630425.277998] RSP: 0018:ffffbf2f838cbd18 EFLAGS: 00010286
Mar 26 00:21:09 pmox2 kernel: [630425.278996] RAX: ffffffffc17edb60 RBX: ffff99be75f28000 RCX: 00000000000000e0
Mar 26 00:21:09 pmox2 kernel: [630425.280000] RDX: 00000000000000ef RSI: 00000000000000ef RDI: ffff99be75f28000
Mar 26 00:21:09 pmox2 kernel: [630425.281007] RBP: ffffbf2f838cbdc0 R08: 0000000000000000 R09: ffff99be90890000
Mar 26 00:21:09 pmox2 kernel: [630425.282020] R10: 0000000000000001 R11: 0000000000000000 R12: 000613033f431951
Mar 26 00:21:09 pmox2 kernel: [630425.283042] R13: 0000000000000000 R14: ffff99be678152b0 R15: ffff99be75f2c280
Mar 26 00:21:09 pmox2 kernel: [630425.284071] FS: 00007f98a3bff700(0000) GS:ffff99c43dd80000(0000) knlGS:0000000000000000
Mar 26 00:21:09 pmox2 kernel: [630425.285105] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 26 00:21:09 pmox2 kernel: [630425.286141] CR2: ffffffffc17edb60 CR3: 0000000547382004 CR4: 00000000003626e0
Mar 26 00:21:09 pmox2 kernel: [630425.287192] Call Trace:
Mar 26 00:21:09 pmox2 kernel: [630425.288257] ? kvm_arch_vcpu_ioctl_run+0x95c/0x16d0 [kvm]
Mar 26 00:21:09 pmox2 kernel: [630425.289332] kvm_vcpu_ioctl+0x339/0x620 [kvm]
Mar 26 00:21:09 pmox2 kernel: [630425.290400] ? kvm_vcpu_ioctl+0x339/0x620 [kvm]
Mar 26 00:21:09 pmox2 kernel: [630425.291464] do_vfs_ioctl+0xa6/0x620
Mar 26 00:21:09 pmox2 kernel: [630425.292561] ? kvm_on_user_return+0x70/0xa0 [kvm]
Mar 26 00:21:09 pmox2 kernel: [630425.293661] SyS_ioctl+0x79/0x90
Mar 26 00:21:09 pmox2 kernel: [630425.294763] ? exit_to_usermode_loop+0xa5/0xd0
Mar 26 00:21:09 pmox2 kernel: [630425.295864] do_syscall_64+0x73/0x130
Mar 26 00:21:09 pmox2 kernel: [630425.296955] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 26 00:21:09 pmox2 kernel: [630425.298058] RIP: 0033:0x7f98b3ebe017
Mar 26 00:21:09 pmox2 kernel: [630425.299163] RSP: 002b:00007f98a3bfc538 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 26 00:21:09 pmox2 kernel: [630425.300299] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f98b3ebe017
Mar 26 00:21:09 pmox2 kernel: [630425.301425] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
Mar 26 00:21:09 pmox2 kernel: [630425.302561] RBP: 0000000000000000 R08: 00007f98a64c3ac0 R09: 000000000000ffff
Mar 26 00:21:09 pmox2 kernel: [630425.303684] R10: 00007f98cc8dc000 R11: 0000000000000246 R12: 00007f98a6662000
Mar 26 00:21:09 pmox2 kernel: [630425.304819] R13: 00007f98cc8db000 R14: 0000000000000000 R15: 00007f98a6662000
Mar 26 00:21:09 pmox2 kernel: [630425.305956] Code: 00 b8 ff 01 00 00 0f 79 d0 0f 96 c0 84 c0 0f 84 5e fe ff ff be ff 01 00 00 bf 12 08 00 00 e8 f8 1f ff ff e9 4a fe ff ff ba 04 44 <00> 00 0f 78 d0 41 80 be ea 59 00 00 00 48 89 45 b0 0f 85 4c fe
Mar 26 00:21:09 pmox2 kernel: [630425.307187] RIP: vmx_handle_exit+0x200/0x1560 [kvm_intel] RSP: ffffbf2f838cbd18
Mar 26 00:21:09 pmox2 kernel: [630425.308414] CR2: ffffffffc17edb60
Mar 26 00:21:09 pmox2 kernel: [630425.309641] ---[ end trace 205d4ecee86f6aff ]---
pveversion -v
Code:
proxmox-ve: 5.3-1 (running kernel: 4.15.18-12-pve)
pve-manager: 5.3-11 (running version: 5.3-11/d4907f84)
pve-kernel-4.15: 5.3-3
pve-kernel-4.15.18-12-pve: 4.15.18-35
pve-kernel-4.15.18-10-pve: 4.15.18-32
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-47
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-12
libpve-storage-perl: 5.0-39
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-23
pve-cluster: 5.0-33
pve-container: 2.0-35
pve-docs: 5.3-3
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-18
pve-firmware: 2.0-6
pve-ha-manager: 2.0-8
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 2.12.1-2
pve-xtermjs: 3.10.1-2
qemu-server: 5.0-47
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo1
Once this bug occurs I can ssh into Proxmox but all KVMs are unresponsive. When I reboot Pmox then it all KVMs are back to normal until the Bug resurfaces back.
Any suggestions on how to resolve this issue?
Thank you in advance,