I have twice today had my host totally crash, while a guest was busy compiling and installing a bunch of different software packages (that load-case may be unrelated to the actual problem).
Before this started happening, I ran Proxmox updates and rebooted the host, which included an update from pve-kernel-5.4.78-1-pve to pve-kernel-5.4.78-2-pve. After completing that I was trying to upgrade my MacPorts packages inside my Big Sur VM, but after about 20 minutes the host would die.
Both times Proxmox logged the same general protection fault error in ebtables-restore, before the host totally died and stopped writing further logs, here's the first incident:
And the second:
It's really interesting to me that the crash was identical both times.
pveversion: pve-manager/6.3-3/eee5f901 (running kernel: 5.4.78-2-pve)
Before this started happening, I ran Proxmox updates and rebooted the host, which included an update from pve-kernel-5.4.78-1-pve to pve-kernel-5.4.78-2-pve. After completing that I was trying to upgrade my MacPorts packages inside my Big Sur VM, but after about 20 minutes the host would die.
Both times Proxmox logged the same general protection fault error in ebtables-restore, before the host totally died and stopped writing further logs, here's the first incident:
Code:
Dec 16 13:10:01 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 16 13:10:28 proxmox kernel: [ 2477.109806] general protection fault: 0000 [#1] SMP PTI
Dec 16 13:10:28 proxmox kernel: [ 2477.116235] CPU: 13 PID: 16588 Comm: ebtables-restor Tainted: P O 5.4.78-2-pve #1
Dec 16 13:10:28 proxmox kernel: [ 2477.141017] RIP: 0010:__kmalloc_node+0x19d/0x330
Dec 16 13:10:28 proxmox kernel: [ 2477.147086] Code: 75 0e 4d 89 f9 41 f6 47 0b 04 0f 84 ef fe ff ff 4c 89 ff e8 25 f8 01 00 49 89 c1 e9 df fe ff ff 41
8b 41 20 49 8b 39 4c 01 d0 <48> 8b 18 48 89 c1 49 33 99 70 01 00 00 4c 89 d0 48 0f c9 48 31 cb
Dec 16 13:10:28 proxmox kernel: [ 2477.171446] RAX: 5c691476e7e1ef27 RBX: 0000000000000000 RCX: 0000000000000000
Dec 16 13:10:28 proxmox kernel: [ 2477.183486] RBP: ffff9fc231d7fbd0 R08: ffff8bd35f970040 R09: ffff8bcb5f407b80
Dec 16 13:10:28 proxmox kernel: [ 2477.195246] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff8bcb5f407b80
Dec 16 13:10:28 proxmox kernel: [ 2477.212439] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 16 13:10:28 proxmox kernel: [ 2477.223792] Call Trace:
Dec 16 13:10:28 proxmox kernel: [ 2477.234838] __vmalloc_node_range+0xd4/0x270
Dec 16 13:10:28 proxmox kernel: [ 2477.245659] ? translate_table+0x5a0/0x710 [ebtables]
Dec 16 13:10:28 proxmox kernel: [ 2477.256102] do_replace_finish+0x232/0x730 [ebtables]
Dec 16 13:10:28 proxmox kernel: [ 2477.266130] ? __vmalloc_node_range+0x1eb/0x270
Dec 16 13:10:28 proxmox kernel: [ 2477.275773] do_ebt_set_ctl+0x69/0x80 [ebtables]
Dec 16 13:10:28 proxmox kernel: [ 2477.285091] ip_setsockopt+0x66/0x90
Dec 16 13:10:28 proxmox kernel: [ 2477.293961] sock_common_setsockopt+0x1a/0x20
Dec 16 13:10:28 proxmox kernel: [ 2477.302542] __x64_sys_setsockopt+0x24/0x30
Dec 16 13:10:28 proxmox kernel: [ 2477.310705] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 16 13:10:28 proxmox kernel: [ 2477.318667] Code: ff ff ff c3 48 8b 15 25 04 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b1 0f 1f 80 00 00 00 00 49
89 ca b8 36 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f6 03 0c 00 f7 d8 64 89 01 48
Dec 16 13:10:28 proxmox kernel: [ 2477.339310] RAX: ffffffffffffffda RBX: 0000000000000e38 RCX: 00007fbaec9a9a6a
Dec 16 13:10:28 proxmox kernel: [ 2477.347877] RBP: 0000562fe2455150 R08: 0000000000000e38 R09: 0000562fe24551d0
Dec 16 13:10:28 proxmox kernel: [ 2477.356114] R13: 00007fbaecaa9468 R14: 0000562fe2453750 R15: 0000562fe2455fd0
Dec 16 13:10:28 proxmox kernel: [ 2477.360137] hwmon_vid coretemp vfio_pci vfio_virqfd irqbypass vfio_iommu_type1 vfio ip_tables x_tables autofs4 zfs(P
O) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) hid_logitech_hidpp hid_logitech_dj usbmouse hid_generic usbkbd usbhid hid uas u
sb_storage raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear xhci_pci ahci i
sci ehci_pci xhci_hcd i2c_i801 libahci lpc_ich libsas e1000e ehci_hcd scsi_transport_sas wmi
Dec 16 13:10:28 proxmox kernel: [ 2477.615334] RIP: 0010:__kmalloc_node+0x19d/0x330
Dec 16 13:10:28 proxmox kernel: [ 2477.639440] RSP: 0018:ffff9fc231d7fb90 EFLAGS: 00010206
Dec 16 13:10:28 proxmox kernel: [ 2477.645396] RAX: 5c691476e7e1ef27 RBX: 0000000000000000 RCX: 0000000000000000
Dec 16 13:10:28 proxmox kernel: [ 2477.657557] RBP: ffff9fc231d7fbd0 R08: ffff8bd35f970040 R09: ffff8bcb5f407b80
Dec 16 13:10:28 proxmox kernel: [ 2477.669778] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff8bcb5f407b80
Dec 16 13:10:28 proxmox kernel: [ 2477.687526] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 16 13:10:28 proxmox pve-firewall[6736]: status update error: ebtables_restore_cmdlist: got signal 11
Dec 16 13:18:04 proxmox systemd-modules-load[1411]: Inserted module 'vfio'
Dec 16 13:18:04 proxmox kernel: [ 0.000000] microcode: microcode updated early to revision 0x718, date = 2019-05-21
And the second:
Code:
Dec 16 15:37:01 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 16 15:37:15 proxmox kernel: [ 8404.719136] CPU: 20 PID: 9752 Comm: ebtables-restor Tainted: P O 5.4.78-2-pve #1
Dec 16 15:37:15 proxmox kernel: [ 8404.768147] RSP: 0018:ffffb65653833b90 EFLAGS: 00010202
Dec 16 15:37:15 proxmox kernel: [ 8404.786276] RBP: ffffb65653833bd0 R08: ffff94c7dfb30040 R09: ffff94c7df407b80
Dec 16 15:37:15 proxmox kernel: [ 8404.803820] FS: 00007fc0af2df740(0000) GS:ffff94c7dfb00000(0000) knlGS:0000000000000000
Dec 16 15:37:15 proxmox kernel: [ 8404.826467] Call Trace:
Dec 16 15:37:15 proxmox kernel: [ 8404.842788] vmalloc+0x4c/0x50
Dec 16 15:37:15 proxmox kernel: [ 8404.858314] do_replace_finish+0x232/0x730 [ebtables]
Dec 16 15:37:15 proxmox kernel: [ 8404.873081] do_replace+0x15f/0x1e0 [ebtables]
Dec 16 15:37:15 proxmox kernel: [ 8404.887084] ip_setsockopt+0x66/0x90
Dec 16 15:37:15 proxmox kernel: [ 8404.900174] __sys_setsockopt+0xcc/0x180
Dec 16 15:37:15 proxmox kernel: [ 8404.912539] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 16 15:37:15 proxmox kernel: [ 8404.932768] RSP: 002b:00007ffff5545278 EFLAGS: 00000206 ORIG_RAX: 0000000000000036
Dec 16 15:37:15 proxmox kernel: [ 8404.949667] RBP: 00005654ba2a3150 R08: 0000000000000e38 R09: 00005654ba2a31d0
Dec 16 15:37:15 proxmox kernel: [ 8404.961913] Modules linked in: veth 8021q garp mrp ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac xt_NFLOG xt_limit ipt_REJECT nf_reject_ipv4 xt_physdev xt_addrtype xt_multiport xt_conntrack xt_set xt_tcpudp xt_comment xt_mark ip_set_hash_net ip_set iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter softdog nfnetlink_log nfnetlink dm_crypt algif_skcipher af_alg intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd vhost_net btusb glue_helper vhost btrtl tap rapl btbcm drm_vram_helper btintel intel_cstate ttm pcspkr bluetooth drm_kms_helper vendor_reset(O) ipmi_ssif ecdh_generic joydev input_leds ecc hid_magicmouse drm i2c_algo_bit fb_sys
_fops syscopyarea sysfillrect sysimgblt mei_me mei ioatdma dca mac_hid ipmi_si ipmi_devintf ipmi_msghandler nct6775 hwmon_vid
Dec 16 15:37:15 proxmox kernel: [ 8405.217041] RIP: 0010:__kmalloc_node+0x19d/0x330
Dec 16 15:37:15 proxmox kernel: [ 8405.240771] RSP: 0018:ffffb65653833b90 EFLAGS: 00010202
Dec 16 15:37:15 proxmox kernel: [ 8405.264982] R10: 3b053d0a7e892d04 R11: ffffb655c0000000 R12: 0000000000000dc0
Dec 16 15:37:15 proxmox kernel: [ 8405.277079] FS: 00007fc0af2df740(0000) GS:ffff94c7dfb00000(0000) knlGS:0000000000000000
Dec 16 15:37:15 proxmox kernel: [ 8405.288781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 16 15:37:15 proxmox pve-firewall[7646]: status update error: ebtables_restore_cmdlist: got signal 11
Dec 16 15:37:17 proxmox pvedaemon[7702]: <root@pam> successful auth for user 'root@pam'
Dec 16 15:37:17 proxmox kernel: [ 8406.806612] kvm [30317]: vcpu29, guest rIP: 0xffffff800d05dd75 ignored rdmsr: 0x3f9
Dec 16 15:37:17 proxmox kernel: [ 8406.829684] kvm [30317]: vcpu29, guest rIP: 0xffffff800d05dd92 ignored rdmsr: 0x630
Dec 16 17:25:06 proxmox systemd-modules-load[1409]: Inserted module 'vfio'
Dec 16 17:25:06 proxmox kernel: [ 0.000000] microcode: microcode updated early to revision 0x718, date = 2019-05-21
It's really interesting to me that the crash was identical both times.
pveversion: pve-manager/6.3-3/eee5f901 (running kernel: 5.4.78-2-pve)