Hi,
I've upgraded (
after which the screen turned black. Not a good sign. I booted the previous kernel (6.5.13-5-pve) via advanced boot options and pinned that kernel now using
I exported the logs using journalctl and noticed the following kernel panics while booting the 6.8.8-2-pve kernel:
I've attached a boot log of both 6.8.8-2-pve (contains mulitple kernel panics/traces) and 6.5.13-5-pve (successful boot).
My system boots up with the following modules added to
Any idea what the issue might be, or how I can further debug this?
Thanks in advance!
I've upgraded (
dist-upgrade
) to PVE 8.2.4 today that comes with kernel 6.8.8-2-pve. After a reboot that I always perform after PVE dist-upgrade
, PVE didn't come back up. I hooked it up to my TV (HDMI) and noticed that the only messages visible were:
Code:
Booting Proxmox VE GNU/Linux
Loading Linux 6.8.8-2-pve ...
Loading initial ramdisk ...
Found volume group "pve" using metadata type lvm 32 logical volune(s) in volume group "pve" now active
/dev/mapper/pue-root: recovering journal
/dev/mapper/pue-root: clean, 112422/3932160 files, 8756405/15728640
after which the screen turned black. Not a good sign. I booted the previous kernel (6.5.13-5-pve) via advanced boot options and pinned that kernel now using
proxmox-boot-tool kernel pin 6.5.13-5-pve
. Now the system is functional again, but may become unstable again in a next dist-upgrade
.I exported the logs using journalctl and noticed the following kernel panics while booting the 6.8.8-2-pve kernel:
Code:
Jun 29 14:49:14 pve pve-guests[1784]: <root@pam> starting task UPID:pve:000006F9:00000699:668002CA:qmstart:200:root@pam:
Jun 29 14:49:14 pve pve-guests[1785]: start VM 200: UPID:pve:000006F9:00000699:668002CA:qmstart:200:root@pam:
Jun 29 14:49:14 pve kernel: Console: switching to colour dummy device 80x25
Jun 29 14:49:14 pve kernel: i915 0000:00:02.0: MDEV: Unregistering
Jun 29 14:49:16 pve kernel: platform INT3515:01: deferred probe pending: Serial bus multi instantiate pseudo device driver: Error creating i2c-client, idx 0
Jun 29 14:49:18 pve chronyd[1527]: Selected source 185.51.192.62 (2.debian.pool.ntp.org)
Jun 29 14:49:18 pve chronyd[1527]: System clock TAI offset set to 37 seconds
Jun 29 14:49:22 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000038
Jun 29 14:49:22 pve kernel: #PF: supervisor read access in kernel mode
Jun 29 14:49:22 pve kernel: #PF: error_code(0x0000) - not-present page
Jun 29 14:49:22 pve kernel: PGD 0 P4D 0
Jun 29 14:49:22 pve kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jun 29 14:49:22 pve kernel: CPU: 3 PID: 1806 Comm: kvm Tainted: P O 6.8.8-2-pve #1
Jun 29 14:49:22 pve kernel: Hardware name: Gigabyte Technology Co., Ltd. Z490M/Z490M, BIOS F2 03/26/2020
Jun 29 14:49:22 pve kernel: RIP: 0010:__memcg_slab_post_alloc_hook+0x9e/0x230
Jun 29 14:49:22 pve kernel: Code: 03 05 4e eb 69 01 48 8b 50 08 49 89 c6 f6 c2 01 0f 85 75 01 00 00 0f 1f 44 00 00 49 8b 06 f6 c4 08 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 38 48 83 f8 03 77 20 8b 55 c4 31 c9 4c 89 fe 4c 89 f7 e8
Jun 29 14:49:22 pve kernel: RSP: 0018:ffffad3a010038f8 EFLAGS: 00010246
Jun 29 14:49:22 pve kernel: RAX: 0000000000000000 RBX: ffff8aee180ea780 RCX: 0000000000000001
Jun 29 14:49:22 pve kernel: RDX: dead000000000100 RSI: ffff8aee03eec240 RDI: 0000000000000cc0
Jun 29 14:49:22 pve kernel: RBP: ffffad3a01003940 R08: ffffad3a01003970 R09: 0000000000000000
Jun 29 14:49:22 pve kernel: R10: ffff8aee180ea780 R11: 0000000000000000 R12: ffff8aee03eec240
Jun 29 14:49:22 pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8aee001fd500
Jun 29 14:49:22 pve kernel: FS: 00007be436c9f300(0000) GS:ffff8af59f180000(0000) knlGS:0000000000000000
Jun 29 14:49:22 pve kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 29 14:49:22 pve kernel: CR2: 0000000000000038 CR3: 000000013c1e2002 CR4: 00000000003726f0
Jun 29 14:49:22 pve kernel: Call Trace:
Jun 29 14:49:22 pve kernel: <TASK>
Jun 29 14:49:22 pve kernel: ? show_regs+0x6d/0x80
Jun 29 14:49:22 pve kernel: ? __die+0x24/0x80
Jun 29 14:49:22 pve kernel: ? page_fault_oops+0x176/0x500
Jun 29 14:49:22 pve kernel: ? do_user_addr_fault+0x2f9/0x6b0
Jun 29 14:49:22 pve kernel: ? exc_page_fault+0x83/0x1b0
Jun 29 14:49:22 pve kernel: ? asm_exc_page_fault+0x27/0x30
Jun 29 14:49:22 pve kernel: ? __memcg_slab_post_alloc_hook+0x9e/0x230
Jun 29 14:49:22 pve kernel: kmem_cache_alloc_lru+0x3b9/0x420
Jun 29 14:49:22 pve kernel: ? __d_alloc+0x34/0x250
Jun 29 14:49:22 pve kernel: __d_alloc+0x34/0x250
Jun 29 14:49:22 pve kernel: d_alloc+0x1a/0x90
Jun 29 14:49:22 pve kernel: d_alloc_parallel+0x5a/0x3e0
Jun 29 14:49:22 pve kernel: ? generic_permission+0x39/0x240
Jun 29 14:49:22 pve kernel: __lookup_slow+0x5c/0x130
Jun 29 14:49:22 pve kernel: lookup_one_len+0xa3/0xb0
Jun 29 14:49:22 pve kernel: start_creating.part.0+0x89/0x1a0
Jun 29 14:49:22 pve kernel: __debugfs_create_file+0x97/0x230
Jun 29 14:49:22 pve kernel: debugfs_create_file+0x29/0x40
Jun 29 14:49:22 pve kernel: kvm_dev_ioctl+0x7e0/0xa20 [kvm]
Jun 29 14:49:22 pve kernel: __x64_sys_ioctl+0xa0/0xf0
Jun 29 14:49:22 pve kernel: x64_sys_call+0xa68/0x24b0
Jun 29 14:49:22 pve kernel: do_syscall_64+0x81/0x170
Jun 29 14:49:22 pve kernel: ? __x64_sys_ioctl+0xbb/0xf0
Jun 29 14:49:22 pve kernel: ? kvm_vm_ioctl_check_extension_generic+0x54/0x220 [kvm]
Jun 29 14:49:22 pve kernel: ? kvm_dev_ioctl+0x31a/0xa20 [kvm]
Jun 29 14:49:22 pve kernel: ? do_syscall_64+0x8d/0x170
Jun 29 14:49:22 pve kernel: ? __x64_sys_ioctl+0xbb/0xf0
Jun 29 14:49:22 pve kernel: ? syscall_exit_to_user_mode+0x89/0x260
Jun 29 14:49:22 pve kernel: ? do_syscall_64+0x8d/0x170
Jun 29 14:49:22 pve kernel: ? syscall_exit_to_user_mode+0x89/0x260
Jun 29 14:49:22 pve kernel: ? do_syscall_64+0x8d/0x170
Jun 29 14:49:22 pve kernel: ? __count_memcg_events+0x6f/0xe0
Jun 29 14:49:22 pve kernel: ? count_memcg_events.constprop.0+0x2a/0x50
Jun 29 14:49:22 pve kernel: ? handle_mm_fault+0xad/0x380
Jun 29 14:49:22 pve kernel: ? do_user_addr_fault+0x343/0x6b0
Jun 29 14:49:22 pve kernel: ? irqentry_exit_to_user_mode+0x7e/0x260
Jun 29 14:49:22 pve kernel: ? irqentry_exit+0x43/0x50
Jun 29 14:49:22 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:22 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:22 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:22 pve kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
Jun 29 14:49:22 pve kernel: RIP: 0033:0x7be44198cc5b
Jun 29 14:49:22 pve kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Jun 29 14:49:22 pve kernel: RSP: 002b:00007fff2684a940 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jun 29 14:49:22 pve kernel: RAX: ffffffffffffffda RBX: 000000000000ae01 RCX: 00007be44198cc5b
Jun 29 14:49:22 pve kernel: RDX: 0000000000000000 RSI: 000000000000ae01 RDI: 000000000000000b
Jun 29 14:49:22 pve kernel: RBP: 0000000000000000 R08: 00007be441a61c68 R09: 00000006194b09f2
Jun 29 14:49:22 pve kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00006194b0777790
Jun 29 14:49:22 pve kernel: R13: 00006194aebef54a R14: 0000000000000000 R15: 00006194aebed31f
Jun 29 14:49:22 pve kernel: </TASK>
Jun 29 14:49:22 pve kernel: Modules linked in: ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_NFLOG xt_limit xt_mac ipt_REJECT nf_reject_ipv4 xt_set xt_physdev xt_addrtype xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment xt_tcpudp xt_mark iptable_filter ip_set_hash_net ip_set scsi_transport_iscsi nf_tables bluetooth ecdh_generic ecc msr softdog binfmt_misc bonding tls nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling vhost_net vhost x86_pkg_temp_thermal vhost_iotlb tap intel_powerclamp kvmgt coretemp mdev kvm_intel crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mei_pxp mei_hdcp aesni_intel crypto_simd zfs(PO) cryptd cmdlinepart intel_pmc_core rapl spi_nor intel_vsec joydev mei_me input_leds pmt_telemetry intel_cstate pcspkr gigabyte_wmi wmi_bmof intel_wmi_thunderbolt mtd ee1004 mei intel_pch_thermal serial_multi_instantiate
Jun 29 14:49:22 pve kernel: spl(O) pmt_class acpi_tad acpi_pad mac_hid i915 drm_buddy ttm drm_display_helper cec rc_core i2c_algo_bit kvm nfsd auth_rpcgss vfio_pci nfs_acl vfio_pci_core lockd irqbypass grace vfio_iommu_type1 vfio iommufd sunrpc efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic usbkbd usbhid hid dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci xhci_pci_renesas i2c_i801 spi_intel_pci crc32_pclmul e1000e xhci_hcd ahci spi_intel i2c_smbus libahci video wmi pinctrl_cannonlake
Jun 29 14:49:22 pve kernel: CR2: 0000000000000038
Jun 29 14:49:22 pve kernel: ---[ end trace 0000000000000000 ]---
Jun 29 14:49:22 pve kernel: RIP: 0010:__memcg_slab_post_alloc_hook+0x9e/0x230
Jun 29 14:49:22 pve kernel: Code: 03 05 4e eb 69 01 48 8b 50 08 49 89 c6 f6 c2 01 0f 85 75 01 00 00 0f 1f 44 00 00 49 8b 06 f6 c4 08 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 38 48 83 f8 03 77 20 8b 55 c4 31 c9 4c 89 fe 4c 89 f7 e8
Jun 29 14:49:22 pve kernel: RSP: 0018:ffffad3a010038f8 EFLAGS: 00010246
Jun 29 14:49:22 pve kernel: RAX: 0000000000000000 RBX: ffff8aee180ea780 RCX: 0000000000000001
Jun 29 14:49:22 pve kernel: RDX: dead000000000100 RSI: ffff8aee03eec240 RDI: 0000000000000cc0
Jun 29 14:49:22 pve kernel: RBP: ffffad3a01003940 R08: ffffad3a01003970 R09: 0000000000000000
Jun 29 14:49:22 pve kernel: R10: ffff8aee180ea780 R11: 0000000000000000 R12: ffff8aee03eec240
Jun 29 14:49:22 pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8aee001fd500
Jun 29 14:49:22 pve kernel: FS: 00007be436c9f300(0000) GS:ffff8af59f180000(0000) knlGS:0000000000000000
Jun 29 14:49:22 pve kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 29 14:49:22 pve kernel: CR2: 0000000000000038 CR3: 000000013c1e2002 CR4: 00000000003726f0
Jun 29 14:49:22 pve kernel: note: kvm[1806] exited with irqs disabled
Jun 29 14:49:35 pve systemd[1]: systemd-fsckd.service: Deactivated successfully.
Jun 29 14:49:35 pve kernel: general protection fault, probably for non-canonical address 0x723d61479d627007: 0000 [#2] PREEMPT SMP NOPTI
Jun 29 14:49:35 pve kernel: CPU: 3 PID: 1861 Comm: ip6tables-save Tainted: P D O 6.8.8-2-pve #1
Jun 29 14:49:35 pve kernel: Hardware name: Gigabyte Technology Co., Ltd. Z490M/Z490M, BIOS F2 03/26/2020
Jun 29 14:49:35 pve kernel: RIP: 0010:kmem_cache_alloc_lru+0xd9/0x420
Jun 29 14:49:35 pve kernel: Code: 50 08 48 83 78 10 00 48 8b 38 0f 84 5c 02 00 00 48 85 ff 0f 84 53 02 00 00 41 8b 46 28 49 8b 9e b8 00 00 00 49 8b 36 48 01 f8 <48> 33 18 48 89 c1 48 89 f8 48 0f c9 48 31 cb 48 8d 8a 00 20 00 00
Jun 29 14:49:35 pve kernel: RSP: 0018:ffffad3a0607f9e0 EFLAGS: 00010202
Jun 29 14:49:35 pve kernel: RAX: 723d61479d627007 RBX: 66132e8df6a0c419 RCX: 0000000000000000
Jun 29 14:49:35 pve kernel: RDX: 0000000003b52003 RSI: 000000000003d100 RDI: 723d61479d626fa7
Jun 29 14:49:35 pve kernel: RBP: ffffad3a0607fa40 R08: 0000000000000000 R09: 0000000000000000
Jun 29 14:49:35 pve kernel: R10: ffff8aee004483c0 R11: ffffffffcec9c7ce R12: 0000000000000cc0
Jun 29 14:49:35 pve kernel: R13: ffff8aee03eec680 R14: ffff8aee001fd500 R15: 0000000000000cc0
Jun 29 14:49:35 pve kernel: FS: 00007444cec41b80(0000) GS:ffff8af59f180000(0000) knlGS:0000000000000000
Jun 29 14:49:35 pve kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 29 14:49:35 pve kernel: CR2: 00005dc74676c008 CR3: 0000000109ba8005 CR4: 00000000003726f0
Jun 29 14:49:35 pve kernel: Call Trace:
Jun 29 14:49:35 pve kernel: <TASK>
Jun 29 14:49:35 pve kernel: ? show_regs+0x6d/0x80
Jun 29 14:49:35 pve kernel: ? die_addr+0x37/0xa0
Jun 29 14:49:35 pve kernel: ? exc_general_protection+0x1db/0x480
Jun 29 14:49:35 pve kernel: ? asm_exc_general_protection+0x27/0x30
Jun 29 14:49:35 pve kernel: ? kmem_cache_alloc_lru+0xd9/0x420
Jun 29 14:49:35 pve kernel: ? __d_alloc+0x34/0x250
Jun 29 14:49:35 pve kernel: __d_alloc+0x34/0x250
Jun 29 14:49:35 pve kernel: d_alloc+0x1a/0x90
Jun 29 14:49:35 pve kernel: d_alloc_parallel+0x5a/0x3e0
Jun 29 14:49:35 pve kernel: ? sprintf+0x5e/0x90
Jun 29 14:49:35 pve kernel: __lookup_slow+0x5c/0x130
Jun 29 14:49:35 pve kernel: walk_component+0x117/0x190
Jun 29 14:49:35 pve kernel: ? inode_permission+0x74/0x1b0
Jun 29 14:49:35 pve kernel: link_path_walk.part.0.constprop.0+0x245/0x3c0
Jun 29 14:49:35 pve kernel: ? path_init+0x298/0x3d0
Jun 29 14:49:35 pve kernel: path_openat+0xaf/0x1190
Jun 29 14:49:35 pve kernel: ? __memcg_slab_post_alloc_hook+0x18e/0x230
Jun 29 14:49:35 pve kernel: ? __mod_memcg_lruvec_state+0x87/0x140
Jun 29 14:49:35 pve kernel: do_filp_open+0xaf/0x170
Jun 29 14:49:35 pve kernel: ? __pfx_proc_put_link+0x10/0x10
Jun 29 14:49:35 pve kernel: ? __pfx_kfree_link+0x10/0x10
Jun 29 14:49:35 pve kernel: do_sys_openat2+0xb3/0xe0
Jun 29 14:49:35 pve kernel: __x64_sys_openat+0x6c/0xa0
Jun 29 14:49:35 pve kernel: x64_sys_call+0x189a/0x24b0
Jun 29 14:49:35 pve kernel: do_syscall_64+0x81/0x170
Jun 29 14:49:35 pve kernel: ? do_user_addr_fault+0x21e/0x6b0
Jun 29 14:49:35 pve kernel: ? irqentry_exit_to_user_mode+0x7e/0x260
Jun 29 14:49:35 pve kernel: ? irqentry_exit+0x43/0x50
Jun 29 14:49:35 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:35 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:35 pve kernel: ? clear_bhb_loop+0x15/0x70
Jun 29 14:49:35 pve kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
Jun 29 14:49:35 pve kernel: RIP: 0033:0x7444ced3af01
Jun 29 14:49:35 pve kernel: Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d ea 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 24 28 64 48 2b 14 25
Jun 29 14:49:35 pve kernel: RSP: 002b:00007ffe182784c0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
Jun 29 14:49:35 pve kernel: RAX: ffffffffffffffda RBX: 0000000000080000 RCX: 00007444ced3af01
Jun 29 14:49:35 pve kernel: RDX: 0000000000080000 RSI: 00007444cee43a96 RDI: 00000000ffffff9c
Jun 29 14:49:35 pve kernel: RBP: 00007444cee43a96 R08: 0000000000000008 R09: 0000000000000001
Jun 29 14:49:35 pve kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00007444cee43a96
Jun 29 14:49:35 pve kernel: R13: 00005dc745aed93f R14: 0000000000000001 R15: 0000000000000000
Jun 29 14:49:35 pve kernel: </TASK>
Jun 29 14:49:35 pve kernel: Modules linked in: ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_NFLOG xt_limit xt_mac ipt_REJECT nf_reject_ipv4 xt_set xt_physdev xt_addrtype xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment xt_tcpudp xt_mark iptable_filter ip_set_hash_net ip_set scsi_transport_iscsi nf_tables bluetooth ecdh_generic ecc msr softdog binfmt_misc bonding tls nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling vhost_net vhost x86_pkg_temp_thermal vhost_iotlb tap intel_powerclamp kvmgt coretemp mdev kvm_intel crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mei_pxp mei_hdcp aesni_intel crypto_simd zfs(PO) cryptd cmdlinepart intel_pmc_core rapl spi_nor intel_vsec joydev mei_me input_leds pmt_telemetry intel_cstate pcspkr gigabyte_wmi wmi_bmof intel_wmi_thunderbolt mtd ee1004 mei intel_pch_thermal serial_multi_instantiate
Jun 29 14:49:35 pve kernel: spl(O) pmt_class acpi_tad acpi_pad mac_hid i915 drm_buddy ttm drm_display_helper cec rc_core i2c_algo_bit kvm nfsd auth_rpcgss vfio_pci nfs_acl vfio_pci_core lockd irqbypass grace vfio_iommu_type1 vfio iommufd sunrpc efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic usbkbd usbhid hid dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci xhci_pci_renesas i2c_i801 spi_intel_pci crc32_pclmul e1000e xhci_hcd ahci spi_intel i2c_smbus libahci video wmi pinctrl_cannonlake
Jun 29 14:49:35 pve kernel: ---[ end trace 0000000000000000 ]---
I've attached a boot log of both 6.8.8-2-pve (contains mulitple kernel panics/traces) and 6.5.13-5-pve (successful boot).
My system boots up with the following modules added to
/etc/modules
Code:
# Modules required for PCI passthrough
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
# Modules required for Intel GVT
kvmgt
exngt
vfio-mdev
Any idea what the issue might be, or how I can further debug this?
Thanks in advance!