BUG: kernel NULL pointer dereference, address: 0000000000000008

ctrlshifti

New Member
Jun 9, 2024
4
2
3
Hello. My hypervisor is consistently shutting down due to a bug, while running Proxmox Virtual Environment 8.2.4.

$ pveversion -v:
Code:
proxmox-ve: 8.2.0 (running kernel: 6.8.8-1-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-1
proxmox-kernel-6.8.8-1-pve-signed: 6.8.8-1
proxmox-kernel-6.8.4-3-pve-signed: 6.8.4-3
amd64-microcode: 3.20230808.1.1~deb12u1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
intel-microcode: 3.20231114.1~deb12u1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.4-1
proxmox-backup-file-restore: 3.2.4-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: not correctly installed
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

CPU (1 socket):

Code:
Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz

Log from journalctl:
Code:
Jun 25 11:52:38  kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Jun 25 11:52:38  kernel: #PF: supervisor write access in kernel mode
Jun 25 11:52:38  kernel: #PF: error_code(0x0002) - not-present page
Jun 25 11:52:38  kernel: PGD 0 P4D 0
Jun 25 11:52:38  kernel: Oops: 0002 [#1] PREEMPT SMP PTI
Jun 25 11:52:38  kernel: CPU: 6 PID: 1339 Comm: kvm Tainted: P           O       6.8.4-3-pve #1
Jun 25 11:52:38  kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C242D4U2-2T, BIOS L2.36A 05/06/2024
Jun 25 11:52:38  kernel: RIP: 0010:blk_flush_complete_seq+0x291/0x2d0
Jun 25 11:52:38  kernel: Code: 0f b6 f6 49 8d 56 01 49 c1 e6 04 4d 01 ee 48 c1 e2 04 49 8b 4e 10 4c 01 ea 48 39 ca 74 2b 48 8b 4b 50 48 8b 7b 48 48 8d 73 48 <48> 89 4f 08 48 89 39 49 8b 4e 18 49 89 76 18 48 89 53 48 48 89 4b
Jun 25 11:52:38  kernel: RSP: 0018:ffffa75b01a139c0 EFLAGS: 00010046
Jun 25 11:52:38  kernel: RAX: 0000000000000000 RBX: ffff8e539b1a0000 RCX: ffff8e539b1a0048
Jun 25 11:52:38  kernel: RDX: ffff8e539260e420 RSI: ffff8e539b1a0048 RDI: 0000000000000000
Jun 25 11:52:38  kernel: RBP: ffffa75b01a13a00 R08: 0000000000000000 R09: 0000000000000000
Jun 25 11:52:38  kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000029801
Jun 25 11:52:38  kernel: R13: ffff8e539260e400 R14: ffff8e539260e410 R15: ffff8e539a32c448
Jun 25 11:52:38  kernel: FS:  0000787fe74006c0(0000) GS:ffff8e5aceb00000(0000) knlGS:0000000000000000
Jun 25 11:52:38  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 25 11:52:38  kernel: CR2: 0000000000000008 CR3: 0000000139aaa004 CR4: 00000000003726f0
Jun 25 11:52:38  kernel: Call Trace:
Jun 25 11:52:38  kernel:  <TASK>
Jun 25 11:52:38  kernel:  ? show_regs+0x6d/0x80
Jun 25 11:52:38  kernel:  ? __die+0x24/0x80
Jun 25 11:52:38  kernel:  ? page_fault_oops+0x176/0x500
Jun 25 11:52:38  kernel:  ? md_submit_bio+0x63/0xb0
Jun 25 11:52:38  kernel:  ? do_user_addr_fault+0x2f9/0x6b0
Jun 25 11:52:38  kernel:  ? exc_page_fault+0x83/0x1b0
Jun 25 11:52:38  kernel:  ? asm_exc_page_fault+0x27/0x30
Jun 25 11:52:38  kernel:  ? blk_flush_complete_seq+0x291/0x2d0
Jun 25 11:52:38  kernel:  ? __blk_mq_alloc_requests+0x3e7/0x450
Jun 25 11:52:38  kernel:  ? wbt_wait+0x33/0x100
Jun 25 11:52:38  kernel:  blk_insert_flush+0xce/0x220
Jun 25 11:52:38  kernel:  blk_mq_submit_bio+0x641/0x750
Jun 25 11:52:38  kernel:  __submit_bio+0xb3/0x1c0
Jun 25 11:52:38  kernel:  submit_bio_noacct_nocheck+0x2b7/0x390
Jun 25 11:52:38  kernel:  submit_bio_noacct+0x1f3/0x650
Jun 25 11:52:38  kernel:  ? ext4_file_write_iter+0x380/0x7e0
Jun 25 11:52:38  kernel:  submit_bio+0xb2/0x110
Jun 25 11:52:38  kernel:  md_super_write+0xcf/0x110
Jun 25 11:52:38  kernel:  write_sb_page+0x148/0x300
Jun 25 11:52:38  kernel:  filemap_write_page+0x5b/0x70
Jun 25 11:52:38  kernel:  md_bitmap_unplug+0x99/0x200
Jun 25 11:52:38  kernel:  flush_bio_list+0x108/0x110 [raid1]
Jun 25 11:52:38  kernel:  raid1_unplug+0x3c/0xf0 [raid1]
Jun 25 11:52:38  kernel:  __blk_flush_plug+0xbe/0x130
Jun 25 11:52:38  kernel:  blk_finish_plug+0x31/0x50
Jun 25 11:52:38  kernel:  io_submit_sqes+0x549/0x680
Jun 25 11:52:38  kernel:  __do_sys_io_uring_enter+0x57c/0xbf0
Jun 25 11:52:38  kernel:  ? vfs_read+0x255/0x390
Jun 25 11:52:38  kernel:  __x64_sys_io_uring_enter+0x22/0x40
Jun 25 11:52:38  kernel:  x64_sys_call+0x20b9/0x24b0
Jun 25 11:52:38  kernel:  do_syscall_64+0x81/0x170
Jun 25 11:52:38  kernel:  ? ksys_read+0xe6/0x100
Jun 25 11:52:38  kernel:  ? syscall_exit_to_user_mode+0x86/0x260
Jun 25 11:52:38  kernel:  ? do_syscall_64+0x8d/0x170
Jun 25 11:52:38  kernel:  ? do_syscall_64+0x8d/0x170
Jun 25 11:52:38  kernel:  ? do_syscall_64+0x8d/0x170
Jun 25 11:52:38  kernel:  ? common_interrupt+0x54/0xb0
Jun 25 11:52:38  kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
Jun 25 11:52:38  kernel: RIP: 0033:0x787ff467cb95
Jun 25 11:52:38  kernel: Code: 00 00 00 44 89 d0 41 b9 08 00 00 00 83 c8 10 f6 87 d0 00 00 00 01 8b bf cc 00 00 00 44 0f 45 d0 45 31 c0 b8 aa 01 00 00 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 41 83 e2 02 74 c2 f0 48 83 0c 24
Jun 25 11:52:38  kernel: RSP: 002b:0000787fe73fafa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
Jun 25 11:52:38  kernel: RAX: ffffffffffffffda RBX: 0000608b718130f0 RCX: 0000787ff467cb95
Jun 25 11:52:38  kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: 000000000000001f
Jun 25 11:52:38  kernel: RBP: 0000608b718130f8 R08: 0000000000000000 R09: 0000000000000008
Jun 25 11:52:38  kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000608b718131e0
Jun 25 11:52:38  kernel: R13: 0000000000000001 R14: 0000608b716ca9b8 R15: 0000000000000000
Jun 25 11:52:38  kernel:  </TASK>
Jun 25 11:52:38  kernel: Modules linked in: veth ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp iptable_filter ip_set_hash_net ip_set softdog nf_tables sunrpc binfmt_misc nfnetlink_log bonding nfnetlink tls intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl acpi_ipmi cmdlinepart jc42 spi_nor intel_cstate i2c_algo_bit mei_me wmi_bmof mtd ee1004 ipmi_si 8250_dw ipmi_devintf intel_pmc_core mei intel_pch_thermal ie31200_edac intel_vsec ipmi_msghandler pmt_telemetry acpi_tad pmt_class mac_hid isofs zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap efi_pstore dmi_sysfs
Jun 25 11:52:38  kernel:  ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 raid1 ixgbe xhci_pci nvme xhci_pci_renesas xfrm_algo i2c_i801 nvme_core crc32_pclmul spi_intel_pci intel_lpss_pci ahci dca xhci_hcd spi_intel i2c_smbus intel_lpss mdio libahci idma64 nvme_auth video wmi pinctrl_cannonlake
Jun 25 11:52:38  kernel: CR2: 0000000000000008
Jun 25 11:52:38  kernel: ---[ end trace 0000000000000000 ]---
Jun 25 11:52:38  kernel: RIP: 0010:blk_flush_complete_seq+0x291/0x2d0
Jun 25 11:52:38  kernel: Code: 0f b6 f6 49 8d 56 01 49 c1 e6 04 4d 01 ee 48 c1 e2 04 49 8b 4e 10 4c 01 ea 48 39 ca 74 2b 48 8b 4b 50 48 8b 7b 48 48 8d 73 48 <48> 89 4f 08 48 89 39 49 8b 4e 18 49 89 76 18 48 89 53 48 48 89 4b
Jun 25 11:52:38  kernel: RSP: 0018:ffffa75b01a139c0 EFLAGS: 00010046
Jun 25 11:52:38  kernel: RAX: 0000000000000000 RBX: ffff8e539b1a0000 RCX: ffff8e539b1a0048
Jun 25 11:52:38  kernel: RDX: ffff8e539260e420 RSI: ffff8e539b1a0048 RDI: 0000000000000000
Jun 25 11:52:38  kernel: RBP: ffffa75b01a13a00 R08: 0000000000000000 R09: 0000000000000000
Jun 25 11:52:38  kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000029801
Jun 25 11:52:38  kernel: R13: ffff8e539260e400 R14: ffff8e539260e410 R15: ffff8e539a32c448
Jun 25 11:52:38  kernel: FS:  0000787fe74006c0(0000) GS:ffff8e5aceb00000(0000) knlGS:0000000000000000
Jun 25 11:52:38  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 25 11:52:38  kernel: CR2: 0000000000000008 CR3: 0000000139aaa004 CR4: 00000000003726f0
Jun 25 11:52:38  kernel: note: kvm[1339] exited with irqs disabled
Jun 25 11:52:38  kernel: note: kvm[1339] exited with preempt_count 1
Jun 25 11:52:38  kernel: ------------[ cut here ]------------
Jun 25 11:52:38  kernel: WARNING: CPU: 6 PID: 1339 at kernel/exit.c:820 do_exit+0x8dd/0xae0
Jun 25 11:52:38  kernel: Modules linked in: veth ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp iptable_filter ip_set_hash_net ip_set softdog nf_tables sunrpc binfmt_misc nfnetlink_log bonding nfnetlink tls intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl acpi_ipmi cmdlinepart jc42 spi_nor intel_cstate i2c_algo_bit mei_me wmi_bmof mtd ee1004 ipmi_si 8250_dw ipmi_devintf intel_pmc_core mei intel_pch_thermal ie31200_edac intel_vsec ipmi_msghandler pmt_telemetry acpi_tad pmt_class mac_hid isofs zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap efi_pstore dmi_sysfs
Jun 25 11:52:38  kernel:  ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 raid1 ixgbe xhci_pci nvme xhci_pci_renesas xfrm_algo i2c_i801 nvme_core crc32_pclmul spi_intel_pci intel_lpss_pci ahci dca xhci_hcd spi_intel i2c_smbus intel_lpss mdio libahci idma64 nvme_auth video wmi pinctrl_cannonlake
Jun 25 11:52:38  kernel: CPU: 6 PID: 1339 Comm: kvm Tainted: P      D    O       6.8.4-3-pve #1
Jun 25 11:52:38  kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C242D4U2-2T, BIOS L2.36A 05/06/2024
Jun 25 11:52:38  kernel: RIP: 0010:do_exit+0x8dd/0xae0
Jun 25 11:52:38  kernel: Code: e9 42 f8 ff ff 48 8b bb e0 09 00 00 31 f6 e8 9a e0 ff ff e9 ee fd ff ff 4c 89 ee bf 05 06 00 00 e8 08 3a 01 00 e9 6e f8 ff ff <0f> 0b e9 9c f7 ff ff 0f 0b e9 55 f7 ff ff 48 89 df e8 0d 2f 14 00
Jun 25 11:52:38  kernel: RSP: 0018:ffffa75b01a13ec8 EFLAGS: 00010282
Jun 25 11:52:38  kernel: RAX: 0000000000000000 RBX: ffff8e539cafa940 RCX: 0000000000000000
Jun 25 11:52:38  kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jun 25 11:52:38  kernel: RBP: ffffa75b01a13f20 R08: 0000000000000000 R09: 0000000000000000
Jun 25 11:52:38  kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e539a068d80
Jun 25 11:52:38  kernel: R13: 0000000000000009 R14: ffff8e539a063180 R15: 0000000000000000
Jun 25 11:52:38  kernel: FS:  0000787fe74006c0(0000) GS:ffff8e5aceb00000(0000) knlGS:0000000000000000
Jun 25 11:52:38  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 25 11:52:38  kernel: CR2: 0000000000000008 CR3: 0000000139aaa004 CR4: 00000000003726f0
Jun 25 11:52:38  kernel: Call Trace:
Jun 25 11:52:38  kernel:  <TASK>
Jun 25 11:52:38  kernel:  ? show_regs+0x6d/0x80
Jun 25 11:52:38  kernel:  ? __warn+0x89/0x160
Jun 25 11:52:38  kernel:  ? do_exit+0x8dd/0xae0
Jun 25 11:52:38  kernel:  ? report_bug+0x17e/0x1b0
Jun 25 11:52:38  kernel:  ? handle_bug+0x46/0x90
Jun 25 11:52:38  kernel:  ? exc_invalid_op+0x18/0x80
Jun 25 11:52:38  kernel:  ? asm_exc_invalid_op+0x1b/0x20
Jun 25 11:52:38  kernel:  ? do_exit+0x8dd/0xae0
Jun 25 11:52:38  kernel:  ? do_exit+0x72/0xae0
Jun 25 11:52:38  kernel:  ? _printk+0x60/0x90
Jun 25 11:52:38  kernel:  make_task_dead+0x83/0x170
Jun 25 11:52:38  kernel:  rewind_stack_and_make_dead+0x17/0x20
Jun 25 11:52:38  kernel: RIP: 0033:0x787ff467cb95
Jun 25 11:52:38  kernel: Code: 00 00 00 44 89 d0 41 b9 08 00 00 00 83 c8 10 f6 87 d0 00 00 00 01 8b bf cc 00 00 00 44 0f 45 d0 45 31 c0 b8 aa 01 00 00 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 41 83 e2 02 74 c2 f0 48 83 0c 24
Jun 25 11:52:38  kernel: RSP: 002b:0000787fe73fafa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
Jun 25 11:52:38  kernel: RAX: ffffffffffffffda RBX: 0000608b718130f0 RCX: 0000787ff467cb95
Jun 25 11:52:38  kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: 000000000000001f
Jun 25 11:52:38  kernel: RBP: 0000608b718130f8 R08: 0000000000000000 R09: 0000000000000008
Jun 25 11:52:38  kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000608b718131e0
Jun 25 11:52:38  kernel: R13: 0000000000000001 R14: 0000608b716ca9b8 R15: 0000000000000000
Jun 25 11:52:38  kernel:  </TASK>
Jun 25 11:52:38  kernel: ---[ end trace 0000000000000000 ]---
 
Hi, this particular NULL pointer dereference error with RIP pointing to blk_flush_complete_seq ...
Code:
Jun 25 11:52:38  kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
[...]
Jun 25 11:52:38  kernel: RIP: 0010:blk_flush_complete_seq+0x291/0x2d0
... should be fixed in proxmox-kernel-6.8.8-1-pve and higher, see [1] for more information.

According to the pveversion -v, the node has been running kernel 6.8.8-1 at the time you ran pveversion -v, but according to the crash message, at the time of the crash the node was still running an older affected kernel 6.8.4-3-pve:
Code:
Jun 25 11:52:38  kernel: CPU: 6 PID: 1339 Comm: kvm Tainted: P           O       6.8.4-3-pve #1
So can you please try kernel 6.8.8-1 instead? If you see any crashes, please let me know and attach the crash information too.

[1] https://forum.proxmox.com/threads/random-6-8-4-2-pve-kernel-crashes.145760/page-8#post-674842
[2] https://forum.proxmox.com/threads/b...-address-0000000000000000.102741/#post-442398
 
  • Like
Reactions: fiona
Hi, this particular NULL pointer dereference error with RIP pointing to blk_flush_complete_seq ...

... should be fixed in proxmox-kernel-6.8.8-1-pve and higher, see [1] for more information.

According to the pveversion -v, the node has been running kernel 6.8.8-1 at the time you ran pveversion -v, but according to the crash message, at the time of the crash the node was still running an older affected kernel 6.8.4-3-pve:

So can you please try kernel 6.8.8-1 instead? If you see any crashes, please let me know and attach the crash information too.

[1] https://forum.proxmox.com/threads/random-6-8-4-2-pve-kernel-crashes.145760/page-8#post-674842
[2] https://forum.proxmox.com/threads/b...-address-0000000000000000.102741/#post-442398
You're correct. After the crash, I performed an apt upgrade which might have led to a newer kernel in pveversion. However, I have reverted back to version 6.8.8-1. I'll inform you if the error occurs again.
 
  • Like
Reactions: fweber
Hi, this particular NULL pointer dereference error with RIP pointing to blk_flush_complete_seq ...

... should be fixed in proxmox-kernel-6.8.8-1-pve and higher, see [1] for more information.

According to the pveversion -v, the node has been running kernel 6.8.8-1 at the time you ran pveversion -v, but according to the crash message, at the time of the crash the node was still running an older affected kernel 6.8.4-3-pve:

So can you please try kernel 6.8.8-1 instead? If you see any crashes, please let me know and attach the crash information too.

[1] https://forum.proxmox.com/threads/random-6-8-4-2-pve-kernel-crashes.145760/page-8#post-674842
[2] https://forum.proxmox.com/threads/b...-address-0000000000000000.102741/#post-442398
You're correct. After the crash, I performed an apt upgrade which might have led to a newer kernel in pveversion. However, I have reverted back to version 6.8.8-1. I'll inform you if the error occurs again.
It's been 18 days and everything's running smoothly now. The last time I encountered this error was happening every 3 days. But now, the problem is solved. Thanks.
 
  • Like
Reactions: fweber

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!