BUG: kernel NULL pointer dereference, address: 0000000000000830

hrocha

New Member
Jul 7, 2023
1
0
1
I am running a server with:
Linux server01 5.4.174-2-pve #1 SMP PVE 5.4.174-2 (Thu, 10 Mar 2022 15:58:44 +0100) x86_64 GNU/Linux

root@vgwppestr1:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.174-2-pve)
pve-manager: 6.4-14 (running version: 6.4-14/15e2bf61)
pve-kernel-5.4: 6.4-15
pve-kernel-helper: 6.4-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown2: 3.0.0-1+pve4~bpo10
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-3
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-2
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 6.6-1
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1



Yesterday the server run into this problem and the VM services stop responding although it was still possible to ping the VM and access it by SSH:

Jul 6 15:24:01 server01 kernel: [532261.481067] BUG: kernel NULL pointer dereference, address: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481070] #PF: supervisor read access in kernel mode
Jul 6 15:24:01 server01 kernel: [532261.481071] #PF: error_code(0x0000) - not-present page
Jul 6 15:24:01 server01 kernel: [532261.481072] PGD 0 P4D 0
Jul 6 15:24:01 server01 kernel: [532261.481075] Oops: 0000 [#1] SMP PTI
Jul 6 15:24:01 server01 kernel: [532261.481077] CPU: 1 PID: 432 Comm: zvol Tainted: P O 5.4.174-2-pve #1
Jul 6 15:24:01 server01 kernel: [532261.481078] Hardware name: /, BIOS 5.12 04/16/2020
Jul 6 15:24:01 server01 kernel: [532261.481123] RIP: 0010:dbuf_read_impl.constprop.33+0x90/0x6e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481125] Code: 28 49 8b 76 58 48 8b 80 88 00 00 00 48 89 85 18 ff ff ff 48 83 fe ff 0f 84 2f 04 00 00 49 8b 46 60 48 85 c0 0f 84 36 02 00 00 <48> 8b 50 30 48 0f ba e2 27 0f 83 12 02 00 00 41 80 7e 68 00 0f 84
Jul 6 15:24:01 server01 kernel: [532261.481126] RSP: 0018:ffffa32cc9cebb18 EFLAGS: 00010206
Jul 6 15:24:01 server01 kernel: [532261.481128] RAX: 0000000000000800 RBX: ffff90c1c57169c8 RCX: 0000000000000001
Jul 6 15:24:01 server01 kernel: [532261.481129] RDX: 0000000000000001 RSI: 00000000005bd810 RDI: ffff90c23bbd4130
Jul 6 15:24:01 server01 kernel: [532261.481129] RBP: ffffa32cc9cebc08 R08: 0000000000000002 R09: ffff90c017a84440
Jul 6 15:24:01 server01 kernel: [532261.481130] R10: ffff90c25f4c9800 R11: 0000000000000000 R12: ffff90c215036660
Jul 6 15:24:01 server01 kernel: [532261.481131] R13: 000000000000000a R14: ffff90c1c5716900 R15: ffff90c1c57169a8
Jul 6 15:24:01 server01 kernel: [532261.481132] FS: 0000000000000000(0000) GS:ffff90c265a80000(0000) knlGS:0000000000000000
Jul 6 15:24:01 server01 kernel: [532261.481133] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 6 15:24:01 server01 kernel: [532261.481134] CR2: 0000000000000830 CR3: 00000001fec0a005 CR4: 00000000003626e0
Jul 6 15:24:01 server01 kernel: [532261.481135] Call Trace:
Jul 6 15:24:01 server01 kernel: [532261.481173] ? arc_space_consume+0x4f/0x120 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481207] ? dbuf_create+0x404/0x580 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481211] ? _cond_resched+0x19/0x30
Jul 6 15:24:01 server01 kernel: [532261.481213] ? down_read+0x12/0xa0
Jul 6 15:24:01 server01 kernel: [532261.481246] dbuf_read+0x1b2/0x510 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481283] dmu_tx_check_ioerr+0x68/0xd0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481319] dmu_tx_count_write+0xf2/0x1b0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481356] dmu_tx_hold_write_by_dnode+0x3a/0x50 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481401] zvol_write+0x182/0x4e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481404] ? __switch_to+0x3c7/0x490
Jul 6 15:24:01 server01 kernel: [532261.481410] taskq_thread+0x2f7/0x4e0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481413] ? wake_up_q+0x80/0x80
Jul 6 15:24:01 server01 kernel: [532261.481459] ? zvol_is_zvol_impl+0x40/0x40 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481461] kthread+0x120/0x140
Jul 6 15:24:01 server01 kernel: [532261.481466] ? task_done+0xb0/0xb0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481467] ? kthread_park+0x90/0x90
Jul 6 15:24:01 server01 kernel: [532261.481469] ret_from_fork+0x35/0x40
Jul 6 15:24:01 server01 kernel: [532261.481471] Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw sctp binfmt_misc bonding softdog ip6table_filter ip6_tables iptable_filter xt_MASQUERADE iptable_nat iptable_mangle iptable_raw bpfilter nfnetlink_log nfnetlink dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mei_hdcp kvm irqbypass i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd glue_helper rapl intel_cstate drm fb_sys_fops pcspkr syscopyarea sysfillrect sysimgblt mei_me intel_pch_thermal mei acpi_pad mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse usbhid hid parport_pc ppdev lp nfsd auth_rpcgss parport nfs_acl
Jul 6 15:24:01 server01 kernel: [532261.481500] lockd grace sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log i2c_i801 igb xhci_pci ahci i2c_algo_bit dca xhci_hcd libahci video
Jul 6 15:24:01 server01 kernel: [532261.481513] CR2: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481515] ---[ end trace d167a867e3329686 ]---

To restore the normal VM operation it was necessary to restart the server.

Is this a kernel bug? What action is recommended?
 
Hello,

Proxmox VE 6.x is already EOL [0] upgrade to the latest available version is recommended.

From the above kernel logs, it looks like there is a memory issue, however, I'm still not sure, I would check the syslog during the kernel panic time, you can use journalctl like the below command to sort the syslog at the specific time/date:

Code:
journalctl --since "2023-07-06 15:00" --until "2023-07-06 15:30" > /tmp/Syslog.log

You may have to change the time/date in the above command.


[0] https://pve.proxmox.com/pve-docs/chapter-pve-faq.html
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!