I am running a server with:
Linux server01 5.4.174-2-pve #1 SMP PVE 5.4.174-2 (Thu, 10 Mar 2022 15:58:44 +0100) x86_64 GNU/Linux
root@vgwppestr1:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.174-2-pve)
pve-manager: 6.4-14 (running version: 6.4-14/15e2bf61)
pve-kernel-5.4: 6.4-15
pve-kernel-helper: 6.4-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown2: 3.0.0-1+pve4~bpo10
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-3
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-2
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 6.6-1
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1
Yesterday the server run into this problem and the VM services stop responding although it was still possible to ping the VM and access it by SSH:
Jul 6 15:24:01 server01 kernel: [532261.481067] BUG: kernel NULL pointer dereference, address: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481070] #PF: supervisor read access in kernel mode
Jul 6 15:24:01 server01 kernel: [532261.481071] #PF: error_code(0x0000) - not-present page
Jul 6 15:24:01 server01 kernel: [532261.481072] PGD 0 P4D 0
Jul 6 15:24:01 server01 kernel: [532261.481075] Oops: 0000 [#1] SMP PTI
Jul 6 15:24:01 server01 kernel: [532261.481077] CPU: 1 PID: 432 Comm: zvol Tainted: P O 5.4.174-2-pve #1
Jul 6 15:24:01 server01 kernel: [532261.481078] Hardware name: /, BIOS 5.12 04/16/2020
Jul 6 15:24:01 server01 kernel: [532261.481123] RIP: 0010:dbuf_read_impl.constprop.33+0x90/0x6e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481125] Code: 28 49 8b 76 58 48 8b 80 88 00 00 00 48 89 85 18 ff ff ff 48 83 fe ff 0f 84 2f 04 00 00 49 8b 46 60 48 85 c0 0f 84 36 02 00 00 <48> 8b 50 30 48 0f ba e2 27 0f 83 12 02 00 00 41 80 7e 68 00 0f 84
Jul 6 15:24:01 server01 kernel: [532261.481126] RSP: 0018:ffffa32cc9cebb18 EFLAGS: 00010206
Jul 6 15:24:01 server01 kernel: [532261.481128] RAX: 0000000000000800 RBX: ffff90c1c57169c8 RCX: 0000000000000001
Jul 6 15:24:01 server01 kernel: [532261.481129] RDX: 0000000000000001 RSI: 00000000005bd810 RDI: ffff90c23bbd4130
Jul 6 15:24:01 server01 kernel: [532261.481129] RBP: ffffa32cc9cebc08 R08: 0000000000000002 R09: ffff90c017a84440
Jul 6 15:24:01 server01 kernel: [532261.481130] R10: ffff90c25f4c9800 R11: 0000000000000000 R12: ffff90c215036660
Jul 6 15:24:01 server01 kernel: [532261.481131] R13: 000000000000000a R14: ffff90c1c5716900 R15: ffff90c1c57169a8
Jul 6 15:24:01 server01 kernel: [532261.481132] FS: 0000000000000000(0000) GS:ffff90c265a80000(0000) knlGS:0000000000000000
Jul 6 15:24:01 server01 kernel: [532261.481133] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 6 15:24:01 server01 kernel: [532261.481134] CR2: 0000000000000830 CR3: 00000001fec0a005 CR4: 00000000003626e0
Jul 6 15:24:01 server01 kernel: [532261.481135] Call Trace:
Jul 6 15:24:01 server01 kernel: [532261.481173] ? arc_space_consume+0x4f/0x120 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481207] ? dbuf_create+0x404/0x580 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481211] ? _cond_resched+0x19/0x30
Jul 6 15:24:01 server01 kernel: [532261.481213] ? down_read+0x12/0xa0
Jul 6 15:24:01 server01 kernel: [532261.481246] dbuf_read+0x1b2/0x510 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481283] dmu_tx_check_ioerr+0x68/0xd0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481319] dmu_tx_count_write+0xf2/0x1b0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481356] dmu_tx_hold_write_by_dnode+0x3a/0x50 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481401] zvol_write+0x182/0x4e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481404] ? __switch_to+0x3c7/0x490
Jul 6 15:24:01 server01 kernel: [532261.481410] taskq_thread+0x2f7/0x4e0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481413] ? wake_up_q+0x80/0x80
Jul 6 15:24:01 server01 kernel: [532261.481459] ? zvol_is_zvol_impl+0x40/0x40 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481461] kthread+0x120/0x140
Jul 6 15:24:01 server01 kernel: [532261.481466] ? task_done+0xb0/0xb0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481467] ? kthread_park+0x90/0x90
Jul 6 15:24:01 server01 kernel: [532261.481469] ret_from_fork+0x35/0x40
Jul 6 15:24:01 server01 kernel: [532261.481471] Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw sctp binfmt_misc bonding softdog ip6table_filter ip6_tables iptable_filter xt_MASQUERADE iptable_nat iptable_mangle iptable_raw bpfilter nfnetlink_log nfnetlink dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mei_hdcp kvm irqbypass i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd glue_helper rapl intel_cstate drm fb_sys_fops pcspkr syscopyarea sysfillrect sysimgblt mei_me intel_pch_thermal mei acpi_pad mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse usbhid hid parport_pc ppdev lp nfsd auth_rpcgss parport nfs_acl
Jul 6 15:24:01 server01 kernel: [532261.481500] lockd grace sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log i2c_i801 igb xhci_pci ahci i2c_algo_bit dca xhci_hcd libahci video
Jul 6 15:24:01 server01 kernel: [532261.481513] CR2: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481515] ---[ end trace d167a867e3329686 ]---
To restore the normal VM operation it was necessary to restart the server.
Is this a kernel bug? What action is recommended?
Linux server01 5.4.174-2-pve #1 SMP PVE 5.4.174-2 (Thu, 10 Mar 2022 15:58:44 +0100) x86_64 GNU/Linux
root@vgwppestr1:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.174-2-pve)
pve-manager: 6.4-14 (running version: 6.4-14/15e2bf61)
pve-kernel-5.4: 6.4-15
pve-kernel-helper: 6.4-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown2: 3.0.0-1+pve4~bpo10
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-3
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-2
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 6.6-1
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1
Yesterday the server run into this problem and the VM services stop responding although it was still possible to ping the VM and access it by SSH:
Jul 6 15:24:01 server01 kernel: [532261.481067] BUG: kernel NULL pointer dereference, address: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481070] #PF: supervisor read access in kernel mode
Jul 6 15:24:01 server01 kernel: [532261.481071] #PF: error_code(0x0000) - not-present page
Jul 6 15:24:01 server01 kernel: [532261.481072] PGD 0 P4D 0
Jul 6 15:24:01 server01 kernel: [532261.481075] Oops: 0000 [#1] SMP PTI
Jul 6 15:24:01 server01 kernel: [532261.481077] CPU: 1 PID: 432 Comm: zvol Tainted: P O 5.4.174-2-pve #1
Jul 6 15:24:01 server01 kernel: [532261.481078] Hardware name: /, BIOS 5.12 04/16/2020
Jul 6 15:24:01 server01 kernel: [532261.481123] RIP: 0010:dbuf_read_impl.constprop.33+0x90/0x6e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481125] Code: 28 49 8b 76 58 48 8b 80 88 00 00 00 48 89 85 18 ff ff ff 48 83 fe ff 0f 84 2f 04 00 00 49 8b 46 60 48 85 c0 0f 84 36 02 00 00 <48> 8b 50 30 48 0f ba e2 27 0f 83 12 02 00 00 41 80 7e 68 00 0f 84
Jul 6 15:24:01 server01 kernel: [532261.481126] RSP: 0018:ffffa32cc9cebb18 EFLAGS: 00010206
Jul 6 15:24:01 server01 kernel: [532261.481128] RAX: 0000000000000800 RBX: ffff90c1c57169c8 RCX: 0000000000000001
Jul 6 15:24:01 server01 kernel: [532261.481129] RDX: 0000000000000001 RSI: 00000000005bd810 RDI: ffff90c23bbd4130
Jul 6 15:24:01 server01 kernel: [532261.481129] RBP: ffffa32cc9cebc08 R08: 0000000000000002 R09: ffff90c017a84440
Jul 6 15:24:01 server01 kernel: [532261.481130] R10: ffff90c25f4c9800 R11: 0000000000000000 R12: ffff90c215036660
Jul 6 15:24:01 server01 kernel: [532261.481131] R13: 000000000000000a R14: ffff90c1c5716900 R15: ffff90c1c57169a8
Jul 6 15:24:01 server01 kernel: [532261.481132] FS: 0000000000000000(0000) GS:ffff90c265a80000(0000) knlGS:0000000000000000
Jul 6 15:24:01 server01 kernel: [532261.481133] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 6 15:24:01 server01 kernel: [532261.481134] CR2: 0000000000000830 CR3: 00000001fec0a005 CR4: 00000000003626e0
Jul 6 15:24:01 server01 kernel: [532261.481135] Call Trace:
Jul 6 15:24:01 server01 kernel: [532261.481173] ? arc_space_consume+0x4f/0x120 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481207] ? dbuf_create+0x404/0x580 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481211] ? _cond_resched+0x19/0x30
Jul 6 15:24:01 server01 kernel: [532261.481213] ? down_read+0x12/0xa0
Jul 6 15:24:01 server01 kernel: [532261.481246] dbuf_read+0x1b2/0x510 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481283] dmu_tx_check_ioerr+0x68/0xd0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481319] dmu_tx_count_write+0xf2/0x1b0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481356] dmu_tx_hold_write_by_dnode+0x3a/0x50 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481401] zvol_write+0x182/0x4e0 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481404] ? __switch_to+0x3c7/0x490
Jul 6 15:24:01 server01 kernel: [532261.481410] taskq_thread+0x2f7/0x4e0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481413] ? wake_up_q+0x80/0x80
Jul 6 15:24:01 server01 kernel: [532261.481459] ? zvol_is_zvol_impl+0x40/0x40 [zfs]
Jul 6 15:24:01 server01 kernel: [532261.481461] kthread+0x120/0x140
Jul 6 15:24:01 server01 kernel: [532261.481466] ? task_done+0xb0/0xb0 [spl]
Jul 6 15:24:01 server01 kernel: [532261.481467] ? kthread_park+0x90/0x90
Jul 6 15:24:01 server01 kernel: [532261.481469] ret_from_fork+0x35/0x40
Jul 6 15:24:01 server01 kernel: [532261.481471] Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw sctp binfmt_misc bonding softdog ip6table_filter ip6_tables iptable_filter xt_MASQUERADE iptable_nat iptable_mangle iptable_raw bpfilter nfnetlink_log nfnetlink dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mei_hdcp kvm irqbypass i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd glue_helper rapl intel_cstate drm fb_sys_fops pcspkr syscopyarea sysfillrect sysimgblt mei_me intel_pch_thermal mei acpi_pad mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse usbhid hid parport_pc ppdev lp nfsd auth_rpcgss parport nfs_acl
Jul 6 15:24:01 server01 kernel: [532261.481500] lockd grace sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log i2c_i801 igb xhci_pci ahci i2c_algo_bit dca xhci_hcd libahci video
Jul 6 15:24:01 server01 kernel: [532261.481513] CR2: 0000000000000830
Jul 6 15:24:01 server01 kernel: [532261.481515] ---[ end trace d167a867e3329686 ]---
To restore the normal VM operation it was necessary to restart the server.
Is this a kernel bug? What action is recommended?