Hi all,
I have been troubleshooting issues with kernel panics on Proxmox 7.0-13 for the past week but I was unsure of the cause due to lots of configuring and setting up prior to noticing the issues. I decided to reinstall Proxmox and start from scratch today and was able to prove that everything seems to work fine until I involve this mirrored ZFS pool. I thought my issues were with VM configuration, but I was just able to cause a kernel panic without a VM running at all on a nearly fresh install.
My specs:
Troubleshooting steps I have taken so far:
Thank you and I appreciate your time if you read all this!
I have been troubleshooting issues with kernel panics on Proxmox 7.0-13 for the past week but I was unsure of the cause due to lots of configuring and setting up prior to noticing the issues. I decided to reinstall Proxmox and start from scratch today and was able to prove that everything seems to work fine until I involve this mirrored ZFS pool. I thought my issues were with VM configuration, but I was just able to cause a kernel panic without a VM running at all on a nearly fresh install.
My specs:
- Dell Optiplex 5060
- Intel i7-8700 & 64 GB (4x16GB) Crucial DDR4 2666 RAM
- SanDisk X600 M.2 2280 SATA 128GB (Proxmox ext4 disk)
- 2x Intel S3700 (the HP branded version, device model# MK0400GCTZA) (ZFS pool disks for VM disk storage)
- Intel E1G44ET NIC
- Added the no subscription repository
- Fully updated the system & rebooted
- Setup my networking bonds & bridges
- Wiped the 2 S3700's, created a mirrored ZFS pool with lz4 compression and ashift=12, and confirmed the pool shows up as healthy
- Created a single debian 11 server VM using the local-lvm for storage rather than the ZFS pool and using mostly all default settings, installed the OS and rebooted the VM, confirmed it was fully working and able to mount a nfs share from my NAS and transfer data to & from it.
- Within the debian VM, I unmounted the NFS share and issued the command "shutdown now" as root
Code:
Nov 04 23:44:18 hyper pvedaemon[1392]: <root@pam> move disk VM 900: move --disk virtio0 --storage tank1
Nov 04 23:44:18 hyper pvedaemon[1392]: <root@pam> starting task UPID:hyper:00001C85:00031B2D:6184C4B2:qmmove:900:root@pam:
Nov 04 23:44:44 hyper kernel: BUG: kernel NULL pointer dereference, address: 0000000000000088
Nov 04 23:44:44 hyper kernel: #PF: supervisor read access in kernel mode
Nov 04 23:44:44 hyper kernel: #PF: error_code(0x0000) - not-present page
Nov 04 23:44:44 hyper kernel: PGD 0 P4D 0
Nov 04 23:44:44 hyper kernel: Oops: 0000 [#1] SMP PTI
Nov 04 23:44:44 hyper kernel: CPU: 0 PID: 7363 Comm: qemu-img Tainted: P O 5.11.22-5-pve #1
Nov 04 23:44:44 hyper kernel: Hardware name: Dell Inc. OptiPlex 5060/0654JC, BIOS 1.14.0 07/22/2021
Nov 04 23:44:44 hyper kernel: RIP: 0010:workingset_activation+0x4d/0xa0
Nov 04 23:44:44 hyper kernel: Code: 14 c5 e0 0b 75 92 0f 1f 44 00 00 48 8b 47 38 48 83 e0 fc 48 0f 44 05 9a ed cc 01 48 63 8a 00 9d 02 00 4c 8b 84 c8 00 0b 00 00 <49> 3b 90 88 00 00 00 75 35 48 8b 07 4c 89 c7 48 c1 e8 10 83 e0 01
Nov 04 23:44:44 hyper kernel: RSP: 0018:ffffa246c3643c88 EFLAGS: 00010282
Nov 04 23:44:44 hyper kernel: RAX: ffff8924a19cf068 RBX: ffffc558595232c0 RCX: 0000000000000000
Nov 04 23:44:44 hyper kernel: RDX: ffff8933c07d6000 RSI: 0000000500000000 RDI: ffffc558595232c0
Nov 04 23:44:44 hyper kernel: RBP: ffffa246c3643c88 R08: 0000000000000000 R09: ffff8924dbd6c880
Nov 04 23:44:44 hyper kernel: R10: 0000000000000001 R11: ffff8924dbd6c9f8 R12: ffff893380228f40
Nov 04 23:44:44 hyper kernel: R13: ffff892606d04608 R14: ffffa246c3643e60 R15: ffffa246c3643e38
Nov 04 23:44:44 hyper kernel: FS: 00007f6264ae6700(0000) GS:ffff893380200000(0000) knlGS:0000000000000000
Nov 04 23:44:44 hyper kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 04 23:44:44 hyper kernel: CR2: 0000000000000088 CR3: 000000010c776004 CR4: 00000000003706f0
Nov 04 23:44:44 hyper kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 04 23:44:44 hyper kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 04 23:44:44 hyper kernel: Call Trace:
Nov 04 23:44:44 hyper kernel: mark_page_accessed+0x181/0x1f0
Nov 04 23:44:44 hyper kernel: generic_file_buffered_read+0x230/0x4a0
Nov 04 23:44:44 hyper kernel: generic_file_read_iter+0xdf/0x140
Nov 04 23:44:44 hyper kernel: blkdev_read_iter+0x4a/0x60
Nov 04 23:44:44 hyper kernel: new_sync_read+0x10d/0x190
Nov 04 23:44:44 hyper kernel: vfs_read+0x15a/0x1c0
Nov 04 23:44:44 hyper kernel: __x64_sys_pread64+0x93/0xc0
Nov 04 23:44:44 hyper kernel: do_syscall_64+0x38/0x90
Nov 04 23:44:44 hyper kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 04 23:44:44 hyper kernel: RIP: 0033:0x7f627777c917
Nov 04 23:44:44 hyper kernel: Code: 08 89 3c 24 48 89 4c 24 18 e8 05 f4 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 35 f4 ff ff 48 8b
Nov 04 23:44:44 hyper kernel: RSP: 002b:00007f6264ae1680 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
Nov 04 23:44:44 hyper kernel: RAX: ffffffffffffffda RBX: 00007f6266aec000 RCX: 00007f627777c917
Nov 04 23:44:44 hyper kernel: RDX: 0000000000200000 RSI: 00007f6266aec000 RDI: 0000000000000004
Nov 04 23:44:44 hyper kernel: RBP: 00007f6266ded840 R08: 0000000000000000 R09: 00000000ffffffff
Nov 04 23:44:44 hyper kernel: R10: 0000000353dff400 R11: 0000000000000293 R12: 0000000000000000
Nov 04 23:44:44 hyper kernel: R13: 00005618e9a364c8 R14: 00005618e9a4ed90 R15: 0000000000802000
Nov 04 23:44:44 hyper kernel: Modules linked in: tcp_diag inet_diag veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm i915 irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd glue_helper cec rc_core dell_wmi fb_sys_fops rapl syscopyarea dell_smbios mei_hdcp sysfillrect dcdbas intel_cstate sysimgblt mei_me pcspkr intel_pch_thermal dell_wmi_sysman efi_pstore ee1004 mei dell_wmi_descriptor dell_wmi_aio wmi_bmof sparse_keymap mac_hid acpi_pad zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio
Nov 04 23:44:44 hyper kernel: libcrc32c crc32_pclmul intel_lpss_pci i2c_i801 intel_lpss xhci_pci igb xhci_pci_renesas ahci idma64 i2c_algo_bit e1000e i2c_smbus dca libahci xhci_hcd virt_dma wmi video pinctrl_cannonlake
Nov 04 23:44:44 hyper kernel: CR2: 0000000000000088
Nov 04 23:44:44 hyper kernel: ---[ end trace 7ffa6e57016e7357 ]---
Nov 04 23:44:44 hyper kernel: RIP: 0010:workingset_activation+0x4d/0xa0
Nov 04 23:44:44 hyper kernel: Code: 14 c5 e0 0b 75 92 0f 1f 44 00 00 48 8b 47 38 48 83 e0 fc 48 0f 44 05 9a ed cc 01 48 63 8a 00 9d 02 00 4c 8b 84 c8 00 0b 00 00 <49> 3b 90 88 00 00 00 75 35 48 8b 07 4c 89 c7 48 c1 e8 10 83 e0 01
Nov 04 23:44:44 hyper kernel: RSP: 0018:ffffa246c3643c88 EFLAGS: 00010282
Nov 04 23:44:44 hyper kernel: RAX: ffff8924a19cf068 RBX: ffffc558595232c0 RCX: 0000000000000000
Nov 04 23:44:44 hyper kernel: RDX: ffff8933c07d6000 RSI: 0000000500000000 RDI: ffffc558595232c0
Nov 04 23:44:44 hyper kernel: RBP: ffffa246c3643c88 R08: 0000000000000000 R09: ffff8924dbd6c880
Nov 04 23:44:44 hyper kernel: R10: 0000000000000001 R11: ffff8924dbd6c9f8 R12: ffff893380228f40
Nov 04 23:44:44 hyper kernel: R13: ffff892606d04608 R14: ffffa246c3643e60 R15: ffffa246c3643e38
Nov 04 23:44:44 hyper kernel: FS: 00007f6264ae6700(0000) GS:ffff893380200000(0000) knlGS:0000000000000000
Nov 04 23:44:44 hyper kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 04 23:44:44 hyper kernel: CR2: 0000000000000088 CR3: 000000010c776004 CR4: 00000000003706f0
Nov 04 23:44:44 hyper kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 04 23:44:44 hyper kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 04 23:44:46 hyper kernel: general protection fault, probably for non-canonical address 0xefa33951ffff892a: 0000 [#2] SMP PTI
Nov 04 23:44:46 hyper kernel: CPU: 3 PID: 513 Comm: dbuf_evict Tainted: P D O 5.11.22-5-pve #1
Nov 04 23:44:46 hyper kernel: Hardware name: Dell Inc. OptiPlex 5060/0654JC, BIOS 1.14.0 07/22/2021
Nov 04 23:44:46 hyper kernel: RIP: 0010:arc_buf_destroy+0x1c/0x110 [zfs]
Nov 04 23:44:46 hyper kernel: Code: 07 00 0f 1f 40 00 e9 f4 fe ff ff 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 4c 8b 27 48 8b 05 ec f9 28 00 <49> 39 84 24 f8 00 00 00 0f 84 95 00 00 00 49 8b 4c 24 10 49 8b 54
Nov 04 23:44:46 hyper kernel: RSP: 0018:ffffa246c1727e08 EFLAGS: 00010282
Nov 04 23:44:46 hyper kernel: RAX: ffffffffc0c8abe0 RBX: 28f5c28f5c28f5c3 RCX: 0000000000000000
Nov 04 23:44:46 hyper kernel: RDX: ffffffffffffe000 RSI: ffff892a344aa180 RDI: ffff892ac30ff204
Nov 04 23:44:46 hyper kernel: RBP: ffffa246c1727e30 R08: 000000000000000b R09: 788af778fc6207b2
Nov 04 23:44:46 hyper kernel: R10: 0000000000000000 R11: ffffa246c1727e60 R12: efa33951ffff892a
Nov 04 23:44:46 hyper kernel: R13: ffff892ac4e3be00 R14: ffff8924918b6300 R15: ffffffffc0d6b0a0
Nov 04 23:44:46 hyper kernel: FS: 0000000000000000(0000) GS:ffff8933802c0000(0000) knlGS:0000000000000000
Nov 04 23:44:46 hyper kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 04 23:44:46 hyper kernel: CR2: 0000561dff3dd398 CR3: 0000000d3cc10005 CR4: 00000000003706e0
Nov 04 23:44:46 hyper kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 04 23:44:46 hyper kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 04 23:44:46 hyper kernel: Call Trace:
Nov 04 23:44:46 hyper kernel: Call Trace:
Nov 04 23:44:46 hyper kernel: dbuf_destroy+0x31/0x460 [zfs]
Nov 04 23:44:46 hyper kernel: ? _cond_resched+0x1a/0x50
Nov 04 23:44:46 hyper kernel: dbuf_evict_one+0x10a/0x140 [zfs]
Nov 04 23:44:46 hyper kernel: dbuf_evict_thread+0x12d/0x1e0 [zfs]
Nov 04 23:44:46 hyper kernel: ? dbuf_evict_one+0x140/0x140 [zfs]
Nov 04 23:44:46 hyper kernel: thread_generic_wrapper+0x79/0x90 [spl]
Nov 04 23:44:46 hyper kernel: ? __thread_exit+0x20/0x20 [spl]
Nov 04 23:44:46 hyper kernel: kthread+0x12b/0x150
Nov 04 23:44:46 hyper kernel: ? set_kthread_struct+0x50/0x50
Nov 04 23:44:46 hyper kernel: ret_from_fork+0x22/0x30
Nov 04 23:44:46 hyper kernel: Modules linked in: tcp_diag inet_diag veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm i915 irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd glue_helper cec rc_core dell_wmi fb_sys_fops rapl syscopyarea dell_smbios mei_hdcp sysfillrect dcdbas intel_cstate sysimgblt mei_me pcspkr intel_pch_thermal dell_wmi_sysman efi_pstore ee1004 mei dell_wmi_descriptor dell_wmi_aio wmi_bmof sparse_keymap mac_hid acpi_pad zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio
Nov 04 23:44:46 hyper kernel: libcrc32c crc32_pclmul intel_lpss_pci i2c_i801 intel_lpss xhci_pci igb xhci_pci_renesas ahci idma64 i2c_algo_bit e1000e i2c_smbus dca libahci xhci_hcd virt_dma wmi video pinctrl_cannonlake
Nov 04 23:44:46 hyper kernel: ---[ end trace 7ffa6e57016e7358 ]---
Nov 04 23:44:47 hyper kernel: RIP: 0010:workingset_activation+0x4d/0xa0
Nov 04 23:44:47 hyper kernel: Code: 14 c5 e0 0b 75 92 0f 1f 44 00 00 48 8b 47 38 48 83 e0 fc 48 0f 44 05 9a ed cc 01 48 63 8a 00 9d 02 00 4c 8b 84 c8 00 0b 00 00 <49> 3b 90 88 00 00 00 75 35 48 8b 07 4c 89 c7 48 c1 e8 10 83 e0 01
Nov 04 23:44:47 hyper kernel: RSP: 0018:ffffa246c3643c88 EFLAGS: 00010282
Nov 04 23:44:47 hyper kernel: RAX: ffff8924a19cf068 RBX: ffffc558595232c0 RCX: 0000000000000000
Nov 04 23:44:47 hyper kernel: RDX: ffff8933c07d6000 RSI: 0000000500000000 RDI: ffffc558595232c0
Nov 04 23:44:47 hyper kernel: RBP: ffffa246c3643c88 R08: 0000000000000000 R09: ffff8924dbd6c880
Nov 04 23:44:47 hyper kernel: R10: 0000000000000001 R11: ffff8924dbd6c9f8 R12: ffff893380228f40
Nov 04 23:44:47 hyper kernel: R13: ffff892606d04608 R14: ffffa246c3643e60 R15: ffffa246c3643e38
Nov 04 23:44:47 hyper kernel: FS: 0000000000000000(0000) GS:ffff8933802c0000(0000) knlGS:0000000000000000
Nov 04 23:44:47 hyper kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 04 23:44:47 hyper kernel: CR2: 0000561dff3dd398 CR3: 0000000121970004 CR4: 00000000003706e0
Nov 04 23:44:47 hyper kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 04 23:44:47 hyper kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 04 23:45:00 hyper systemd[1]: Starting Proxmox VE replication runner...
Nov 04 23:45:01 hyper systemd[1]: pvesr.service: Succeeded.
Nov 04 23:45:01 hyper systemd[1]: Finished Proxmox VE replication runner.
Troubleshooting steps I have taken so far:
- Prior to installing proxmox on this system with this exact hardware configuration, I did the following:
- Fully updated the BIOS
- Ran 24 hours of memtest, which it survived with no errors.
- I installed both Ubuntu Server 20.04 and FreeBSD13 on this machine to test it out. In both of those configurations, I formatted these S3700's with ZFS, filled them with data or used them as root disks, and used them for testing with no issues that I witnessed.
- All 3 SSD's were data tested and had long SMART tests run, which they survived with no errors
- After getting the kernel panic above, I did the following:
- Ran long smart tests on the drives again, which they passed again.
- Checked the health of the zpool, which shows as online with no known data errors from "zpool status" and healthy in the Proxmox UI.
Thank you and I appreciate your time if you read all this!