Hi !
I have a new Proxmox Server installed on a Tuxedo Mini Server (https://www.tuxedocomputers.com/en/TUXEDO-Nano-Pro-Gen12.tuxedo)
After a while setting up VMs the server becomes unresponsive to the extend where I can do nothing anymore and need to turn it off via power switch.
This happened 2 times and the only thing I see in dmesg is:
[ 9326.421441] BUG: kernel NULL pointer dereference, address: 0000000000000510
[ 9326.421473] #PF: supervisor write access in kernel mode
[ 9326.421486] #PF: error_code(0x0002) - not-present page
[ 9326.421500] PGD 0 P4D 0
[ 9326.421534] CPU: 2 PID: 119 Comm: ksmd Tainted: P O 6.5.11-8-pve #1
[ 9326.421511] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 9326.421554] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./4X4-7040 Series/D5, BIOS P1.00 09/20/2023
[ 9326.421583] RIP: 0010:ksm_scan_thread+0x35c/0x2060
[ 9326.421602] Code: 82 f2 04 00 00 48 8b 03 48 89 df 49 89 45 00 e8 4a d6 ff ff 48 8b 43 10 48 89 de 48 8b 3d ec d9 3b 03 48 83 2d b4 d9 3b 03 01 <48> 83 a8 10 05 00 00 01 48 c7 43 10 00 00 00 00 e8 3f b0 00 00 49
[ 9326.421651] RSP: 0018:ffffae1300577e18 EFLAGS: 00010212
[ 9326.421665] RAX: 0000000000000000 RBX: ffff9c7348ba6200 RCX: 0000000000000000
[ 9326.421684] RDX: 0000000000000000 RSI: ffff9c7348ba6200 RDI: ffff9c7340222b00
[ 9326.421705] RBP: ffffae1300577ee0 R08: 0000000000000000 R09: 0000000000000000
[ 9326.421727] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c734006d800
[ 9326.421751] R13: ffff9c7340ba6940 R14: 00007f67677c0000 R15: ffffcfed481f7000
[ 9326.421773] FS: 0000000000000000(0000) GS:ffff9c823e880000(0000) knlGS:0000000000000000
[ 9326.421802] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9326.421818] CR2: 0000000000000510 CR3: 000000010c234000 CR4: 0000000000750ee0
[ 9326.421836] PKRU: 55555554
[ 9326.421845] Call Trace:
[ 9326.421854] <TASK>
[ 9326.421866] ? show_regs+0x6d/0x80
[ 9326.421886] ? __die+0x24/0x80
[ 9326.421897] ? page_fault_oops+0x176/0x500
[ 9326.421916] ? srso_alias_return_thunk+0x5/0x7f
[ 9326.421940] ? psi_task_switch+0xd3/0x240
[ 9326.421961] ? do_user_addr_fault+0x31d/0x6a0
[ 9326.421977] ? exc_page_fault+0x83/0x1b0
[ 9326.421998] ? asm_exc_page_fault+0x27/0x30
[ 9326.422025] ? ksm_scan_thread+0x35c/0x2060
[ 9326.422038] ? ksm_scan_thread+0x346/0x2060
[ 9326.422055] ? __pfx_ksm_scan_thread+0x10/0x10
[ 9326.422068] kthread+0xef/0x120
[ 9326.422080] ? __pfx_kthread+0x10/0x10
[ 9326.422097] ret_from_fork+0x44/0x70
[ 9326.422111] ? __pfx_kthread+0x10/0x10
[ 9326.422125] ret_from_fork_asm+0x1b/0x30
[ 9326.422146] </TASK>
[ 9326.422152] Modules linked in: tcp_diag inet_diag veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables softdog bonding tls sunrpc binfmt_misc nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd snd_hda_codec_realtek amdgpu kvm_amd snd_hda_codec_generic ledtrig_audio kvm mt7921e snd_hda_codec_hdmi mt7921_common amdxcp btusb iommu_v2 irqbypass mt76_connac_lib btrtl drm_buddy crct10dif_pclmul snd_hda_intel btbcm gpu_sched polyval_clmulni mt76 btintel snd_intel_dspcfg polyval_generic drm_suballoc_helper snd_intel_sdw_acpi ghash_clmulni_intel drm_ttm_helper btmtk ttm aesni_intel snd_hda_codec mac80211 bluetooth drm_display_helper crypto_simd snd_hda_core cryptd cec snd_hwdep ecdh_generic snd_pcm rc_core ecc rapl pcspkr snd_timer cfg80211 drm_kms_helper k10temp snd ipmi_devintf i2c_algo_bit ccp soundcore libarc4 ipmi_msghandler amd_pmc joydev input_leds mac_hid zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap drm efi_pstore dmi_sysfs
[ 9326.422238] ip_tables x_tables autofs4 btrfs blake2b_generic xor hid_generic usbkbd usbmouse usbhid raid6_pq simplefb uas usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci nvme xhci_pci_renesas crc32_pclmul thunderbolt xhci_hcd nvme_core ahci ehci_pci r8169 i2c_piix4 i2c_hid_acpi libahci ehci_hcd nvme_common video realtek i2c_hid wmi hid
[ 9326.422460] CR2: 0000000000000510
[ 9326.422469] ---[ end trace 0000000000000000 ]---
[ 9326.547114] RIP: 0010:ksm_scan_thread+0x35c/0x2060
[ 9326.547127] Code: 82 f2 04 00 00 48 8b 03 48 89 df 49 89 45 00 e8 4a d6 ff ff 48 8b 43 10 48 89 de 48 8b 3d ec d9 3b 03 48 83 2d b4 d9 3b 03 01 <48> 83 a8 10 05 00 00 01 48 c7 43 10 00 00 00 00 e8 3f b0 00 00 49
[ 9326.547144] RSP: 0018:ffffae1300577e18 EFLAGS: 00010212
[ 9326.547152] RAX: 0000000000000000 RBX: ffff9c7348ba6200 RCX: 0000000000000000
[ 9326.547161] RDX: 0000000000000000 RSI: ffff9c7348ba6200 RDI: ffff9c7340222b00
[ 9326.547170] RBP: ffffae1300577ee0 R08: 0000000000000000 R09: 0000000000000000
[ 9326.547179] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c734006d800
[ 9326.547187] R13: ffff9c7340ba6940 R14: 00007f67677c0000 R15: ffffcfed481f7000
[ 9326.547196] FS: 0000000000000000(0000) GS:ffff9c823e880000(0000) knlGS:0000000000000000
[ 9326.547626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9326.548004] CR2: 0000000000000510 CR3: 000000067d388000 CR4: 0000000000750ee0
[ 9326.548376] PKRU: 55555554
[ 9326.548739] note: ksmd[119] exited with irqs disabled
anyone knows what could be wrong ?
I have a new Proxmox Server installed on a Tuxedo Mini Server (https://www.tuxedocomputers.com/en/TUXEDO-Nano-Pro-Gen12.tuxedo)
After a while setting up VMs the server becomes unresponsive to the extend where I can do nothing anymore and need to turn it off via power switch.
This happened 2 times and the only thing I see in dmesg is:
[ 9326.421441] BUG: kernel NULL pointer dereference, address: 0000000000000510
[ 9326.421473] #PF: supervisor write access in kernel mode
[ 9326.421486] #PF: error_code(0x0002) - not-present page
[ 9326.421500] PGD 0 P4D 0
[ 9326.421534] CPU: 2 PID: 119 Comm: ksmd Tainted: P O 6.5.11-8-pve #1
[ 9326.421511] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 9326.421554] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./4X4-7040 Series/D5, BIOS P1.00 09/20/2023
[ 9326.421583] RIP: 0010:ksm_scan_thread+0x35c/0x2060
[ 9326.421602] Code: 82 f2 04 00 00 48 8b 03 48 89 df 49 89 45 00 e8 4a d6 ff ff 48 8b 43 10 48 89 de 48 8b 3d ec d9 3b 03 48 83 2d b4 d9 3b 03 01 <48> 83 a8 10 05 00 00 01 48 c7 43 10 00 00 00 00 e8 3f b0 00 00 49
[ 9326.421651] RSP: 0018:ffffae1300577e18 EFLAGS: 00010212
[ 9326.421665] RAX: 0000000000000000 RBX: ffff9c7348ba6200 RCX: 0000000000000000
[ 9326.421684] RDX: 0000000000000000 RSI: ffff9c7348ba6200 RDI: ffff9c7340222b00
[ 9326.421705] RBP: ffffae1300577ee0 R08: 0000000000000000 R09: 0000000000000000
[ 9326.421727] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c734006d800
[ 9326.421751] R13: ffff9c7340ba6940 R14: 00007f67677c0000 R15: ffffcfed481f7000
[ 9326.421773] FS: 0000000000000000(0000) GS:ffff9c823e880000(0000) knlGS:0000000000000000
[ 9326.421802] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9326.421818] CR2: 0000000000000510 CR3: 000000010c234000 CR4: 0000000000750ee0
[ 9326.421836] PKRU: 55555554
[ 9326.421845] Call Trace:
[ 9326.421854] <TASK>
[ 9326.421866] ? show_regs+0x6d/0x80
[ 9326.421886] ? __die+0x24/0x80
[ 9326.421897] ? page_fault_oops+0x176/0x500
[ 9326.421916] ? srso_alias_return_thunk+0x5/0x7f
[ 9326.421940] ? psi_task_switch+0xd3/0x240
[ 9326.421961] ? do_user_addr_fault+0x31d/0x6a0
[ 9326.421977] ? exc_page_fault+0x83/0x1b0
[ 9326.421998] ? asm_exc_page_fault+0x27/0x30
[ 9326.422025] ? ksm_scan_thread+0x35c/0x2060
[ 9326.422038] ? ksm_scan_thread+0x346/0x2060
[ 9326.422055] ? __pfx_ksm_scan_thread+0x10/0x10
[ 9326.422068] kthread+0xef/0x120
[ 9326.422080] ? __pfx_kthread+0x10/0x10
[ 9326.422097] ret_from_fork+0x44/0x70
[ 9326.422111] ? __pfx_kthread+0x10/0x10
[ 9326.422125] ret_from_fork_asm+0x1b/0x30
[ 9326.422146] </TASK>
[ 9326.422152] Modules linked in: tcp_diag inet_diag veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables softdog bonding tls sunrpc binfmt_misc nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd snd_hda_codec_realtek amdgpu kvm_amd snd_hda_codec_generic ledtrig_audio kvm mt7921e snd_hda_codec_hdmi mt7921_common amdxcp btusb iommu_v2 irqbypass mt76_connac_lib btrtl drm_buddy crct10dif_pclmul snd_hda_intel btbcm gpu_sched polyval_clmulni mt76 btintel snd_intel_dspcfg polyval_generic drm_suballoc_helper snd_intel_sdw_acpi ghash_clmulni_intel drm_ttm_helper btmtk ttm aesni_intel snd_hda_codec mac80211 bluetooth drm_display_helper crypto_simd snd_hda_core cryptd cec snd_hwdep ecdh_generic snd_pcm rc_core ecc rapl pcspkr snd_timer cfg80211 drm_kms_helper k10temp snd ipmi_devintf i2c_algo_bit ccp soundcore libarc4 ipmi_msghandler amd_pmc joydev input_leds mac_hid zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap drm efi_pstore dmi_sysfs
[ 9326.422238] ip_tables x_tables autofs4 btrfs blake2b_generic xor hid_generic usbkbd usbmouse usbhid raid6_pq simplefb uas usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c xhci_pci nvme xhci_pci_renesas crc32_pclmul thunderbolt xhci_hcd nvme_core ahci ehci_pci r8169 i2c_piix4 i2c_hid_acpi libahci ehci_hcd nvme_common video realtek i2c_hid wmi hid
[ 9326.422460] CR2: 0000000000000510
[ 9326.422469] ---[ end trace 0000000000000000 ]---
[ 9326.547114] RIP: 0010:ksm_scan_thread+0x35c/0x2060
[ 9326.547127] Code: 82 f2 04 00 00 48 8b 03 48 89 df 49 89 45 00 e8 4a d6 ff ff 48 8b 43 10 48 89 de 48 8b 3d ec d9 3b 03 48 83 2d b4 d9 3b 03 01 <48> 83 a8 10 05 00 00 01 48 c7 43 10 00 00 00 00 e8 3f b0 00 00 49
[ 9326.547144] RSP: 0018:ffffae1300577e18 EFLAGS: 00010212
[ 9326.547152] RAX: 0000000000000000 RBX: ffff9c7348ba6200 RCX: 0000000000000000
[ 9326.547161] RDX: 0000000000000000 RSI: ffff9c7348ba6200 RDI: ffff9c7340222b00
[ 9326.547170] RBP: ffffae1300577ee0 R08: 0000000000000000 R09: 0000000000000000
[ 9326.547179] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c734006d800
[ 9326.547187] R13: ffff9c7340ba6940 R14: 00007f67677c0000 R15: ffffcfed481f7000
[ 9326.547196] FS: 0000000000000000(0000) GS:ffff9c823e880000(0000) knlGS:0000000000000000
[ 9326.547626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9326.548004] CR2: 0000000000000510 CR3: 000000067d388000 CR4: 0000000000750ee0
[ 9326.548376] PKRU: 55555554
[ 9326.548739] note: ksmd[119] exited with irqs disabled
anyone knows what could be wrong ?