Proxmox suddenly stops responding, services still run for 30-60min

gigagames

Member
May 14, 2021
3
0
6
27
Hey guys,
i have some weird problems with my Proxmox System.

At a Random time it suddenly stops working correctly
- The Webpage of the proxmox server is still reachable
- On the Summary page no addional data is shown in the Graphs
Screenshot_20240915_171038.png
I force rebootet the System at arround 17:10, so between ~16:50-17:10 the server was still running, and all services still responding

- The Server and all contains get shown with a '?'
- An SSH connection to the Server isn't possible
- Opening the Console is not possible
- but all services run just fine and are accessible.

Restarting is only possible, if the power button is pressed / the power is cut

If i leave the server running 30-60minutes later, all services aren't reachable anymore

In the latest logs i have the following, can this be the reason? How to fix?


Sep 15 16:50:53 iris kernel: BUG: unable to handle page fault for address: 0000040000000000
Sep 15 16:50:53 iris kernel: #PF: supervisor write access in kernel mode
Sep 15 16:50:53 iris kernel: #PF: error_code(0x0002) - not-present page
Sep 15 16:50:53 iris kernel: PGD 0 P4D 0
Sep 15 16:50:53 iris kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Sep 15 16:50:53 iris kernel: CPU: 2 PID: 1116 Comm: pvestatd Tainted: P O 6.8.12-1-pve #1
Sep 15 16:50:53 iris kernel: Hardware name: Default string Default string/Default string, BIOS 5.27 09/28/2023
Sep 15 16:50:53 iris kernel: RIP: 0010:__rb_insert_augmented+0x60/0x1e0
Sep 15 16:50:53 iris kernel: Code: f6 02 01 0f 84 ae 00 00 00 48 8b 57 08 49 89 fc 48 39 c2 0f 84 e4 00 00 00 48 89 53 10 48 89 5f 08 48 85 d2 74 07 48 8d 43 01 <48> 89 02 48 8b 13 48 89 17 4c 89 23 48 83 fa 03 76 76 48 83 e2 fc
Sep 15 16:50:53 iris kernel: RSP: 0018:ffffad3a097f3bb0 EFLAGS: 00010206
Sep 15 16:50:53 iris kernel: RAX: ffff931090cc11c1 RBX: ffff931090cc11c0 RCX: ffff9314a53b3900
Sep 15 16:50:53 iris kernel: RDX: 0000040000000000 RSI: ffff93108584d628 RDI: ffff93108fd74700
Sep 15 16:50:53 iris kernel: RBP: ffffad3a097f3bd0 R08: ffff93108fd74710 R09: 0000000000000000
Sep 15 16:50:53 iris kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff93108fd74700
Sep 15 16:50:53 iris kernel: R13: ffffffffa43b80a0 R14: ffff93108584d628 R15: ffffad3a097f3d88
Sep 15 16:50:53 iris kernel: FS: 0000762187851b80(0000) GS:ffff9317dfb00000(0000) knlGS:0000000000000000
Sep 15 16:50:53 iris kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 16:50:53 iris kernel: CR2: 0000040000000000 CR3: 000000010e46a005 CR4: 0000000000f72ef0
Sep 15 16:50:53 iris kernel: PKRU: 55555554
Sep 15 16:50:53 iris kernel: Call Trace:
Sep 15 16:50:53 iris kernel: <TASK>
Sep 15 16:50:53 iris kernel: ? show_regs+0x6d/0x80
Sep 15 16:50:53 iris kernel: ? __die+0x24/0x80
Sep 15 16:50:53 iris kernel: ? page_fault_oops+0x176/0x4f0
Sep 15 16:50:53 iris kernel: ? do_user_addr_fault+0x2ed/0x650
Sep 15 16:50:53 iris kernel: ? exc_page_fault+0x83/0x1b0
Sep 15 16:50:53 iris kernel: ? asm_exc_page_fault+0x27/0x30
Sep 15 16:50:53 iris kernel: ? __pfx_vma_interval_tree_augment_rotate+0x10/0x10
Sep 15 16:50:53 iris kernel: ? __rb_insert_augmented+0x60/0x1e0
Sep 15 16:50:53 iris kernel: vma_interval_tree_insert_after+0x97/0xc0
Sep 15 16:50:53 iris kernel: copy_process+0x205a/0x2570
Sep 15 16:50:53 iris kernel: kernel_clone+0xbd/0x440
Sep 15 16:50:53 iris kernel: ? wp_page_reuse+0x95/0xc0
Sep 15 16:50:53 iris kernel: __do_sys_clone+0x66/0xa0
Sep 15 16:50:53 iris kernel: RAX: ffff931090cc11c1 RBX: ffff931090cc11c0 RCX: ffff9314a53b3900
Sep 15 16:50:53 iris kernel: RDX: 0000040000000000 RSI: ffff93108584d628 RDI: ffff93108fd74700
Sep 15 16:50:53 iris kernel: RBP: ffffad3a097f3bd0 R08: ffff93108fd74710 R09: 0000000000000000
Sep 15 16:50:53 iris kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff93108fd74700
Sep 15 16:50:53 iris kernel: R13: ffffffffa43b80a0 R14: ffff93108584d628 R15: ffffad3a097f3d88
Sep 15 16:50:53 iris kernel: FS: 0000762187851b80(0000) GS:ffff9317dfb00000(0000) knlGS:0000000000000000
Sep 15 16:50:53 iris kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 16:50:53 iris kernel: CR2: 0000040000000000 CR3: 000000010e46a005 CR4: 0000000000f72ef0
Sep 15 16:50:53 iris kernel: PKRU: 55555554
Sep 15 16:50:53 iris kernel: Call Trace:
Sep 15 16:50:53 iris kernel: <TASK>
Sep 15 16:50:53 iris kernel: ? show_regs+0x6d/0x80
Sep 15 16:50:53 iris kernel: ? __die+0x24/0x80
Sep 15 16:50:53 iris kernel: ? page_fault_oops+0x176/0x4f0
Sep 15 16:50:53 iris kernel: ? do_user_addr_fault+0x2ed/0x650
Sep 15 16:50:53 iris kernel: ? exc_page_fault+0x83/0x1b0
Sep 15 16:50:53 iris kernel: ? asm_exc_page_fault+0x27/0x30
Sep 15 16:50:53 iris kernel: ? __pfx_vma_interval_tree_augment_rotate+0x10/0x10
Sep 15 16:50:53 iris kernel: ? __rb_insert_augmented+0x60/0x1e0
Sep 15 16:50:53 iris kernel: vma_interval_tree_insert_after+0x97/0xc0
Sep 15 16:50:53 iris kernel: copy_process+0x205a/0x2570
Sep 15 16:50:53 iris kernel: kernel_clone+0xbd/0x440
Sep 15 16:50:53 iris kernel: ? wp_page_reuse+0x95/0xc0
Sep 15 16:50:53 iris kernel: __do_sys_clone+0x66/0xa0
Sep 15 16:50:53 iris kernel: __x64_sys_clone+0x25/0x40
Sep 15 16:50:53 iris kernel: x64_sys_call+0x1d0e/0x24b0
Sep 15 16:50:53 iris kernel: do_syscall_64+0x81/0x170
Sep 15 16:50:53 iris kernel: ? __count_memcg_events+0x6f/0xe0
Sep 15 16:50:53 iris kernel: ? count_memcg_events.constprop.0+0x2a/0x50
Sep 15 16:50:53 iris kernel: ? handle_mm_fault+0xad/0x380
Sep 15 16:50:53 iris kernel: ? do_user_addr_fault+0x337/0x650
Sep 15 16:50:53 iris kernel: ? irqentry_exit_to_user_mode+0x7e/0x260
Sep 15 16:50:53 iris kernel: ? irqentry_exit+0x43/0x50
Sep 15 16:50:53 iris kernel: ? exc_page_fault+0x94/0x1b0
Sep 15 16:50:53 iris kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
Sep 15 16:50:53 iris kernel: RIP: 0033:0x762187963293
Sep 15 16:50:53 iris kernel: Code: 00 00 00 00 00 66 90 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00
Sep 15 16:50:53 iris kernel: RSP: 002b:00007ffc0789d4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Sep 15 16:50:53 iris kernel: RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000762187963293
Sep 15 16:50:53 iris kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
Sep 15 16:50:53 iris kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Sep 15 16:50:53 iris kernel: R10: 0000762187851e50 R11: 0000000000000246 R12: 0000000000000001
Sep 15 16:50:53 iris kernel: R13: 00007ffc0789d600 R14: 00007ffc0789d680 R15: 0000762187b8a020
Sep 15 16:50:53 iris kernel: </TASK>
Sep 15 16:50:53 iris kernel: Modules linked in: udp_diag nfsv3 nfs_acl dm_snapshot cfg80211 veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb>
Sep 15 16:50:53 iris kernel: snd_hda_core sha256_ssse3 sha1_ssse3 aesni_intel ttm crypto_simd cryptd snd_hwdep snd_pcm drm_display_helper rapl intel_pmc_core snd_timer intel_cstate cmdlinepart cec pcspkr wmi_bmof snd rc_core sp>
Sep 15 16:50:53 iris kernel: CR2: 0000040000000000
Sep 15 16:50:53 iris kernel: ---[ end trace 0000000000000000 ]---
Sep 15 16:50:53 iris kernel: RIP: 0010:__rb_insert_augmented+0x60/0x1e0
Sep 15 16:50:53 iris kernel: Code: f6 02 01 0f 84 ae 00 00 00 48 8b 57 08 49 89 fc 48 39 c2 0f 84 e4 00 00 00 48 89 53 10 48 89 5f 08 48 85 d2 74 07 48 8d 43 01 <48> 89 02 48 8b 13 48 89 17 4c 89 23 48 83 fa 03 76 76 48 83 e2 fc
Sep 15 16:50:53 iris kernel: RSP: 0018:ffffad3a097f3bb0 EFLAGS: 00010206
Sep 15 16:50:53 iris kernel: RAX: ffff931090cc11c1 RBX: ffff931090cc11c0 RCX: ffff9314a53b3900
Sep 15 16:50:53 iris kernel: RDX: 0000040000000000 RSI: ffff93108584d628 RDI: ffff93108fd74700
Sep 15 16:50:53 iris kernel: RBP: ffffad3a097f3bd0 R08: ffff93108fd74710 R09: 0000000000000000
Sep 15 16:50:53 iris kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff93108fd74700
Sep 15 16:50:53 iris kernel: R13: ffffffffa43b80a0 R14: ffff93108584d628 R15: ffffad3a097f3d88
Sep 15 16:50:53 iris kernel: FS: 0000762187851b80(0000) GS:ffff9317dfb00000(0000) knlGS:0000000000000000
Sep 15 16:50:53 iris kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 16:50:53 iris kernel: CR2: 0000040000000000 CR3: 000000010e46a005 CR4: 0000000000f72ef0
Sep 15 16:50:53 iris kernel: PKRU: 55555554
 

Attachments

  • journalctl.txt
    606.6 KB · Views: 2
Hi,
sounds like it might be a kernel issue. You could try the newer kernel from the test repository: https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_test_repo
You can temporarily enable the repository, run apt update, run apt install proxmox-kernel-6.8 and disable the repository again (e.g. via the Repositories section in the UI), then run apt update again.

Alternatively, you can try booting into an older kernel.

I'd also make sure to have the latest CPU microcode installed: https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysadmin_firmware_cpu
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!