I run Proxmox on a PCEngines APU2C4 (Firmware v4.10.0.1) and get kernel crashes every few days. Sometimes the system gets unresponsive and all that helps is a hard reset, sometimes I can still ssh into it or log in via serial console.
Here is a stack trace I was able to get:
Any Suggestions on what might be causing this?
Here is a stack trace I was able to get:
Code:
[29606.593925] kvm: Corrupted page table at address e1
[29606.598818] PGD 0 P4D 0
[29606.601368] Bad pagetable: 79c8 [#1] SMP NOPTI
[29606.605841] CPU: 1 PID: 1102 Comm: kvm Not tainted 5.0.21-1-pve #1
[29606.612024] Hardware name: PC Engines apu2/apu2, BIOS v4.10.0.1 09/10/2019
[29606.618965] RIP: 0010:eventfd_poll+0xa/0x70
[29606.623154] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[29606.641993] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[29606.647255] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[29606.654449] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[29606.661589] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[29606.668778] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[29606.675914] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[29606.683048] FS: 00007f7b91e2edc0(0000) GS:ffffa06aaaa80000(0000) knlGS:0000000000000000
[29606.691147] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[29606.696896] CR2: 00000000000000e1 CR3: 00000000c3354000 CR4: 00000000000406e0
[29606.704046] Call Trace:
[29606.706518] do_sys_poll+0x252/0x530
[29606.710104] ? __skb_datagram_iter+0x6e/0x2c0
[29606.714542] ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.719597] ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.724658] ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.729720] ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.734825] ? _copy_from_user+0x3e/0x60
[29606.738812] ? wake_up_q+0x80/0x80
[29606.742225] ? ktime_get_ts64+0x46/0xe0
[29606.746078] __x64_sys_ppoll+0xbd/0x120
[29606.749926] do_syscall_64+0x5a/0x110
[29606.753598] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[29606.758667] RIP: 0033:0x7f7b9e70e916
[29606.762294] Code: 7c 24 08 e8 5c 7e 01 00 41 b8 08 00 00 00 4c 8b 54 24 18 48 89 da 41 89 c1 48 8b 74 24 10 48 8b 7c 24 08 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 28 44 89 cf 89 44 24 08 e8 86 7e 01 00 8b 44
[29606.781067] RSP: 002b:00007ffe2b6a5200 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
[29606.788709] RAX: ffffffffffffffda RBX: 00007ffe2b6a5220 RCX: 00007f7b9e70e916
[29606.795887] RDX: 00007ffe2b6a5220 RSI: 0000000000000050 RDI: 00007f7b4d1e1400
[29606.803017] RBP: 00007ffe2b6a5290 R08: 0000000000000008 R09: 0000000000000000
[29606.810251] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f7b91b53580
[29606.817413] R13: 00007f7b91b53580 R14: 00007ffe2b6a528c R15: 00005611fea091f0
[29606.824609] Modules linked in: cfg80211 veth vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter 8021q garp mrp dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c sp5100_tco amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel leds_apu aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_pcm snd_timer snd soundcore pcspkr fam15h_power k10temp ccp leds_gpio mac_hid nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 sdhci_pci cqhci ahci sdhci i2c_piix4 libahci igb i2c_algo_bit dca gpio_keys
[29606.890304] ---[ end trace 7cd79a73330e8d36 ]---
[29606.894940] RIP: 0010:eventfd_poll+0xa/0x70
[29606.899134] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[29606.917976] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[29606.923218] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[29606.930396] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[29606.937550] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[29606.944690] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[29606.951825] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[29606.958997] FS: 00007f7b91e2edc0(0000) GS:ffffa06aaaa80000(0000) knlGS:0000000000000000
[29606.967085] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[29606.972831] CR2: 00000000000000e1 CR3: 00000000c3354000 CR4: 00000000000406e0
[31979.496121] BUG: unable to handle kernel paging request at ffffb2e4c09a7cd8
[31979.503106] #PF error: [normal kernel read fault]
[31979.507810] PGD 12a550067 P4D 12a550067 PUD 12a551067 PMD 1299a4067 PTE 0
[31979.514668] Oops: 0000 [#2] SMP NOPTI
[31979.518337] CPU: 0 PID: 1251 Comm: kvm Tainted: G D 5.0.21-1-pve #1
[31979.525906] Hardware name: PC Engines apu2/apu2, BIOS v4.10.0.1 09/10/2019
[31979.532818] RIP: 0010:__wake_up_common+0x41/0x140
[31979.537585] Code: ec 18 4d 85 c9 74 0a 41 f6 01 04 0f 85 a2 00 00 00 48 8b 43 08 48 83 c3 08 48 8d 78 e8 48 8d 47 18 48 39 c3 0f 84 cc 00 00 00 <48> 8b 47 18 4c 89 45 c8 4d 89 cd 45 31 e4 89 4d d0 89 75 d4 4c 8d
[31979.556377] RSP: 0018:ffffb2e4c1343dc8 EFLAGS: 00010097
[31979.561662] RAX: ffffb2e4c09a7cd8 RBX: ffffa06a9f61b590 RCX: 0000000000000000
[31979.568823] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffb2e4c09a7cc0
[31979.575973] RBP: ffffb2e4c1343e08 R08: 0000000000000001 R09: 0000000000000000
[31979.583178] R10: ffffa06a3a956200 R11: 0000000000000000 R12: ffffa06a9f61b590
[31979.590327] R13: ffffa06a9f61b580 R14: ffffa06a3aa9ae00 R15: 0000000000000008
[31979.597503] FS: 00007f7b8e1ff700(0000) GS:ffffa06aaaa00000(0000) knlGS:0000000000000000
[31979.605624] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31979.611420] CR2: ffffb2e4c09a7cd8 CR3: 00000000c3354000 CR4: 00000000000406f0
[31979.618587] Call Trace:
[31979.621065] __wake_up_locked_key+0x1b/0x20
[31979.625257] eventfd_write+0xd3/0x270
[31979.628965] ? wake_up_q+0x80/0x80
[31979.632428] __vfs_write+0x1b/0x40
[31979.635850] vfs_write+0xab/0x1b0
[31979.639173] ksys_write+0x5c/0xd0
[31979.642500] __x64_sys_write+0x1a/0x20
[31979.646273] do_syscall_64+0x5a/0x110
[31979.649949] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[31979.655048] RIP: 0033:0x7f7b9e7f24a7
[31979.658648] Code: 44 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 fb fc ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 34 fd ff ff 48
[31979.677514] RSP: 002b:00007f7b8e1fa340 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[31979.685149] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f7b9e7f24a7
[31979.692295] RDX: 0000000000000008 RSI: 00005611feb265f8 RDI: 000000000000000d
[31979.699446] RBP: 00005611feb265f8 R08: 0000000000000000 R09: 0000000000000014
[31979.706642] R10: 000000000000001d R11: 0000000000000293 R12: 0000000000000008
[31979.713781] R13: 0000000000000000 R14: 0000000000000001 R15: 00005611fe5c22d0
[31979.720933] Modules linked in: cfg80211 veth vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter 8021q garp mrp dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c sp5100_tco amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel leds_apu aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_pcm snd_timer snd soundcore pcspkr fam15h_power k10temp ccp leds_gpio mac_hid nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 sdhci_pci cqhci ahci sdhci i2c_piix4 libahci igb i2c_algo_bit dca gpio_keys
[31979.786576] CR2: ffffb2e4c09a7cd8
[31979.789903] ---[ end trace 7cd79a73330e8d37 ]---
[31979.794565] RIP: 0010:eventfd_poll+0xa/0x70
[31979.798803] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[31979.817616] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[31979.822895] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[31979.830062] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[31979.837191] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[31979.844325] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[31979.851587] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[31979.858732] FS: 00007f7b8e1ff700(0000) GS:ffffa06aaaa00000(0000) knlGS:0000000000000000
[31979.866841] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31979.872662] CR2: ffffb2e4c09a7cd8 CR3: 00000000c3354000 CR4: 00000000000406f0
Any Suggestions on what might be causing this?