kvm: Corrupted page table at address e1

sebastian_de

Member
Sep 20, 2019
10
0
6
I run Proxmox on a PCEngines APU2C4 (Firmware v4.10.0.1) and get kernel crashes every few days. Sometimes the system gets unresponsive and all that helps is a hard reset, sometimes I can still ssh into it or log in via serial console.

Here is a stack trace I was able to get:
Code:
[29606.593925] kvm: Corrupted page table at address e1
[29606.598818] PGD 0 P4D 0
[29606.601368] Bad pagetable: 79c8 [#1] SMP NOPTI
[29606.605841] CPU: 1 PID: 1102 Comm: kvm Not tainted 5.0.21-1-pve #1
[29606.612024] Hardware name: PC Engines apu2/apu2, BIOS v4.10.0.1 09/10/2019
[29606.618965] RIP: 0010:eventfd_poll+0xa/0x70
[29606.623154] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[29606.641993] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[29606.647255] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[29606.654449] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[29606.661589] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[29606.668778] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[29606.675914] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[29606.683048] FS:  00007f7b91e2edc0(0000) GS:ffffa06aaaa80000(0000) knlGS:0000000000000000
[29606.691147] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[29606.696896] CR2: 00000000000000e1 CR3: 00000000c3354000 CR4: 00000000000406e0
[29606.704046] Call Trace:
[29606.706518]  do_sys_poll+0x252/0x530
[29606.710104]  ? __skb_datagram_iter+0x6e/0x2c0
[29606.714542]  ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.719597]  ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.724658]  ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.729720]  ? poll_select_copy_remaining+0x1b0/0x1b0
[29606.734825]  ? _copy_from_user+0x3e/0x60
[29606.738812]  ? wake_up_q+0x80/0x80
[29606.742225]  ? ktime_get_ts64+0x46/0xe0
[29606.746078]  __x64_sys_ppoll+0xbd/0x120
[29606.749926]  do_syscall_64+0x5a/0x110
[29606.753598]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[29606.758667] RIP: 0033:0x7f7b9e70e916
[29606.762294] Code: 7c 24 08 e8 5c 7e 01 00 41 b8 08 00 00 00 4c 8b 54 24 18 48 89 da 41 89 c1 48 8b 74 24 10 48 8b 7c 24 08 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 28 44 89 cf 89 44 24 08 e8 86 7e 01 00 8b 44
[29606.781067] RSP: 002b:00007ffe2b6a5200 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
[29606.788709] RAX: ffffffffffffffda RBX: 00007ffe2b6a5220 RCX: 00007f7b9e70e916
[29606.795887] RDX: 00007ffe2b6a5220 RSI: 0000000000000050 RDI: 00007f7b4d1e1400
[29606.803017] RBP: 00007ffe2b6a5290 R08: 0000000000000008 R09: 0000000000000000
[29606.810251] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f7b91b53580
[29606.817413] R13: 00007f7b91b53580 R14: 00007ffe2b6a528c R15: 00005611fea091f0
[29606.824609] Modules linked in: cfg80211 veth vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter 8021q garp mrp dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c sp5100_tco amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel leds_apu aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_pcm snd_timer snd soundcore pcspkr fam15h_power k10temp ccp leds_gpio mac_hid nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 sdhci_pci cqhci ahci sdhci i2c_piix4 libahci igb i2c_algo_bit dca gpio_keys
[29606.890304] ---[ end trace 7cd79a73330e8d36 ]---
[29606.894940] RIP: 0010:eventfd_poll+0xa/0x70
[29606.899134] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[29606.917976] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[29606.923218] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[29606.930396] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[29606.937550] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[29606.944690] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[29606.951825] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[29606.958997] FS:  00007f7b91e2edc0(0000) GS:ffffa06aaaa80000(0000) knlGS:0000000000000000
[29606.967085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[29606.972831] CR2: 00000000000000e1 CR3: 00000000c3354000 CR4: 00000000000406e0
[31979.496121] BUG: unable to handle kernel paging request at ffffb2e4c09a7cd8
[31979.503106] #PF error: [normal kernel read fault]
[31979.507810] PGD 12a550067 P4D 12a550067 PUD 12a551067 PMD 1299a4067 PTE 0
[31979.514668] Oops: 0000 [#2] SMP NOPTI
[31979.518337] CPU: 0 PID: 1251 Comm: kvm Tainted: G      D           5.0.21-1-pve #1
[31979.525906] Hardware name: PC Engines apu2/apu2, BIOS v4.10.0.1 09/10/2019
[31979.532818] RIP: 0010:__wake_up_common+0x41/0x140
[31979.537585] Code: ec 18 4d 85 c9 74 0a 41 f6 01 04 0f 85 a2 00 00 00 48 8b 43 08 48 83 c3 08 48 8d 78 e8 48 8d 47 18 48 39 c3 0f 84 cc 00 00 00 <48> 8b 47 18 4c 89 45 c8 4d 89 cd 45 31 e4 89 4d d0 89 75 d4 4c 8d
[31979.556377] RSP: 0018:ffffb2e4c1343dc8 EFLAGS: 00010097
[31979.561662] RAX: ffffb2e4c09a7cd8 RBX: ffffa06a9f61b590 RCX: 0000000000000000
[31979.568823] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffb2e4c09a7cc0
[31979.575973] RBP: ffffb2e4c1343e08 R08: 0000000000000001 R09: 0000000000000000
[31979.583178] R10: ffffa06a3a956200 R11: 0000000000000000 R12: ffffa06a9f61b590
[31979.590327] R13: ffffa06a9f61b580 R14: ffffa06a3aa9ae00 R15: 0000000000000008
[31979.597503] FS:  00007f7b8e1ff700(0000) GS:ffffa06aaaa00000(0000) knlGS:0000000000000000
[31979.605624] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31979.611420] CR2: ffffb2e4c09a7cd8 CR3: 00000000c3354000 CR4: 00000000000406f0
[31979.618587] Call Trace:
[31979.621065]  __wake_up_locked_key+0x1b/0x20
[31979.625257]  eventfd_write+0xd3/0x270
[31979.628965]  ? wake_up_q+0x80/0x80
[31979.632428]  __vfs_write+0x1b/0x40
[31979.635850]  vfs_write+0xab/0x1b0
[31979.639173]  ksys_write+0x5c/0xd0
[31979.642500]  __x64_sys_write+0x1a/0x20
[31979.646273]  do_syscall_64+0x5a/0x110
[31979.649949]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[31979.655048] RIP: 0033:0x7f7b9e7f24a7
[31979.658648] Code: 44 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 fb fc ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 34 fd ff ff 48
[31979.677514] RSP: 002b:00007f7b8e1fa340 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[31979.685149] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f7b9e7f24a7
[31979.692295] RDX: 0000000000000008 RSI: 00005611feb265f8 RDI: 000000000000000d
[31979.699446] RBP: 00005611feb265f8 R08: 0000000000000000 R09: 0000000000000014
[31979.706642] R10: 000000000000001d R11: 0000000000000293 R12: 0000000000000008
[31979.713781] R13: 0000000000000000 R14: 0000000000000001 R15: 00005611fe5c22d0
[31979.720933] Modules linked in: cfg80211 veth vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter 8021q garp mrp dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c sp5100_tco amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel leds_apu aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_pcm snd_timer snd soundcore pcspkr fam15h_power k10temp ccp leds_gpio mac_hid nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 sdhci_pci cqhci ahci sdhci i2c_piix4 libahci igb i2c_algo_bit dca gpio_keys
[31979.786576] CR2: ffffb2e4c09a7cd8
[31979.789903] ---[ end trace 7cd79a73330e8d37 ]---
[31979.794565] RIP: 0010:eventfd_poll+0xa/0x70
[31979.798803] Code: 83 c0 00 00 00 48 8d 98 40 ff ff ff 48 3d 60 cd 50 a8 75 9a 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 9f c8 00 00 00 48 85 f6 74 1c 48 8b 06 48 8d 4b 08 48 85 c0
[31979.817616] RSP: 0018:ffffb2e4c09a7a88 EFLAGS: 00010282
[31979.822895] RAX: ffffffffa7104310 RBX: ffffa069ec8e3400 RCX: ffffa06a3abd8400
[31979.830062] RDX: ffffa06a3abd8438 RSI: ffffb2e4c09a7c00 RDI: 0000000000000019
[31979.837191] RBP: ffffb2e4c09a7a90 R08: ffffa06a3abd8401 R09: 0000000000000045
[31979.844325] R10: ffffa06a3abd8400 R11: 000000000000000c R12: 0000000000000019
[31979.851587] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa069ec8e3424
[31979.858732] FS:  00007f7b8e1ff700(0000) GS:ffffa06aaaa00000(0000) knlGS:0000000000000000
[31979.866841] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[31979.872662] CR2: ffffb2e4c09a7cd8 CR3: 00000000c3354000 CR4: 00000000000406f0

Any Suggestions on what might be causing this?
 
pveversion -v :
Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-2-pve)
pve-manager: 6.0-7 (running version: 6.0-7/28984024)
pve-kernel-5.0: 6.0-8
pve-kernel-helper: 6.0-8
pve-kernel-5.0.21-2-pve: 5.0.21-3
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.18-1-pve: 5.0.18-3
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksmtuned: 4.20150325+b1
libjs-extjs: 6.0.1-10
libknet1: 1.11-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-8
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
 
I got another one today, this time it said kernel tried to execute NX-protected page - exploit attempt? (uid: 0) before the crash any idea what that means? I found this thread, but the link there doesn't work for me.

the complete syslog of the event:
Code:
Sep 23 13:45:00 burrito systemd[1]: Starting Proxmox VE replication runner...
Sep 23 13:45:03 burrito systemd[1]: pvesr.service: Succeeded.
Sep 23 13:45:03 burrito systemd[1]: Started Proxmox VE replication runner.
Sep 23 13:45:33 burrito kernel: kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Sep 23 13:45:33 burrito kernel: BUG: unable to handle kernel paging request at ffff9e9f3a3c0cc0
Sep 23 13:45:33 burrito kernel: #PF error: [PROT] [INSTR]
Sep 23 13:45:33 burrito kernel: PGD 10fc01067 P4D 10fc01067 PUD c0740063 PMD bf528063 PTE 80000000ba3c0163
Sep 23 13:45:33 burrito kernel: Oops: 0011 [#1] SMP NOPTI
Sep 23 13:45:33 burrito kernel: CPU: 1 PID: 41 Comm: ksmd Not tainted 5.0.21-2-pve #1
Sep 23 13:45:33 burrito kernel: Hardware name: PC Engines apu2/apu2, BIOS v4.10.0.1 09/10/2019
Sep 23 13:45:33 burrito kernel: RIP: 0010:0xffff9e9f3a3c0cc0
Sep 23 13:45:33 burrito kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Sep 23 13:45:33 burrito kernel: RSP: 0018:ffffbb150082fe68 EFLAGS: 00010202
Sep 23 13:45:33 burrito kernel: RAX: 000000000000000a RBX: ffff9e9eee34b300 RCX: 0000000000000023
Sep 23 13:45:33 burrito kernel: RDX: 0000000000001000 RSI: ffff9e9ef149b000 RDI: ffff9e9ef153b000
Sep 23 13:45:33 burrito kernel: RBP: 0000000000000000 R08: 000000000000000a R09: 00000000000000c9
Sep 23 13:45:33 burrito kernel: R10: 0000000000000080 R11: 0000000000000080 R12: ffff9e9f3a3c0d30
Sep 23 13:45:33 burrito kernel: R13: fffff78b01c54ec0 R14: ffff9e9f3073a7a8 R15: fffff78b01c526c0
Sep 23 13:45:33 burrito kernel: FS:  0000000000000000(0000) GS:ffff9e9faaa80000(0000) knlGS:0000000000000000
Sep 23 13:45:33 burrito kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 23 13:45:33 burrito kernel: CR2: ffff9e9f3a3c0cc0 CR3: 00000000ba3ac000 CR4: 00000000000406e0
Sep 23 13:45:33 burrito kernel: Call Trace:
Sep 23 13:45:33 burrito kernel:  ? wait_woken+0x80/0x80
Sep 23 13:45:33 burrito kernel:  ? kthread+0x120/0x140
Sep 23 13:45:33 burrito kernel:  ? try_to_merge_with_ksm_page+0x90/0x90
Sep 23 13:45:33 burrito kernel:  ? __kthread_parkme+0x70/0x70
Sep 23 13:45:33 burrito kernel:  ? ret_from_fork+0x22/0x40
Sep 23 13:45:33 burrito kernel: Modules linked in: cfg80211 veth vfio_pci vfio_virqfd vfio_iommu_type1 vfio ebta
Sep 23 13:45:33 burrito kernel: CR2: ffff9e9f3a3c0cc0
Sep 23 13:45:33 burrito kernel: ---[ end trace f102b1cc43ac241f ]---
Sep 23 13:45:33 burrito kernel: RIP: 0010:0xffff9e9f3a3c0cc0
Sep 23 13:45:33 burrito kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Sep 23 13:45:33 burrito kernel: RSP: 0018:ffffbb150082fe68 EFLAGS: 00010202
Sep 23 13:45:33 burrito kernel: RAX: 000000000000000a RBX: ffff9e9eee34b300 RCX: 0000000000000023
Sep 23 13:45:33 burrito kernel: RDX: 0000000000001000 RSI: ffff9e9ef149b000 RDI: ffff9e9ef153b000
Sep 23 13:45:33 burrito kernel: RBP: 0000000000000000 R08: 000000000000000a R09: 00000000000000c9
Sep 23 13:45:33 burrito kernel: R10: 0000000000000080 R11: 0000000000000080 R12: ffff9e9f3a3c0d30
Sep 23 13:45:33 burrito kernel: R13: fffff78b01c54ec0 R14: ffff9e9f3073a7a8 R15: fffff78b01c526c0
Sep 23 13:45:33 burrito kernel: FS:  0000000000000000(0000) GS:ffff9e9faaa80000(0000) knlGS:0000000000000000
Sep 23 13:45:33 burrito kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 23 13:45:33 burrito kernel: CR2: ffff9e9f3a3c0cc0 CR3: 00000000ba3ac000 CR4: 00000000000406e0
Sep 23 13:46:00 burrito systemd[1]: Starting Proxmox VE replication runner...
Sep 23 13:46:03 burrito systemd[1]: pvesr.service: Succeeded.
Sep 23 13:46:03 burrito systemd[1]: Started Proxmox VE replication runner.
Sep 23 13:47:00 burrito systemd[1]: Starting Proxmox VE replication runner...
Sep 23 13:47:03 burrito systemd[1]: pvesr.service: Succeeded.

I don't think that it's related to the replication runner - I only enabled HA services to make use of the hardware watchdog after the crashes started to happen.
 
On a hunch - maybe try to run memtest86 for a few passes - that could rule out a HW-problem

I hope this helps!
 
I ran memtest86+ twice without errors.
I searched a bit and and found the following lkml thread: https://lkml.org/lkml/2019/6/5/274
Since I can't interpret kernel traces, I don't know if this is related.

I did a quick test, however. I set up two VWs:
  • vanilla Debian Buster (Kernel: #1 SMP Debian 4.19.67-2 (2019-08-28) )
  • proxmox (Kernel: #1 SMP PVE 5.0.21-3 (Thu, 05 Sep 2019 13:56:01 +0200) )
I ran the mentioned ltp ftrace-stress-test about 10 times in both VMs. The Debian VM never produced traces, but in the proxmox vm I got traces like this one during a few runs:
Code:
[   44.073887] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
[   62.248518] ------------[ cut here ]------------
[   62.248533] WARNING: CPU: 2 PID: 4044 at kernel/trace/trace_hwlat.c:355 start_kthread.isra.1.cold.6+0xc/0x37
[   62.248533] Modules linked in: ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zlua(PO) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper input_leds snd_pcm snd_timer snd soundcore pcspkr serio_raw joydev virtio_gpu ttm drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt qemu_fw_cfg mac_hid zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc virtio_rng ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbhid hid i2c_i801 psmouse ahci libahci lpc_ich virtio_blk virtio_net net_failover failover
[   62.248561] CPU: 2 PID: 4044 Comm: sh Tainted: P           O      5.0.21-2-pve #1
[   62.248562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
[   62.248563] RIP: 0010:start_kthread.isra.1.cold.6+0xc/0x37
[   62.248564] Code: 48 89 05 15 51 ac 01 5d c3 e8 6e 2b fe ff 48 2b 05 07 51 ac 01 5d 48 89 05 f7 50 ac 01 c3 48 c7 c7 90 bf ae 94 e8 55 46 f6 ff <0f> 0b 31 c0 e9 e1 fa ff ff 48 c7 c7 e0 b1 84 94 e8 bb d8 8a 00 48
[   62.248565] RSP: 0018:ffffb44280c3be30 EFLAGS: 00010246
[   62.248566] RAX: 0000000000000024 RBX: 0000000000000002 RCX: 0000000000000000
[   62.248567] RDX: 0000000000000000 RSI: ffff8daffbb16448 RDI: ffff8daffbb16448
[   62.248567] RBP: ffffb44280c3be40 R08: 0000000000000001 R09: 00000000000002c0
[   62.248568] R10: 0000000000000004 R11: 0000000000000000 R12: ffffffff94ed2980
[   62.248568] R13: ffffb44280c3bee8 R14: ffff8daffb405301 R15: 0000000000000000
[   62.248570] FS:  00007f0264e6d580(0000) GS:ffff8daffbb00000(0000) knlGS:0000000000000000
[   62.248571] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.248572] CR2: 000055dc62dc9328 CR3: 0000000177b88000 CR4: 00000000003406e0
[   62.248574] Call Trace:
[   62.248577]  hwlat_tracer_start+0x9/0x20
[   62.248579]  rb_simple_write+0xcc/0x150
[   62.248581]  __vfs_write+0x1b/0x40
[   62.248583]  vfs_write+0xab/0x1b0
[   62.248584]  ksys_write+0x5c/0xd0
[   62.248585]  __x64_sys_write+0x1a/0x20
[   62.248587]  do_syscall_64+0x5a/0x110
[   62.248590]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   62.248591] RIP: 0033:0x7f0264d95504
[   62.248592] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[   62.248593] RSP: 002b:00007ffdc6db6858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   62.248594] RAX: ffffffffffffffda RBX: 000055dc62dc7320 RCX: 00007f0264d95504
[   62.248594] RDX: 0000000000000002 RSI: 000055dc62dc7320 RDI: 0000000000000001
[   62.248594] RBP: 0000000000000002 R08: 000055dc62dc6bad R09: 000055dc62dc7181
[   62.248595] R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000001
[   62.248595] R13: 0000000000000002 R14: 7fffffffffffffff R15: 0000000000000002
[   62.248597] ---[ end trace d59f3b6a08a17744 ]---
[   62.248633] ------------[ cut here ]------------
[   62.248640] WARNING: CPU: 2 PID: 4044 at kernel/trace/trace_hwlat.c:355 start_kthread.isra.1.cold.6+0xc/0x37
[   62.248640] Modules linked in: ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zlua(PO) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper input_leds snd_pcm snd_timer snd soundcore pcspkr serio_raw joydev virtio_gpu ttm drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt qemu_fw_cfg mac_hid zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc virtio_rng ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbhid hid i2c_i801 psmouse ahci libahci lpc_ich virtio_blk virtio_net net_failover failover
[   62.248656] CPU: 2 PID: 4044 Comm: sh Tainted: P        W  O      5.0.21-2-pve #1
[   62.248657] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
[   62.248658] RIP: 0010:start_kthread.isra.1.cold.6+0xc/0x37
[   62.248659] Code: 48 89 05 15 51 ac 01 5d c3 e8 6e 2b fe ff 48 2b 05 07 51 ac 01 5d 48 89 05 f7 50 ac 01 c3 48 c7 c7 90 bf ae 94 e8 55 46 f6 ff <0f> 0b 31 c0 e9 e1 fa ff ff 48 c7 c7 e0 b1 84 94 e8 bb d8 8a 00 48
[   62.248659] RSP: 0018:ffffb44280c3be30 EFLAGS: 00010246
[   62.248660] RAX: 0000000000000024 RBX: 0000000000000002 RCX: 0000000000000000
[   62.248660] RDX: 0000000000000000 RSI: ffff8daffbb16448 RDI: ffff8daffbb16448
[   62.248661] RBP: ffffb44280c3be40 R08: 0000000000000001 R09: 00000000000002e2
[   62.248661] R10: 0000000000000004 R11: 0000000000000000 R12: ffffffff94ed2980
[   62.248662] R13: ffffb44280c3bee8 R14: ffff8daffb405301 R15: 0000000000000000
[   62.248664] FS:  00007f0264e6d580(0000) GS:ffff8daffbb00000(0000) knlGS:0000000000000000
[   62.248664] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.248664] CR2: 000055dc62dca388 CR3: 0000000177b88000 CR4: 00000000003406e0
[   62.248666] Call Trace:
[   62.248668]  hwlat_tracer_start+0x9/0x20
[   62.248669]  rb_simple_write+0xcc/0x150
[   62.248670]  __vfs_write+0x1b/0x40
[   62.248672]  vfs_write+0xab/0x1b0
[   62.248673]  ksys_write+0x5c/0xd0
[   62.248674]  __x64_sys_write+0x1a/0x20
[   62.248675]  do_syscall_64+0x5a/0x110
[   62.248677]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   62.248677] RIP: 0033:0x7f0264d95504
[   62.248678] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[   62.248680] RSP: 002b:00007ffdc6db6858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   62.248683] RAX: ffffffffffffffda RBX: 000055dc62dc7320 RCX: 00007f0264d95504
[   62.248684] RDX: 0000000000000002 RSI: 000055dc62dc7320 RDI: 0000000000000001
[   62.248687] RBP: 0000000000000002 R08: 000055dc62dc6bad R09: 000055dc62dc7181
[   62.248688] R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000001
[   62.248690] R13: 0000000000000002 R14: 7fffffffffffffff R15: 0000000000000002
[   62.248692] ---[ end trace d59f3b6a08a17745 ]---

https://lkml.org/lkml/2019/6/9/781 mentions 3 patches - how can I find out if they are in the current proxmox kernel?

But as I said before, since I can't interpret the traces, this might be a problem completely unrelated to the crashes I experience on my APU machine.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!