Nested PVE (on PVE host) Kernel panic Host injected async #PF in kernel mode

Darkangeel_hd · Tuesday at 18:15

Hello
I've been trying run a nested proxmox instance on my already running proxmox installation.

Installation went as normal and everything seem right.
But the VM of the nested proxmox will, always eventually kernel panic.

This are some of the panics it gave me over the course of trying to figure out what's going on:

Code:

[10037.609096] Kernel panic - not syncing: Host injected async #PF in kernel mode
[10037.611446] CPU: 1 PID: 27733 Comm: pvescheduler Tainted: P           O       6.8.12-10-pve #1
[10037.614206] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[10037.616497] Call Trace:
[10037.617204]  <TASK>
[10037.617913]  dump_stack_lvl+0x27/0xa0
[10037.620661]  dump_stack+0x10/0x20
[10037.621688]  panic+0x34f/0x380
[10037.622645]  ? early_xen_iret_patch+0xc/0xc
[10037.624707]  __kvm_handle_async_pf+0xb7/0xe0
[10037.626035]  exc_page_fault+0xb6/0x1b0
[10037.626948]  asm_exc_page_fault+0x27/0x30
[10037.627946] RIP: 0010:__put_user_4+0xd/0x20
[10037.628984] Code: 66 89 01 31 c9 0f 01 ca c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90
[10037.633449] RSP: 0018:ffffaf12011bbf10 EFLAGS: 00050206
[10037.634679] RAX: 0000000000006c55 RBX: 0000000000000000 RCX: 00007c978a26de50
[10037.636410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[10037.638170] RBP: ffffaf12011bbf20 R08: 0000000000000000 R09: 0000000000000000
[10037.639934] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[10037.641941] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[10037.643584]  ? schedule_tail+0x43/0x70
[10037.644462]  ret_from_fork+0x1c/0x70
[10037.645291]  ret_from_fork_asm+0x1b/0x30
[10037.646142] RIP: 0033:0x7c978a37f353
[10037.646985] Code: Unable to access opcode bytes at 0x7c978a37f329.
[10037.648293] RSP: 002b:00007ffd5fb09238 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[10037.649874] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007c978a37f353
[10037.651937] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[10037.653448] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[10037.654968] R10: 00007c978a26de50 R11: 0000000000000246 R12: 0000000000000001
[10037.657719] R13: 00007ffd5fb09350 R14: 00007ffd5fb093d0 R15: 00007c978a5a8020
[10037.659204]  </TASK>
[10037.660041] Kernel Offset: 0x7600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[10039.664329] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

Code:

[42478.311545] Kernel panic - not syncing: Host injected async #PF in kernel mode
[42478.313054] CPU: 3 PID: 113983 Comm: pve-firewall Tainted: P           O       6.8.12-10-pve #1
[42478.314640] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[42478.316279] Call Trace:
[42478.316810]  <TASK>
[42478.317251]  dump_stack_lvl+0x27/0xa0
[42478.318315]  dump_stack+0x10/0x20
[42478.318954]  panic+0x34f/0x380
[42478.319542]  ? early_xen_iret_patch+0xc/0xc
[42478.320505]  __kvm_handle_async_pf+0xb7/0xe0
[42478.321638]  exc_page_fault+0xb6/0x1b0
[42478.322377]  asm_exc_page_fault+0x27/0x30
[42478.323162] RIP: 0010:__put_user_4+0xd/0x20
[42478.324269] Code: 66 89 01 31 c9 0f 01 ca c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90
[42478.327849] RSP: 0018:ffffb829c0c2ff10 EFLAGS: 00050206
[42478.328870] RAX: 000000000001bd3f RBX: 0000000000000000 RCX: 00007e26e1bc8e50
[42478.330482] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[42478.331836] RBP: ffffb829c0c2ff20 R08: 0000000000000000 R09: 0000000000000000
[42478.333177] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[42478.334546] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[42478.335858]  ? schedule_tail+0x43/0x70
[42478.336623]  ret_from_fork+0x1c/0x70
[42478.337349]  ret_from_fork_asm+0x1b/0x30
[42478.338064] RIP: 0033:0x7e26e1cda353
[42478.338735] Code: Unable to access opcode bytes at 0x7e26e1cda329.
[42478.340023] RSP: 002b:00007fff8a182858 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[42478.341714] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007e26e1cda353
[42478.343350] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[42478.344901] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[42478.346571] R10: 00007e26e1bc8e50 R11: 0000000000000246 R12: 0000000000000001
[42478.348144] R13: 00007fff8a182970 R14: 00007fff8a1829f0 R15: 00007e26e1f03020
[42478.349772]  </TASK>
[42478.350521] Kernel Offset: 0x1c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[42480.084099] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

Code:

[54671.401738] Kernel panic - not syncing: Host injected async #PF in kernel mode
[54671.404285] CPU: 3 PID: 1114 Comm: vhost-1069 Tainted: P           O       6.8.12-10-pve #1
[54671.407190] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[54671.410874] Call Trace:
[54671.411902]  <TASK>
[54671.412639]  dump_stack_lvl+0x27/0xa0
[54671.414102]  dump_stack+0x10/0x20
[54671.414985]  panic+0x34f/0x380
[54671.415783]  ? early_xen_iret_patch+0xc/0xc
[54671.417813]  __kvm_handle_async_pf+0xb7/0xe0
[54671.419810]  exc_page_fault+0xb6/0x1b0
[54671.422993]  asm_exc_page_fault+0x27/0x30
[54671.424462] RIP: 0010:rep_movs_alternative+0x4a/0x70
[54671.426326] Code: cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 db 83 f9 08 73 e8 eb c5 <f3> a4 c3 cc cc cc cc 48 89 c8 48 c1 e9 03 83 e0 07 f3 48 a5 89 c1
[54671.432456] RSP: 0018:ffffadba40727ae8 EFLAGS: 00050206
[54671.433644] RAX: 0000000000001000 RBX: 0000000000000336 RCX: 0000000000000336
[54671.440594] RDX: 0000000000000336 RSI: ffff9f24ddd3bec0 RDI: 00007c9481e00000
[54671.444913] RBP: ffffadba40727b68 R08: 0000000000000000 R09: ffff9f27a6a80248
[54671.446977] R10: 0000000000000000 R11: ffff9f24d2ef5800 R12: ffffadba40727de0
[54671.448848] R13: ffff9f24ddd3bc4e R14: 0000000000000272 R15: 000000000000332a
[54671.451464]  ? _copy_to_iter+0x14d/0x590
[54671.452946]  ? kvm_irq_delivery_to_apic_fast+0x1ae/0x200 [kvm]
[54671.455336]  ? __check_object_size+0x9d/0x300
[54671.457411]  simple_copy_to_iter+0x38/0x60
[54671.459515]  ? __pfx_simple_copy_to_iter+0x10/0x10
[54671.460884]  __skb_datagram_iter+0x1a1/0x2d0
[54671.462031]  ? __pfx_simple_copy_to_iter+0x10/0x10
[54671.463764]  skb_copy_datagram_iter+0x37/0xb0
[54671.465422]  tun_do_read+0x437/0x800
[54671.468124]  tun_recvmsg+0x87/0x1a0
[54671.469982]  handle_rx+0x590/0xbd0 [vhost_net]
[54671.471119]  handle_rx_net+0x15/0x20 [vhost_net]
[54671.473416]  vhost_run_work_list+0x46/0x80 [vhost]
[54671.475006]  vhost_task_fn+0x58/0xf0
[54671.476159]  ? __mmdrop+0x125/0x1b0
[54671.477239]  ? _raw_spin_unlock_irq+0xe/0x50
[54671.478822]  ? __pfx_vhost_task_fn+0x10/0x10
[54671.480103]  ret_from_fork+0x44/0x70
[54671.481402]  ? __pfx_vhost_task_fn+0x10/0x10
[54671.482925]  ret_from_fork_asm+0x1b/0x30
[54671.484104] RIP: 0033:0x0
[54671.484988] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[54671.487125] RSP: 002b:0000000000000000 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[54671.490781] RAX: 0000000000000000 RBX: 000060c17fea4710 RCX: 00007c949b93dd1b
[54671.492972] RDX: 0000000000000000 RSI: 000000000000af01 RDI: 0000000000000023
[54671.496050] RBP: 00007ffece086a50 R08: 00007ffece0869f0 R09: 00007c949ba12d00
[54671.498888] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000022
[54671.504675] R13: 00007ffece086a48 R14: 000060c17fea4710 R15: 0000000000000000
[54671.508163]  </TASK>
[54671.509535] Kernel Offset: 0x33e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[54674.124447] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

Code:

[45625.467750] hrtimer: interrupt took 11017253 ns
[46321.231577] Kernel panic - not syncing: Host injected async #PF in kernel mode
[46321.235270] CPU: 5 PID: 123999 Comm: ksmtuned Tainted: P           O       6.8.12-10-pve #1
[46321.239835] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[46321.242429] Call Trace:
[46321.243229]  <TASK>
[46321.243843]  dump_stack_lvl+0x27/0xa0
[46321.245380]  dump_stack+0x10/0x20
[46321.246326]  panic+0x34f/0x380
[46321.247241]  ? early_xen_iret_patch+0xc/0xc
[46321.252921]  __kvm_handle_async_pf+0xb7/0xe0
[46321.254489]  exc_page_fault+0xb6/0x1b0
[46321.255267]  asm_exc_page_fault+0x27/0x30
[46321.256612] RIP: 0010:__put_user_4+0xd/0x20
[46321.258235] Code: 66 89 01 31 c9 0f 01 ca c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90
[46321.263930] RSP: 0018:ffffa556c38cff10 EFLAGS: 00050202
[46321.265453] RAX: 000000000001e45f RBX: 0000000000000000 RCX: 00007b4808d9ba10
[46321.267763] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[46321.269712] RBP: ffffa556c38cff20 R08: 0000000000000000 R09: 0000000000000000
[46321.271763] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[46321.273478] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[46321.275421]  ? schedule_tail+0x43/0x70
[46321.276636]  ret_from_fork+0x1c/0x70
[46321.277667]  ret_from_fork_asm+0x1b/0x30
[46321.278672] RIP: 0033:0x7b4808e72353
[46321.279541] Code: Unable to access opcode bytes at 0x7b4808e72329.
[46321.281197] RSP: 002b:00007fff01e6d3e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[46321.285353] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007b4808e72353
[46321.287250] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[46321.288911] RBP: 0000000000000000 R08: 0000000000000000 R09: 00005b9070ba2890
[46321.290574] R10: 00007b4808d9ba10 R11: 0000000000000246 R12: 0000000000000001
[46321.292712] R13: 00007fff01e6d620 R14: 00005b9045378b08 R15: 0000000000000000
[46321.297935]  </TASK>
[46321.299348] Kernel Offset: 0x36a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[46322.810947] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

guest dmesg doesn't show much before the panic, last one only had one line before panicking
[45625.467750] hrtimer: interrupt took 11017253 ns

Unfortunately those panics never happened when I was there and the dmesg output of the host was override by other entries when I saw the guest panic.

Host machine info:

Bash:

~ #  pveversion --verbose
proxmox-ve: 8.4.0 (running kernel: 6.8.12-10-pve)
pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.8.12-10-pve-signed: 6.8.12-10
proxmox-kernel-6.8: 6.8.12-10
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
proxmox-kernel-6.8.12-8-pve-signed: 6.8.12-8
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
ceph-fuse: 16.2.15+ds-0+deb12u1
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.0
libpve-cluster-perl: 8.1.0
libpve-common-perl: 8.3.1
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.1-1
proxmox-backup-file-restore: 3.4.1-1
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.10
pve-cluster: 8.1.0
pve-container: 5.2.6
pve-docs: 8.4.0
pve-edk2-firmware: 4.2025.02-3
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.1
pve-firmware: 3.15-3
pve-ha-manager: 4.0.7
pve-i18n: 3.4.2
pve-qemu-kvm: 9.2.0-5
pve-xtermjs: 5.5.0-2
qemu-server: 8.3.12
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2

Guest machine info:

Bash:

~#  pveversion --verbose
proxmox-ve: 8.4.0 (running kernel: 6.8.12-10-pve)
pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.8.12-10-pve-signed: 6.8.12-10
proxmox-kernel-6.8: 6.8.12-10
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
ceph-fuse: 19.2.1-pve3
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
frr-pythontools: 10.2.2-1+pve1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.0
libpve-cluster-perl: 8.1.0
libpve-common-perl: 8.3.1
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.1-1
proxmox-backup-file-restore: 3.4.1-1
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.10
pve-cluster: 8.1.0
pve-container: 5.2.6
pve-docs: 8.4.0
pve-edk2-firmware: 4.2025.02-3
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.1
pve-firmware: 3.15-3
pve-ha-manager: 4.0.7
pve-i18n: 3.4.2
pve-qemu-kvm: 9.2.0-5
pve-xtermjs: 5.5.0-2
qemu-server: 8.3.12
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2

Bash:

~ #  cat /etc/pve/qemu-server/108.conf 
agent: 1,fstrim_cloned_disks=1
balloon: 4096
bios: ovmf
boot: order=scsi0;ide2
cores: 6
cpu: host
efidisk0: local-lvm-fast:vm-108-disk-0,efitype=4m,size=4M
hotplug: disk,network,usb
ide2: local:iso/proxmox-ve_8.4-1.iso,media=cdrom,size=1535054K
machine: pc-q35-9.2+pve1
memory: 16384
meta: creation-qemu=9.2.0,ctime=1745336813
name: nested-PVE
net0: virtio=BC:24:11:66:CC:47,bridge=vmbr20
net6: virtio=BC:24:11:15:34:F3,bridge=vmbr000,link_down=1
numa: 0
onboot: 1
ostype: l26
rng0: source=/dev/urandom
scsi0: local-lvm:vm-108-disk-1,discard=on,iothread=1,size=32G
scsi10: local-lvm:vm-108-disk-2,cache=writeback,discard=on,iothread=1,size=320G
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=676944cc-698f-48dc-9029-efe304e7b983
sockets: 1
tablet: 0
vga: serial0
vmgenid: f6ea56ee-bcfe-496c-bd45-5ef8e042fc86

Darkangeel_hd · Tuesday at 18:20

Here's more info that didn't fit inside first message:

Bash:

~ #  cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 94
model name      : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
stepping        : 3
microcode       : 0xd6
cpu MHz         : 3300.033
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp vnmi md_clear flush_l1d
vmx flags       : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple shadow_vmcs pml
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit srbds mmio_stale_data retbleed gds
bogomips        : 6799.81
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

trimmed, all 8 entries share the same important values

Bash:

~ #  dmidecode -t memory
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.

Handle 0x0037, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 64 GB
        Error Information Handle: Not Provided
        Number Of Devices: 4

Handle 0x0038, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x0037
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM CHA3
        Bank Locator: BANK 0
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2133 MT/s
        Manufacturer: Samsung
        Serial Number: 92174CBC
        Asset Tag: 1545
        Part Number: M378A2K43BB1-CPB   
        Rank: 2
        Configured Memory Speed: 2133 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.2 V

Handle 0x0039, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x0037
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM CHA1
        Bank Locator: BANK 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2133 MT/s
        Manufacturer: Samsung
        Serial Number: 92174BCE
        Asset Tag: 1545
        Part Number: M378A2K43BB1-CPB   
        Rank: 2
        Configured Memory Speed: 2133 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.2 V

Handle 0x003A, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x0037
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM CHB4
        Bank Locator: BANK 2
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2133 MT/s
        Manufacturer: Samsung
        Serial Number: 991C48A6
        Asset Tag: 1548
        Part Number: M378A2K43BB1-CPB   
        Rank: 2
        Configured Memory Speed: 2133 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.2 V

Handle 0x003B, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x0037
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 16 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM CHB2
        Bank Locator: BANK 3
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2133 MT/s
        Manufacturer: Samsung
        Serial Number: 991C4744
        Asset Tag: 1548
        Part Number: M378A2K43BB1-CPB   
        Rank: 2
        Configured Memory Speed: 2133 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: 1.2 V

If there's any additional information I can provide, or if i made a mistake while making the thread please by all means tell me.

l.leahu-vladucu · Wednesday at 12:39

Hello Darkangeel_hd! Did you do any major changes to the Proxmox VE host? I'm asking because the call stack of the kernel panic mentions early_xen_iret_patch, which makes it sound like you are using Xen virtualization.

Some further ideas:

Just to double-check, please make sure to follow our guide on Nested Virtualization.
Does it make any difference if you try to use the opt-in kernel 6.14 in either the host or nested Proxmox VE?

Darkangeel_hd · Wednesday at 13:14

Hello l.leahu-vladucu, thanks you so much for replying
I have not make any modifications to the PVE host, all deb repos are just Debian's and Proxmox no subscription one, updated to the lastest version.
machine is a normal proxmox qemu machine, i have no prior experience with xen, nor have i ever use it.

I followed the Nested virtualization wiki to set it up actually, the only doubt i have is when the wiki says:

By default, it does not expose hardware-assisted virtualization extensions to its VMs. Do not expect optimal performance for virtual machines on the guest hypervisor, unless you configure the VM's CPU as "host" and have nested hardware-assisted virtualization extensions enabled on the physical PVE host.

As you can see in the vm config, cpu model is set to host, so my guess is that that should do most of the job, about the "hardware-assisted virtualization extensions", I tried enabled cpu-flag hv-evemcs but unlikely my processor does not support APICv, I dunno if that can be a problem.

nevertheless virtual L2 guest run perfectly fine on the nested PVE, till the nested PVE panics that is.

here are kvm and kvm-intel parameters rn:

Bash:

~ #  grep --color '' /sys/module/kvm_intel/parameters/*
/sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr:N
/sys/module/kvm_intel/parameters/dump_invalid_vmcs:N
/sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y
/sys/module/kvm_intel/parameters/enable_apicv:N
/sys/module/kvm_intel/parameters/enable_ipiv:N
/sys/module/kvm_intel/parameters/enable_shadow_vmcs:Y
/sys/module/kvm_intel/parameters/enlightened_vmcs:N
/sys/module/kvm_intel/parameters/ept:Y
/sys/module/kvm_intel/parameters/eptad:Y
/sys/module/kvm_intel/parameters/error_on_inconsistent_vmcs_config:Y
/sys/module/kvm_intel/parameters/fasteoi:Y
/sys/module/kvm_intel/parameters/flexpriority:Y
/sys/module/kvm_intel/parameters/nested:Y
/sys/module/kvm_intel/parameters/nested_early_check:N
/sys/module/kvm_intel/parameters/ple_gap:128
/sys/module/kvm_intel/parameters/ple_window:4096
/sys/module/kvm_intel/parameters/ple_window_grow:2
/sys/module/kvm_intel/parameters/ple_window_max:4294967295
/sys/module/kvm_intel/parameters/ple_window_shrink:0
/sys/module/kvm_intel/parameters/pml:Y
/sys/module/kvm_intel/parameters/preemption_timer:Y
/sys/module/kvm_intel/parameters/sgx:N
/sys/module/kvm_intel/parameters/unrestricted_guest:Y
/sys/module/kvm_intel/parameters/vmentry_l1d_flush:cond
/sys/module/kvm_intel/parameters/vnmi:Y
/sys/module/kvm_intel/parameters/vpid:Y

Bash:

~ #  grep --color '' /sys/module/kvm/parameters/*
/sys/module/kvm/parameters/eager_page_split:Y
/sys/module/kvm/parameters/enable_pmu:Y
/sys/module/kvm/parameters/enable_vmware_backdoor:N
/sys/module/kvm/parameters/flush_on_reuse:N
/sys/module/kvm/parameters/force_emulation_prefix:0
/sys/module/kvm/parameters/halt_poll_ns:200000
/sys/module/kvm/parameters/halt_poll_ns_grow:0
/sys/module/kvm/parameters/halt_poll_ns_grow_start:10000
/sys/module/kvm/parameters/halt_poll_ns_shrink:0
/sys/module/kvm/parameters/ignore_msrs:Y
/sys/module/kvm/parameters/kvmclock_periodic_sync:Y
/sys/module/kvm/parameters/lapic_timer_advance_ns:-1
/sys/module/kvm/parameters/min_timer_period_us:200
/sys/module/kvm/parameters/mitigate_smt_rsb:N
/sys/module/kvm/parameters/mmio_caching:Y
/sys/module/kvm/parameters/nx_huge_pages:Y
/sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms:0
/sys/module/kvm/parameters/nx_huge_pages_recovery_ratio:60
/sys/module/kvm/parameters/pi_inject_timer:0
/sys/module/kvm/parameters/report_ignored_msrs:Y
/sys/module/kvm/parameters/tdp_mmu:Y
/sys/module/kvm/parameters/tsc_tolerance_ppm:250
/sys/module/kvm/parameters/vector_hashing:Y

Bash:

~ #  grep --color '' /sys/module/vhost/parameters/*
/sys/module/vhost/parameters/max_iotlb_entries:2048
/sys/module/vhost/parameters/max_mem_regions:512

regarding trying 6.14, should i try it first on the guest or in the host?

My understanding is that the guest PVE is the one not being able to gracefully handle the PF, unsure of why this happened tho

Again, thanks for reaching out to help me

l.leahu-vladucu · Wednesday at 13:57

Darkangeel_hd said:
I followed the Nested virtualization wiki to set it up actually, the only doubt i have is when the wiki says:

By default, it does not expose hardware-assisted virtualization extensions to its VMs. Do not expect optimal performance for virtual machines on the guest hypervisor, unless you configure the VM's CPU as "host" and have nested hardware-assisted virtualization extensions enabled on the physical PVE host.

Click to expand...

As you can see in the vm config, cpu model is set to host, so my guess is that that should do most of the job, about the "hardware-assisted virtualization extensions", I tried enabled cpu-flag hv-evemcs but unlikely my processor does not support APICv, I dunno if that can be a problem.

The part you quoted above means that in order for hardware-assisted virtualization to work, you need the following:

Your CPU must support it. Your i7 6700 does.
It has to be enabled in the BIOS of your motherboard.
In Proxmox VE, click on the VM, go to Options and make sure that KVM hardware virtualization is set to On.
Set the CPU type to host so that all CPU extensions are passed to the nested Proxmox VE.

You might want to check the BIOS settings, but otherwise, as far as I can see, everything should be configured correctly.

Darkangeel_hd said:
regarding trying 6.14, should i try it first on the guest or in the host?

You can begin with the guest (nested) Proxmox VE, then on the host, then on both. You can always go back to the older kernel by selecting an older one during boot, or by pinning the kernel version.

Darkangeel_hd · Wednesday at 14:10

The part you quoted above means that in order for hardware-assisted virtualization to work, you need the following:

Your CPU must support it. Your i7 6700 does.

It has to be enabled in the BIOS of your motherboard.

In Proxmox VE, click on the VM, go to Options and make sure that KVM hardware virtualization is set to On.

Set the CPU type to host so that all CPU extensions are passed to the nested Proxmox VE.

You might want to check the BIOS settings, but otherwise, as far as I can see, everything should be configured correctly.

Yeah, I'm using kvm on all VMs, and ofc VT-x n VT-d are enabled on the BIOS
Okay, so the only thing that really maters there is that the nested guess has the vmx flag, which it does. No need for other flags then?

Bash:

~#  cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 94
model name      : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
stepping        : 3
microcode       : 0xd6
cpu MHz         : 3407.964
cache size      : 16384 KB
physical id     : 0
siblings        : 6
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves arat vnmi umip md_clear flush_l1d arch_capabilities
vmx flags       : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest shadow_vmcs pml
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa srbds mmio_stale_data retbleed gds bhi ibpb_no_ret
bogomips        : 6815.92
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

Kernel 6.14 installed on the nested guest, I will see how it goes over the next days (as panics do not happen at specific times)

Also, I forgot to mention, qemu-guest-agent was installed on the nested pve,
but i don't think is relevant as the panics happened before and after installing it.

l.leahu-vladucu · Wednesday at 14:52

Darkangeel_hd said:
Yeah, I'm using kvm on all VMs, and ofc VT-x n VT-d are enabled on the BIOS
Okay, so the only thing that really maters there is that the nested guess has the vmx flag, which it does. No need for other flags then?

No, there's no need to set any flags manually. Using CPU type host means that the guest will simply use exactly the same CPU flags, which is also confirmed by the output of cat /proc/cpuinfo.

Darkangeel_hd said:
Kernel 6.14 installed on the nested guest, I will see how it goes over the next days (as panics do not happen at specific times)

Awesome! Please let us know if it helped.

Darkangeel_hd · 2025-05-08T12:55:19+0200

welp it happened again

C-like:

[ 1400.084789] tap100i0: left allmulticast mode
[ 1400.084857] vmbr20: port 2(tap100i0) entered disabled state
[ 1403.486556] tap100i0: entered promiscuous mode
[ 1403.522819] vmbr20: port 2(tap100i0) entered blocking state
[ 1403.522826] vmbr20: port 2(tap100i0) entered disabled state
[ 1403.522853] tap100i0: entered allmulticast mode
[ 1403.522968] vmbr20: port 2(tap100i0) entered blocking state
[ 1403.522973] vmbr20: port 2(tap100i0) entered forwarding state





[75941.867931] INFO: task pvescheduler:1235 blocked for more than 122 seconds.
[75941.867989]       Tainted: P           O       6.14.0-2-pve #1
[75941.867993] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75941.867994] task:pvescheduler    state:D stack:0     pid:1235  tgid:1235  ppid:1      task_flags:0x400040 flags:0x00000002
[75941.868001] Call Trace:
[75941.868009]  <TASK>
[75941.868028]  __schedule+0x495/0x13f0
[75941.868041]  ? __handle_mm_fault+0xbaf/0x10b0
[75941.868047]  schedule+0x29/0x130
[75941.868049]  kvm_async_pf_task_wait_schedule+0x170/0x1b0
[75941.868057]  __kvm_handle_async_pf+0x5c/0xe0
[75941.868060]  exc_page_fault+0xb8/0x1e0
[75941.868063]  asm_exc_page_fault+0x27/0x30
[75941.868068] RIP: 0033:0x5daffd8ce50f
[75941.868079] RSP: 002b:00007ffe90ecc078 EFLAGS: 00010206
[75941.868082] RAX: 00005db034aaf2a0 RBX: ffffffffffffff78 RCX: 00007286a4126503
[75941.868083] RDX: 00005db034b4fea0 RSI: 0000000000000000 RDI: 0000000000000011
[75941.868085] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[75941.868086] R10: 00007ffe90ecc730 R11: 0000000000000202 R12: 00005db03b7dd668
[75941.868088] R13: 00005db034ab4c88 R14: 00005daffdb57030 R15: 00007286a435a020
[75941.868100]  </TASK>
[76064.749250] INFO: task pvescheduler:1235 blocked for more than 245 seconds.
[76064.749257]       Tainted: P           O       6.14.0-2-pve #1
[76064.749259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[76064.749260] task:pvescheduler    state:D stack:0     pid:1235  tgid:1235  ppid:1      task_flags:0x400040 flags:0x00000002
[76064.749265] Call Trace:
[76064.749267]  <TASK>
[76064.749271]  __schedule+0x495/0x13f0
[76064.749276]  ? __handle_mm_fault+0xbaf/0x10b0
[76064.749282]  schedule+0x29/0x130
[76064.749284]  kvm_async_pf_task_wait_schedule+0x170/0x1b0
[76064.749291]  __kvm_handle_async_pf+0x5c/0xe0
[76064.749294]  exc_page_fault+0xb8/0x1e0
[76064.749327]  asm_exc_page_fault+0x27/0x30
[76064.749331] RIP: 0033:0x5daffd8ce50f
[76064.749333] RSP: 002b:00007ffe90ecc078 EFLAGS: 00010206
[76064.749345] RAX: 00005db034aaf2a0 RBX: ffffffffffffff78 RCX: 00007286a4126503
[76064.749347] RDX: 00005db034b4fea0 RSI: 0000000000000000 RDI: 0000000000000011
[76064.749348] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[76064.749350] R10: 00007ffe90ecc730 R11: 0000000000000202 R12: 00005db03b7dd668
[76064.749352] R13: 00005db034ab4c88 R14: 00005daffdb57030 R15: 00007286a435a020
[76064.749358]  </TASK>






[76187.630616] INFO: task pvescheduler:1235 blocked for more than 368 seconds.
[76187.630659]       Tainted: P           O       6.14.0-2-pve #1
[76187.630662] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[76187.630664] task:pvescheduler    state:D stack:0     pid:1235  tgid:1235  ppid:1      task_flags:0x400040 flags:0x00000002
[76187.630672] Call Trace:
[76187.630675]  <TASK>
[76187.630696]  __schedule+0x495/0x13f0
[76187.630713]  ? __handle_mm_fault+0xbaf/0x10b0
[76187.630720]  schedule+0x29/0x130
[76187.630724]  kvm_async_pf_task_wait_schedule+0x170/0x1b0
[76187.630731]  __kvm_handle_async_pf+0x5c/0xe0
[76187.630735]  exc_page_fault+0xb8/0x1e0
[76187.630739]  asm_exc_page_fault+0x27/0x30
[76187.630744] RIP: 0033:0x5daffd8ce50f
[76187.630752] RSP: 002b:00007ffe90ecc078 EFLAGS: 00010206
[76187.630756] RAX: 00005db034aaf2a0 RBX: ffffffffffffff78 RCX: 00007286a4126503
[76187.630759] RDX: 00005db034b4fea0 RSI: 0000000000000000 RDI: 0000000000000011
[76187.630761] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[76187.630764] R10: 00007ffe90ecc730 R11: 0000000000000202 R12: 00005db03b7dd668
[76187.630766] R13: 00005db034ab4c88 R14: 00005daffdb57030 R15: 00007286a435a020
[76187.630772]  </TASK>



[76278.397452] Kernel panic - not syncing: Host injected async #PF in kernel mode
[76278.399614] CPU: 2 UID: 0 PID: 203037 Comm: sh Tainted: P           O       6.14.0-2-pve #1
[76278.401524] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[76278.402809] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[76278.404734] Call Trace:
[76278.405484]  <TASK>
[76278.406288]  dump_stack_lvl+0x27/0xa0
[76278.407385]  dump_stack+0x10/0x20
[76278.408398]  panic+0x358/0x3b0
[76278.409242]  ? early_xen_iret_patch+0xc/0xc
[76278.410334]  __kvm_handle_async_pf+0xb7/0xe0
[76278.411500]  exc_page_fault+0xb8/0x1e0
[76278.412506]  asm_exc_page_fault+0x27/0x30
[76278.413543] RIP: 0010:__put_user_4+0xd/0x20
[76278.414567] Code: 66 89 01 31 c9 0f 01 ca c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90
[76278.419382] RSP: 0018:ffffb0f18ab23f00 EFLAGS: 00050202
[76278.421183] RAX: 000000000003191d RBX: 0000000000000000 RCX: 000071afeae3ca10
[76278.424369] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[76278.426075] RBP: ffffb0f18ab23f10 R08: 0000000000000000 R09: 0000000000000000
[76278.428253] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[76278.431138] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[76278.433244]  ? schedule_tail+0x42/0x70
[76278.435124]  ret_from_fork+0x1c/0x70
[76278.436842]  ret_from_fork_asm+0x1a/0x30
[76278.437920] RIP: 0033:0x71afeaf13353
[76278.438814] Code: Unable to access opcode bytes at 0x71afeaf13329.
[76278.440882] RSP: 002b:00007fff6aae68e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[76278.445358] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000071afeaf13353
[76278.447734] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[76278.450619] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[76278.452594] R10: 000071afeae3ca10 R11: 0000000000000246 R12: 0000000000000001
[76278.454667] R13: 00005e675f7ff8e0 R14: 00005e675f7ff8e0 R15: 00007fff6aae6a30
[76278.457161]  </TASK>
[76278.458853] Kernel Offset: 0x28a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[76280.504506] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

Under memory pressure the guest receives de async PF and does set the processes on hang
But even if the memory pressure condition is removed, the processes stay on D state, some on Z, and eventually the guest panics.

I did happen before with 1.8x of the guest mem free on the host, this time i forced the memory pressure to speed things up a bit, but the problem is the same.
Ill try just leaving to its own devices and see what happens, but the guest pve doesn't have much load since i cannot put production work on it till I now its stable.

Is should mention that every other guest running on that same proxmox host seems unaffected by this problem (about 16VMs and 12 CTs)

Should i try virtualizing a PVE7 instance too see if this problem applies there too?

l.leahu-vladucu · 2025-05-08T14:40:51+0200

Thanks for trying it out! If you have time, feel free to experiment around. Depending on how much time you have, you can try PVE 7 (both with newer and older kernels), as well as PVE 8 with older kernels (6.2 and 6.5).

guruevi · 2025-05-09T02:32:25+0200

Can you turn off ballooning, it does funky stuff when you have memory pressure. You said it happens when you’re running out of memory and you have your guest system set to 4/16G. The way the balloon works is not by actually freeing the memory but just from the perspective of the guest, filling the memory, then passing those pages as “free to use”. Not sure how that interacts with nested virtualization, as in that case, the processor physically deals with a layer of memory indirection.

Darkangeel_hd · 2025-05-09T10:22:43+0200

guruevi said:
Can you turn off ballooning, it does funky stuff when you have memory pressure. You said it happens when you’re running out of memory and you have your guest system set to 4/16G. The way the balloon works is not by actually freeing the memory but just from the perspective of the guest, filling the memory, then passing those pages as “free to use”. Not sure how that interacts with nested virtualization, as in that case, the processor physically deals with a layer of memory indirection.

Hi @guruevi , thanks for stopping by
The host has 64GB of ram, usually only about half of that is used, even with the guest pve running, right now I have about 16G free with 2 nested pve (7 and 8) running with an ubuntu vm inside each.
The problem happens even if not under memory pressure, I just discovered that I cant "force" it to happen earlier under memory pressure.

I'm aware of how the ballooning device works, and every other linux machine on that host uses it, and seems unaffected by this problem, so i don't think is related.

From my investigation of the problem, the async PF happens when the guest tries to access virtual memory which at that time is swapped out in the host.
Instead of just freezing the host till that mem region is available again, what the host does is send an async PF to the guest, which in turn is suppose to freeze the thread that was accessing the memory and give cpu time to other processes. When the host brings back that page to ram, it will send another event to the guest so in turn this one can thaw the process so it can access that page. This whole complicated process is so that the guest can take advantage of otherwise wasted cpu time.

Here's an old kvm forum slideshow explaining part of it: https://kvm-forum.qemu.org/2020/KVMForum2020_APF.pdf

So I guess some part of the guest memory guests swapped out at some point. and something happens there.

The weird thing is that this only happens to that pve guest, every other vm, most of which also have the cpu set as host seem unaffected by this.

The main problem I have when testing stuff is that I don't know how to properly replicate the error under normal conditions, as I don't know why or when it will happen.

On another note, @l.leahu-vladucu
The PVE8 host panicked tonight, while the PVE7 didnt, both running pretty similar workloads

C-like:

[11016.894170] tap100i0: entered promiscuous mode
[11016.925515] vmbr20: port 2(tap100i0) entered blocking state
[11016.925521] vmbr20: port 2(tap100i0) entered disabled state
[11016.925535] tap100i0: entered allmulticast mode
[11016.925638] vmbr20: port 2(tap100i0) entered blocking state
[11016.925641] vmbr20: port 2(tap100i0) entered forwarding stat

[40356.608563] Kernel panic - not syncing: Host injected async #PF in kernel mode
[40356.610084] CPU: 5 UID: 0 PID: 114905 Comm: nft Tainted: P           O       6.14.0-2-pve #1
[40356.611690] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[40356.612722] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025
[40356.614306] Call Trace:
[40356.614781]  <TASK>
[40356.615204]  dump_stack_lvl+0x27/0xa0
[40356.615990]  dump_stack+0x10/0x20
[40356.616661]  panic+0x358/0x3b0
[40356.617305]  ? early_xen_iret_patch+0xc/0xc
[40356.618099]  __kvm_handle_async_pf+0xb7/0xe0
[40356.618969]  exc_page_fault+0xb8/0x1e0
[40356.619749]  asm_exc_page_fault+0x27/0x30
[40356.620543] RIP: 0010:rep_stos_alternative+0x40/0x80
[40356.621577] Code: c9 75 f6 c3 cc cc cc cc 48 89 07 48 83 c7 08 83 e9 08 74 ef 83 f9 08 73 ef eb de 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
[40356.625729] RSP: 0018:ffffa11385373a18 EFLAGS: 00050202
[40356.626897] RAX: 0000000000000000 RBX: 00006052539d6010 RCX: 0000000000000ff0
[40356.628566] RDX: 00006052539d7400 RSI: 0000000000000000 RDI: 00006052539d6010
[40356.630152] RBP: ffffa11385373a58 R08: 0000000000000000 R09: 0000000000000000
[40356.631862] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a898189f518
[40356.633590] R13: 0000000000000003 R14: 00006052539d5000 R15: 0000000000000000
[40356.635300]  ? elf_load+0x26c/0x290
[40356.636431]  load_elf_binary+0x667/0x17b0
[40356.637464]  ? load_misc_binary+0x224/0x380 [binfmt_misc]
[40356.638769]  bprm_execve+0x2b6/0x540
[40356.639672]  do_execveat_common.isra.0+0x194/0x1f0
[40356.640959]  __x64_sys_execve+0x37/0x60
[40356.641878]  x64_sys_call+0x1e03/0x2540
[40356.642895]  do_syscall_64+0x7e/0x170
[40356.643877]  ? putname+0x60/0x80
[40356.644671]  ? do_execveat_common.isra.0+0x1a7/0x1f0
[40356.645948]  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
[40356.647464]  ? syscall_exit_to_user_mode+0x38/0x1d0
[40356.648656]  ? do_syscall_64+0x8a/0x170
[40356.649620]  ? syscall_exit_to_user_mode+0x38/0x1d0
[40356.650831]  ? _raw_spin_unlock_irq+0xe/0x50
[40356.651933]  ? sigprocmask+0xa3/0xd0
[40356.652814]  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
[40356.654556]  ? syscall_exit_to_user_mode+0x38/0x1d0
[40356.655912]  ? do_syscall_64+0x8a/0x170
[40356.657085]  ? _copy_to_user+0x41/0x60
[40356.658242]  ? __x64_sys_rt_sigaction+0xb8/0x120
[40356.659674]  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
[40356.661242]  ? syscall_exit_to_user_mode+0x38/0x1d0
[40356.662360]  ? do_syscall_64+0x8a/0x170
[40356.663410]  ? do_syscall_64+0x8a/0x170
[40356.664363]  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
[40356.665937]  ? syscall_exit_to_user_mode+0x38/0x1d0
[40356.667037]  ? do_syscall_64+0x8a/0x170
[40356.667929]  ? clear_bhb_loop+0x15/0x70
[40356.668823]  ? clear_bhb_loop+0x15/0x70
[40356.669740]  ? clear_bhb_loop+0x15/0x70
[40356.670751]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[40356.671981] RIP: 0033:0x77e7285b4ad7
[40356.672826] Code: Unable to access opcode bytes at 0x77e7285b4aad.
[40356.674211] RSP: 002b:000077e7284dcd98 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
[40356.676493] RAX: ffffffffffffffda RBX: 000061a0227384a0 RCX: 000077e7285b4ad7
[40356.678339] RDX: 00007ffc930c6948 RSI: 000061a0227389c0 RDI: 000077e7284dcda0
[40356.680009] RBP: 000077e7284dce60 R08: 00007ffc930c6f3e R09: 0000000000000000
[40356.681893] R10: 0000000000000008 R11: 0000000000000202 R12: 000061a0227389c0
[40356.683485] R13: 00007ffc930c6948 R14: 00007ffc930c6f35 R15: 0000000000000000
[40356.685073]  </TASK>
[40356.685854] Kernel Offset: 0x27600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[40358.023346] ---[ end Kernel panic - not syncing: Host injected async #PF in kernel mode ]---

the pve7 guest currently has an 10h uptime with no entries on dmesg since i spun up the L2 vm inside it.

both pve guest have the same vm config, same L2 ubuntu vm config running inside too.

Is it possible to use an older kernel in PVE8?, just for testing purposes, it is of course not supported.
But that way I guess we could discard the new kernel having a regression or some stuff.

I could also try running another machine with the pve8 kernel and see if that one panics too.

Again, thank you both so much for lending me a hand at trying to figure this one out.

l.leahu-vladucu · 2025-05-09T12:28:20+0200

Darkangeel_hd said:
Is it possible to use an older kernel in PVE8?, just for testing purposes, it is of course not supported.

Sure! In PVE 8, simply use apt install proxmox-kernel-6.2 and/or apt install proxmox-kernel-6.5 (and, as always, make sure to boot with the correct version).

Darkangeel_hd said:
the pve7 guest currently has an 10h uptime with no entries on dmesg since i spun up the L2 vm inside it.

Good to know, thanks for confirming. Which kernel version are you using with PVE 7?

guruevi · 2025-05-09T14:39:17+0200

The thing regarding balloon memory is that in HyperV and VMware it is not compatible with nested virtualization, given the error message, your guest is basically sending another signal while it should be, as you identified, frozen. My presumption is that is a hardware event because VFIO or VTx/d doesn’t care about your VirtIO balloon software.

Search

Search

Nested PVE (on PVE host) Kernel panic Host injected async #PF in kernel mode

Darkangeel_hd

New Member

Darkangeel_hd

New Member

l.leahu-vladucu

Proxmox Staff Member

Darkangeel_hd

New Member

l.leahu-vladucu

Proxmox Staff Member

Darkangeel_hd

New Member

l.leahu-vladucu

Proxmox Staff Member

Darkangeel_hd

New Member

l.leahu-vladucu

Proxmox Staff Member

guruevi

Well-Known Member

Darkangeel_hd

New Member

l.leahu-vladucu

Proxmox Staff Member

guruevi

Well-Known Member

We value your privacy