Unexplained virtual machine crashes after updating to Proxmox 9.
In this case, the virtual machine will enter an internal-error state.
The issue only occurs in Proxmox 9, not in Proxmox 8.
This does not happen all the time, but it does happen in environments where GPU passthrough is enabled.
If anyone knows the cause or how to troubleshoot this error, I would be grateful if you could let me know.
version
error
<vmid>.conf
lspci
In this case, the virtual machine will enter an internal-error state.
The issue only occurs in Proxmox 9, not in Proxmox 8.
This does not happen all the time, but it does happen in environments where GPU passthrough is enabled.
If anyone knows the cause or how to troubleshoot this error, I would be grateful if you could let me know.
version
Code:
pveversion
pve-manager/9.0.6/49c767b70aeb6648 (running kernel: 6.14.8-2-pve)
error
Code:
Aug 25 14:13:29 pve1 QEMU[449875]: error: kvm run failed Bad address
Aug 25 14:13:29 pve1 QEMU[449875]: RAX=ffff968d5f2a7e08 RBX=000000000000019c RCX=ffff968d5f2a7e08 RDX=ffffed78588951f8
Aug 25 14:13:29 pve1 QEMU[449875]: RSI=0000000000000000 RDI=fffff80060b32bc2 RBP=ffff968d5f2fa048 RSP=ffff8104bd756698
Aug 25 14:13:29 pve1 QEMU[449875]: R8 =000000000000019c R9 =0000000000000005 R10=ffff968d4f3cb040 R11=ffff8405b7b3d19c
Aug 25 14:13:29 pve1 QEMU[449875]: R12=0000000000000005 R13=0000000000000083 R14=ffff968d5f2a7e08 R15=ffff8405b7b3d000
Aug 25 14:13:29 pve1 QEMU[449875]: RIP=fffff80060b0ff52 RFL=00050283 [--S---C] CPL=0 II=0 A20=1 SMM=0 HLT=0
Aug 25 14:13:29 pve1 QEMU[449875]: ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
Aug 25 14:13:29 pve1 QEMU[449875]: CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
Aug 25 14:13:29 pve1 QEMU[449875]: SS =0018 0000000000000000 00000000 00409300 DPL=0 DS [-WA]
Aug 25 14:13:29 pve1 QEMU[449875]: DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
Aug 25 14:13:29 pve1 QEMU[449875]: FS =0053 0000000000000000 00013c00 0040f300 DPL=3 DS [-WA]
Aug 25 14:13:29 pve1 QEMU[449875]: GS =002b ffffe0813dfc0000 ffffffff 00c0f300 DPL=3 DS [-WA]
Aug 25 14:13:29 pve1 QEMU[449875]: LDT=0000 0000000000000000 ffffffff 00c00000
Aug 25 14:13:29 pve1 QEMU[449875]: TR =0040 ffffe0813dfd0000 00000067 00008b00 DPL=0 TSS64-busy
Aug 25 14:13:29 pve1 QEMU[449875]: GDT= ffffe0813dfd1fb0 00000057
Aug 25 14:13:29 pve1 QEMU[449875]: IDT= ffffe0813dfcf000 00000fff
Aug 25 14:13:29 pve1 QEMU[449875]: CR0=80050033 CR2=0000002f651fee88 CR3=00000000001ae000 CR4=00350ef8
Aug 25 14:13:29 pve1 QEMU[449875]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Aug 25 14:13:29 pve1 QEMU[449875]: DR6=00000000ffff07f0 DR7=0000000000000400
Aug 25 14:13:29 pve1 QEMU[449875]: EFER=0000000000000d01
Aug 25 14:13:29 pve1 QEMU[449875]: Code=00 00 4e 8d 1c 02 48 2b d1 73 09 4c 3b d9 0f 87 6e 01 00 00 <0f> 10 04 11 48 83 c1 10 f6 c1 0f 74 12 48 83 e1 f0 0f 10 0c 11 0f 11 00 0f 28 c1 48 83 c1
<vmid>.conf
Code:
agent: 1
args: -cpu host,hv_passthrough,-hypervisor,level=35,+vmx,guest>
balloon: 0
bios: ovmf
boot: order=ide0;ide1;virtio0
cores: 20
cpu: host,flags=+pdpe1gb
efidisk0: local-zfs:vm-923-disk-0,efitype=4m,pre-enrolled-keys>
hookscript: local:snippets/rx9070_reset.sh
hostpci0: 0000:04:00,pcie=1,rombar=0,x-vga=1
hostpci1: 0000:83:00,pcie=1
hostpci2: 0000:01:00,pcie=1
ide0: none,media=cdrom
ide1: none,media=cdrom
machine: pc-q35-9.2+pve1
memory: 49152
meta: creation-qemu=8.1.5,ctime=1718161181
name: etc1
net0: virtio=BC:24:11:9E:2C:37,bridge=vmbr0,firewall=1,mtu=1,q>
net1: virtio=BC:24:11:CF:87:C7,bridge=vmbr1,firewall=1,mtu=1,q>
numa: 0
onboot: 1
ostype: win11
rng0: source=/dev/urandom
scsihw: virtio-scsi-single
smbios1: ---
sockets: 1
tablet: 1
tags: default
tpmstate0: local-zfs:vm-923-disk-1,size=4M,version=v2.0
vga: none
virtio0: local-zfs:vm-923-disk-2,iothread=1,size=80G
vmgenid: ---
lspci
Code:
root@pve1:~# lspci -nns 04:00
04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 [RX 9070/9070 XT] [1002:7550] (rev c0)
04:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 HDMI/DP Audio Controller [1002:ab40]
root@pve1:~# lspci -nns 01:00
01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD_BLACK SN7100 NVMe SSD (DRAM-less) [15b7:5045] (rev 01)
root@pve1:~# lspci -nns 83:00
83:00.0 USB controller [0c03]: Renesas Electronics Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
Last edited: