Kernel panic when IOMMU is enabled

FeKn

Member
Aug 16, 2021
1
0
6
Hi,
I have a VM host with an ASRock Rack B550D4-4L mainboard with AMD Ryzen 7 PRO 5750G (with Radeon Graphics) CPU running a current PVE 8.1 with Kernel 6.5.

Code:
# pveversion
pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.13-1-pve)

The system runs very stable overall, but it only boots when the kernel parameter amd_iommu=off is set. Without this parameter (default is amd_iommu=on) a kernel panic occurs:

Code:
Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC


Full log:

Code:
EFI stub: Loaded initrd from command line option
[    0.000000] Linux version 6.5.13-1-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-1 (2024-02-05T13:50Z) ()
[    0.000000] Command line: initrd=\EFI\proxmox\6.5.13-1-pve\initrd.img-6.5.13-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs console=tty0 console=ttyS2,115200n8 loglevel=7
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Hygon HygonGenuine
[    0.000000]   Centaur CentaurHauls
[    0.000000]   zhaoxin   Shanghai 
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[...]
[    0.000000] BIOS-e820: [mem 0x000000103e300000-0x000000103fffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] extended physical RAM map:
[    0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000009ffff] usable
[...]
[    0.000000] reserve setup_data: [mem 0x000000103e300000-0x000000103fffffff] reserved
[    0.000000] efi: EFI v2.7 by American Megatrends
[    0.000000] efi: ACPI=0x9bf20000 ACPI 2.0=0x9bf20014 SMBIOS=0x9cd55000 SMBIOS 3.0=0x9cd54000 MEMATTR=0x950d6018 ESRT=0x969fad18 INITRD=0x93e65398 RNG=0x9a049018
[    0.000000] random: crng init done
[    0.000000] efi: Remove mem348: MMIO range=[0xf0000000-0xf7ffffff] (128MB) from e820 map
[    0.000000] efi: Remove mem349: MMIO range=[0xfd200000-0xfd2fffff] (1MB) from e820 map
[    0.000000] efi: Remove mem350: MMIO range=[0xfd600000-0xfd6fffff] (1MB) from e820 map
[    0.000000] efi: Not removing mem351: MMIO range=[0xfea00000-0xfea0ffff] (64KB) from e820 map
[    0.000000] efi: Remove mem352: MMIO range=[0xfeb80000-0xfec01fff] (0MB) from e820 map
[    0.000000] efi: Not removing mem353: MMIO range=[0xfec10000-0xfec10fff] (4KB) from e820 map
[...]
[    0.000000] efi: Not removing mem359: MMIO range=[0xfedd4000-0xfedd5fff] (8KB) from e820 map
[    0.000000] efi: Remove mem360: MMIO range=[0xff000000-0xffffffff] (16MB) from e820 map
[    0.000000] secureboot: Secure boot disabled
[    0.000000] SMBIOS 3.3.0 present.
[    0.000000] DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./B550D4-4L, BIOS P1.10 06/28/2021
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3800.187 MHz processor
[    0.000237] last_pfn = 0x103e300 max_arch_pfn = 0x400000000
[    0.000243] MTRR map: 5 entries (3 fixed + 2 variable; max 20), built from 9 variable MTRRs
[    0.000244] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT 
[    0.000878] last_pfn = 0x9e000 max_arch_pfn = 0x400000000
[    0.003543] esrt: Reserving ESRT space from 0x00000000969fad18 to 0x00000000969fad50.
[    0.003582] Using GB pages for direct mapping
[    0.003782] secureboot: Secure boot disabled
[    0.003783] RAMDISK: [mem 0x7c696000-0x7fffffff]
[    0.003787] ACPI: Early table checksum verification disabled
[    0.003790] ACPI: RSDP 0x000000009BF20014 000024 (v02 ALASKA)
[    0.003793] ACPI: XSDT 0x000000009BF1F728 0000EC (v01 ALASKA A M I    01072009 AMI  01000013)
[    0.003798] ACPI: FACP 0x000000009A06E000 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
[    0.003802] ACPI: DSDT 0x000000009A04A000 00643D (v02 ALASKA A M I    01072009 INTL 20120913)
[    0.003804] ACPI: FACS 0x000000009AF1A000 000040
[    0.003805] ACPI: IVRS 0x000000009A07D000 0000D0 (v02 AMD    AmdTable 00000001 AMD  00000001)
[    0.003807] ACPI: SPMI 0x000000009A07C000 000041 (v05 ALASKA A M I    00000000 AMI. 00000000)
[    0.003809] ACPI: SSDT 0x000000009A074000 007229 (v02 AMD    Artic    00000002 MSFT 04000000)
[    0.003810] ACPI: SSDT 0x000000009A070000 003AAF (v01 AMD    AMD AOD  00000001 INTL 20120913)
[    0.003812] ACPI: SSDT 0x000000009A06F000 000221 (v02 ALASKA CPUSSDT  01072009 AMI  01072009)
[    0.003814] ACPI: FIDT 0x000000009A067000 00009C (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.003815] ACPI: MCFG 0x000000009A066000 00003C (v01 ALASKA A M I    01072009 MSFT 00010013)
[    0.003817] ACPI: AAFT 0x000000009A065000 000068 (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)
[    0.003818] ACPI: HPET 0x000000009A064000 000038 (v01 ALASKA A M I    01072009 AMI  00000005)
[    0.003820] ACPI: SPCR 0x000000009A063000 000050 (v02 A M I  APTIO V  01072009 AMI. 00050011)
[    0.003821] ACPI: SSDT 0x000000009A05D000 005354 (v02 AMD    AmdTable 00000001 AMD  00000001)
[    0.003823] ACPI: CRAT 0x000000009A05C000 000EE8 (v01 AMD    AmdTable 00000001 AMD  00000001)
[    0.003825] ACPI: CDIT 0x000000009A05B000 000029 (v01 AMD    AmdTable 00000001 AMD  00000001)
[    0.003826] ACPI: SSDT 0x000000009A05A000 00015B (v01 AMD    ArticRC  00000001 INTL 20120913)
[    0.003828] ACPI: SSDT 0x000000009A059000 000D53 (v01 AMD    ArticIG2 00000001 INTL 20120913)
[    0.003829] ACPI: SSDT 0x000000009A057000 0010AC (v01 AMD    ArticTPX 00000001 INTL 20120913)
[    0.003831] ACPI: SSDT 0x000000009A053000 0038A2 (v01 AMD    ArticN   00000001 INTL 20120913)
[    0.003832] ACPI: WSMT 0x000000009A052000 000028 (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.003834] ACPI: APIC 0x000000009A051000 00015E (v03 ALASKA A M I    01072009 AMI  00010013)
[    0.003835] ACPI: SSDT 0x000000009A06D000 00051B (v01 AMD    ArticPRN 00000001 INTL 20120913)
[    0.003837] ACPI: SSDT 0x000000009A06C000 00007D (v01 AMD    ArticDIS 00000001 INTL 20120913)
[    0.003839] ACPI: SSDT 0x000000009A06A000 0010AF (v01 AMD    ArticC   00000001 INTL 20120913)
[    0.003840] ACPI: SSDT 0x000000009A069000 0000BF (v01 AMD    AmdTable 00001000 INTL 20120913)
[    0.003842] ACPI: FPDT 0x000000009A068000 000044 (v01 ALASKA A M I    01072009 AMI  01000013)
[    0.003843] ACPI: Reserving FACP table memory at [mem 0x9a06e000-0x9a06e113]
[    0.003844] ACPI: Reserving DSDT table memory at [mem 0x9a04a000-0x9a05043c]
[    0.003844] ACPI: Reserving FACS table memory at [mem 0x9af1a000-0x9af1a03f]
[    0.003845] ACPI: Reserving IVRS table memory at [mem 0x9a07d000-0x9a07d0cf]
[    0.003845] ACPI: Reserving SPMI table memory at [mem 0x9a07c000-0x9a07c040]
[    0.003845] ACPI: Reserving SSDT table memory at [mem 0x9a074000-0x9a07b228]
[    0.003846] ACPI: Reserving SSDT table memory at [mem 0x9a070000-0x9a073aae]
[    0.003846] ACPI: Reserving SSDT table memory at [mem 0x9a06f000-0x9a06f220]
[    0.003846] ACPI: Reserving FIDT table memory at [mem 0x9a067000-0x9a06709b]
[    0.003847] ACPI: Reserving MCFG table memory at [mem 0x9a066000-0x9a06603b]
[    0.003847] ACPI: Reserving AAFT table memory at [mem 0x9a065000-0x9a065067]
[    0.003848] ACPI: Reserving HPET table memory at [mem 0x9a064000-0x9a064037]
[    0.003848] ACPI: Reserving SPCR table memory at [mem 0x9a063000-0x9a06304f]
[    0.003848] ACPI: Reserving SSDT table memory at [mem 0x9a05d000-0x9a062353]
[    0.003849] ACPI: Reserving CRAT table memory at [mem 0x9a05c000-0x9a05cee7]
[    0.003849] ACPI: Reserving CDIT table memory at [mem 0x9a05b000-0x9a05b028]
[    0.003849] ACPI: Reserving SSDT table memory at [mem 0x9a05a000-0x9a05a15a]
[    0.003850] ACPI: Reserving SSDT table memory at [mem 0x9a059000-0x9a059d52]
[    0.003850] ACPI: Reserving SSDT table memory at [mem 0x9a057000-0x9a0580ab]
[    0.003851] ACPI: Reserving SSDT table memory at [mem 0x9a053000-0x9a0568a1]
[    0.003851] ACPI: Reserving WSMT table memory at [mem 0x9a052000-0x9a052027]
[    0.003851] ACPI: Reserving APIC table memory at [mem 0x9a051000-0x9a05115d]
[    0.003852] ACPI: Reserving SSDT table memory at [mem 0x9a06d000-0x9a06d51a]
[    0.003852] ACPI: Reserving SSDT table memory at [mem 0x9a06c000-0x9a06c07c]
[    0.003853] ACPI: Reserving SSDT table memory at [mem 0x9a06a000-0x9a06b0ae]
[    0.003853] ACPI: Reserving SSDT table memory at [mem 0x9a069000-0x9a0690be]
[    0.003853] ACPI: Reserving FPDT table memory at [mem 0x9a068000-0x9a068043]
[    0.003913] No NUMA configuration found
[    0.003914] Faking a node at [mem 0x0000000000000000-0x000000103e2fffff]
[    0.003919] NODE_DATA(0) allocated [mem 0x103e2d5000-0x103e2fffff]
[    0.004032] Zone ranges:
[    0.004033]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.004034]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.004035]   Normal   [mem 0x0000000100000000-0x000000103e2fffff]
[    0.004035]   Device   empty
[    0.004036] Movable zone start for each node
[    0.004037] Early memory node ranges
[    0.004037]   node   0: [mem 0x0000000000001000-0x000000000009ffff]
[    0.004038]   node   0: [mem 0x0000000000100000-0x0000000009bfefff]
[    0.004039]   node   0: [mem 0x000000000a000000-0x000000000a1fffff]
[    0.004039]   node   0: [mem 0x000000000a20f000-0x000000000affffff]
[    0.004040]   node   0: [mem 0x000000000b020000-0x0000000098681fff]
[    0.004040]   node   0: [mem 0x000000009cfff000-0x000000009dffffff]
[    0.004041]   node   0: [mem 0x0000000100000000-0x000000103e2fffff]
[    0.004046] Initmem setup node 0 [mem 0x0000000000001000-0x000000103e2fffff]
[    0.004050] On node 0, zone DMA: 1 pages in unavailable ranges
[    0.004061] On node 0, zone DMA: 96 pages in unavailable ranges
[    0.004162] On node 0, zone DMA32: 1025 pages in unavailable ranges
[    0.004173] On node 0, zone DMA32: 15 pages in unavailable ranges
[    0.006529] On node 0, zone DMA32: 32 pages in unavailable ranges
[    0.006654] On node 0, zone DMA32: 18813 pages in unavailable ranges
[    0.082768] On node 0, zone Normal: 8192 pages in unavailable ranges
[    0.082811] On node 0, zone Normal: 7424 pages in unavailable ranges
[    0.083578] ACPI: PM-Timer IO Port: 0x808
[    0.083586] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[    0.083599] IOAPIC[0]: apic_id 17, version 33, address 0xfec00000, GSI 0-23
[    0.083604] IOAPIC[1]: apic_id 18, version 33, address 0xfec01000, GSI 24-55
[    0.083605] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.083607] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[    0.083609] ACPI: Using ACPI (MADT) for SMP configuration information
[    0.083611] ACPI: HPET id: 0x10228201 base: 0xfed00000
[    0.083615] ACPI: SPCR: console: uart,io,0x3f8,115200
[    0.083616] smpboot: Allowing 32 CPUs, 16 hotplug CPUs
[    0.083632] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
[...]
[    0.083648] PM: hibernation: Registered nosave memory: [mem 0xfedd6000-0xffffffff]
[    0.083649] [mem 0xc0000000-0xfe9fffff] available for PCI devices
[    0.083652] Booting paravirtualized kernel on bare hardware
[    0.083654] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.083660] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:1
[    0.084510] percpu: Embedded 63 pages/cpu s221184 r8192 d28672 u262144
[    0.084541] Kernel command line: initrd=\EFI\proxmox\6.5.13-1-pve\initrd.img-6.5.13-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs console=tty0 console=ttyS2,115200n8 loglevel=7
[    0.084592] Unknown kernel command line parameters "boot=zfs", will be passed to user space.
[    0.089464] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes, linear)
[    0.091831] Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes, linear)
[    0.091985] Fallback order for Node 0: 0
[    0.091991] Built 1 zonelists, mobility grouping on.  Total pages: 16350845
[    0.091992] Policy zone: Normal
[    0.091999] mem auto-init: stack:all(zero), heap alloc:on, heap free:off
[    0.092042] software IO TLB: area num 32.
[    0.186313] Memory: 65018472K/66442184K available (20480K kernel code, 3583K rwdata, 12760K rodata, 4640K init, 18228K bss, 1423452K reserved, 0K cma-reserved)
[    0.186472] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=32, Nodes=1
[    0.186498] ftrace: allocating 52837 entries in 207 pages
[    0.196632] ftrace: allocated 207 pages with 6 groups
[    0.197337] Dynamic Preempt: voluntary
[    0.197382] rcu: Preemptible hierarchical RCU implementation.
[    0.197383] rcu:     RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=32.
[    0.197383]  Trampoline variant of Tasks RCU enabled.
[    0.197384]  Rude variant of Tasks RCU enabled.
[    0.197384]  Tracing variant of Tasks RCU enabled.
[    0.197384] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.197385] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=32
[    0.198963] NR_IRQS: 524544, nr_irqs: 1224, preallocated irqs: 16
[    0.199146] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.199213] Console: colour dummy device 80x25
[    0.199215] printk: console [tty0] enabled
[    0.199484] printk: console [ttyS2] enabled
[    2.021523] ACPI: Core revision 20230331
[    2.025545] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484873504 ns
[    2.034684] APIC: Switch to symmetric I/O mode setup
[    2.040325] AMD-Vi: Using global IVHD EFR:0x206d73ef22254ade, EFR2:0x0
[    2.202354] AMD-Vi: Completion-Wait loop timed out
[    2.207152] Switched APIC routing to physical flat.
[    2.335756] AMD-Vi: Completion-Wait loop timed out
[    2.340724] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    2.501914] AMD-Vi: Completion-Wait loop timed out
[    2.661835] AMD-Vi: Completion-Wait loop timed out
[    2.821857] AMD-Vi: Completion-Wait loop timed out
[    2.868749] Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC
[    2.877266] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.13-1-pve #1
[    2.883704] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B550D4-4L, BIOS P1.10 06/28/2021
[    2.893262] Call Trace:
[    2.895705]  <TASK>
[    2.897802]  dump_stack_lvl+0x48/0x70
[    2.901470]  dump_stack+0x10/0x20
[    2.904785]  panic+0x2e8/0x360
[    2.907837]  ? mp_irqdomain_activate+0x35/0x50
[    2.912282]  panic_if_irq_remap+0x21/0x30
[    2.916297]  setup_IO_APIC+0x8c9/0x9d0
[    2.920047]  ? _raw_spin_unlock_irqrestore+0x21/0x60
[    2.925013]  ? clear_IO_APIC_pin+0x174/0x290
[    2.929277]  apic_intr_mode_init+0x7b/0x140
[    2.933454]  x86_late_time_init+0x24/0x40
[    2.937466]  start_kernel+0x680/0xb00
[    2.941135]  x86_64_start_reservations+0x18/0x30
[    2.945744]  x86_64_start_kernel+0xbf/0x110
[    2.949921]  secondary_startup_64_no_verify+0x17e/0x18b
[    2.955149]  </TASK>
[    2.957335] ---[ end Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC ]---

On the screen and in the IPMI the output cannot be seen, but it can be made visible by redirecting it out via the serial port (therefore console=tty0 console=ttyS2,115200n8).

Since I want to pass a Coral Edge TPU to a VM, I need IOMMU. IOMMU is enabled in the BIOS. BIOS is the latest version.

Does anyone have an idea how to fix this problem?

Thank you

Best regards,
FeKn
 
Did you activated every cfg for virt in your bios? Csm, iommu, vtd and x ( or equivalent when amd)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!