Proxmox 8.0.4 wont boot

Daze

New Member
Nov 4, 2023
14
0
1
Hello all,
Tried to get Radeon VII passthrough working in Proxmox, after reboot system is not booting up anymore.

Stuck at black screen with text:
amd_gpio AMDI0030:00: Invalid config param 0014

Any ideas?
 
Hi Daze,

Welcome to the forums!

Any ideas?
Yes, a vague idea :p

Which motherboard do you use? Not all motherboards support a lot of choice in "passthrough groups". I never used it, so the specific terms slip my mind.

The point is: you can pass through a resource group all-or-nothing. Besedes that, I recall only when the PCIe is connected to the CPU directly. So if you got 6 PCIe slots, 2 of them connected to the CPU and 4 to the south bridge (.. I think), you could only pass through the two that are connected to the CPU.

Compounding matters: depending on motherboard (and maybe CPU) features, these two PCIe slots may be part of a single resource group. If so, you can pass through both slots or none. In more feature rich motherboards (and/or CPU's) you'd be able to chose to pass through just one of the two, or each to another VM.

Practical: perhaps your Radeon VII is in a slot that shares a resource group with a PCIe resource that is needed by your host, for example the storage controller.

I only answered because you asked: "Any ideas?", because my story is too vague to be of much help :p
 
  • Like
Reactions: Daze
Thank you wbk!

I tried rescue boot from usb, also no luck :)
So installed proxmox from scratch (again), because this system isnt mission critical yet and wont be before i got it working. Have to do back up's this time so i dont have to configure and install everything again.

My set up is:
Gigabyte Auros master x570
Ryzen 3900x
Radeon VII in top pci
SLI HBA sas/sata card second pci slot (2x4gb disk attached)
Intel x710da2 ethernet card in bottom slot

Win 11, Monterey & Sonoma, Truenas Scale Cobia VM:s worked fine and
i got my disks working in truenas hba card passthrough and blacklisted it succesfully from proxmox.
Long S.m.a.r.t. test also worked for these hba disks.

Encouraged by this i started to configure radeon vii passthough and final result was black screen and no boot :)

Not 100% sure but i think its the bottom pci slot which shares storage controller.
 
Last edited:
Not 100% sure but i think its the bottom pci slot which shares storage controller.
The two upper PCIe slots are connected to the CPU and may be used with passthrough. The two bottom slots are connected to the chipset.

What is pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" reporting?
 
Encouraged by this i started to configure radeon vii passthough and final result was black screen and no boot :)
If you passed through the device, you must also add option to the denylist.conf...if you don't the kernel blocks the passthrough...see here: ttps://forum.proxmox.com/threads/pve-blacklist-conf-not-working.134895/#post-596612
 
  • Like
Reactions: Daze
Ok. Heres the screenshots

1699146421048.png
 

Attachments

  • Näyttökuva 2023-11-05 030326.png
    Näyttökuva 2023-11-05 030326.png
    186.7 KB · Views: 4
For reference, the term I could not recall just before I see on screen now: IOMMU group.

My set up is:
Gigabyte Auros master x570
There is a nice website that lets you look up your board when you don't have access to your Proxmox, or someone else's IOMMU groups ;-)
 
If you passed through the device, you must also add option to the denylist.conf...if you don't the kernel blocks the passthrough...see here: ttps://forum.proxmox.com/threads/pve-blacklist-conf-not-working.134895/#post-
If you passed through the device, you must also add option to the denylist.conf...if you don't the kernel blocks the passthrough...see here: ttps://forum.proxmox.com/threads/pve-blacklist-conf-not-working.134895/#post-596612

For reference, the term I could not recall just before I see on screen now: IOMMU group.


There is a nice website that lets you look up your board when you don't have access to your Proxmox, or someone else's IOMMU groups ;-)
Thank you, but i think something got messed when i blacklisted radeon vii. When i verified everything results werent identical as in instructions at forums so that could possibly be it.
 
Thank you, but i think something got messed when i blacklisted radeon vii. When i verified everything results werent identical as in instructions at forums so that could possibly be it.
Im using this guide to proceed : https://forum.proxmox.com/threads/p...x-ve-8-installation-and-configuration.130218/


IOMMU activation for systemd-boot

After i reboot and verify with dmesg | grep -e IOMMU that IOMMU is enabled i should get this line as in guide:*
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA

but instead i get this:

root@homelab:~# dmesg | grep -e IOMMU
[ 2.431305] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.433909] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.434283] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 13.338650] AMD-Vi: AMD IOMMUv2 loaded and initialized

root@homelab:~# dmesg | grep 'remapping'
[ 0.539151] x2apic: IRQ remapping doesn't support X2APIC mode
[ 2.433918] AMD-Vi: Interrupt remapping enabled

I have everything in bios enabled for succesful passthrough, latest Auros Master x570 bios F37g. Never used proxmox on older bios version so cant compare if IOMMU mapping has changed from previous.

EDIT:
Carried on and after reboot checked "reset Bug" with dmesg | grep vendor_reset:

root@homelab:~# dmesg | grep vendor_reset
[ 13.372913] vendor_reset: module verification failed: signature and/or required key missing - tainting kernel
[ 13.450249] vendor_reset_hook: installed

In guide there is no answer how output should look like and not a word from "vendor_reset: module verification failed: signature and/or required key missing - tainting kernel"

EDIT:
Carried on and when i verify everything i get this:

root@homelab:~# dmesg | grep -E "DMAR|IOMMU"
[ 2.430801] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.433430] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.433814] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 13.477645] AMD-Vi: AMD IOMMUv2 loaded and initialized

root@homelab:~# dmesg | grep 'remapping'
[ 0.538992] x2apic: IRQ remapping doesn't support X2APIC mode
[ 2.433438] AMD-Vi: Interrupt remapping enabled

root@homelab:~# dmesg | grep -i vfio
[ 13.099882] VFIO - User Level meta-driver version: 0.3
[ 13.106757] vfio-pci 0000:0f:00.0: vgaarb: deactivate vga console
[ 13.106762] vfio-pci 0000:0f:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:eek:wns=none
[ 13.106930] vfio_pci: add [1002:66af[ffffffff:ffffffff]] class 0x000000/00000000
[ 13.203691] vfio_pci: add [1002:ab20[ffffffff:ffffffff]] class 0x000000/00000000

Correct driver loading verify:
root@homelab:~# lspci -nnk | grep 'AMD'
It lists all IOMMU groups not only gpu contoller and its audio device and theres not any kernel info

root@homelab:~# systemctl status vreset.service
Unit vreset.service could not be found.

Dont know how to continue from this point, can anyone please help me with this? Thank you

EDIT:
Again black screen and everything from beginning. Gonna post this to tutorial thread.
 
Last edited:
Dont know how to continue from this point, can anyone please help me with this? Thank you
I told you in post #5...IOMMU will not show enabled until you create a vfio_pci.conf and add the option below...
GNU nano 7.2 /etc/modprobe.d/vfio-pci.conf
options vfio-pci disable_denylist=1

Once enabled, you set/attach to the VM : qm set VMID -hostpciX 0f:00,pcie=on
where VMID equals for example 100, 101, 102, etc., and where X in -hostpciX equals the pcie slot number
then you run: update-initramfs -u -k all

Your pci device is now passed through...
 
Last edited:
I told you in post #5...IOMMU will not show enabled until you create a vfio_pci.conf and add the option below...
GNU nano 7.2 /etc/modprobe.d/vfio-pci.conf
options vfio-pci disable_denylist=1

Once enabled, you set/attach to the VM : qm set VMID -hostpciX 0f:00,pcie=on
where VMID equals for example 100, 101, 102, etc., and -hostpciX equals the pcie slot,
then you run: update-initramfs -u -k all

Your pci device is now passed through...
I dont quite follow now, all my IOMMUS show. My problem is vendor reset doesnt activate.
 
Last edited:
I dont quite follow now, all my IOMMUS show.
so, running this cmd "dmesg | grep -e DMAR -e IOMMU" shows enabled? Do you only have one pci graphic card?
Such as below:
root@nolliprivatecloud:~# dmesg | grep -e DMAR -e IOMMU
[ 0.008996] ACPI: DMAR 0x000000004AC0F000 0000F0 (v01 DELL\x CBX3 00000001 INTL 20091013)
[ 0.009041] ACPI: Reserving DMAR table memory at [mem 0x4ac0f000-0x4ac0f0ef]
[ 0.170241] DMAR: IOMMU enabled
[ 0.451328] DMAR: Host address width 46
[ 0.451330] DMAR: DRHD base: 0x000000b5ffc000 flags: 0x0
[ 0.451337] DMAR: dmar0: reg_base_addr b5ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.451341] DMAR: DRHD base: 0x000000d8ffc000 flags: 0x0
[ 0.451346] DMAR: dmar1: reg_base_addr d8ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.451349] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[ 0.451354] DMAR: dmar2: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.451357] DMAR: DRHD base: 0x00000092ffc000 flags: 0x1
[ 0.451364] DMAR: dmar3: reg_base_addr 92ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.451367] DMAR: RMRR base: 0x0000004c2e0000 end: 0x0000004c529fff
[ 0.451370] DMAR: ATSR flags: 0x0
[ 0.451373] DMAR-IR: IOAPIC id 12 under DRHD base 0xfbffc000 IOMMU 2
[ 0.451376] DMAR-IR: IOAPIC id 11 under DRHD base 0xd8ffc000 IOMMU 1
[ 0.451379] DMAR-IR: IOAPIC id 10 under DRHD base 0xb5ffc000 IOMMU 0
[ 0.451381] DMAR-IR: IOAPIC id 8 under DRHD base 0x92ffc000 IOMMU 3
[ 0.451384] DMAR-IR: IOAPIC id 9 under DRHD base 0x92ffc000 IOMMU 3
[ 0.451386] DMAR-IR: HPET id 0 under DRHD base 0x92ffc000
[ 0.451388] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.452321] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.823411] DMAR: No SATC found
[ 0.823415] DMAR: dmar2: Using Queued invalidation
[ 0.823417] DMAR: dmar1: Using Queued invalidation
[ 0.823428] DMAR: dmar3: Using Queued invalidation
[ 0.826614] DMAR: Intel(R) Virtualization Technology for Directed I/O
root@nolliprivatecloud:~#
 
Last edited:
Yes, Radeon VII

havent changed nothing now this is again fresh install.

root@homelab:~# dmesg | grep -e DMAR -e IOMMU
[ 2.432268] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.441706] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.442113] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 13.383318] AMD-Vi: AMD IOMMUv2 loaded and initialized
 
root@homelab:~# dmesg | grep -e DMAR -e IOMMU
[ 2.432268] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.441706] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 2.442113] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 13.383318] AMD-Vi: AMD IOMMUv2 loaded and initialized
It doesn't show as in mine: [ 0.170241] DMAR: IOMMU enabled...
That's what I want to see.
 
Last edited:
Yes...isn't the same with AMD when IOMMU is enabled?
From this: https://pve.proxmox.com/wiki/PCI_Passthrough
There should be a line that looks like "DMAR: IOMMU enabled". If there is no output, something is
Yes...isn't the same with AMD when IOMMU is enabled?
From this: https://pve.proxmox.com/wiki/PCI_Passthrough
There should be a line that looks like "DMAR: IOMMU enabled". If there is no output, something is wrong.


Newer amd has iommus on by default. If it wouldnt work i wouldnt have hba card passthrough working for truenas.
 
Well, I don't know then...to me, if IOMMU is on by default, it should show in Proxmox "DMAR: IOMMU enabled".

Proxmox needs a new wiki for pci passthrough...
 
Last edited:
Well, I don't know then...to me, if IOMMU is on by default, it should show in Proxmox "DMAR: IOMMU enabled".

Proxmox needs a new wiki for pci passthrough...
Yes, thats how it shows on intel.

Indeed, for both vendors :)
 
View attachment 57608The two upper PCIe slots are connected to the CPU and may be used with passthrough. The two bottom slots are connected to the chipset.

What is pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" reporting?
 

Attachments

  • 1699146421048.png
    1699146421048.png
    223 KB · Views: 1
  • Näyttökuva 2023-11-05 030326.png
    Näyttökuva 2023-11-05 030326.png
    186.7 KB · Views: 1

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!