Error 43 AMD Radeon 6800XT

archemas

New Member
Nov 14, 2024
1
0
1
Hi all. I know these come up quiet a bit, and trust me, I have looked around and tried all types of combinations but I just can't seem to get this to work.

My hardware is:
Ryzen 7 7800x3d
Gigabyte B650M AORUS ELITE AX
XFX Speedster AMD Radeon 6800XT

As for the steps I took it was:

nano /etc/default/grub
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt initcall_blacklist=sysfb_init/g pcie_acs_override=downstream>
GRUB_CMDLINE_LINUX=""

I have tried removing iommu=pt because I heard that wasn't necessary but still same result, and I've also removed pcie_acs_override=downstream and even putting that in GRUB_CMDLINE_LINX="" but I end with the same result.

nano /etc/modules/vfio.conf
Code:
vfio
vfio_iommu_type1
vfio_pci

nano /etc/kernel/cmdline
Code:
root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on

nano /etc/modprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist amdgpu

nano/etc/modprobe.d/iommu_unsafe_interrupts.conf
Code:
options vfio_iommu_type1 allow_unsafe_interrupts=1

And so far, now that I did all that and updated using
Code:
update-grub
update-grub2
update-initramfs -u -k all
and rebooted

From there I ran
lspci -nn
Code:
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c3)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]

I added those to a .conf file

nano /etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=1002:73bf,1002:ab28 disable_vga=1

and confirmed they picked up the right modules
lspci -n -s 03:00 -v
Code:
03:00.0 0300: 1002:73bf (rev c3) (prog-if 00 [VGA controller])
        Subsystem: 1eae:6705
        Flags: fast devsel, IRQ 34, IOMMU group 15
        Memory at f800000000 (64-bit, prefetchable) [size=16G]
        Memory at fc00000000 (64-bit, prefetchable) [size=256M]
        I/O ports at f000 [size=256]
        Memory at f6a00000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at f6b00000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [200] Physical Resizable BAR
        Capabilities: [240] Power Budgeting <?>
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2a0] Access Control Services
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [320] Latency Tolerance Reporting
        Capabilities: [410] Physical Layer 16.0 GT/s <?>
        Capabilities: [440] Lane Margining at the Receiver <?>
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu

03:00.1 0403: 1002:ab28
        Subsystem: 1002:ab28
        Flags: fast devsel, IRQ 107, IOMMU group 16
        Memory at f6b20000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [2a0] Access Control Services
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
After all that for good measure one more time
Code:
update-grub
update-grub2
update-initramfs -u -k all
and rebooted

So now when I check using
dmesg | grep -e DMAR -e IOMMU I get
Code:
[    0.416875] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.518768] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

Then I checked Isolation using
pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""

and at some point I did get a list of everything in there, but at the time of writing this, nothing actually comes back.

I've tried all different sort've combinations, and I don't know if I saw an option in my bios to turn on ACS, and I think that's the issue but I'm not sure. I think the docs also say to enable SR-IOV which my bios has and I have set on. I also checked and turned on IOMMU in my bios and there was an initial display output option that I switched from my GPU to IGD Video. I don't know if this matters but there was also the Re-Size BAR Support which I also enabled.

Any help would be appreciated. Thank you!
 
nano /etc/default/grub
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt initcall_blacklist=sysfb_init/g pcie_acs_override=downstream>
GRUB_CMDLINE_LINUX=""
amd_iommu=on is nonsense because it is on by default.
initcall_blacklist=sysfb_init/g has a /g at the end that should not be there. And I think that you don't need it because I expect the 6800XT to (FLR) reset properly.
The rest of the kernel parameters appear to be cut-off or did you really end with > instead of "? Your Proxmox installation might not use GRUB. What is the output of cat /proc/cmdline ?

Can you get the 6800XT (easily) working with the Ubuntu 24.04 LTS installer? No need to install Ubuntu, just see if it working in the live environment.
Maybe it's a AMD Windows driver issue and setting the machine version back to 7.2 is enough?