AMD 7900 XT or XTX pci passthrough

azmosult

New Member
Jan 25, 2023
5
0
1
Hi

I have currently a well working proxmox (7.3 I think -> can confirm this later) setup with 2 main VMs.

Setup:
Debian 11 Proxmox Server
Windows 10 pro 64 bit -> Mostly for Gaming
Ubuntu 22.04 64 bit -> For working stuff
Some other VMs

Hardware:
ASUS TUF Gaming X570-Plus Mainboard Sockel AM4
AMD Ryzen CPU
Nvidia 980ti
Nvidia 1030 (I think -> can confirm this later)
Some ECC RAM
Bunch of disks

I have pci passthrough running for both cards. The 980ti goes to the Windows VM and the 1030 to the Linux VM.
This setup runs now for about a year and just works great.


I got now a new amd 7900 xtx grapic card and wanted to swap out the 980ti.
As far as I can see the passthrough to the windows VM works but I only get an n black screen.
I can connect to windows via rdesktop and see the graphics card in the device manger but with an error code 43.
I tried to install the amd drivers for the card but nothing changed.

I tried now many things to get this setup to work but had no sucess:
1.) make sure all the kernel pci passthrough stuff is set:
Code:
video=efifb:off, amd_iommu=on, iommu=pt, ...
2.) Set the VM to the q35-7.1
3.) Downloaded the vbios from the card under linux (echo 1 > rom, ...) and windows (GPU-Z) (the linux variant had about 128KB and the GPU-Z variant about 2MB)
4.) Used the vbios rom file for the vm (romfile=vbios.rom)
5.) Tested dfferent combinations of the passthrough options from the GUI
6.) Read and found that someone got it woking in unraid https://forums.unraid.net/topic/132895-radeon-rx-7900-xt-passthrough/ -> Its also kvm so it shoult at least be possible
...

Nothing of that worked and I'm nearly out of ideas now.
The last Ideas I have:
I just saw that 'Resizable BAR'/'Smart Access Memory' can be a problem with amd cards -> have to test that
I can update to a newer bios version on the host
I can try to remove the 1030 card and try only the 7900xtx (the passthrough of the 1030 to the linux vm still works flawless)


Has someone a working passthrough on proxmox with one of the new amd cards (7900 xt or xtx) running?
Has someone ideas what I could have missed?
This setup has so much benefits for me I really do not want to step back to bare metal.
 
Last edited:

azmosult

New Member
Jan 25, 2023
5
0
1
Ok I have Proxmox 7.3-4
And the second GPU is an Nvidia geforce gt 1030

I have now updated to the latest bios
In bios svm and iommu is enabled
I made sure the 'Resizable BAR'/'Smart Access Memory' in bios is desabled
After the bios update I have testet if I can passthrough the 1030 -> yes works flawless

I also tested to remove the 1030 from the system and only use the 7900xtx in the system
All my tests failed again.

But I saw one change if I start the VM multible times without the rom file.
Sometimes I get the Proxmox boot logo on the screen and the animated Windows boot circle dots -> so I see something from windows
When windows is done with booting the upper half of the screen turns black and the lower half shows the remains of the Proxmox boot logo


Now I'm out of ideas.
I don't even know where I could start the debug process (logs on the host look ok)
Maybe its my hardware setup but how can I test this?
 

psyyo

New Member
Aug 10, 2022
11
0
1
twitter.com
if the 7900 is anything like the 6900, my settings might help with what you're trying to do.

ok
disabling Resizable BAR - good first step.


in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"

in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files

in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
obviously address will differ, just pointing out flags i use

Point of these settings is that AMD hardware virtualizes much cleaner than Nvidia, and all those weird tricks and workarounds actually hinder the integration.
 

leesteken

Famous Member
May 31, 2020
2,650
577
118
Please allow me to give some observations about this working passthrough setup.
in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"
amd_iommu=on does nothing because it is on by default.
iommu=pt probably does nothing, please let me know if it makes a difference for anything for you (or anybody else, I'm really curious but heard nothing so far).
initcall_blacklist=sysfb_init is a work-around if you have only one GPU, but I would not expect that you need that since the 6000-series are supposed to reset properly and you ought to be able to let Proxmox load amdgpu as normal. Do you really need it?
nofb nomodeset will prevent amdgpu from using the GPU during boot (like the one above but also other (non-boot) GPUs) and, like above, I don't expect you to need it with a 6900. Do you really need it?
in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files
radeon is for much older AMD GPUs and does not touch newer GPUs, amdgpu is used for 6000- and 7000-series. Does this make a difference for your setup?
in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
Primary GPU (x-vga=1) is a fix for NVidia GPUs. For AMD GPUs, it should be enough to set Display to None.

I'm sorry that my reply does not help passthrough of the 7900 but I'm also interested whether it will work, and I would prefer to keep the configuration as clean as possible with only the things we actually need.
 

azmosult

New Member
Jan 25, 2023
5
0
1
Thx for the input.

I have now tested with alle the options from psyyo:
Code:
# the list of kernel options
:~# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.83-1-pve root=/dev/mapper/lv1--ssd-root ro quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init

#########################################

# there is nothing in the blacklist that should make problems
:~# cat /etc/modprobe.d/*
blacklist nouveau
options kvm ignore_msrs=1
# mdadm module configuration file
# set start_ro=1 to make newly assembled arrays read-only initially,
# to prevent metadata writes.  This is needed in order to allow
# resume-from-disk to work - new boot should not perform writes
# because it will be done behind the back of the system being
# resumed.  See http://bugs.debian.org/415441 for details.

options md_mod start_ro=1
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

#########################################

# make really sure all modules are loaded
:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.


# Generated by sensors-detect on Mon Sep 13 19:15:50 2021
# Chip drivers
nct6775

# hardware passthroug
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

#########################################

# make sure I have the right id of my card
:~# lspci | grep -i vga
05:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 744c (rev c8)

#########################################

# the pve config of my windows machine
:~# cat /etc/pve/qemu-server/13211901.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 12
cpu: host
efidisk0: encrypted-nvme-pool:vm-13211901-disk-1,size=1M
hostpci0: 0000:0c:00,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: pc-q35-7.1
memory: 20480
name: game-win01
net0: e1000=7E:F3:10:92:DD:A3,bridge=vmbr0
numa: 0
ostype: win10
scsi0: encrypted-nvme-pool:vm-13211901-disk-0,discard=on,size=500G
scsihw: virtio-scsi-pci
smbios1: uuid=b32f2bc9-a21f-4288-9ea8-38cc24770be4
sockets: 1
usb0: host=7-1.2,usb3=1
usb1: host=7-1.3,usb3=1
usb2: host=7-1.1,usb3=1
usb3: host=7-1.4,usb3=1
vga: none
vmgenid: a147c51a-355c-480b-ad54-0242de56a8b6

#########################################

# I also testet:
hostpci0: 0000:0c:00,pcie=1
hostpci0: 0000:0c:00,pcie=1,x-vga=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom
hostpci0: 0000:0c:00,pcie=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom

Nothing worked so far.
The only thing changes is, if I use the romfile I have no reset bug. Without the romfile I can only start the windows guest once and I have to reboot the whole host.
Code:
kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.

I also see more and more people who have the same problems as me and had an 6000 serias AMD card running before:
https://www.reddit.com/r/VFIO/comments/zn0zdm/7900xtx_passthrough_pci_reset_seems_not_working/
https://www.reddit.com/r/VFIO/comments/zx4v0h/radeon_rx_7900_xt_gpu_passthrough/

But there are also people who managed to get it work with unraid and arch linux.
I install now a extra disk and install windows bare metal to see if everything works (Currently I do not know if my card has the hotspot problem and I need to send it back anyway)
Later I maybe give arch linux a try, maybe I also try to get a 6.1 kernel which seems to support Resizable BAR in kvm
https://www.reddit.com/r/VFIO/comments/ye0cpj/psa_linux_v61_resizable_bar_support/
 
Last edited:

azmosult

New Member
Jan 25, 2023
5
0
1
Ok I have now testet my 7900xtx.
The card works fine under windows (bare metal) and has no hotspot problem (15min stresstest and everything was ok)

I also tried the 6.1 kernel in proxmox which changed nothing on the black screen side but made my mouse wheel super sensitive (10 times the speed). And once my linux vm was not starting. Currently the kernel seems to be a little bit buggy. But the kernel is opt in and I was happy I could test it.

I installed arch linux on an extra disk and setup everything until libvirtd was running (which was a hassle -.- )
I created a test linux vm and passed the 1030 to it -> works fine
Then I switched to the 7900xtx and did some tests with the same vm (romfile, xml changes, ...) -> always a black screen :(

So it could now be my hardware setup (because other people got it running on arch linux), or I had not exact the same host/vm config they had.

Things I can now do:
Get and 4070ti or an 4080 or an 6950 xt (which would suck)
Use the 7900xtx bare metal (would also suck)
Hope that someone figures it out how to get this work in the near future


Maybe I will try it again with arch in the next days but now I'm out of ideas and of motivation.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!