AMD 7900 XT or XTX pci passthrough

azmosult

New Member
Jan 25, 2023
7
0
1
Hi

I have currently a well working proxmox (7.3 I think -> can confirm this later) setup with 2 main VMs.

Setup:
Debian 11 Proxmox Server
Windows 10 pro 64 bit -> Mostly for Gaming
Ubuntu 22.04 64 bit -> For working stuff
Some other VMs

Hardware:
ASUS TUF Gaming X570-Plus Mainboard Sockel AM4
AMD Ryzen CPU
Nvidia 980ti
Nvidia 1030 (I think -> can confirm this later)
Some ECC RAM
Bunch of disks

I have pci passthrough running for both cards. The 980ti goes to the Windows VM and the 1030 to the Linux VM.
This setup runs now for about a year and just works great.


I got now a new amd 7900 xtx grapic card and wanted to swap out the 980ti.
As far as I can see the passthrough to the windows VM works but I only get an n black screen.
I can connect to windows via rdesktop and see the graphics card in the device manger but with an error code 43.
I tried to install the amd drivers for the card but nothing changed.

I tried now many things to get this setup to work but had no sucess:
1.) make sure all the kernel pci passthrough stuff is set:
Code:
video=efifb:off, amd_iommu=on, iommu=pt, ...
2.) Set the VM to the q35-7.1
3.) Downloaded the vbios from the card under linux (echo 1 > rom, ...) and windows (GPU-Z) (the linux variant had about 128KB and the GPU-Z variant about 2MB)
4.) Used the vbios rom file for the vm (romfile=vbios.rom)
5.) Tested dfferent combinations of the passthrough options from the GUI
6.) Read and found that someone got it woking in unraid https://forums.unraid.net/topic/132895-radeon-rx-7900-xt-passthrough/ -> Its also kvm so it shoult at least be possible
...

Nothing of that worked and I'm nearly out of ideas now.
The last Ideas I have:
I just saw that 'Resizable BAR'/'Smart Access Memory' can be a problem with amd cards -> have to test that
I can update to a newer bios version on the host
I can try to remove the 1030 card and try only the 7900xtx (the passthrough of the 1030 to the linux vm still works flawless)


Has someone a working passthrough on proxmox with one of the new amd cards (7900 xt or xtx) running?
Has someone ideas what I could have missed?
This setup has so much benefits for me I really do not want to step back to bare metal.
 
Last edited:
Ok I have Proxmox 7.3-4
And the second GPU is an Nvidia geforce gt 1030

I have now updated to the latest bios
In bios svm and iommu is enabled
I made sure the 'Resizable BAR'/'Smart Access Memory' in bios is desabled
After the bios update I have testet if I can passthrough the 1030 -> yes works flawless

I also tested to remove the 1030 from the system and only use the 7900xtx in the system
All my tests failed again.

But I saw one change if I start the VM multible times without the rom file.
Sometimes I get the Proxmox boot logo on the screen and the animated Windows boot circle dots -> so I see something from windows
When windows is done with booting the upper half of the screen turns black and the lower half shows the remains of the Proxmox boot logo


Now I'm out of ideas.
I don't even know where I could start the debug process (logs on the host look ok)
Maybe its my hardware setup but how can I test this?
 
if the 7900 is anything like the 6900, my settings might help with what you're trying to do.

ok
disabling Resizable BAR - good first step.


in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"

in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files

in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
obviously address will differ, just pointing out flags i use

Point of these settings is that AMD hardware virtualizes much cleaner than Nvidia, and all those weird tricks and workarounds actually hinder the integration.
 
Please allow me to give some observations about this working passthrough setup.
in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"
amd_iommu=on does nothing because it is on by default.
iommu=pt probably does nothing, please let me know if it makes a difference for anything for you (or anybody else, I'm really curious but heard nothing so far).
initcall_blacklist=sysfb_init is a work-around if you have only one GPU, but I would not expect that you need that since the 6000-series are supposed to reset properly and you ought to be able to let Proxmox load amdgpu as normal. Do you really need it?
nofb nomodeset will prevent amdgpu from using the GPU during boot (like the one above but also other (non-boot) GPUs) and, like above, I don't expect you to need it with a 6900. Do you really need it?
in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files
radeon is for much older AMD GPUs and does not touch newer GPUs, amdgpu is used for 6000- and 7000-series. Does this make a difference for your setup?
in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
Primary GPU (x-vga=1) is a fix for NVidia GPUs. For AMD GPUs, it should be enough to set Display to None.

I'm sorry that my reply does not help passthrough of the 7900 but I'm also interested whether it will work, and I would prefer to keep the configuration as clean as possible with only the things we actually need.
 
Thx for the input.

I have now tested with alle the options from psyyo:
Code:
# the list of kernel options
:~# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.83-1-pve root=/dev/mapper/lv1--ssd-root ro quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init

#########################################

# there is nothing in the blacklist that should make problems
:~# cat /etc/modprobe.d/*
blacklist nouveau
options kvm ignore_msrs=1
# mdadm module configuration file
# set start_ro=1 to make newly assembled arrays read-only initially,
# to prevent metadata writes.  This is needed in order to allow
# resume-from-disk to work - new boot should not perform writes
# because it will be done behind the back of the system being
# resumed.  See http://bugs.debian.org/415441 for details.

options md_mod start_ro=1
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

#########################################

# make really sure all modules are loaded
:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.


# Generated by sensors-detect on Mon Sep 13 19:15:50 2021
# Chip drivers
nct6775

# hardware passthroug
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

#########################################

# make sure I have the right id of my card
:~# lspci | grep -i vga
05:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 744c (rev c8)

#########################################

# the pve config of my windows machine
:~# cat /etc/pve/qemu-server/13211901.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 12
cpu: host
efidisk0: encrypted-nvme-pool:vm-13211901-disk-1,size=1M
hostpci0: 0000:0c:00,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: pc-q35-7.1
memory: 20480
name: game-win01
net0: e1000=7E:F3:10:92:DD:A3,bridge=vmbr0
numa: 0
ostype: win10
scsi0: encrypted-nvme-pool:vm-13211901-disk-0,discard=on,size=500G
scsihw: virtio-scsi-pci
smbios1: uuid=b32f2bc9-a21f-4288-9ea8-38cc24770be4
sockets: 1
usb0: host=7-1.2,usb3=1
usb1: host=7-1.3,usb3=1
usb2: host=7-1.1,usb3=1
usb3: host=7-1.4,usb3=1
vga: none
vmgenid: a147c51a-355c-480b-ad54-0242de56a8b6

#########################################

# I also testet:
hostpci0: 0000:0c:00,pcie=1
hostpci0: 0000:0c:00,pcie=1,x-vga=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom
hostpci0: 0000:0c:00,pcie=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom

Nothing worked so far.
The only thing changes is, if I use the romfile I have no reset bug. Without the romfile I can only start the windows guest once and I have to reboot the whole host.
Code:
kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.

I also see more and more people who have the same problems as me and had an 6000 serias AMD card running before:
https://www.reddit.com/r/VFIO/comments/zn0zdm/7900xtx_passthrough_pci_reset_seems_not_working/
https://www.reddit.com/r/VFIO/comments/zx4v0h/radeon_rx_7900_xt_gpu_passthrough/

But there are also people who managed to get it work with unraid and arch linux.
I install now a extra disk and install windows bare metal to see if everything works (Currently I do not know if my card has the hotspot problem and I need to send it back anyway)
Later I maybe give arch linux a try, maybe I also try to get a 6.1 kernel which seems to support Resizable BAR in kvm
https://www.reddit.com/r/VFIO/comments/ye0cpj/psa_linux_v61_resizable_bar_support/
 
Last edited:
Ok I have now testet my 7900xtx.
The card works fine under windows (bare metal) and has no hotspot problem (15min stresstest and everything was ok)

I also tried the 6.1 kernel in proxmox which changed nothing on the black screen side but made my mouse wheel super sensitive (10 times the speed). And once my linux vm was not starting. Currently the kernel seems to be a little bit buggy. But the kernel is opt in and I was happy I could test it.

I installed arch linux on an extra disk and setup everything until libvirtd was running (which was a hassle -.- )
I created a test linux vm and passed the 1030 to it -> works fine
Then I switched to the 7900xtx and did some tests with the same vm (romfile, xml changes, ...) -> always a black screen :(

So it could now be my hardware setup (because other people got it running on arch linux), or I had not exact the same host/vm config they had.

Things I can now do:
Get and 4070ti or an 4080 or an 6950 xt (which would suck)
Use the 7900xtx bare metal (would also suck)
Hope that someone figures it out how to get this work in the near future


Maybe I will try it again with arch in the next days but now I'm out of ideas and of motivation.
 
We’re you able to get this to work with the amd gpu? If so can you provide what worked?

Thanks
 
No I never got it running :(
I found a other user where it seem to work.
https://bbs.minisforum.com/threads/proxmox-gpu-passthrough-rx-7900-xt.2572/
I ask for Details and tried the same configuration -> Did also not work for me.

So I asume that it can work if you have the right hardware combination. Or it is something in the config that I'm oversee.
My solution is now bare metal windows and I will build an extra PC for proxmox in the future (This was so frustrating that I now will only use setups in the future where I know this also works after an hardware change).
 
Sorry I can not test it anymore. I already have my 2 PC setup with windows bare metal and proxmox on a seperate machine.
On proxmox I have an older amd gpu (6000 series) passed to my linux vm. It simply worked out of the box.

This setup has also another benefit. The power supply in my windows PC broke, so currently I use my proxmox maschine as fallback which works fine.
 
hello fellas!
i followed many tutorials and also 100% working solutions and went through a lot... none helped me. i still had black screen on monitors....

what helped me were this one (two) things... enabling CMS in the BIOS did the trick!

im still booting UEFI while CMS turned on... but having CMS disabled simply shows black screen. with enabling CMS you also have to disable Resizable BAR (often it deactivates automatically after enabling CMS)...

my setup is:
gigabyte aorus x570 pro rev1 (https://www.gigabyte.com/Motherboard/X570-AORUS-PRO-rev-10#kf)
amd ryzen 7 5800x3d
32G of 2x16G @ cl14 3600 (https://www.gskill.com/product/165/326/1604306130/F4-3600C14D-32GTZN)
sapphire pulse rx 7900 xtx (https://www.sapphiretech.com/en/consumer/pulse-radeon-rx-7900-xtx-24g-gddr6)
many disks (1x 1T pcie4 wd sn850x, 1x 500G pcie3 samsung evo plus 970, 1x 2T pcie3 samsung evo plus 970, 2x old sata ssds)
rtl8125 pcie x1 card (2500mbit network at home...)

the attachments are performance tests made on bare metal and in VM (i allocated all the CPU cores to the VM for the test). the higher score is the bare metal setup test... and i should also mention, that my bare metal setup uses 1T wd sn850x (pcie4 disk) which has almost 50k raw score in the test..., but the VM is installed to my older system disk which is 500G samsung evo plus 970 (pcie3 disk) which has raw score about 28k ... but the proxmox VM is set up with write-back enabled and i must say, that the RAM cache helps the disk a lot here and affects the result of the disk... the VM disk has basically the same score as the sn850x has... but that is affected by the RAM and the write-back.

and i'm not finished yet... i just managed to make it working stable. i'm still not able to install AMD official drivers... im using the drivers windows uses by dfefault and i guess thats not the very best ...

i must also add, that i have been forced to exchange my dual intel i225-lm pcie3 x4 card for the single realtek rtl8125 pcie3 x1 network card, because with it every attempt to passthrough, it failed the host horibly (completely dead... had to do hard reset with pc case hw button). this is (in my opinion) because of pcie lanes. the cpu only has 20+4 pcie lanes... and my setup with 3x nvme, gpu and a x4 networking card was simply too much... even the reboot took like... 2-3 minutes hanging on the bios post screen before it moved to continue the boot process... and it doesn't even work in bare metal windows configured with this many pci-e devices... on bare metal i simply do not use the 500G nvme, i moved it to my Intel NUC when i changed it for the sn850x,but for the sake of testing proxmox and passthrough i returned it to the PC so i dont need to touch my regular disks... after changing the x4 networking for a x1 one, it works both in bare metal and VMs..
 

Attachments

  • 361673611_845835800220987_7085492855734203472_n.png
    361673611_845835800220987_7085492855734203472_n.png
    88.6 KB · Views: 14
  • 362040106_1266592543978023_7459207962681982349_n.png
    362040106_1266592543978023_7459207962681982349_n.png
    448.5 KB · Views: 12
Last edited:
Anyone here found a solution to the atomics problem yet? I posted a thread but have no responses yet...... very frustrating!
 
I have an XFX RX 7800 XT and have the same issue. I tried this on Unraid and have the same issues. Only way I can get it to load is with the bios w/ CSM boot enabled but you can never shutdown or reboot the VM or the whole system crashes and you lose everything.
 
on my system i had to:
  1. make sure to have CSM (compatibility support module) enabled in bios. you will need to install proxmox and boot it in UEFI mode, but the CSM has to be enabled anyways - this did the final trick with the black screen issue you talk about
  2. make sure you properly set up proxmox to do passthrough on your system... (read this: PCI-e passthrough and this: PCI passthrough)
    1. make sure to add vfio, vfio_iommu_type1, vfio_pci, vfio_virqfd modules (one module per new line) to the /etc/modules file and run command: update-initramfs -u -k all afterwards
    2. disable some drivers loading when proxmox is booting. i have it all added to /etc/default/grub(there is very probably more stuff than needed on my system but it should be working for amd, intel and nvidia cards...). by adding all this to this line, you will be always able to just edit this line on GRUB loading screen by pressing E, so that in some case of issues, you can easily just reboot and edit this line bore booting to the OS to be able to troubleshoot it... setting the iommu=off in the GRUB loading screen will effectively disable the passthrough functionality for your next boot...
      1. GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init nofb nomodeset video=simplefb:off video=vesafb:off video=efifb:off video=vesa:off disable_vga=1 vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu,snd_hda_intel,snd_hda_codec_hdmi,i915 vfio-pci.ids=144d:a808,2646:5013,1002:744c,1002:ab30,1022:1487"
        1. this is for AMD CPU. for Intel CPU change the amd_iommu=on to intel_iommu=on.
      2. to find out your pci.ids= to include here, use lspci -nn command. i do a passthrough of my: nvme2, nvme3, 7900xtx gpu, 7900xtx hdmi sound card, and motherboard soundcard
      3. after editing the /etc/default/grub make sure to run command: update-grub
  3. reboot system
  4. when assigning the PCI devices, make sure co use all 3 select buttons for your gpu: Primary GPU, All Functions and PCI-Express. this will add pcie=1,x-vga=1 to the VM config
the CSM is only needed for Windows VMs. for me, linux works without black screen with native UEFI boot mode without CSM enabled.

I am also able to freely reboot my VMs with just one issue. Windows VM, when rebooted looses the motherboard audio card. i need to either reboot the whole host OR i need to boot some linux VM, that uses the audio card too, in between...

I am successfully using my PC with Win10, Win11, and some Arch Linuxes set up with desktop environments... the only limitation is that you need to run only one of the VMs at a time, if they use the same PCI-E devices. but thats understandable...
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!