AMD 7900 XT or XTX pci passthrough

azmosult

Member
Jan 25, 2023
8
0
6
Hi

I have currently a well working proxmox (7.3 I think -> can confirm this later) setup with 2 main VMs.

Setup:
Debian 11 Proxmox Server
Windows 10 pro 64 bit -> Mostly for Gaming
Ubuntu 22.04 64 bit -> For working stuff
Some other VMs

Hardware:
ASUS TUF Gaming X570-Plus Mainboard Sockel AM4
AMD Ryzen CPU
Nvidia 980ti
Nvidia 1030 (I think -> can confirm this later)
Some ECC RAM
Bunch of disks

I have pci passthrough running for both cards. The 980ti goes to the Windows VM and the 1030 to the Linux VM.
This setup runs now for about a year and just works great.


I got now a new amd 7900 xtx grapic card and wanted to swap out the 980ti.
As far as I can see the passthrough to the windows VM works but I only get an n black screen.
I can connect to windows via rdesktop and see the graphics card in the device manger but with an error code 43.
I tried to install the amd drivers for the card but nothing changed.

I tried now many things to get this setup to work but had no sucess:
1.) make sure all the kernel pci passthrough stuff is set:
Code:
video=efifb:off, amd_iommu=on, iommu=pt, ...
2.) Set the VM to the q35-7.1
3.) Downloaded the vbios from the card under linux (echo 1 > rom, ...) and windows (GPU-Z) (the linux variant had about 128KB and the GPU-Z variant about 2MB)
4.) Used the vbios rom file for the vm (romfile=vbios.rom)
5.) Tested dfferent combinations of the passthrough options from the GUI
6.) Read and found that someone got it woking in unraid https://forums.unraid.net/topic/132895-radeon-rx-7900-xt-passthrough/ -> Its also kvm so it shoult at least be possible
...

Nothing of that worked and I'm nearly out of ideas now.
The last Ideas I have:
I just saw that 'Resizable BAR'/'Smart Access Memory' can be a problem with amd cards -> have to test that
I can update to a newer bios version on the host
I can try to remove the 1030 card and try only the 7900xtx (the passthrough of the 1030 to the linux vm still works flawless)


Has someone a working passthrough on proxmox with one of the new amd cards (7900 xt or xtx) running?
Has someone ideas what I could have missed?
This setup has so much benefits for me I really do not want to step back to bare metal.
 
Last edited:
Ok I have Proxmox 7.3-4
And the second GPU is an Nvidia geforce gt 1030

I have now updated to the latest bios
In bios svm and iommu is enabled
I made sure the 'Resizable BAR'/'Smart Access Memory' in bios is desabled
After the bios update I have testet if I can passthrough the 1030 -> yes works flawless

I also tested to remove the 1030 from the system and only use the 7900xtx in the system
All my tests failed again.

But I saw one change if I start the VM multible times without the rom file.
Sometimes I get the Proxmox boot logo on the screen and the animated Windows boot circle dots -> so I see something from windows
When windows is done with booting the upper half of the screen turns black and the lower half shows the remains of the Proxmox boot logo


Now I'm out of ideas.
I don't even know where I could start the debug process (logs on the host look ok)
Maybe its my hardware setup but how can I test this?
 
if the 7900 is anything like the 6900, my settings might help with what you're trying to do.

ok
disabling Resizable BAR - good first step.


in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"

in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files

in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
obviously address will differ, just pointing out flags i use

Point of these settings is that AMD hardware virtualizes much cleaner than Nvidia, and all those weird tricks and workarounds actually hinder the integration.
 
Please allow me to give some observations about this working passthrough setup.
in /etc/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init"
amd_iommu=on does nothing because it is on by default.
iommu=pt probably does nothing, please let me know if it makes a difference for anything for you (or anybody else, I'm really curious but heard nothing so far).
initcall_blacklist=sysfb_init is a work-around if you have only one GPU, but I would not expect that you need that since the 6000-series are supposed to reset properly and you ought to be able to let Proxmox load amdgpu as normal. Do you really need it?
nofb nomodeset will prevent amdgpu from using the GPU during boot (like the one above but also other (non-boot) GPUs) and, like above, I don't expect you to need it with a 6900. Do you really need it?
in /etc/modprobe.d/
make sure to NOT have blacklist radeon set in any .conf files
radeon is for much older AMD GPUs and does not touch newer GPUs, amdgpu is used for 6000- and 7000-series. Does this make a difference for your setup?
in /etc/pve/qemu-server/
your VMs .conf file - the actual GPU entry hostpci0: 0000:0c:00,pcie=1,x-vga=1
Primary GPU (x-vga=1) is a fix for NVidia GPUs. For AMD GPUs, it should be enough to set Display to None.

I'm sorry that my reply does not help passthrough of the 7900 but I'm also interested whether it will work, and I would prefer to keep the configuration as clean as possible with only the things we actually need.
 
Thx for the input.

I have now tested with alle the options from psyyo:
Code:
# the list of kernel options
:~# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.83-1-pve root=/dev/mapper/lv1--ssd-root ro quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init

#########################################

# there is nothing in the blacklist that should make problems
:~# cat /etc/modprobe.d/*
blacklist nouveau
options kvm ignore_msrs=1
# mdadm module configuration file
# set start_ro=1 to make newly assembled arrays read-only initially,
# to prevent metadata writes.  This is needed in order to allow
# resume-from-disk to work - new boot should not perform writes
# because it will be done behind the back of the system being
# resumed.  See http://bugs.debian.org/415441 for details.

options md_mod start_ro=1
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

#########################################

# make really sure all modules are loaded
:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.


# Generated by sensors-detect on Mon Sep 13 19:15:50 2021
# Chip drivers
nct6775

# hardware passthroug
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

#########################################

# make sure I have the right id of my card
:~# lspci | grep -i vga
05:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 744c (rev c8)

#########################################

# the pve config of my windows machine
:~# cat /etc/pve/qemu-server/13211901.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 12
cpu: host
efidisk0: encrypted-nvme-pool:vm-13211901-disk-1,size=1M
hostpci0: 0000:0c:00,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: pc-q35-7.1
memory: 20480
name: game-win01
net0: e1000=7E:F3:10:92:DD:A3,bridge=vmbr0
numa: 0
ostype: win10
scsi0: encrypted-nvme-pool:vm-13211901-disk-0,discard=on,size=500G
scsihw: virtio-scsi-pci
smbios1: uuid=b32f2bc9-a21f-4288-9ea8-38cc24770be4
sockets: 1
usb0: host=7-1.2,usb3=1
usb1: host=7-1.3,usb3=1
usb2: host=7-1.1,usb3=1
usb3: host=7-1.4,usb3=1
vga: none
vmgenid: a147c51a-355c-480b-ad54-0242de56a8b6

#########################################

# I also testet:
hostpci0: 0000:0c:00,pcie=1
hostpci0: 0000:0c:00,pcie=1,x-vga=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom
hostpci0: 0000:0c:00,pcie=1,romfile=extra/asus_7900xtx_navi_31_2023-01-24.rom

Nothing worked so far.
The only thing changes is, if I use the romfile I have no reset bug. Without the romfile I can only start the windows guest once and I have to reboot the whole host.
Code:
kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.

I also see more and more people who have the same problems as me and had an 6000 serias AMD card running before:
https://www.reddit.com/r/VFIO/comments/zn0zdm/7900xtx_passthrough_pci_reset_seems_not_working/
https://www.reddit.com/r/VFIO/comments/zx4v0h/radeon_rx_7900_xt_gpu_passthrough/

But there are also people who managed to get it work with unraid and arch linux.
I install now a extra disk and install windows bare metal to see if everything works (Currently I do not know if my card has the hotspot problem and I need to send it back anyway)
Later I maybe give arch linux a try, maybe I also try to get a 6.1 kernel which seems to support Resizable BAR in kvm
https://www.reddit.com/r/VFIO/comments/ye0cpj/psa_linux_v61_resizable_bar_support/
 
Last edited:
Ok I have now testet my 7900xtx.
The card works fine under windows (bare metal) and has no hotspot problem (15min stresstest and everything was ok)

I also tried the 6.1 kernel in proxmox which changed nothing on the black screen side but made my mouse wheel super sensitive (10 times the speed). And once my linux vm was not starting. Currently the kernel seems to be a little bit buggy. But the kernel is opt in and I was happy I could test it.

I installed arch linux on an extra disk and setup everything until libvirtd was running (which was a hassle -.- )
I created a test linux vm and passed the 1030 to it -> works fine
Then I switched to the 7900xtx and did some tests with the same vm (romfile, xml changes, ...) -> always a black screen :(

So it could now be my hardware setup (because other people got it running on arch linux), or I had not exact the same host/vm config they had.

Things I can now do:
Get and 4070ti or an 4080 or an 6950 xt (which would suck)
Use the 7900xtx bare metal (would also suck)
Hope that someone figures it out how to get this work in the near future


Maybe I will try it again with arch in the next days but now I'm out of ideas and of motivation.
 
We’re you able to get this to work with the amd gpu? If so can you provide what worked?

Thanks
 
No I never got it running :(
I found a other user where it seem to work.
https://bbs.minisforum.com/threads/proxmox-gpu-passthrough-rx-7900-xt.2572/
I ask for Details and tried the same configuration -> Did also not work for me.

So I asume that it can work if you have the right hardware combination. Or it is something in the config that I'm oversee.
My solution is now bare metal windows and I will build an extra PC for proxmox in the future (This was so frustrating that I now will only use setups in the future where I know this also works after an hardware change).
 
Sorry I can not test it anymore. I already have my 2 PC setup with windows bare metal and proxmox on a seperate machine.
On proxmox I have an older amd gpu (6000 series) passed to my linux vm. It simply worked out of the box.

This setup has also another benefit. The power supply in my windows PC broke, so currently I use my proxmox maschine as fallback which works fine.
 
hello fellas!
i followed many tutorials and also 100% working solutions and went through a lot... none helped me. i still had black screen on monitors....

what helped me were this one (two) things... enabling CMS in the BIOS did the trick!

im still booting UEFI while CMS turned on... but having CMS disabled simply shows black screen. with enabling CMS you also have to disable Resizable BAR (often it deactivates automatically after enabling CMS)...

my setup is:
gigabyte aorus x570 pro rev1 (https://www.gigabyte.com/Motherboard/X570-AORUS-PRO-rev-10#kf)
amd ryzen 7 5800x3d
32G of 2x16G @ cl14 3600 (https://www.gskill.com/product/165/326/1604306130/F4-3600C14D-32GTZN)
sapphire pulse rx 7900 xtx (https://www.sapphiretech.com/en/consumer/pulse-radeon-rx-7900-xtx-24g-gddr6)
many disks (1x 1T pcie4 wd sn850x, 1x 500G pcie3 samsung evo plus 970, 1x 2T pcie3 samsung evo plus 970, 2x old sata ssds)
rtl8125 pcie x1 card (2500mbit network at home...)

the attachments are performance tests made on bare metal and in VM (i allocated all the CPU cores to the VM for the test). the higher score is the bare metal setup test... and i should also mention, that my bare metal setup uses 1T wd sn850x (pcie4 disk) which has almost 50k raw score in the test..., but the VM is installed to my older system disk which is 500G samsung evo plus 970 (pcie3 disk) which has raw score about 28k ... but the proxmox VM is set up with write-back enabled and i must say, that the RAM cache helps the disk a lot here and affects the result of the disk... the VM disk has basically the same score as the sn850x has... but that is affected by the RAM and the write-back.

and i'm not finished yet... i just managed to make it working stable. i'm still not able to install AMD official drivers... im using the drivers windows uses by dfefault and i guess thats not the very best ...

i must also add, that i have been forced to exchange my dual intel i225-lm pcie3 x4 card for the single realtek rtl8125 pcie3 x1 network card, because with it every attempt to passthrough, it failed the host horibly (completely dead... had to do hard reset with pc case hw button). this is (in my opinion) because of pcie lanes. the cpu only has 20+4 pcie lanes... and my setup with 3x nvme, gpu and a x4 networking card was simply too much... even the reboot took like... 2-3 minutes hanging on the bios post screen before it moved to continue the boot process... and it doesn't even work in bare metal windows configured with this many pci-e devices... on bare metal i simply do not use the 500G nvme, i moved it to my Intel NUC when i changed it for the sn850x,but for the sake of testing proxmox and passthrough i returned it to the PC so i dont need to touch my regular disks... after changing the x4 networking for a x1 one, it works both in bare metal and VMs..
 

Attachments

  • 361673611_845835800220987_7085492855734203472_n.png
    361673611_845835800220987_7085492855734203472_n.png
    88.6 KB · Views: 16
  • 362040106_1266592543978023_7459207962681982349_n.png
    362040106_1266592543978023_7459207962681982349_n.png
    448.5 KB · Views: 14
Last edited:
Anyone here found a solution to the atomics problem yet? I posted a thread but have no responses yet...... very frustrating!
 
I have an XFX RX 7800 XT and have the same issue. I tried this on Unraid and have the same issues. Only way I can get it to load is with the bios w/ CSM boot enabled but you can never shutdown or reboot the VM or the whole system crashes and you lose everything.
 
on my system i had to:
  1. make sure to have CSM (compatibility support module) enabled in bios. you will need to install proxmox and boot it in UEFI mode, but the CSM has to be enabled anyways - this did the final trick with the black screen issue you talk about
  2. make sure you properly set up proxmox to do passthrough on your system... (read this: PCI-e passthrough and this: PCI passthrough)
    1. make sure to add vfio, vfio_iommu_type1, vfio_pci, vfio_virqfd modules (one module per new line) to the /etc/modules file and run command: update-initramfs -u -k all afterwards
    2. disable some drivers loading when proxmox is booting. i have it all added to /etc/default/grub(there is very probably more stuff than needed on my system but it should be working for amd, intel and nvidia cards...). by adding all this to this line, you will be always able to just edit this line on GRUB loading screen by pressing E, so that in some case of issues, you can easily just reboot and edit this line bore booting to the OS to be able to troubleshoot it... setting the iommu=off in the GRUB loading screen will effectively disable the passthrough functionality for your next boot...
      1. GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init nofb nomodeset video=simplefb:off video=vesafb:off video=efifb:off video=vesa:off disable_vga=1 vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu,snd_hda_intel,snd_hda_codec_hdmi,i915 vfio-pci.ids=144d:a808,2646:5013,1002:744c,1002:ab30,1022:1487"
        1. this is for AMD CPU. for Intel CPU change the amd_iommu=on to intel_iommu=on.
      2. to find out your pci.ids= to include here, use lspci -nn command. i do a passthrough of my: nvme2, nvme3, 7900xtx gpu, 7900xtx hdmi sound card, and motherboard soundcard
      3. after editing the /etc/default/grub make sure to run command: update-grub
  3. reboot system
  4. when assigning the PCI devices, make sure co use all 3 select buttons for your gpu: Primary GPU, All Functions and PCI-Express. this will add pcie=1,x-vga=1 to the VM config
the CSM is only needed for Windows VMs. for me, linux works without black screen with native UEFI boot mode without CSM enabled.

I am also able to freely reboot my VMs with just one issue. Windows VM, when rebooted looses the motherboard audio card. i need to either reboot the whole host OR i need to boot some linux VM, that uses the audio card too, in between...

I am successfully using my PC with Win10, Win11, and some Arch Linuxes set up with desktop environments... the only limitation is that you need to run only one of the VMs at a time, if they use the same PCI-E devices. but thats understandable...
 
Last edited:
STATUS: Proxmox 8.3.0: As of today AMD 7900XTX does not work - When I pass it to a VM, it looks fine. lspci shows the Card. While installing amd rocm drivers, whole vm crashes. A restart of the VM afterwards is not possible until restart of whole proxmox server. After installing the drivers (usecase llm inference), VM reproducably crashes on traffic to the gpu.

Has anyone a hassle free working setup?
 
  • Like
Reactions: pcuci
fed this thread, this reddit thread, and this level1tech thread to o4-mini-high and asked it for insights

I can't tell you exactly which of the following changes is the magic bullet, but I got AMD Radeon 7900 XTX working via pass through on 8.3 PVE

Code:
proxmox-boot-tool kernel pin 6.14.0-2-pve
proxmox-boot-tool refresh
reboot
# After reboot:
uname -r  # should show: 6.14.0-2-pve


Edited /etc/default/grub and set:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pci_vendor_reset=1 pcie_acs_override=downstream,multifunction nofb nomodeset initcall_blacklist=sysfb_init video=vesafb:off video=efifb:off video=vesa:off video=simplefb:off"

Followed by
Code:
update-grub


/etc/modprobe.d/blacklist-graphics.conf:
Code:
blacklist amdgpu
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist i40evf

VFIO & KVM module options​


Created these files

/etc/modprobe.d/iommu_unsafe_interrupts.conf:

Code:
options vfio_iommu_type1 allow_unsafe_interrupts=1

/etc/modprobe.d/kvm.conf:
Code:
options kvm ignore_msrs=1

Rebuilt initramfs & reboot
Code:
update-initramfs -u -k all
reboot

My vm config looks like:
Code:
❯ qm config 4000
bios: ovmf
boot: order=ide2
cores: 8
cpu: host
efidisk0: local-lvm:vm-4000-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:03:00.0,pcie=1,x-vga=1,rombar=0
ide2: none,media=cdrom
machine: q35
memory: 24576
meta: creation-qemu=9.0.0,ctime=1744238629
name: ubuntu-gpu
net0: virtio=BC:24:11:5C:72:7E,bridge=vmbr0
onboot: 1
scsi0: local-lvm:vm-4000-disk-1,size=500G
scsihw: virtio-scsi-pci
smbios1: uuid=15831d3f-afcb-4d96-90e0-f1ad0dbc7907
vga: none
vmgenid: e84cbb6a-52d6-40c6-a9e8-ef9f6a7b5e3f

IMPORTANT: notice the absence of hostpci1 for passing through HDMI audio alongside. Whenever I tried to add it I would end up in an IRQ storm. For my at-home ollama inference use-case, that's more than fine. YMMV


On the VM, an Ubuntu 24.04 in my case, I have to sometimes run
Code:
sudo modprobe -r amdgpu
to reload the driver

I knew things were working when I saw:
Code:
❯ ls -l /dev/dri
total 0
drwxr-xr-x  2 root root         80 Apr 18 21:19 by-path
crw-rw----+ 1 root video  226,   0 Apr 18 21:19 card0
crw-rw----+ 1 root render 226, 128 Apr 18 21:19 renderD128

Hope this helps, good luck!

My specific GPU is: SAPPHIRE NITRO+ AMD RadeonTM RX 7900 XTX Vapor-X 24GB GDDR6 DUAL HDMI / DUAL DP / 11322-01-40G, which incidentally is the least expensive card I could find these days at the 24GB size

and https://ollama.com/blog/amd-preview works great so far (I run it inside docker [had to use the render gid instead of the group name on this line], inside the vm)

But, I need to be careful, as when I put too large a model on it it's the PVE node, not just the VM, that blows up and reboots. :p
 
Last edited:
But, I need to be careful, as when I put too large a model on it it's the PVE node, not just the VM, that blows up and reboots. :p
I was about to pull the trigger on 7900 XTX until I saw this line. It will blow up the whole PVE? WTF?
Btw, I saw someone said 7900 xtx passthrough to windows 11 has issues. Do you observe similar situation?
 
since I posted, nothing has crashed, @frankmanzhu - that's almost 2 weeks of 100% uptime

I also didn't stress the setup with oversized models, so YMMV :)

The new qwen3 model that just got out is pretty great, I loaded the 20GiB qwen3:32b-q4_K_M version, and things are sailing smoothly

you do need the latest ollama though, since the architecture differs from the previous models; just a heads up

my reasoning is that things are only going to improve on the AMD side, so looking forward to even more self-hostable models. So far, having great fun experimenting

PS: I'm passing through to Ubuntu 24.04; haven't tried Windows as I'm mostly interested in inference workloads over my LAN/tailnet
 
Last edited:
  • Like
Reactions: frankmanzhu
Hot dayum... :cool:
Had to register to the forum to comment on pcuci post.

I've been tinkering with my main computer wanting to try passthrough, specifically for GPUs.
I got passthrough to work after a couple of hours of tinkering with Proxmox VE 8.4.
I've tried on Windows 11, Fedora KDE 42.
I've had issues when switching between passthrough to the VMs, resulting in no signal to the monitor.
The only way to fix the issue, was to reboot the host.
I did see several posts about vendor reset was an issue.
I decided to check out Ubuntu today, but for the life of me I couldn't get the passthrough to work.
That was until I stumbled upon this post and tried out the settings.
Switching between the VMs no longer requires host reboots, and finally got Ubuntu passthrough to work.

THIS IS AWESOME!

Thanks for posting this...

Specs on system:
AMD 3950x Ryzen CPU
X570 Aorus Master
Radeon PowerColor 7900 XTX GPU