Encountering issues with passing through a 7900XTX in PVE 8.0.

promise

New Member
Aug 27, 2023
6
0
1
After attempting to pass through the 7900XTX to Windows in PVE 8.0 and encountering screen distortion, I spent a considerable amount of time troubleshooting with no success. Subsequently, I used rom-parser to validate the VBIOS extracted using GPU-Z to determine if it was of "type 3." However, the output indicated it was "type 0." Following this, I attempted to export directly from PVE, as well as using a VBIOS downloaded from techpowerup, but the issue persisted. Currently, I'm attempting single GPU passthrough, which necessitates a functional VBIOS. Could you please provide guidance on potential solutions?

1693151802303.png
 
After attempting to pass through the 7900XTX to Windows in PVE 8.0 and encountering screen distortion, I spent a considerable amount of time troubleshooting with no success.
Did you try this work-around? Did you disable Resizable BAR (SMA/CMA) in the motherboard BIOS?
I don't think ROM patching will fix screen distortion or is necessary with your GPU, but I could be wrong. What kind of screen distortion are we talking about?
 
Did you try this work-around? Did you disable Resizable BAR (SMA/CMA) in the motherboard BIOS?
I don't think ROM patching will fix screen distortion or is necessary with your GPU, but I could be wrong. What kind of screen distortion are we talking about?
Currently, after testing with the addition of "
Code:
initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off
" the issue persists. Upon the initial startup of PVE, the virtual machine displays distortion on the screen. After stopping the virtual machine and attempting to restart, the monitor shows no output.
Currently, the GRUB boot parameters are set as follows:
Code:
Currently, the GRUB boot parameters are set as follows: "GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off"."
 
Last edited:
Currently, after testing with the addition of "
Code:
initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off
" the issue persists. Upon the initial startup of PVE, the virtual machine displays distortion on the screen. After stopping the virtual machine and attempting to restart, the monitor shows no output.
Currently, the GRUB boot parameters are set as follows:
Code:
Currently, the GRUB boot parameters are set as follows: "GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off"."
amd_iommu=on does nothing as it is on by default. Also video=efifb:off video=simplefb:off don't do anything on Proxmox. nofb nomodeset probably don't do anything when also using initcall_blacklist=sysfb_init.
What is the output of cat /proc/cmdline? Just to double check you parameters are active, as it looks like initcall_blacklist=sysfb_init is not working.
 
amd_iommu=on does nothing as it is on by default. Also video=efifb:off video=simplefb:off don't do anything on Proxmox. nofb nomodeset probably don't do anything when also using initcall_blacklist=sysfb_init.
What is the output of cat /proc/cmdline? Just to double check you parameters are active, as it looks like initcall_blacklist=sysfb_init is not working.
root@pve:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-6.2.16-3-pve root=/dev/mapper/pve-root ro quiet iommu=pt nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off
Take a look at the output and compare it with the GRUB configuration parameters.
vm conf
Code:
bios: ovmf
boot: order=hostpci1
cores: 16
cpu: host
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:2f:00,pcie=1,x-vga=1,romfile=7900XTX.rom
hostpci1: 0000:22:00,pcie=1
ide2: none,media=cdrom
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1693237235
name: Win11
net0: virtio=CA:23:5D:FA:4C:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=8d73030d-9061-4330-9954-1afeec9a92a9
sockets: 1
tpmstate0: local-lvm:vm-100-disk-1,size=4M,version=v2.0
vmgenid: 932c3ffc-b585-4f12-be79-d576910f353e
 
root@pve:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-6.2.16-3-pve root=/dev/mapper/pve-root ro quiet iommu=pt nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off
Take a look at the output and compare it with the GRUB configuration parameters.
Appears to match. Sometimes people edit GRUB many times before figuring out that systemd-boot is used instead.
vm conf
Code:
bios: ovmf
boot: order=hostpci1
cores: 16
cpu: host
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:2f:00,pcie=1,x-vga=1,romfile=7900XTX.rom
hostpci1: 0000:22:00,pcie=1
ide2: none,media=cdrom
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1693237235
name: Win11
net0: virtio=CA:23:5D:FA:4C:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=8d73030d-9061-4330-9954-1afeec9a92a9
sockets: 1
tpmstate0: local-lvm:vm-100-disk-1,size=4M,version=v2.0
vmgenid: 932c3ffc-b585-4f12-be79-d576910f353e
Please don't use Primary GPU (x-vga=1) as it is intended as a work-around for NVidia GPUs, but I don't expect that to fix the corruption. I also do not expect you to need a ROM-file, and I would expect a Display: None (vga: none) to make sure the VM uses the GPU, but those probably also don't fix the corruption.
Maybe try running echo 0 | tee /sys/class/vtconsole/vtcon*/bind before starting the VM to close all active consoles? That might fix the corruption.
Do you have the amdgpu driver load for the GPU on Proxmox, or did you blacklist it and/or early bind to vfio-pci in /etc/modprobe.d/?
 
Appears to match. Sometimes people edit GRUB many times before figuring out that systemd-boot is used instead.

Please don't use Primary GPU (x-vga=1) as it is intended as a work-around for NVidia GPUs, but I don't expect that to fix the corruption. I also do not expect you to need a ROM-file, and I would expect a Display: None (vga: none) to make sure the VM uses the GPU, but those probably also don't fix the corruption.
Maybe try running echo 0 | tee /sys/class/vtconsole/vtcon*/bind before starting the VM to close all active consoles? That might fix the corruption.
Do you have the amdgpu driver load for the GPU on Proxmox, or did you blacklist it and/or early bind to vfio-pci in /etc/modprobe.d/?
Code:
root@pve:~# cat /etc/modprobe.d/*
blacklist nouveau
blacklist radeon
blacklist amdgpu
options kvm ignore_msrs=1
# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
options vfio-pci ids=1002:744c,1002:ab30  disable_vga=1
The above is the configuration in /etc/modprobe.d/*.


Entering echo 0 | tee /sys/class/vtconsole/vtcon*/bind before starting the VM results in an error.

Code:
swtpm_setup: Not overwriting existing state file.
kvm: ../hw/pci/pci.c:1613: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
stopping swtpm instance (pid 6067) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1
dmesg
[ 484.031681] vfio-pci 0000:2f:00.0: Unable to change power state from D3cold to D0, device inaccessible
 
Code:
root@pve:~# cat /etc/modprobe.d/*
blacklist nouveau
blacklist radeon
blacklist amdgpu
options kvm ignore_msrs=1
# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
options vfio-pci ids=1002:744c,1002:ab30  disable_vga=1
The above is the configuration in /etc/modprobe.d/*.
You blacklist amdgpu but not other drivers that are used by the device like snd_hda_intel and maybe some USB?
Entering echo 0 | tee /sys/class/vtconsole/vtcon*/bind before starting the VM results in an error.

Code:
swtpm_setup: Not overwriting existing state file.
kvm: ../hw/pci/pci.c:1613: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
stopping swtpm instance (pid 6067) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1
dmesg
[ 484.031681] vfio-pci 0000:2f:00.0: Unable to change power state from D3cold to D0, device inaccessible
I did really not expect that. Is that after a fresh reboot? Did you change other things?

Can you try not blacklisting amdgpu and also not binding 1002:744c and 1002:ab30 to vfio-pci, remove nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off from /etc/default/grub, update-initramfs -u, update-grub, reboot, run echo 0 | tee /sys/class/vtconsole/vtcon*/bind and start the VM? Please also don't use x-vga and romfile in the VM configuration for this test.

If that fails, then early bind all functions of the GPU (lspci -nnks 2f:00) to vfio-pci but also add softdep amdgpu pre: vfio-pci and a softdep for all other drivers used by the functions of the GPU to make sure vfio-pci is loaded first and the actual drivers don't tough it. Also add initcall_blacklist=sysfb_init back to /etc/default/grub, update-initramfs -u, update-grub, reboot, and start VM.
 
You blacklist amdgpu but not other drivers that are used by the device like snd_hda_intel and maybe some USB?

I did really not expect that. Is that after a fresh reboot? Did you change other things?

Can you try not blacklisting amdgpu and also not binding 1002:744c and 1002:ab30 to vfio-pci, remove nofb nomodeset initcall_blacklist=sysfb_init video=efifb:off video=simplefb:off from /etc/default/grub, update-initramfs -u, update-grub, reboot, run echo 0 | tee /sys/class/vtconsole/vtcon*/bind and start the VM? Please also don't use x-vga and romfile in the VM configuration for this test.

If that fails, then early bind all functions of the GPU (lspci -nnks 2f:00) to vfio-pci but also add softdep amdgpu pre: vfio-pci and a softdep for all other drivers used by the functions of the GPU to make sure vfio-pci is loaded first and the actual drivers don't tough it. Also add initcall_blacklist=sysfb_init back to /etc/default/grub, update-initramfs -u, update-grub, reboot, and start VM.
I have just tried configuring the display as VMware-compatible and then started the virtual machine. In the console, when the VBIOS of the 7900XTX is not loaded, the Device Manager reports code 43. Loading the VBIOS exported from GPU-Z results in the driver functioning normally, but the display is unable to show properly when a monitor is connected.
 
I have just tried configuring the display as VMware-compatible and then started the virtual machine. In the console, when the VBIOS of the 7900XTX is not loaded, the Device Manager reports code 43. Loading the VBIOS exported from GPU-Z results in the driver functioning normally, but the display is unable to show properly when a monitor is connected.
Code:
bios: ovmf
boot: 
cores: 16
cpu: host
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:22:00,pcie=1
hostpci1: 0000:2f:00,pcie=1,x-vga=1,romfile=7900XTX.rom
machine: pc-q35-8.0
memory: 16384
meta: creation-qemu=8.0.2,ctime=1693237235
name: Win11
net0: virtio=CA:23:5D:FA:4C:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=8d73030d-9061-4330-9954-1afeec9a92a9
sockets: 1
tpmstate0: local-lvm:vm-100-disk-1,size=4M,version=v2.0
vga: vmware
vmgenid: 932c3ffc-b585-4f12-be79-d576910f353e
1693245465105.pngLoading the VBIOS