Hello Proxmox Forum -
I have been battling this issue for quite some time and it is finally at the point that I need to reach out for help. I wrote up details of my current setup / settings and would appreciate anyone taking the time to help out! Thank you in advance!
Objective: Passthru the attached GPU to a Windows 11 Virtual Machine.
References:
https://pve.proxmox.com/wiki/PCI(e)_Passthrough
https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
https://3os.org/infrastructure/proxmox/gpu-passthrough/gpu-passthrough-to-vm/
https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
https://www.reddit.com/r/Proxmox/comments/1118opd/psa_gpu_passthrough_on_single_gpu_systems/
Hardware:
AMD Ryzen 9 7900x
Asus ROG B650E Motherboard
GPU - AMD Sapphire RX 6950 XT
RAM - Corsair 4x 32gb DDR5
Versions:
BIOS is updated to version 1654
Proxmox is upgraded to latest version 8
Windows 11 works no issues - just need to get GPU to passthru
BIOS Settings: IOMMU is Enabled , ROM-Bar is Disabled.
VM Settings:
Currently I am seeing my entire machine crash when I start the machine with 'All Functions' Enabled:
Looking at /etc/default/grub
Kernel Modules are all in etc/modules
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
find /sys/kernel/iommu_groups/ -type l
lspci -nn
details on 0b:00.0:
nano /etc/modprobe.d/pve-blacklist.conf
I have been battling this issue for quite some time and it is finally at the point that I need to reach out for help. I wrote up details of my current setup / settings and would appreciate anyone taking the time to help out! Thank you in advance!
Objective: Passthru the attached GPU to a Windows 11 Virtual Machine.
References:
https://pve.proxmox.com/wiki/PCI(e)_Passthrough
https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
https://3os.org/infrastructure/proxmox/gpu-passthrough/gpu-passthrough-to-vm/
https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
https://www.reddit.com/r/Proxmox/comments/1118opd/psa_gpu_passthrough_on_single_gpu_systems/
Hardware:
AMD Ryzen 9 7900x
Asus ROG B650E Motherboard
GPU - AMD Sapphire RX 6950 XT
RAM - Corsair 4x 32gb DDR5
Versions:
BIOS is updated to version 1654
Proxmox is upgraded to latest version 8
Windows 11 works no issues - just need to get GPU to passthru
BIOS Settings: IOMMU is Enabled , ROM-Bar is Disabled.
VM Settings:
Currently I am seeing my entire machine crash when I start the machine with 'All Functions' Enabled:
Looking at /etc/default/grub
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on pcie_acs_override=downstream,multifunction video=efifb:off video=vesa:off vfio-pci.ids=1002:164e,1002:1640,1022:1649,1022:15b6,1022:15b7 vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 mos=1 modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu"
# Also tried the line below and left it commented as I am using the line above
# GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on initcall_blacklist=sysfb_init amd_iommu=on initcall_blacklist=sysfb_init amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off,efifb:off"
GRUB_CMDLINE_LINUX=""
Kernel Modules are all in etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
Code:
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[ 0.094343] AMD-Vi: Unknown option - 'on'
[ 0.223018] AMD-Vi: Using global IVHD EFR:0x246577efa2254afa, EFR2:0x0
[ 0.459382] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 0.461462] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 0.461463] AMD-Vi: Extended features (0x246577efa2254afa, 0x0): PPR NX GT [5] IA GA PC GA_vAPIC
[ 0.461467] AMD-Vi: Interrupt remapping enabled
[ 0.544277] AMD-Vi: Virtual APIC enabled
[ 0.544675] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank)
find /sys/kernel/iommu_groups/ -type l
Code:
/sys/kernel/iommu_groups/17/devices/0000:02:0c.0
/sys/kernel/iommu_groups/7/devices/0000:00:08.1
/sys/kernel/iommu_groups/25/devices/0000:0b:00.1
/sys/kernel/iommu_groups/15/devices/0000:02:0a.0
/sys/kernel/iommu_groups/5/devices/0000:00:04.0
/sys/kernel/iommu_groups/23/devices/0000:0a:00.0
/sys/kernel/iommu_groups/13/devices/0000:02:08.0
/sys/kernel/iommu_groups/3/devices/0000:00:02.2
/sys/kernel/iommu_groups/21/devices/0000:08:00.0
/sys/kernel/iommu_groups/11/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/28/devices/0000:0b:00.4
/sys/kernel/iommu_groups/18/devices/0000:02:0d.0
/sys/kernel/iommu_groups/8/devices/0000:00:08.3
/sys/kernel/iommu_groups/26/devices/0000:0b:00.2
/sys/kernel/iommu_groups/16/devices/0000:02:0b.0
/sys/kernel/iommu_groups/6/devices/0000:00:08.0
/sys/kernel/iommu_groups/24/devices/0000:0b:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:09.0
/sys/kernel/iommu_groups/4/devices/0000:00:03.0
/sys/kernel/iommu_groups/22/devices/0000:09:00.0
/sys/kernel/iommu_groups/12/devices/0000:02:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.1
/sys/kernel/iommu_groups/20/devices/0000:07:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:18.3
/sys/kernel/iommu_groups/10/devices/0000:00:18.1
/sys/kernel/iommu_groups/10/devices/0000:00:18.6
/sys/kernel/iommu_groups/10/devices/0000:00:18.4
/sys/kernel/iommu_groups/10/devices/0000:00:18.2
/sys/kernel/iommu_groups/10/devices/0000:00:18.0
/sys/kernel/iommu_groups/10/devices/0000:00:18.7
/sys/kernel/iommu_groups/10/devices/0000:00:18.5
/sys/kernel/iommu_groups/29/devices/0000:0c:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/19/devices/0000:06:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:14.3
/sys/kernel/iommu_groups/9/devices/0000:00:14.0
/sys/kernel/iommu_groups/27/devices/0000:0b:00.3
lspci -nn
Code:
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14d8]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Device [1022:14d9]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]
00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 71)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e0]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e1]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e2]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e3]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e4]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e5]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e6]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e7]
01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f4] (rev 01)
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:0b.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
02:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
06:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
07:00.0 Network controller [0280]: MEDIATEK Corp. MT7921K (RZ608) Wi-Fi 6E 80MHz [14c3:0608]
08:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f7] (rev 01)
09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f6] (rev 01)
0a:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
0b:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c2)
0b:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
0b:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] VanGogh PSP/CCP [1022:1649]
0b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b6]
0b:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b7]
0c:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b8]
details on 0b:00.0:
Code:
0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raphael (rev c2) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Raphael
Flags: bus master, fast devsel, latency 0, IRQ 62, IOMMU group 24
Memory at fce0000000 (64-bit, prefetchable) [size=256M]
Memory at fcf0000000 (64-bit, prefetchable) [size=2M]
I/O ports at f000 [size=256]
Memory at fc900000 (32-bit, non-prefetchable) [size=512K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
Capabilities: [c0] MSI-X: Enable- Count=4 Masked-
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [450] Lane Margining at the Receiver <?>
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
nano /etc/modprobe.d/pve-blacklist.conf
Code:
blacklist nvidiafb
blacklist nvidia
blacklist radeon
blacklist nouveau