Hello, I've gone through so many posts and I feel like I am losing it. Sorry for being 'another gpu passthrough' thread. I tried for two days at about 20 hours total trying to figure this out, and I have made it work before on a much older system with a much older GPU...
It seems like the BIOS and proxmox system see the GPU, but for whatever reason, I can't get the Windows system to boot when I enable the GPU. Without the PCI/GPU added in the hardware tab, system boots fine, with the mapping it seems to load about 50% of the RAM and freezes, can't connect via RDP or console but removing mapping immediately fixes the issue.
What's even more odd: I configured this on a OLDDDD desktop (i7-7700k that already has a 1080ti in it) and an older version of proxmox (8.1.4), and the GPU works flawlessly(same VM, it was backed up, transferred, and restored, i recreated the hardware linkages), I even went as far as to copy the configs that I had set up on my old system (changing the hardware id's and such).
Anyway, the system is the Minisforum MS-01 intel i-9 13900 with the "newer" 1.26 BIOS, it's running proxmox 8.3.0, the GPU is an Nvidia RTX 4000 SFF Ada, the host VM is Windows 11.
Here are a bunch of configs I've seen being asked for. My goal is that someone has used the MS-01 and set up a DGPU (not igpu) for passthrough fully to a VM. I have been using this on the old hardware as an AI system, but really want to migrate it to the new hardware. Any help is appreciated. I've used a bunch of guides, but went back and rolled back a lot of the configs to try and match these two:
Configs:
~# pveversion
# vi /etc/default/grub
~# cat /etc/modules
~# lspci -v
~# find /sys/kernel/iommu_groups/ -type l | grep -e 16 -e 17
:~# cat /etc/modprobe.d/blacklist.conf
~# dmesg | grep -e DMAR -e IOMMU
~# qm config 199
~# cat /etc/modprobe.d/vfio.conf (the two id's are the GPU)
It seems like the BIOS and proxmox system see the GPU, but for whatever reason, I can't get the Windows system to boot when I enable the GPU. Without the PCI/GPU added in the hardware tab, system boots fine, with the mapping it seems to load about 50% of the RAM and freezes, can't connect via RDP or console but removing mapping immediately fixes the issue.
What's even more odd: I configured this on a OLDDDD desktop (i7-7700k that already has a 1080ti in it) and an older version of proxmox (8.1.4), and the GPU works flawlessly(same VM, it was backed up, transferred, and restored, i recreated the hardware linkages), I even went as far as to copy the configs that I had set up on my old system (changing the hardware id's and such).
Anyway, the system is the Minisforum MS-01 intel i-9 13900 with the "newer" 1.26 BIOS, it's running proxmox 8.3.0, the GPU is an Nvidia RTX 4000 SFF Ada, the host VM is Windows 11.
Here are a bunch of configs I've seen being asked for. My goal is that someone has used the MS-01 and set up a DGPU (not igpu) for passthrough fully to a VM. I have been using this on the old hardware as an AI system, but really want to migrate it to the new hardware. Any help is appreciated. I've used a bunch of guides, but went back and rolled back a lot of the configs to try and match these two:
Code:
https://pve.proxmox.com/wiki/PCI_Passthrough
https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
Configs:
~# pveversion
pve-manager/8.3.0/c1689ccb1065a83b (running kernel: 6.8.12-4-pve)
# vi /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""
~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
~# lspci -v
01:00.0 VGA compatible controller: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation]
Flags: fast devsel, IRQ 16, IOMMU group 16
Memory at 6d000000 (32-bit, non-prefetchable) [size=16M]
Memory at 6000000000 (64-bit, prefetchable) [size=32G]
Memory at 6800000000 (64-bit, prefetchable) [size=32M]
I/O ports at 3000 [size=128]
Expansion ROM at 6e000000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] Secondary PCI Express
Capabilities: [bb0] Physical Resizable BAR
Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
Capabilities: [d00] Lane Margining at the Receiver <?>
Capabilities: [e00] Data Link Feature <?>
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device: NVIDIA Corporation AD104 High Definition Audio Controller (rev a1)
Subsystem: NVIDIA Corporation AD104 High Definition Audio Controller
Flags: fast devsel, IRQ 17, IOMMU group 17
Memory at 6e080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [160] Data Link Feature <?>
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
~# find /sys/kernel/iommu_groups/ -type l | grep -e 16 -e 17
/sys/kernel/iommu_groups/17/devices/0000:01:00.1
/sys/kernel/iommu_groups/16/devices/0000:01:00.0
:~# cat /etc/modprobe.d/blacklist.conf
blacklist nvidia
blacklist nvidiafb
blacklist nouveau
~# dmesg | grep -e DMAR -e IOMMU
[ 0.019560] ACPI: DMAR 0x0000000042D78000 000088 (v02 INTEL EDK2 00000002 01000013)
[ 0.019586] ACPI: Reserving DMAR table memory at [mem 0x42d78000-0x42d78087]
[ 0.113819] DMAR: IOMMU enabled
[ 0.243187] DMAR: Host address width 39
[ 0.243188] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.243194] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.243196] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.243199] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.243200] DMAR: RMRR base: 0x0000004c000000 end: 0x000000503fffff
[ 0.243202] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.243203] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.243203] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.244749] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.666279] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.747178] DMAR: No ATSR found
[ 0.747179] DMAR: No SATC found
[ 0.747181] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.747181] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.747182] DMAR: IOMMU feature nwfs inconsistent
[ 0.747183] DMAR: IOMMU feature dit inconsistent
[ 0.747183] DMAR: IOMMU feature sc_support inconsistent
[ 0.747184] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.747185] DMAR: dmar0: Using Queued invalidation
[ 0.747187] DMAR: dmar1: Using Queued invalidation
[ 0.750790] DMAR: Intel(R) Virtualization Technology for Directed I/O
~# qm config 199
balloon: 0
bios: ovmf
boot: order=scsi0;ide0;ide2;net0
cores: 1
cpu: x86-64-v2-AES
efidisk0: raid1_vmstore:199/vm-199-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: mapping=rtx4000,pcie=1,x-vga=1
ide0: raid1_isos:iso/virtio-win-0.1.240.iso,media=cdrom,size=612812K
ide2: raid1_isos:iso/Win11_23H2_English_x64v2.iso,media=cdrom,size=6653034K
machine: pc-q35-9.0
memory: 4096
meta: creation-qemu=9.0.2,ctime=1736308597
name: GPUTest
net0: virtio=BC:24:11:11:65:F8,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: raid1_vmstore:199/vm-199-disk-1.qcow2,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=9be2f2e0-eda9-4a2b-b227-79877c8f382c
sockets: 2
tpmstate0: raid1_vmstore:199/vm-199-disk-2.raw,size=4M,version=v2.0
vga: virtio
vmgenid: 2acdd542-d395-45d3-b936-c3d76ac112a9
~# cat /etc/modprobe.d/vfio.conf (the two id's are the GPU)
options vfio-pci ids=10de:27b0,10de:22bc disable_vga=1
Last edited: