Hello everyone, I didn't mean to come to this and bother you guys but after going through all this forum (with many successful solutions), tenths of different guides and videos, building, configuring, destroying, rebuilding and reconfiguring to destroy again, I'm now officially desperate. I bought a HPE DL 380 Gen9 server for my homelab and I chose Proxmox as VE to be able to use ZFS raid for VM storage, so I'm quite new at it.
One of these VMs has to be a remote gaming machine, thus I took away the flawless working Quadro K620, got a Zotac GTX 1080 Ti blower from a friend and started configuring.
No matter which guide I follow (old or new), I am at my 100th attempt and what I get is always the infamous Code 43 in Win10 VM.
Proxmox boots from a hardware RAID-10 SSD lvm using GRUB, ZFS raid is used for VM disks only. Some changes in there parameters lead me to actual situation with server console freezed on loading ramdisk.... I read about this on this forum and I'm confident I won't have it in the end any more, since server and proxmox will share integrated GPU while 1080Ti will be isolated and passed exclusively to VM. Server actually boots and other VMs work fine!
My actual configuration is reasonably a mess and comes from 1-6 years old material, it used to be simpler when I had the K620 and proxmox login screen was displayed in server console.
IOMMU group (some guides say they must be single, some say they must be splitted....what's the answer?)
Is there any difference blacklisting a driver in blacklist.conf or pve-blacklist.conf?
VM is fresh W10 Pro, RDP on, Nvidia drivers downloaded, supposedly ready for GPU passthrough and later "switching to primary" as depicted in too many guides. I made a snapshot of this VM so that I could try all possible combinations while adding GPU and restarting the process, but the best thing i can get to is passthrough as secondary and install drivers. After that, reboot and Code 43 forever.
What I already tried to modify according guides are all parameters concerning:
cpu (host, hidden, flags=pcid....., with or w/o)
pci rom (rom-bar, no rom-bar, downloaded rom, downloaded patched rom, dumped rom, dumped patched rom)
machine type (any from 6.0 to 7.2, snapshot is 6.2)
args (with or without supposedly old kvm related parameters...sorry I can't find them again now)
So, basically, I'm asking for some hints, help, suggestion on how to start fresh again and where from....anything that could bring me closer to a stable and working environment. Let me know if you need some other data, couldn't wait to cooperate and helping some other guys with same hardware and issues.
I'm again sorry for creating another thread for this well-known-often-solved issue, but I'm really starting to get really mad at this.
Last resort for me would be going through the V-GPU approach, but I'd like to stick to a "simple" PCI passthrough, so anything would be really appreciated!
And my biggest question of them all is: isn't passthrough for consumer GPUs officially supported since 2021, is it? Why this then?
One of these VMs has to be a remote gaming machine, thus I took away the flawless working Quadro K620, got a Zotac GTX 1080 Ti blower from a friend and started configuring.
No matter which guide I follow (old or new), I am at my 100th attempt and what I get is always the infamous Code 43 in Win10 VM.
Proxmox boots from a hardware RAID-10 SSD lvm using GRUB, ZFS raid is used for VM disks only. Some changes in there parameters lead me to actual situation with server console freezed on loading ramdisk.... I read about this on this forum and I'm confident I won't have it in the end any more, since server and proxmox will share integrated GPU while 1080Ti will be isolated and passed exclusively to VM. Server actually boots and other VMs work fine!
My actual configuration is reasonably a mess and comes from 1-6 years old material, it used to be simpler when I had the K620 and proxmox login screen was displayed in server console.
Code:
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.108-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init nomodeset video=vesafb:off video=efifb:off video=simplefb:off
IOMMU group (some guides say they must be single, some say they must be splitted....what's the answer?)
Code:
IOMMU Group 94:
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
84:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
Code:
lspci -nnk
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. GP102 [GeForce GTX 1080 Ti] [19da:1470]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
84:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. GP102 HDMI Audio Controller [19da:1470]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
Code:
cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
Code:
cat /etc/modprobe.d/blacklist.conf
blacklist radeon
blacklist nouveau
blacklist nvidia
Code:
cat /etc/modprobe.d/pve-blacklist.conf
# This file contains a list of modules which are not supported by Proxmox VE
# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
Is there any difference blacklisting a driver in blacklist.conf or pve-blacklist.conf?
Code:
cat /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1 report_ignored_msrs=0
Code:
cat /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1
Code:
cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1b06,10de:10ef disable_vga=1
Code:
dmesg | grep -i vfio
[ 8.232097] VFIO - User Level meta-driver version: 0.3
[ 8.241496] vfio-pci 0000:84:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 8.261387] vfio_pci: add [10de:1b06[ffffffff:ffffffff]] class 0x000000/00000000
[ 8.281316] vfio_pci: add [10de:10ef[ffffffff:ffffffff]] class 0x000000/00000000
[ 200.066232] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 200.086845] vfio-pci 0000:84:00.1: enabling device (0140 -> 0142)
[ 1456.386052] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 1557.679559] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 1709.051338] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 1929.574380] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 3494.627535] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 3826.294611] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 4024.540187] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 4380.460742] vfio-pci 0000:84:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Code:
dmesg | grep -e DMAR -e IOMMU
[ 0.023851] ACPI: DMAR 0x000000007B7E7000 000300 (v01 HP ProLiant 00000001 HP 00000001)
[ 0.023941] ACPI: Reserving DMAR table memory at [mem 0x7b7e7000-0x7b7e72ff]
[ 1.277537] DMAR: IOMMU enabled
[ 2.877110] DMAR: Host address width 46
[ 2.877112] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[ 2.877123] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020de
[ 2.877128] DMAR: DRHD base: 0x000000c7ffc000 flags: 0x1
[ 2.877135] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020de
[ 2.877138] DMAR: RMRR base: 0x00000079174000 end: 0x00000079176fff
[ 2.877142] DMAR: RMRR base: 0x000000791f4000 end: 0x000000791f7fff
[ 2.877148] DMAR: RMRR base: 0x000000791de000 end: 0x000000791f3fff
[ 2.877150] DMAR: RMRR base: 0x000000791cb000 end: 0x000000791dbfff
[ 2.877153] DMAR: RMRR base: 0x000000791dc000 end: 0x000000791ddfff
[ 2.877156] DMAR: ATSR flags: 0x0
[ 2.877159] DMAR: ATSR flags: 0x0
[ 2.877164] DMAR-IR: IOAPIC id 10 under DRHD base 0xfbffc000 IOMMU 0
[ 2.877168] DMAR-IR: IOAPIC id 8 under DRHD base 0xc7ffc000 IOMMU 1
[ 2.877171] DMAR-IR: IOAPIC id 9 under DRHD base 0xc7ffc000 IOMMU 1
[ 2.877174] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
[ 2.877177] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 2.878722] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 3.444494] DMAR: No SATC found
[ 3.444497] DMAR: dmar0: Using Queued invalidation
[ 3.444508] DMAR: dmar1: Using Queued invalidation
[ 3.458158] DMAR: Intel(R) Virtualization Technology for Directed I/O
VM is fresh W10 Pro, RDP on, Nvidia drivers downloaded, supposedly ready for GPU passthrough and later "switching to primary" as depicted in too many guides. I made a snapshot of this VM so that I could try all possible combinations while adding GPU and restarting the process, but the best thing i can get to is passthrough as secondary and install drivers. After that, reboot and Code 43 forever.
What I already tried to modify according guides are all parameters concerning:
cpu (host, hidden, flags=pcid....., with or w/o)
pci rom (rom-bar, no rom-bar, downloaded rom, downloaded patched rom, dumped rom, dumped patched rom)
machine type (any from 6.0 to 7.2, snapshot is 6.2)
args (with or without supposedly old kvm related parameters...sorry I can't find them again now)
Code:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 10
cpu: host
efidisk0: hwR10_SSD_400GB:vm-150-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
ide2: local:iso/virtio-win-0.1.229.iso,media=cdrom,size=522284K
machine: pc-q35-6.2
memory: 65536
meta: creation-qemu=7.2.0,ctime=1691933057
name: njafs
net0: e1000=6A:23:95:48:AD:96,bridge=vmbr0
numa: 0
ostype: win10
scsi0: R1_NVME_2TB:vm-150-disk-0,discard=on,iothread=1,size=250G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=83c27078-aae3-4291-8dc4-5a18fcb158d2
snaptime: 1692121223
sockets: 1
vmgenid: 1353b6b0-864a-40ea-a9fe-4e048ba8152b
So, basically, I'm asking for some hints, help, suggestion on how to start fresh again and where from....anything that could bring me closer to a stable and working environment. Let me know if you need some other data, couldn't wait to cooperate and helping some other guys with same hardware and issues.
I'm again sorry for creating another thread for this well-known-often-solved issue, but I'm really starting to get really mad at this.
Last resort for me would be going through the V-GPU approach, but I'd like to stick to a "simple" PCI passthrough, so anything would be really appreciated!
And my biggest question of them all is: isn't passthrough for consumer GPUs officially supported since 2021, is it? Why this then?