Hi everyone,
I am trying to pass through a Zotac GTX 970 to a Linux VM on Proxmox, but I am hitting a wall where the Host either crashes/reboots immediately upon VM start, or the VM fails to initialize the card.
Hardware:
What I Have Tried (and failed):
My Configuration:
```
~# cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_aspm=off"
~# lspci -nnk -s 01:00
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:2370]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:2370]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
```
`journactl -b -e -1`:
```
Jan 20 01:20:28 paris kernel: vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none
wns=none
Jan 20 01:20:28 paris kernel: vfio-pci 0000:01:00.0: probe with driver vfio-pci failed with error -22
Jan 20 01:20:28 paris kernel: vfio_pci: add [10de:13c2[ffffffff:ffffffff]] class 0x000000/00000000
```
Can anyone suggest if this specific Haswell/GTX970 combo requires a specific kernel patch or if I'm missing a specific interrupt setting?
Thanks!
I am trying to pass through a Zotac GTX 970 to a Linux VM on Proxmox, but I am hitting a wall where the Host either crashes/reboots immediately upon VM start, or the VM fails to initialize the card.
Hardware:
- CPU: Intel Xeon E3-1200 v3 (Haswell)
- Mobo: Z97 Chipset (ASUS)
- GPU: Zotac GTX 970 (Group 0002) - Verified working on Host (output to monitor works if drivers are loaded).
- Kernel: Linux 6.17.4-2-pve (Proxmox VE 8)
What I Have Tried (and failed):
- IOMMU Groups: Checked and confirmed isolated. GPU and Audio are in Group 2. Network is in Group 6.
- BIOS/UEFI: Host BIOS has VT-d enabled. Tried VM in both SeaBIOS (Legacy) and OVMF (UEFI) modes.
- Vendor-Reset: Installed vendor-reset module, but it taints the kernel and doesn't seem to solve the reset bug for this specific card.
- VBIOS Patching: Dumped own ROM, patched headers, downloaded reference ROM, stripped headers manually. Confirmed 55 aa magic bytes.
- Config Flags: Tried combinations of x_vga=1, pcie_acs_override=downstream, disable_vga=1 in vfio module, and pcie_aspm=off.
- Drivers: Blacklisted nouveau and nvidia on host; vfio-pci is binding correctly before the crash.
My Configuration:
```
~# cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_aspm=off"
~# lspci -nnk -s 01:00
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:2370]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:2370]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
```
`journactl -b -e -1`:
```
Jan 20 01:20:28 paris kernel: vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none
Jan 20 01:20:28 paris kernel: vfio-pci 0000:01:00.0: probe with driver vfio-pci failed with error -22
Jan 20 01:20:28 paris kernel: vfio_pci: add [10de:13c2[ffffffff:ffffffff]] class 0x000000/00000000
```
Can anyone suggest if this specific Haswell/GTX970 combo requires a specific kernel patch or if I'm missing a specific interrupt setting?
Thanks!