[TUTORIAL] GPU Passthrough with NVIDIA in Linux VM - Improve stability

tikilou

New Member
Apr 1, 2025
4
1
3
[GUIDE] Dual-GPU Passthrough, Multi-VM Linux Gaming on Proxmox with NFS Shared Storage (Steam, Proton, Heroic, Saves, etc.)


This is a full, real-world working configuration for running two Linux gaming VMs simultaneously on Proxmox, each with its own passthrough NVIDIA GPU, and sharing games, saves, and Wine/Proton prefixes via high-performance NFS mounts.
My setup is designed for maximum flexibility, stability and performance for gaming, streaming, video editor, and large library management.
I'm using Sunshine for video gamestreaming server (on any VM) and Moonlight as client on PC, laptops, SteamDeck, tablet, phone... With Local Network or my own VPN.

This fixing all instability of Nvidia drivers 570 on Linux (wich support wayland) too. Without hiding VM to Nvidia proprietary driver, there is a lot of crashes ingames... Very Unstable. You will need a dump of your GPU too for improving stability.




1. Hardware & Architecture Overview​


  • Motherboard: Asrock B650I
  • CPU: Ryzen 9 7900 (with iGPU used only for containers, encoding/decoding)
  • GPU 1: NVIDIA GTX 1660 Ti (direct PCIe passthrough to first VM)
  • GPU 2: NVIDIA GTX 1650, connected via Oculink to NVMe adapter (passed through to a second VM)
  • Both VMs can run at the same time, each with exclusive, hardware-passthrough NVIDIA GPUs.
  • Storage:
    • Main games & data: NVMe/SSD (shared via NFS for speed)
    • Saves & prefixes: All Wine/Proton prefixes, Steam/Heroic games, and savegames are hosted on a double RAID of a 4TB HDD (already configured as mp0 under Proxmox – setup not detailed here).
  • All VMs and containers access the same shared folders via NFS, including:
    • Steam/Proton/Wine prefixes
    • Downloaded/installed games (Steam, Heroic, etc.)
    • Game save files (works for cross-VM and backup)
  • Networking: All on vmbr0 bridge for maximum bandwidth



2. Proxmox VM Config – Options Explained​


Below is an actual config excerpt for one gaming VM. The second VM is configured similarly (replace PCI address and ROM as needed).


Code:
agent: 1
args: -cpu host,kvm=off,+kvm_pv_unhalt,+kvm_pv_eoi,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff
balloon: 0
bios: ovmf
boot: order=scsi0
cores: 8
cpu: host
efidisk0: hdd-4to:501/vm-501-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hookscript: local:snippets/vm-governor-hook.sh
hostpci1: 0000:01:00,pcie=1,x-vga=1,romfile=evga-gtx1660ti.rom
hotplug: disk,network,usb,cpu
machine: pc-q35-9.2+pve1
memory: 20000
meta: creation-qemu=9.2.0,ctime=1743019237
name: CachyOS-2
net0: virtio=BC:24:11:1C:2F:52,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-501-disk-0,iothread=1,size=64G
scsi1: hdd-4to:501/vm-501-disk-2.qcow2,iothread=1,size=160G
scsihw: virtio-scsi-single
smbios1: uuid=youridsmbios
sockets: 1
vga: none
vmgenid: yourid


Key Options:


  • hostpci1: Passthrough your NVIDIA GPU (change PCI address and ROM for second VM/GPU)
  • args: See below for detailed breakdown
  • hookscript: Handles CPU governor and PCI rescan/removal (see script below)
  • scsi0, scsi1: Storage devices – recommend SSD/NVMe for game performance
  • net0: VirtIO for fast networking (NFS performance)
  • All VMs use the same approach, only PCI and ROM will differ for each GPU



Breakdown: args Option


Code:
args: -cpu host,kvm=off,+kvm_pv_unhalt,+kvm_pv_eoi,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff


  • -cpu host: Presents all host CPU features to the VM
  • kvm=off: Masks KVM hypervisor to avoid NVIDIA driver detection/Code 43
  • +kvm_pv_unhalt, +kvm_pv_eoi: Enables KVM paravirt features (safe, improves perf)
  • hv_time, hv_relaxed, hv_vapic, hv_spinlocks=0x1fff: Hyper-V extensions for accurate timing and performance
    All these flags together maximize compatibility and stability for GPU passthrough and gaming.



3. CPU Governor and PCI Management Hook Script​


Automatically switches CPU to "performance" when any VM is running, "powersave" when not, and safely rescans/removes GPU devices in PCIE motherboard port.


Code:
#!/bin/bash


vmid="$1"
operation="$2"


GPU_PCI_IDS=("01:00.0" "01:00.1" "01:00.2" "01:00.3")


log() {
logger -t vm-governor "$1"
echo "$(date +'%Y-%m-%d %H:%M:%S') $1" >> /var/log/vm-governor.log
}


case "$operation" in
pre-start)
echo 1 > /sys/bus/pci/rescan
log "Rescan PCI triggered before starting VM $vmid"
;;
post-start)
mode="performance"
;;
post-stop)
if qm list | awk '$2 == "running" {exit 1}'; then
mode="powersave"
for id in "${GPU_PCI_IDS[@]}"; do
if [ -e "/sys/bus/pci/devices/0000:$id" ]; then
echo 1 > "/sys/bus/pci/devices/0000:$id/remove"
log "GPU function $id removed (no VM active)"
fi
done
else
mode="powersave"
fi
;;
*)
exit 0
;;
esac


if [ -n "$mode" ]; then
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
echo "$mode" > "$i"
done
log "VM $vmid: CPU mode -> $mode (operation: $operation)"
fi




4. LXC Container for NFS Shared Storage​


All games, prefixes and saves are shared between both VMs via NFS, exported by a dedicated container.
Saves/prefixes are physically stored on a dual RAID 4TB HDD (mounted as mp0 in the container, RAID setup not detailed here).


Code:
arch: amd64
cores: 3
hostname: ubuntu-smb
memory: 512
mp0: /media/hdd-4To,mp=/media/hdd-4To  # This is your RAID drive, contains all saves/prefixes
mp1: /media/hdd-3To,mp=/media/hdd-3To
mp2: /media/hdd-18To,mp=/media/hdd-18To
mp3: /media/ssd-2To/Jeux,mp=/media/ssd-2To/Jeux
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=BB:23:10:25:42:1F,ip=dhcp,ip6=dhcp,type=veth
onboot: 1
ostype: ubuntu
protection: 0
rootfs: local-lvm:vm-100-disk-0,size=8G
swap: 0
lxc.apparmor.profile: unconfined
lxc.cgroup.devices.allow: a
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file
lxc.mount.entry: /etc/lxc/resolv.conf etc/resolv.conf none bind,ro,create=file


Example of NFS exports with RW for theses VM Only (with fixed IP !) and RO for others machines on local network in /etc/exports :


Code:
/media/hdd-4To/.sessions-vm-jeux/Jeux 192.168.0.0/24(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.126(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.26(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000)
/media/hdd-18To/Jeux 192.168.0.0/24(ro,sync,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.126(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.26(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000)
/media/hdd-3To/Jeux-PC 192.168.0.0/24(ro,sync,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.126(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.26(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000)
/media/ssd-2To/ssd 192.168.0.0/24(ro,sync,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.126(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000) 192.168.1.26(rw,async,no_subtree_check,insecure,all_squash,anonuid=1000,anongid=1000)


  • Both VMs mount the same NFS shares for prefixes/games/saves (perfect for Steam, Heroic, Wine/Proton, and cross-VM save games).
  • Saves/prefixes are always on RAID for safety and performance.

automount shared NFS folders In VM /etc/fstab :

Code:
192.168.1.35:/media/hdd-4To/.sessions-vm-jeux/Jeux    /media/hdd-4To/.sessions-vm-jeux/Jeux    nfs4    rw,nofail,x-systemd.automount,_netdev  0  0
192.168.1.35:/media/ssd-2To/Jeux    /media/ssd-2To/Jeux   nfs4 rw,nofail,x-systemd.automount,_netdev,actimeo=1,noatime,async,hard,intr 0 0



5. Kernel cmdline Example (for passthrough optimization)​

My actual kernel cmdline in /etc/default/grub of proxmox:

Code:
amdgpu.dpm=1 amdgpu.dpm_forced=1 amdgpu.dc=1 amdgpu.runpm=1 amd_iommu=on iommu=pt amd_pstate=passive pcie_aspm=powersave pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init kvm.ignore_msrs=1 video=efifb:off loglevel=3 mce=off amd_idle.max_cstate=6 pci=realloc

Parameter-by-parameter explanation:

amdgpu.dpm=1
Enables Dynamic Power Management (DPM) for AMD GPUs.
→ Lets the driver adjust GPU clock/power states dynamically.

amdgpu.dpm_forced=1
Forces DPM to be enabled, even if not detected as supported by the hardware/BIOS.

amdgpu.dc=1
Enables Display Core, the modern display engine for AMD GPUs (better support for multi-head, DP, HDMI 2.0+).

amdgpu.runpm=1
Enables runtime power management for AMD GPUs.
→ Allows the kernel to power down the GPU when not used (useful in passthrough/multi-GPU).

amd_iommu=on
Activates the AMD IOMMU (input-output memory management unit) on the host.
→ Required for PCI passthrough/isolation.

iommu=pt
Sets the IOMMU to “pass-through” mode by default, reducing overhead for devices not explicitly passed through to VMs.

amd_pstate=passive
Uses the new AMD CPU frequency scaling driver in “passive” mode.
→ The kernel suggests frequencies, firmware decides (usually safer and more stable than “active” on some platforms).

pcie_aspm=powersave
Enables PCIe Active State Power Management in powersave mode.
→ Reduces idle power consumption of PCIe devices (can be disabled if causing instability, but works well with modern hardware).

pcie_acs_override=downstream,multifunction
Overrides PCIe Access Control Services for “downstream” and “multifunction” devices.
→ Forces IOMMU group separation on consumer platforms, allowing safe and independent passthrough of GPUs and NVMe devices (without this, groups are often merged and passthrough doesn’t work).

initcall_blacklist=sysfb_init
Prevents the kernel from initializing the simple framebuffer driver.
→ Avoids conflicts with OVMF/UEFI GOP and with GPU passthrough where guest needs full control of the GPU.

kvm.ignore_msrs=1
Instructs KVM to ignore “unknown model specific registers” errors from the guest.
→ Prevents fatal VM errors with some Windows/Linux guests and modern CPUs/GPUs.

video=efifb:off
Disables the EFI framebuffer after boot.
→ Prevents host from holding the framebuffer on GPUs passed through to guests (avoids blank screen or driver hang).

loglevel=3
Sets default kernel log verbosity to “errors only.”
→ Reduces spammy kernel messages in dmesg/journalctl.

mce=off
Disables Machine Check Exception handling.
→ Used for debugging, or to avoid random host crashes on some buggy platforms; can be omitted for most users unless you have unexplained hardware faults.

amd_idle.max_cstate=6
Sets the maximum allowed CPU C-state for AMD CPUs.
→ Limits how deep the CPU can go in sleep states (may improve stability and avoid hangs in heavy virtualization/passthrough scenarios).

pci=realloc
Tells the kernel to re-allocate PCI resources at boot.
→ Useful for large numbers of PCI devices, or non-standard slot configs, avoids "not enough space for BAR" errors.


DONT FORGET TO BLACKLIST NOUVEAU AND NVIDIA ON PROXMOX !

Code:
/etc/modprobe.d/blacklist.conf                                                       
blacklist nouveau
blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_modeset
blacklist nvidia_uvm
blacklist nvidiafb
blacklist i2c_nvidia_gpu

And customize setting of IOMMU in /etc/modprobe.d/iommu_unsafe_interrupts.conf :
Code:
options vfio_iommu_type1 allow_unsafe_interrupts=1

And you can specify wich PCI is enable for VFIO in /etc/modprobe.d/vfio.conf if your using VGPU or want to keep a GPU for host too :
This is an example :
Code:
options vfio-pci ids=10de:2182,10de:1aeb,10de:1aec,10de:1aed,10de:1f82,10de:10fa


For the VMs :

Add this to the commandline : "clearcpuid=514" (it's necessary for running somes AAA games)
The clearcpuid=514 kernel parameter tells the Linux kernel to mask (hide) a specific CPU feature (in this case, feature ID 514) from the system and applications, even if the hardware supports it.


  • ID 514: On most AMD CPUs, feature 514 corresponds to the "hypervisor present" flag in the CPUID instruction.
  • By hiding this flag, the system and any software running inside (such as drivers or games) will not detect that they are running in a virtualized environment.
  • Prevents guest operating systems or drivers (especially NVIDIA drivers) from detecting that they are inside a VM, which can help avoid errors like "Code 43" or other software refusing to run under virtualization.


Allocate more processes for games :
And vm.max_map_count=1048576 in /etc/sysctl.d/99-custom.conf ("sysctl -p" for loading this value or reboot)
The vm.max_map_count kernel parameter sets the maximum number of memory map areas (virtual memory regions) a single process can have.


  • Default value: On most Linux systems, it is set to 65530.
  • Value 1048576: By raising it to 1048576 (one million), you allow processes (especially games, databases, or applications like Elasticsearch, Star Citizen, Proton, Wine, etc.) to allocate far more memory regions.
  • Some modern games, game engines, or emulators require a huge number of memory mappings. If the limit is too low, you might encounter errors like :
    • “mmap failed: [ENOMEM]”
    • “Could not allocate virtual memory”
  • Increasing this value is often required for demanding games under Wine/Proton and for some large applications (search engines, servers, etc.) to run reliably.




6. Wake-on-LAN for VMs – Service, Script & Config

This system lets you start your VM(s) by sending a magic packet (e.g. Moonlight, Sunshine, or any WOL tool). The script listens for WOL packets, checks MAC addresses, and starts the corresponding VM if it’s stopped.


Systemd service (/etc/systemd/system/proxmox-wol-listener.service):



Code:
[Unit]
Description=Proxmox VM Wake-on-LAN Listener
After=network-online.target
Wants=network-online.target


[Service]
ExecStart=/usr/local/bin/proxmox-wol-listener.sh
User=root
Restart=always
RestartSec=5
SyslogIdentifier=proxmox-wol


[Install]
WantedBy=multi-user.target


Listener Script (/usr/local/bin/proxmox-wol-listener.sh):


Code:
#!/bin/bash


CONFIG_FILE="/etc/proxmox-wol-vm.conf"
LOG_TAG="proxmox-wol"


[[ ! -f "$CONFIG_FILE" ]] && echo "[$(date)] [!] Config file not found." && exit 1


exec tcpdump -lnXX -i any 'udp port 9' 2>/dev/null | awk -v config_file="$CONFIG_FILE" -v log_tag="$LOG_TAG" '
BEGIN {
rule_count = 0
while ((getline < config_file) > 0) {
if ($0 ~ /^\s*#/ || $0 ~ /^\s*$/) continue
if (split($0, parts, /[[:space:]]+/) == 2) {
mac_orig = parts[1]
mac_clean = tolower(gensub(":", "", "g", mac_orig))
vmid = parts[2]
if (mac_clean ~ /^[0-9a-f]{12}$/ && vmid ~ /^[0-9]+$/) {
patterns[mac_clean] = "ffffffffffff"
for (i = 1; i <= 16; i++) {
patterns[mac_clean] = patterns[mac_clean] mac_clean
}
vmids[mac_clean] = vmid
mac_display[mac_clean] = mac_orig
rule_count++
}
}
}
close(config_file)
buffer = ""
}
/^[[:space:]]0x[0-9a-f]+:/ {
line = $0
sub(/^[^:]:[[:space:]]/, "", line)
sub(/[[:space:]]+[^ ]$/, "", line)
gsub(/[[:space:]]/, "", line)
buffer = buffer line
}
/^[^[:space:]]/ {
if (buffer != "") {
buffer_lower = tolower(buffer)
for (mac_key in patterns) {
if (buffer_lower ~ patterns[mac_key]) {
target_vmid = vmids[mac_key]
system("qm status " target_vmid " > /dev/null 2>&1")
if (system("qm status " target_vmid " | grep -q 'status:.*stopped'") == 0) {
system("qm start " target_vmid)
}
}
}
}
buffer = ""
}
'


Config file (/etc/proxmox-wol-vm.conf) EXAMPLE :


Code:
#MAC address of VM network interface and corresponding VMID
bc:24:11:1c:2f:52 501
02:de:ad:be:ef:02 502


  • Each line: MAC address (format: lower-case, :-separated, as shown by ip link or ifconfig in the VM config) and the Proxmox VM ID.


Ps : llm helped me to format all my guide.
 
Last edited:
  • Like
Reactions: FrankList80
Note : i don't know how to enable the REBAR hack for GTX 16XX on VM with proxmox.
This is possible to enable it because it's TURING card, for Windows and Linux Native OS, but i don't know how to do here, it can improve performances ingame. Do someone know how to do ?

Here the hack to bypass software limitation of Nvidia for GTX 16XX and RTX 2XXX GPUs :
https://github.com/xCuri0/ReBarUEFI/