Hi All,
And yes, jet another GPU passthrough thread, why? There are to many, with too many sources of information and some of them are seriously outdated, which leads to issues,...ask me how I know!!!
So, the give a little background information. I'm running proxmox (happily) for years now, I started to use somewhere late V5 early V6 and always used it to host all of my (mostly) linux vm's and containers. I recently upped my game by switching to new (better) hardware, in my case the HP Z840 workstation.
So far, so good! All VM's and CT's are migrated and working, some already expanded their configs to adjust to the new real-estate.
But now my issues. The change to this new proxmox server is to start using the GPU passthrough options, I would like to run a couple of my VM's (now hosted on my iMac in VMware fusion) on proxmox. These are mostly GPU hungry VM's and run 'slightly too laggy' to really use them.
I bought this Z840 with a Nvidia NVS310 and a GTX 285 card, both I wish to passthrough to a VM.
I started my modding of the conf files with usuals:
- /etc/default/grub to
- adding the VFIO drivers to /etc/modules:
- blacklisting the graphics drivers in /etc/modules.d/:
- running update-grub and update-initramfs -u
After rebooting and running the command:
it reflects that iommu is working:
So far so good.
Then I added the NVS310 to a win10 vm with the below config:
When I started the VM all hell broke loose.
I wasn't well prepared and couldn't capture the errors, which lead to attempt #2, with an open shh terminal I could follow the journal, which blew-up full of erros (125MB of ssh capture), until I could hard kill the vm.
The capture had thousands of lines with:
Proxmox needed to be hard rebooted by a power cycle.
From this point on, my quest to get this working resulted in several hours of googling, youtube videos on the topic (where it always works) and plenty of threads on this forum. But no cigar!
Just to show some of my work, here are all grub lines I tested:
Here i discovered that many options aren't well documented, some of them aren't needed by some and others swear by them. At the moment it leaves my very confused. If I run the command below for instance, it shows that 'iommu grouping 'is done correctly by my system, which yields some of the GRUB_CMDLINE_LINUX_DEFAULT options already overkill/useless.
I tried a lot of things, not all shown above. If needed I can add more if things aren't clear enough, but for now. Is anyone capable of telling why GPU passthrough is so troublesome for me?
Best regards,
LVX
And yes, jet another GPU passthrough thread, why? There are to many, with too many sources of information and some of them are seriously outdated, which leads to issues,...ask me how I know!!!
So, the give a little background information. I'm running proxmox (happily) for years now, I started to use somewhere late V5 early V6 and always used it to host all of my (mostly) linux vm's and containers. I recently upped my game by switching to new (better) hardware, in my case the HP Z840 workstation.
So far, so good! All VM's and CT's are migrated and working, some already expanded their configs to adjust to the new real-estate.
But now my issues. The change to this new proxmox server is to start using the GPU passthrough options, I would like to run a couple of my VM's (now hosted on my iMac in VMware fusion) on proxmox. These are mostly GPU hungry VM's and run 'slightly too laggy' to really use them.
I bought this Z840 with a Nvidia NVS310 and a GTX 285 card, both I wish to passthrough to a VM.
I started my modding of the conf files with usuals:
- /etc/default/grub to
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia
After rebooting and running the command:
Code:
dmesg | grep -e DMAR -e IOMMU
Code:
[ 0.407883] DMAR: IOMMU enabled"
So far so good.
Then I added the NVS310 to a win10 vm with the below config:
Code:
bios: ovmf
boot: order=sata0;ide2;net0
cores: 1
efidisk0: local-lvm:vm-105-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:04:00,pcie=1,x-vga=1
ide2: local:iso/Windows10-x64-new.iso,media=cdrom,size=4141440K
machine: pc-q35-7.0
memory: 2048
meta: creation-qemu=7.0.0,ctime=1667148331
name: Windows-10-105
net0: e1000=7A:A1:D1:C3:E0:D7,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
sata0: local-lvm:vm-105-disk-1,size=16G
scsihw: virtio-scsi-pci
smbios1: uuid=5b3df202-bb2b-4450-9834-60d297ba7ee6
sockets: 1
vga: none
vmgenid: 137a9eaa-953d-412b-99c9-6ae7aa20df3e
When I started the VM all hell broke loose.
I wasn't well prepared and couldn't capture the errors, which lead to attempt #2, with an open shh terminal I could follow the journal, which blew-up full of erros (125MB of ssh capture), until I could hard kill the vm.
The capture had thousands of lines with:
Code:
Nov 01 16:18:43 proxmox-z840 QEMU[4840]: kvm: vfio_region_write(0000:04:00.0:region1+0x13f8, 0x0,8) failed: Device or resource busy
Nov 01 16:18:43 proxmox-z840 kernel: vfio-pci 0000:04:00.0: BAR 1: can't reserve [mem 0xd0000000-0xd7ffffff 64bit pref]
Proxmox needed to be hard rebooted by a power cycle.
From this point on, my quest to get this working resulted in several hours of googling, youtube videos on the topic (where it always works) and plenty of threads on this forum. But no cigar!
Just to show some of my work, here are all grub lines I tested:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# Disable the 'efifb' graphics driver using efifb:off. This will prevent the driver from stealing the GPU. An unfortunate side-effect of this is that you will not be able to see what your computer is doing while it is booting up.
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off video=efifb:off"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream video=efifb:eek:ff video=vesafb:eek:ff"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 video=efifb:eek:ff video=vesafb:eek:ff"
Here i discovered that many options aren't well documented, some of them aren't needed by some and others swear by them. At the moment it leaves my very confused. If I run the command below for instance, it shows that 'iommu grouping 'is done correctly by my system, which yields some of the GRUB_CMDLINE_LINUX_DEFAULT options already overkill/useless.
Code:
dmesg | grep 'remapping'
Code:
[ 0.912231] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.913102] DMAR-IR: Enabled IRQ remapping in x2apic mode
I tried a lot of things, not all shown above. If needed I can add more if things aren't clear enough, but for now. Is anyone capable of telling why GPU passthrough is so troublesome for me?
Best regards,
LVX