Dear all,
First of all, since this is my first post on this awesome forum, let me thank the whole community for pushing so hard on a product like Proxmox, in my opinion the real state of the art of hypervisors and the only solution which adheres to Linuxproviding plenty of options and customizability.
I am writing you to ask for your help in setting up my system.
I am currently having some problems in setting un the GPUs passthrough.
The GPUs I am trying to pass are two AMD Vega Frontier Edition which successfully got passed through under other hypervisors (despite suffering from reset bug).
Ideally I'd like to pass both GPUs to either a Windows 10 VM, or to a CentOS 7 one.
I can accept passing each of them to a different machine tho.
I had this setup lying around for quite some time, but I am Italian and I am at working at home right now for the Covid-19 emergency.
For this reason, I got some spare time (my usual 3 hours commute each day...) to try finalizing this setup and I'd like to get operational ASAP to take part to the Folding@Home Covid-19 research.
At the moment I have (I think) successfully isolated the GPUs' IOMMU groups and managed to get them handled by the vfio driver.
My settings follow.
lspci GPUs and integrated audio controller:
vfio.conf:
grub config:
blacklist.conf (same for pve-blacklist.conf):
After applying such settings I both recreated the boot Ramdisk and updated the GRUB configuration.
I managed to make the windows machine boot with a SPICE VGA and the Vega FE, install the AMD drivers and boot back using the Vega GPU, but it hangs loading the Windows UI. The QEMU settings I use are:
The system is a dual Xeon E5 CPU system based on an Asus Z10PE-D16 WS. The GPUs are placed each on a PCIe slot linked directly to one of the two CPUs, but the same issue happens with both GPUs making me doubt that this is related to PCIe splitting.
Is there anybody capable fo helping me troubleshooting this weird behavior?
I am a quite skilled sysadmin, so have no fear on being "too technical" in you answers.
Thank you in advance,
Slid
First of all, since this is my first post on this awesome forum, let me thank the whole community for pushing so hard on a product like Proxmox, in my opinion the real state of the art of hypervisors and the only solution which adheres to Linuxproviding plenty of options and customizability.
I am writing you to ask for your help in setting up my system.
I am currently having some problems in setting un the GPUs passthrough.
The GPUs I am trying to pass are two AMD Vega Frontier Edition which successfully got passed through under other hypervisors (despite suffering from reset bug).
Ideally I'd like to pass both GPUs to either a Windows 10 VM, or to a CentOS 7 one.
I can accept passing each of them to a different machine tho.
I had this setup lying around for quite some time, but I am Italian and I am at working at home right now for the Covid-19 emergency.
For this reason, I got some spare time (my usual 3 hours commute each day...) to try finalizing this setup and I'd like to get operational ASAP to take part to the Folding@Home Covid-19 research.
At the moment I have (I think) successfully isolated the GPUs' IOMMU groups and managed to get them handled by the vfio driver.
My settings follow.
lspci GPUs and integrated audio controller:
Bash:
root@pve:~# lspci -nnk -d 1002:6863
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XTX [Radeon Vega Frontier Edition] [1002:6863]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XTX [Radeon Vega Frontier Edition] [1002:6b76]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XTX [Radeon Vega Frontier Edition] [1002:6863]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XTX [Radeon Vega Frontier Edition] [1002:6b76]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
root@pve:~# lspci -nnk -d 1002:aaf8
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] [1002:aaf8]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] [1002:aaf8]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
06:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] [1002:aaf8]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] [1002:aaf8]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
vfio.conf:
Bash:
root@pve:~# cat /etc/modprobe.d/vfio.conf
softdep radeon pre: vfio-pci
softdep amdgpu pre: vfio-pci
softdep nouveau pre: vfio-pci
softdep drm pre: vfio-pci
options vfio-pci ids=1002:6863,1002:aaf8 disable_vga=1
grub config:
Bash:
root@pve:~# cat /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX=""
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true
# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"
blacklist.conf (same for pve-blacklist.conf):
Bash:
root@pve:~# cat /etc/modprobe.d/blacklist.conf
# This file contains a list of modules which are not supported by Proxmox VE
# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
blacklist amdgpu
blacklist radeon
blacklist nouveau
After applying such settings I both recreated the boot Ramdisk and updated the GRUB configuration.
I managed to make the windows machine boot with a SPICE VGA and the Vega FE, install the AMD drivers and boot back using the Vega GPU, but it hangs loading the Windows UI. The QEMU settings I use are:
Code:
bios: ovmf
bootdisk: sata0
cores: 16
efidisk0: local-lvm:vm-101-disk-0,size=128K
hostpci0: 03:00.0,pcie=1,x-vga=1
hostpci1: 03:00.1,pcie=1
ide2: freenas-isos:iso/Win-10.iso,media=cdrom
machine: q35
memory: 27000
name: WinFolding
net0: e1000=5A:D0:1A:31:6B:8A,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
sata0: freenas-isos:101/vm-101-disk-0.qcow2,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=8d50dab8-7f06-40aa-8d80-3f94791bd495
sockets: 1
vga: none
vmgenid: e7b11fea-e225-467a-97c3-55d20651c843
The system is a dual Xeon E5 CPU system based on an Asus Z10PE-D16 WS. The GPUs are placed each on a PCIe slot linked directly to one of the two CPUs, but the same issue happens with both GPUs making me doubt that this is related to PCIe splitting.
Is there anybody capable fo helping me troubleshooting this weird behavior?
I am a quite skilled sysadmin, so have no fear on being "too technical" in you answers.
Thank you in advance,
Slid
Last edited: