Hello,
I have been struggling for a couple days to get a new 7900 XTX to work in an EndeavourOS VM. I've read through many posts on this forum as well as on Reddit.
I also have a 4070ti Super in my system that has been working for months. I got tired of the Nvidia issues in EndeavourOS, so I have added the 7900XTX for it.
Anyways, I was getting tons of correctable PCI errors at first in Proxmox syslog and my IPMI logs. The errors only occurred when booting the VM. Whenever EndeavourOS would actually boot, it would not load amdgpu at all and was only accessible with ssh. Other times it would not boot and would eventually time out with exit code 1. After adding the romfile to my VM config and rebooting, the PCI errors seem to have went away, but now I have two other errors:
EndeavourOS now boots and the 7900 XTX is recognized, with amdgpu being the driver used. I can remote in via Sunshine/Moonlight and it seems happy. I just want to resolve the errors and make sure I'm good to go.
I have CSM, Above 4G decoding, and Re-Size BAR Support enabled in my BIOS.
My Grub Params:
VM Config:
OS and Hardware:
Proxmox 8.1.4
Motherboard - Asrock Rack Rome8d-2t
CPU - Epyc 7443p
RAM - 256GB 3200MHZ ECC
GPU1 - Asus TUF RTX-4070ti Super, PCIe7
GPU2 - Sapphire Pulse 7900 XTX, PCIe1
OS Drive - Samsung 980 Pro 500GB
VM OS Drives - ZFS Mirror 2x960GB Samsung P9A3, Oculink 1/2
VM Storage - ZFS Striped Mirror 4x1TB WD SN850, bifurcation card PCIe5
TrueNAS Drives - 4x12TB Seagate x16, SATA 4-7
PCIe NIC - 82599ES 10Gbe - passed through to TrueNAS, PCIe4
I have been struggling for a couple days to get a new 7900 XTX to work in an EndeavourOS VM. I've read through many posts on this forum as well as on Reddit.
I also have a 4070ti Super in my system that has been working for months. I got tired of the Nvidia issues in EndeavourOS, so I have added the 7900XTX for it.
Anyways, I was getting tons of correctable PCI errors at first in Proxmox syslog and my IPMI logs. The errors only occurred when booting the VM. Whenever EndeavourOS would actually boot, it would not load amdgpu at all and was only accessible with ssh. Other times it would not boot and would eventually time out with exit code 1. After adding the romfile to my VM config and rebooting, the PCI errors seem to have went away, but now I have two other errors:
pve kernel: vfio-pci 0000:84:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x006a address=0xf7900747000 flags=0x0020]
pve kernel: vfio-pci 0000:84:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x006a address=0xf7900767000 flags=0x0020]
EndeavourOS now boots and the 7900 XTX is recognized, with amdgpu being the driver used. I can remote in via Sunshine/Moonlight and it seems happy. I just want to resolve the errors and make sure I'm good to go.
I have CSM, Above 4G decoding, and Re-Size BAR Support enabled in my BIOS.
My Grub Params:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt initcall_blacklist=acpi_cpufreq_init amd_pstate.shared_mem=1 amd_pstate=active"
VM Config:
Code:
affinity: 0-11
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0
cores: 12
cpu: host
efidisk0: tank_nvme:vm-104-disk-0,efitype=4m,size=1M
hostpci0: 0000:42:11.2,pcie=1
hostpci1: 0000:84:00,pcie=1,romfile=7900xtx.rom
machine: q35
memory: 16384
meta: creation-qemu=8.1.2,ctime=1707322107
name: EndeavourOS-AMD
numa: 0
ostype: l26
scsi0: tank_nvme:vm-104-disk-1,discard=on,iothread=1,size=32G,ssd=1
scsi1: tank_nvme:vm-104-disk-2,discard=on,iothread=1,size=64G,ssd=1
scsi2: storage_nvme:vm-104-disk-0,backup=0,discard=on,iothread=1,size=512G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=2b620f67-9099-476f-b8a4-6b845bc8524d
sockets: 1
vga: none
vmgenid: 9fb59840-4171-437f-b034-d459120f4f11
OS and Hardware:
Proxmox 8.1.4
Motherboard - Asrock Rack Rome8d-2t
CPU - Epyc 7443p
RAM - 256GB 3200MHZ ECC
GPU1 - Asus TUF RTX-4070ti Super, PCIe7
GPU2 - Sapphire Pulse 7900 XTX, PCIe1
OS Drive - Samsung 980 Pro 500GB
VM OS Drives - ZFS Mirror 2x960GB Samsung P9A3, Oculink 1/2
VM Storage - ZFS Striped Mirror 4x1TB WD SN850, bifurcation card PCIe5
TrueNAS Drives - 4x12TB Seagate x16, SATA 4-7
PCIe NIC - 82599ES 10Gbe - passed through to TrueNAS, PCIe4