Proxmox halts when starting VM with AMD APU passthrough

encryptedbr

New Member
Mar 23, 2023
3
1
1
Hello there,

I'm using proxmox 7.4-3 with an Ubuntu LXC for a while. It's been perfect.
Today I decided to install a Windows VM and passthrough the AMD APU (My CPU is a Ryzen 4650GE PRO).
I was able to install Windows using the following guide (https://4sysops.com/archives/create-a-windows-vm-in-proxmox-ve/). It runs fine.
Then I tried to passthrough the APU using this guide (https://forum.proxmox.com/threads/gpu-passthrough-ryzen-4600g-apu.120151/#post-524787). Also installed the vender-reset module as described here (https://www.nicksherlock.com/2020/11/working-around-the-amd-gpu-reset-bug-on-proxmox/).

lspci command shows the following:
#lspci -s 06:00 06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev dc) 06:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device 1637 06:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor 06:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 06:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 06:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller

When I click to start the VM, it returns an error and the whole host halts/crashes.

dmesg shows the following right after clicking to start the VM.
<6>[ 428.932442] usb usb6: USB disconnect, device number 1 <4>[ 428.968056] ata10.00: disabled <5>[ 428.999646] sd 9:0:0:0: [sdc] Synchronizing SCSI cache <5>[ 429.001686] sd 9:0:0:0: [sdc] Stopping disk <6>[ 429.002204] sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK <3>[ 429.021775] FAT-fs (sdc2): unable to read boot sector to mark fs as dirty <6>[ 429.143677] xhci_hcd 0000:06:00.3: remove, state 4 <6>[ 429.145406] xhci_hcd 0000:06:00.3: USB bus 4 deregistered <6>[ 429.145916] xhci_hcd 0000:06:00.3: remove, state 4 <6>[ 429.147456] xhci_hcd 0000:06:00.3: USB bus 3 deregistered <3>[ 429.353430] Buffer I/O error on dev dm-1, logical block 8945664, lost sync page write <3>[ 429.355415] Buffer I/O error on dev dm-1, logical block 0, lost sync page write <2>[ 429.355547] EXT4-fs error (device dm-1): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal <2>[ 429.355566] EXT4-fs error (device dm-1): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal <3>[ 429.355902] EXT4-fs (dm-1): I/O error while writing superblock <3>[ 429.355911] EXT4-fs (dm-1): previous I/O error to superblock detected <3>[ 429.355918] Buffer I/O error on dev dm-1, logical block 0, lost sync page write <2>[ 429.355921] EXT4-fs (dm-1): Remounting filesystem read-only <3>[ 437.363493] Buffer I/O error on dev dm-6, logical block 9258, lost sync page write <2>[ 443.812106] EXT4-fs error (device dm-6): ext4_journal_check_start:83: comm containerd-shim: Detected aborted journal <3>[ 443.813333] Buffer I/O error on dev dm-6, logical block 0, lost sync page write <3>[ 443.813770] EXT4-fs (dm-6): I/O error while writing superblock <2>[ 443.814213] EXT4-fs (dm-6): Remounting filesystem read-only <3>[ 443.814761] device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -5 <3>[ 443.815207] device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -5 <3>[ 492.321767] Buffer I/O error on dev dm-1, logical block 0, lost sync page write <6>[ 498.044125] kvm: exiting hardware virtualization <5>[ 498.045663] sd 4:0:0:0: [sda] Synchronizing SCSI cache

Am I make any mistake somewhere?
Any help is appreciated.
 
Looks like the GPU is part of a bigger IOMMU group (which is very common) and you cannot share devices from the same IOMMU between the VM and the Proxmox host (and other VMs). The Proxmox host therefore also loses drives, network and probably some other stuff and crashes.
Use pcie_acs_overrride (which breaks VM security isolation and allows the VM to read all of the Proxmox host memory) to break the IOMMU groups. Then passthrough the VGA (.0) and Audio (.1) device of the APU (but not other functions of 06:00) to the VM. Also use this work-around if the APU is used as the boot GPU.
And even then it might not really work because the AMD WIndows GPU drivers might not work if you passthrough integrated graphics. Also make use to use machine type version 6.2 as the drivers appear to have trouble with machine version 7.1.
And search this forum on other people having the exact same problem and same hardware and see if they got it to work and what kind of (unsafe) work-arounds they needed.
 
@leesteken thanks for your support!
I managed to find a typo in my grub config and now the IOMMU devices are ungroupped.
I was able to boot the VM once, but very sluggish. Took minutes to load windows.
Then it stopped booting with the error:

swtpm_setup: Not overwriting existing state file. kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed. stopping swtpm instance (pid 25279) due to QEMU startup error TASK ERROR: start failed: QEMU exited with code 1

I'm digging accross the forum to see if can find a solution but nothing yet. Any suggestion?
 
kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.

I'm digging accross the forum to see if can find a solution but nothing yet. Any suggestion?
I don't think anyone has fixed that particular error, which can happen for any number devices. I think your best bet is to find a post (here or on the internet) of someone who succeeded in passthrough of the same or a very similar AMD integrated graphics, using Proxmox, KVM or libvirt. Then you know it is possible to get working and what device-specific work-arounds are needed.
 
You're right. After one day of reading and searching, could not find anyone who managed to solve this. I'll keep searching and if I find a solution, I will come back here to post the finding.
Thanks again @leesteken
 
  • Like
Reactions: lukassvihel

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!