GPU Passthrough to ubuntu guest, rebooting guest will freeze the guest and glitch the host

Sandbo

Member
Jul 4, 2019
65
6
13
32
Sorry for the long title, but the behaviour is a bit confusing so I said it in full.

My spec is:
Threadripper 1950X
AMD X399 Taichi
AMD Radeon Vii

I clean installed Proxmox 6.2 and then also clean installed ubuntu server 20.04.1. I was trying to pass the GPU from Proxmox to the ubuntu guest. I followed the wiki and got the remapping to generate output as described.

By assigning the GPU from host to the guest, I was able to boot the guest. From within the guest, I am able to see the information of the AMD GPU I passed. I thought I was successful there. Then I went ahead to install the AMD ROCm for the GPU for computing. When doing so, I needed to reboot the guest a couple time.

I realized that upon rebooting the guest, while it could shutdown, it will never start again. I tried to manually stop it from Proxmox, it does stop after some time, but then trying to start it will not work. Going back to Proxmox's shell, I typed lspci -vvv, and I found that Proxmox can no longer detect the GPU anymore:
Code:
44:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon VII] (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu

44:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 HDMI Audio [Radeon VII] (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

Before booting the guest, Proxmox could see the GPU. Even when the guest is still running, Proxmox could display the GPU information correctly:
Code:
44:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon VII] (rev c1) (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon VII]
        Flags: bus master, fast devsel, latency 0, IRQ 255
        Memory at 7fce0000000 (64-bit, prefetchable) [size=256M]
        Memory at 7fcf0000000 (64-bit, prefetchable) [size=2M]
        I/O ports at 4000 [disabled] [size=256]
        Memory at 82300000 (32-bit, non-prefetchable) [size=512K]
        Expansion ROM at 82380000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [200] #15
        Capabilities: [270] #19
        Capabilities: [2a0] Access Control Services
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [320] Latency Tolerance Reporting
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu

Any ideas will be appreciated.
 
Last edited:

Sandbo

Member
Jul 4, 2019
65
6
13
32
Thanks a lot, you have saved me hours of understanding the issue. However, it does seem like there has not been an actual fix for the AMD GPUs yet, as if they are not supposed to be passed to VMs.

Interestingly, I have done GPU passthrough years ago (~2016) before I started using Proxmox. I was using just Virt manager and QEMU if I recall correctly. At that point, I was using Intel 6600 and AMD HD 7950, I dont remember having the same issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!