[SOLVED] AMD GPU passthrough & reset hookscript that works through guest shutdowns

jce

Member
Jan 31, 2021
4
1
23
44
I used to have an NVidia GPU which I passed through to my guest VMs, this was working well. Recently I switched to an AMD RX 9070 and encountered the notorious reset bug for the first time. The reset hookscripts posted here and on other forums were helpful but only partially. If shutting down a guest from within the VM itself, the "post-stop" phase in the script doesn't trigger and so the GPU remains bound to vfio-pci. I searched the forum and found others mentioning this without any solution posted so I tweaked the script. My version checks to see if the GPU is already bound to vfio-pci during the "pre-start" phase, if yes then it unbinds from this and binds to amdgpu. The script then continues as normal from there, unbinding from amdgpu and binding to vfio-pci. I couldn't find those discussion threads where others raised this issue so I'm posting a new thread with my revision.

I can appreciate that the line grepping for 'vfio-pci' could be finessed to account for variations in output from `lspci` so please reply with suggestions.

Bash:
#!/usr/bin/bash

phase="$2"

echo "Phase is $phase"

if [ "$phase" == "pre-start" ]; then

    if [ `lspci -nnk | grep -A 2 03:00.0 | tail -1 | sed 's/.*: //'` == "vfio-pci" ]; then
        # Unbind gpu from vfio-pci
        echo "Bound to vfio-pci, unbinding"
        echo "0000:03:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind 2>/dev/null
        # Binding gpu back to amdgpu
        sleep 2
        echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind 2>/dev/null
        sleep 2
    fi

    # Unbind gpu from amdgpu
    echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/unbind 2>/dev/null

    sleep 2

    # Resize the GPU's BAR2 memory region (useful for PCI passthrough)

    echo 8 > /sys/bus/pci/devices/0000:03:00.0/resource2_resize

    sleep 2

elif [ "$phase" == "post-stop" ]; then

    # Unbind gpu from vfio-pci

    sleep 5

    echo "0000:03:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind 2>/dev/null

    sleep 2

    # Bind amdgpu

    echo "0000:03:00.0" > /sys/bus/pci/drivers/amdgpu/bind 2>/dev/null

    sleep 2

fi