GPU passthrough tutorial/reference

Discussion in 'Proxmox VE: Installation and configuration' started by sshaikh, Apr 23, 2017.

  1. hewu

    hewu New Member
    Proxmox Subscriber

    Joined:
    May 12, 2018
    Messages:
    8
    Likes Received:
    1
    Hello, everybody,
    I've worked my way through the tutorial and almost everything works fine. Thanks a lot for that!
    However, I have the following problem:
    When I restart the VM with Passthrough, my whole system freezes and I have to restart everything.
    I hope you can help me with the diagnosis and give me tips on how to fix it.

    Thanks a lot!
    Greetings hewu

    Translated with www.DeepL.com/Translator
     
  2. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    what kind of card is it ? some amd vega cards cannot be reset properly
    any output in dmesg/syslog ?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. hewu

    hewu New Member
    Proxmox Subscriber

    Joined:
    May 12, 2018
    Messages:
    8
    Likes Received:
    1
    Thank you for your reply

    It is an AMD Radeon Vega Frontier Edition. Is there a Workaround? What is the reason for this behavior?

    Greetings
    Hewu
     

    Attached Files:

    BobhWasatch likes this.
  4. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    the only workaround i know of is to eject the driver in the guest of the card before rebooting/shutdown
    (e.g. rmmod for linux guest)

    it seems there is a kernel fix for this in 4.19, but i do not believe we will backport this
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. Jeffrey Roberts

    Jeffrey Roberts New Member

    Joined:
    Jan 11, 2019
    Messages:
    7
    Likes Received:
    0
    I followed the instructions, but I do not see

    Kernel driver in use: vfio-pci

    when I execute

    lspci -v

    Any ideas on what I might be doing wrong?

    ...

    Also, I see

    Kernel modules: nvidiafb, nouveau

    I added nouveau to the blacklist, should nouveau still be listed in the kernel modules?

    Thank you
     
  6. davu

    davu New Member

    Joined:
    Feb 6, 2019
    Messages:
    1
    Likes Received:
    0
    Hello, I am having troubles with my GPU Passthrough. I am using an RX570. I have two VM's in which I have the GPU passed through (only one is on at a time).

    This method has worked flawlessly with my Windows 10 VM. And it somewhat works with my Ubuntu VM.

    The GPU will passthrough to the Ubuntu VM once, and only once. Whenever I shut it down, I cannot start it back up again until I restart the whole proxmox node. The error I get when starting the Ubuntu VM after shutting it down is:

    Code:
    TASK ERROR: start failed: command ' (a bunch of parameters) ' failed: got timeout
    it seems as if the GPU is getting locked after I shutdown the VM (I shutdown the vm from inside the vm, but clicking the power button and shutting down)
     
  7. koburr

    koburr New Member

    Joined:
    Feb 16, 2019
    Messages:
    1
    Likes Received:
    0
    Ah yes here is my config for GPU passthrough on the PowerEdge R710 I with Xeon E55xx CPUs using two GTX 750 Ti's (One on this machine and one on another:

    Code:
    agent: 1
    balloon: 0
    bios: ovmf
    bootdisk: virtio0
    cores: 3
    cpu: host,hidden=1
    hostpci0: 04:00,x-vga=1,pcie=1
    hotplug: 0
    ide2: none,media=cdrom
    machine: q35
    memory: 7168
    name: WIN10X64PVEXXXX
    net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0
    numa: 1
    ostype: win10
    scsihw: virtio-scsi-pci
    smbios1: uuid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    sockets: 2
    virtio0: local-lvm:vm-102-disk-0,cache=writeback,size=320G
    vmgenid: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    
    It would boot up the machine and crash the whole server with pcie without numa: 1 but without numa would run regular pci.

    Instructions:

    Hardware:
    1. Solder in power plugs onto the power supply connector on the motherboard.
    2. Using a razor knife and hot air soldering station heat up the PCI slit in the back (Using 400 degree heat on medium air or whatever works) and cut out the back of the PCI slot to fit cards.

    Proxmox:
    Edit grub command line w unsafe interrupts:
    Code:
    # nano /etc/default/grub
    
    change:
    Code:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 video=efifb:eek:ff"
    
    add to /etc/modules:
    Code:
    # echo "vfio" > /etc/modules
    # echo "vfio_iommu_type1" > /etc/modules
    # echo "vfio_pci" > /etc/modules
    # echo "vfio_virqfd" > /etc/modules
    
    Allow unsafe interrupts in vfio:
    Code:
    echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
    
    Add dev id (either 04:00 or 06:00 on the R710):
    Code:
    lspci -n -s 04:00
    04:00.0 0300: 10de:1380 (rev a2)
    04:00.1 0403: 10de:0fbc (rev a1)
    
    echo "options vfio-pci ids=10de:1380,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf
    
    blacklist drivers:
    Code:
    echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf
    echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
    echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
    
    Make a new VM for windows 10 as usual: virtio lvm, virtio networking, cpu: host, hidden=1, machine: q35, bios: ovmf

    add virtio drivers cdrom, install...

    Code:
    agent: 1
    balloon: 0
    bios: ovmf
    bootdisk: virtio0
    cores: 3
    cpu: host,hidden=1
    hotplug: 0
    ide2: none,media=cdrom
    machine: q35
    memory: 7168
    name: WIN10X64PVEXXXX
    net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0
    ostype: win10
    scsihw: virtio-scsi-pci
    smbios1: uuid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    sockets: 2
    virtio0: local-lvm:vm-102-disk-0,cache=writeback,size=320G
    vmgenid: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    
    After install enable RDP and add to the config:
    Code:
    hostpci0: 04:00,x-vga=1,pcie=1
    numa: 1
    
    install nvidia drivers as usual.

    Tips on gaming:

    install steam and set up steam in home streaming.
    Streaming will only work on first boot of vm machine before logging in RDP.
    Add notepad or some application as a non steam game (right click in notepad and search with bing to leave notepad and minimize)


    Other notes on CPUs being that they are quade core sockets on the R710 the machine will crash and maybe crash the server if you use more than four cores without sockets: 2.

    Still a very inexpensive buy for doing Autodesk work remotely and gaming while still having a little CPU power left over for running databases and developing websites and applications at home..

    I think the nvidia drivers I'm using are 289.81 I wouldn't recommend installing any newer ones and those seem to be the ones people are recommending on steam for the 750 Ti. Although I heard you can get an RX 460 for around $50 these days.
     
    #47 koburr, Feb 16, 2019
    Last edited: Feb 16, 2019
  8. Saxosus

    Saxosus New Member

    Joined:
    Apr 8, 2019
    Messages:
    2
    Likes Received:
    0
    Question about the vid passthrough instructions...
    I've only been using Proxmox for about a month now and I'm still learning the multitude of details, but do the step 3 instructions update the correct blacklist?

    In step 3, it says add lines to:
    /etc/modprobe.d/blacklist.conf
    In my system I see an already existing file named:
    /etc/modprobe.d/pve-blacklist.conf

    After I ran "update-initramfs -u" and checked with "lspci -v | grep -A 8 -i NVIDIA", I still see the drivers loaded.
     
    #48 Saxosus, May 11, 2019
    Last edited: May 11, 2019
  9. Saxosus

    Saxosus New Member

    Joined:
    Apr 8, 2019
    Messages:
    2
    Likes Received:
    0
    I'm also running an r710 but I don't like the idea of working so hard to modify the server to get an x16 card crammed into the slot in addition to having even more wires to deal with. As it's only a pcie v2 bus anyway, I went with the standard Nvidia Geforce GT710 x8 card. Your GTX 750 will downgrade itself in the x8 slot to 8x pcie 2 speeds, so you'll lose significant throughput there, but your card will still work better (MUCH better I think :) ) than mine. However, there's one more thing you can do to speed it up. While I was researching how much effort I wanted to put into a vid card, I found an interesting thing. To mount a GTX750 in riser 2, you have the choice of using one of two x8 slots. First, and this is ridiculous, but available, there is a very rare, and stupid expensive at over $200, x16 riser card made for the machine. There's also the miner rig mods out there. You can get hold of one of the small bitmining slot adapters which will convert two x8 slots to an x16 slot AND include power into the bus for only about $15. You'll still need an additional power supply because the riser is only rated at 25W per port and not exceeding 30W for the whole riser. Combining the two x8 slots into a single x16 slot will allow your vid card to run at its full speed of 16x

    Can you elaborate a little more on this? I have the hex procs in my machine, but are you saying you have to use at least one core from each proc? Maybe something to do with balancing the data? I'll be interested in learning more.

    Have you experimented with and found any better drivers? I'd love to save the trouble of fighting driver hunting!

    Thanks!
     
  10. Sub-7

    Sub-7 New Member

    Joined:
    Mar 11, 2018
    Messages:
    7
    Likes Received:
    0
  11. Neox

    Neox New Member

    Joined:
    Dec 12, 2018
    Messages:
    5
    Likes Received:
    1
    the real name of the file doesn't matter if you keep the .conf at the end
    this is only for you to remember what it does

    as pve-blacklist.conf might be provided by "proxmox package", I would use a
    gpu-passthrough-blacklist.conf to keep proxmox able to update his own file and keep mine separate

    and after the update-initramfs -u -k all
    you MUST reboot your server, to activate the new setting
     
    Saxosus likes this.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice