GPU passthrough tutorial/reference

Discussion in 'Proxmox VE: Installation and configuration' started by sshaikh, Apr 23, 2017.

  1. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    if you use q35 you can only use ide slot 0 and 2
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  2. Miguel Moreira

    Miguel Moreira New Member
    Proxmox Subscriber

    Joined:
    Nov 20, 2017
    Messages:
    23
    Likes Received:
    0
    I change ide the hdd , and now I don't have the same error, but the complete node it is frozen, and the only way is restart manually the node.

    this is status of vm

    Pinging 10.20.10.44 with 32 bytes of data:
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.
    Reply from 192.168.20.253: Destination host unreachable.
    Reply from 192.168.20.253: Destination host unreachable.
    Reply from 192.168.20.253: Destination host unreachable.
    Request timed out.
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=2ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.

    thanks for you help
     
    #22 Miguel Moreira, Dec 7, 2017
    Last edited: Dec 7, 2017
  3. Miguel Moreira

    Miguel Moreira New Member
    Proxmox Subscriber

    Joined:
    Nov 20, 2017
    Messages:
    23
    Likes Received:
    0
    I change ide the hdd , and now I don't have the same error, but the complete node it is frozen, and the only way is restart manually the node.

    this is status of vm


    hoo is very important to know , this vm is for work using Ethos.
    maybe is not possible minning using vm.
    Pinging 10.20.10.44 with 32 bytes of data:
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.
    Reply from 192.168.20.253: Destination host unreachable.
    Reply from 192.168.20.253: Destination host unreachable.
    Reply from 192.168.20.253: Destination host unreachable.
    Request timed out.
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=2ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Reply from 10.20.10.44: bytes=32 time=1ms TTL=63
    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.

    thanks for you help
     
  4. 3DDario

    3DDario New Member

    Joined:
    Oct 14, 2017
    Messages:
    1
    Likes Received:
    0
    Same problem here (RX 580 - guest OS is Windows 10). It boots up and after two minutes everything (guest and host) freezes. No problem without GPU Passthrough.
     
  5. VTOLfreak

    VTOLfreak New Member
    Proxmox Subscriber

    Joined:
    Nov 14, 2017
    Messages:
    3
    Likes Received:
    0
    I had the same problem with my RX480. Which is pretty much the same card. Scary thing is this can bring the whole node down. I wonder if this is because I don't have proper ACS on my system (Intel c236) and the card is trying to talk to something it should not. Even though it is in its own IOMMU group and this should not affect the host OS in any way?
     
  6. Dorin

    Dorin Member

    Joined:
    Sep 11, 2017
    Messages:
    33
    Likes Received:
    2
    I tried again but this time i can't see in use the vfio-pci driver, only if the following commands are executed after each restart:
    Code:
    # modprobe vfio
    # modprobe vfio_pci
    # echo 10de 1c82 | tee /sys/bus/pci/drivers/vfio-pci/new_id
    10de 1c82
    # echo 10de 0fb9 | tee /sys/bus/pci/drivers/vfio-pci/new_id
    10de 0fb9
    # dmesg | grep -i vfio
    [ 2810.602064] VFIO - User Level meta-driver version: 0.3
    [ 2817.223859] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
    [ 2817.241650] vfio_pci: add [10de:1c82[ffff:ffff]] class 0x000000/00000000
    [ 2817.241654] vfio_pci: add [10de:0fb9[ffff:ffff]] class 0x000000/00000000
    # lspci -nnk -d 10de:1c82
    01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1)
            Subsystem: Micro-Star International Co., Ltd. [MSI] GP107 [GeForce GTX 1050 Ti] [1462:8c96]
            Kernel driver in use: vfio-pci
            Kernel modules: nvidiafb, nouveau
    # lspci -nnk -d 10de:0fb9
    01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fb9] (rev a1)
            Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c96]
            Kernel driver in use: snd_hda_intel
            Kernel modules: snd_hda_intel
    

    Otherwise this is what i get:
    Code:
    # dmesg | grep -i vfio
    nothing
    Kernel driver in use is not displayed or is not vfio-pci.
    # lspci -vnn
    ...
    01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1) (prog-if 00 [VGA controller])
            Subsystem: Micro-Star International Co., Ltd. [MSI] GP107 [GeForce GTX 1050 Ti] [1462:8c96]
            Flags: bus master, fast devsel, latency 0, IRQ 11
            Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
            Memory at e0000000 (64-bit, prefetchable) [size=256M]
            Memory at f0000000 (64-bit, prefetchable) [size=32M]
            I/O ports at e000 [size=128]
            Expansion ROM at 000c0000 [disabled] [size=128K]
            Capabilities: [60] Power Management version 3
            Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
            Capabilities: [78] Express Legacy Endpoint, MSI 00
            Capabilities: [100] Virtual Channel
            Capabilities: [250] Latency Tolerance Reporting
            Capabilities: [128] Power Budgeting <?>
            Capabilities: [420] Advanced Error Reporting
            Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
            Capabilities: [900] #19
            Kernel modules: nvidiafb, nouveau
    
    01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fb9] (rev a1)
            Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c96]
            Flags: bus master, fast devsel, latency 0, IRQ 17
            Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
            Capabilities: [60] Power Management version 3
            Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
            Capabilities: [78] Express Endpoint, MSI 00
            Capabilities: [100] Advanced Error Reporting
            Kernel driver in use: snd_hda_intel
            Kernel modules: snd_hda_intel
    ...
    

    This is what i have:
    Code:
    # uname -r
    4.13.13-2-pve
    
    # find /sys/kernel/iommu_groups/ -type l
    /sys/kernel/iommu_groups/7/devices/0000:00:1c.0
    /sys/kernel/iommu_groups/5/devices/0000:00:1a.0
    /sys/kernel/iommu_groups/3/devices/0000:00:16.3
    /sys/kernel/iommu_groups/3/devices/0000:00:16.0
    /sys/kernel/iommu_groups/11/devices/0000:01:00.1
    /sys/kernel/iommu_groups/11/devices/0000:01:00.0
    /sys/kernel/iommu_groups/1/devices/0000:00:01.0
    /sys/kernel/iommu_groups/8/devices/0000:00:1c.1
    /sys/kernel/iommu_groups/6/devices/0000:00:1b.0
    /sys/kernel/iommu_groups/4/devices/0000:00:19.0
    /sys/kernel/iommu_groups/12/devices/0000:03:00.0
    /sys/kernel/iommu_groups/2/devices/0000:00:14.0
    /sys/kernel/iommu_groups/10/devices/0000:00:1f.3
    /sys/kernel/iommu_groups/10/devices/0000:00:1f.2
    /sys/kernel/iommu_groups/10/devices/0000:00:1f.0
    /sys/kernel/iommu_groups/0/devices/0000:00:00.0
    /sys/kernel/iommu_groups/9/devices/0000:00:1d.0
    
    The "/etc/default/grub" file has:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on video=efifb:eek:ff pcie_acs_override=downstream"
    
    # update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-4.13.13-2-pve
    Found initrd image: /boot/initrd.img-4.13.13-2-pve
    Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
    Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
    done
    
    # update-initramfs -u
    update-initramfs: Generating /boot/initrd.img-4.13.13-2-pve
    
    The "/etc/modprobe.d/vfio.conf" file:
    options vfio-pci ids=10de:1c82,10de:0fb9
    
    The "/etc/modprobe.d/blacklist.conf" file:
    blacklist radeon
    blacklist nouveau
    blacklist nvidia
    
     
  7. Dorin

    Dorin Member

    Joined:
    Sep 11, 2017
    Messages:
    33
    Likes Received:
    2
    Finally, after a long time.. The GTX 1050 Ti seems to work with my VM. :)

    But, after a system shut down, at next boot time, Proxmox didn't finished to boot and seems to have entered into an infinite loop.
    I waited some time, but the system was hanged at "Starting The Proxmox VE cluster filesystem...", new line: "Starting rebootscript.service...".
    I was able to see the output from the monitor which was connected to the GPU (VM's monitor).
    After one or two restarts, i don't remember exactly, the system has boot completely.
    Edit:
    1.
    If i boot the host with the monitor connected to GPU, even if the screen looks like in the picture and a small cursor is blinking in the left bottom corner, the system is completely booted.
    The VM is working when is started and the monitor goes black when i shut down the VM.
    This behavior made me think that the host didn't boot completely.
    2.
    If i boot the host with the monitor disconnected from GPU and later is reconnected back, the screen is black until the VM is started.


    Performance test (PerformanceTest 9.0 Evaluation Version) and system info:
    https://www.passmark.com/baselines/V9/display.php?id=94665526842

    VM - OS: Windows 10 Pro 64 bit; Task Manager shows: "Virtual machine: Yes", "L1 cache: N/A"
    VM - RAM: 16 GB
    VM - HDD: 500 GB
    VM - Display Resolution: 1024 x 768
    Host system - OS: Proxmox VE 5.1
    Host system - name: Dell PowerEdge T20
    Host system - CPU: XEON E3-1225 v3
    Host system - RAM: 4 x 8 GB (ECC RAM)
    Host system - HDD: 3 x 1 TB (RAID-Z1)
    Guest display: VGA Monitor @ 1024 x 768 + HDMI to VGA + Audio output. The sound is working well and so far the quality was very poor only during GPU's driver installation.


    - Therefore, to solve my issue i made a script to execute on restart the commands related to vfio:
    Code:
    # modprobe vfio
    # modprobe vfio_pci
    # echo 10de 1c82 | tee /sys/bus/pci/drivers/vfio-pci/new_id
    # echo 10de 0fb9 | tee /sys/bus/pci/drivers/vfio-pci/new_id
    
    - The "/etc/default/grub" file has:
    Code:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on video=efifb:off pcie_acs_override=downstream"
    
    - The vm's config file:
    Code:
    balloon: 0
    bios: ovmf
    boot: dc
    bootdisk: scsi0
    cores: 4
    cpu: host
    efidisk0: local-zfs:vm-100-disk-2,size=128K
    machine: q35
    memory: 16384
    net0: e1000=##:##:##:##:##:##,bridge=vmbr0
    numa: 0
    ostype: win10
    sata2: local:iso/Windows10.iso,media=cdrom,size=3553536K
    sata3: local:iso/virtio-win-0.1.126.iso,media=cdrom,size=152204K
    scsi0: local-zfs:vm-100-disk-1,size=500G
    scsihw: virtio-scsi-pci
    smbios1: uuid=########-####-####-####-############
    sockets: 1
    hostpci0: 01:00,pcie=1,x-vga=on,romfile=vbios.bin
    
    After Windows installation, i added "hostpci0: 01:00,pcie=1,x-vga=on,romfile=vbios.bin" in VM's config file.
    In my case i found that i have to use a vbios.
    The "vbios.bin" is a modified vbios of my GPU. It was dumped with GPU-Z and modified with a HEX editor as seen on a tutorial from Youtube.


    Thanks.
     

    Attached Files:

    #27 Dorin, Dec 28, 2017
    Last edited: Dec 28, 2017
    johjoh and Aist like this.
  8. TheFunk

    TheFunk Member

    Joined:
    Oct 25, 2016
    Messages:
    35
    Likes Received:
    3
    Hey y'all, been a while. Moved to ESXi for a minute. I'm back now, I'm sorry I ever left.

    I've been using a few different cards in different passthrough configurations recently.

    For those of you on Supermicro X10DAI series boards trying to use passthrough with a relatively powerful card, I highly recommend updating to the latest BIOS revision which includes the option to enable "Above 4G Encoding" for specific PCIE slots only. Super useful. If you have an NVIDIA GPU with more than 4GB of RAM you're going to want to enable that option on the slot where that big GPU is installed (and no other slots).

    Additionally, if you're passing through the card as an EFI device, make sure you set it to use the EFI OPROM in the BIOS.

    Lastly for anyone having issues using qemu agent or the built in Windows power buttons (shutdown, restart, etc) after passing through an NVIDIA GPU, switch the card to message signaled interrupts. https://forums.guru3d.com/threads/windows-line-based-vs-message-signaled-based-interrupts.378044/

    My VM would not successfully complete a shutdown command until this was done.

    FYI this will also resolve the issue of the HID device in device manager with the little yellow triangle.
     
  9. johjoh

    johjoh Member

    Joined:
    Feb 4, 2011
    Messages:
    43
    Likes Received:
    0
    I just want to thank you

    NVIDIA GeForce GTX 1060 (brand Zotax) work like a charm with:
    bios: ovmf
    machine: q35
    hostpci0: 82:00,x-vga=on,pcie=1
     
  10. sshaikh

    sshaikh Member

    Joined:
    Apr 23, 2017
    Messages:
    40
    Likes Received:
    9
    Glad it was that simple for at least one of us!
     
  11. Sub-7

    Sub-7 New Member

    Joined:
    Mar 11, 2018
    Messages:
    7
    Likes Received:
    0
    Hi, I have a few questions

    my cpu (i5 4670k) has no VT-d only VT-x,
    is GRUB_CMDLINE_LINUX_DEFAULT = "quiet intel_iommu = on"
    the only way or is that synonymous with "lxc.mount.entry = / dev / nvidia0 dev / nvidia0 none bind, optional, create = file"?

    does the CPU need VT-d?

    does GPU passthrough work in the container as well?
     
  12. blindrain

    blindrain New Member

    Joined:
    Apr 22, 2018
    Messages:
    3
    Likes Received:
    0
    I Have sucessfully setup GPU passthrough. but audio is messed up.
    So I tried to passthrough my onboard audio too. Which works when I passthrough only the audio and disable the GPU.
    the Boot process EFI screen freezes when I try to passthrough both.

    hostpci0: 01:00.0,pcie=1
    hostpci1: 00:1f.3,pcie=1

    Why would they work independent of each other and not together?
    here's my full config for the vm
    Code:
    balloon: 3072
    bios: ovmf
    bootdisk: scsi0
    cores: 4
    cpu: host
    efidisk0: local-lvm:vm-130-disk-2,size=128K
    hostpci0: 01:00.0,pcie=1
    hostpci1: 00:1f.3,pcie=1
    ide0: local:iso/virtio-win-0.1.141.iso,media=cdrom,size=309208K
    ide2: local:iso/Win10_1803_English_x64.iso,media=cdrom
    machine: q35
    memory: 4096
    name: Cindy
    net0: virtio=4A:D7:94:CE:91:B0,bridge=vmbr0,tag=11
    numa: 1
    ostype: win10
    scsi0: local-lvm:vm-130-disk-1,size=160G
    scsihw: virtio-scsi-pci
    smbios1: uuid=51f589b7-aa7b-4271-82f9-e9fecf1b5b47
    sockets: 1
    usb0: host=046d:c52b
    usb1: host=413c:2011
    usb2: host=0b0e:034a
    usb3: host=1-3,usb3=1
    usb4: host=1-6.2.3
     
  13. maxprox

    maxprox Member
    Proxmox Subscriber

    Joined:
    Aug 23, 2011
    Messages:
    286
    Likes Received:
    11
    Yes (VT-d) and No (container), have a look here: https://pve.proxmox.com/wiki/Pci_passthrough
     
  14. maxprox

    maxprox Member
    Proxmox Subscriber

    Joined:
    Aug 23, 2011
    Messages:
    286
    Likes Received:
    11
    HI,
    nowhere I can found this Syntax
    first I would try the correct one. Have a look at the first Post...
     
  15. Neox

    Neox New Member

    Joined:
    Dec 12, 2018
    Messages:
    5
    Likes Received:
    1
    Worked fine here with linux Guest (my windows ISO won't boot in EFI mode)
    exactly as you have done, but this line
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

    without the efifboot=off

    vm config file added
    STEP1

    bios: ovmf #(from GUI, select BIOS = OVMF)
    machine: q35 #(from CLI)
    then install your OS

    STEP2, from CLI
    hostpci0: 01:00,x-vga=on,pcie=1

    as from GUI pve5.3 adds rombar=0/1,romfile=undefined that prevent VM from booting with error

    kvm: -device vfio-pci,host=01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on,romfile=/usr/share/kvm/undefined: failed to find romfile "/usr/share/kvm/undefined"

    now i'm trying to trick proxmox with saying my GTX is not my primary card,
    to allow noVNC from proxmox to connect to primary card and have my secondary on GTX1080, TV screen for gaming


    with hardware :
    core-i7-4970
    nvidia-GTX1080
    proxmox v5.3
     
    #35 Neox, Dec 13, 2018
    Last edited: Dec 13, 2018
  16. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    yes this is a bug and already fixed in git (so should be fixed with the next pve-manager package update)

    you could add a second virtual gpu in the 'args' line

    maybe i extend the config so that we can pass through a gpu and still have the 'hv_vendor_id' set for nvidia, then it would be only a matter of omitting the 'x-vga' part in the config
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  17. Neox

    Neox New Member

    Joined:
    Dec 12, 2018
    Messages:
    5
    Likes Received:
    1
    Thanks for this info,

    as per rombar=0/1 :
    0 mean you wont see bootscreen until OS is ready, thus hidding proxmox boot logo
    1 mean you want to see proxmox bootlog to know your VM is firing up

    yep, i may give a trial this way first
     
  18. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    no this just means if the rom bar from the device gets mapped into the guest memory
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  19. Neox

    Neox New Member

    Joined:
    Dec 12, 2018
    Messages:
    5
    Likes Received:
    1
    I don't know if it is expected or related to my hardware no having a 'rom bar' to get mapped
    but this is NOT what I have on my system.

    on MY system :
    rombar=0 hide my proxmox boot logo and system only show at "login screen"
    rombar=1 show me proxmox boot/bios logo and progress bar, then GRUB, then os boot sequence, then "login screen"
     
  20. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,482
    Likes Received:
    317
    that may just be a side effect
    the 'rom bar' is the ROM Base Address Register (BAR)

    the default is on (1) but some cards do not work properly when this is done

    generally i would leave it on until there is a problem with it
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice