GPU passthrough for both VM & LXC

junukwon

New Member
Nov 9, 2023
5
1
3
Just realized while reading the article again: it was A or B, not A and B. Seems like if I pass the device ID I don't (necessarily) need to blacklist the driver.


TL;DR I have two GPUs. One for VM, one for LXC. Is it possible?

So basically, I've been using GPU1 on a Windows VM for a while. Recently I've acquired a second GPU (GPU2), and I want to use it on a LXC.

The problem is, as my I've followed this wiki article, I've already blacklisted Nvidia gpu drivers (both GPUs are Nvidia). Afaik to passthrough a GPU to an LXC, the host should install & configure the drivers. Hence, I need a way so that the host has the Nvidia drivers, yet catches only GPU2 and leaves GPU1 alone so that my Windows VM can use it. One hard requirement here is, the Windows VM should be able to turn on & off multiple times without a host reboot.

I did some search but failed to find info on my situation. Some articles suggest sharing is not possible but it was about sharing a same GPU on both VMs & LXCs, which is not my case.

Anyways, thanks in advance!
 
Last edited:
Did you get this working or find any additional information? I have been searching for a way to use my P2200 for the host/LXC and pass through my 2080ti to a VM, but I can't find clear guidance on how to force the vfio-pci driver to only one GPU. All I find is old posts and unrelated ones. I guess I could get an AMD card and do it that way, but I don't care to spend money if I have these.
 
Did you get this working or find any additional information? I have been searching for a way to use my P2200 for the host/LXC and pass through my 2080ti to a VM, but I can't find clear guidance on how to force the vfio-pci driver to only one GPU. All I find is old posts and unrelated ones. I guess I could get an AMD card and do it that way, but I don't care to spend money if I have these.

Yes I got it to work. Basically you don't want any of the Nvidia blacklisting stuff. Which means, `/etc/modprobe.d/blacklist` should not include Nvidia, as your P2200 will need that Nvidia module to be loaded.

Now, as you want to prevent the host from loading the Nvidia driver for the 2080ti, you want to note its IDs with `lspci -nnk`. Add it to `/etc/modprobe.d/vfio.conf`.
Code:
options vfio-pci ids=10ae:2186,10ae:20bc disable_vga=1

There are two IDs. One for the VGA and one for the audio device.

With this all set, run `update-initramfs -u` then reboot. If you do `lspci -nnk` again you'll see your P2200 having `Kernel driver in use: nvidia` while your 2080ti having `Kernel driver in use: vfio-pci`. Now you can proceed and install Nvidia drivers for your P2200 on the host machine. For the LXC, you may refer to any other article describing how to use a GPU in a LXC. You'll have to modify the LXC config file and also install some packages like nvidia-container-toolkit to actually use the GPU in your LXC.

On the other hand, you can simply pass-through your 2080ti as a raw pcie device.

In conclusion the key is to not blacklist the driver, but only prevent the specific device (in this case, 2080ti) from being bind to the host.

If you have any questions feel free to ask!
 
Last edited:
  • Like
Reactions: MacDaddyBighorn
Yes I got it to work. Basically you don't want any of the Nvidia blacklisting stuff. Which means, `/etc/modprobe.d/blacklist` should not include Nvidia, as your P2200 will need that Nvidia module to be loaded.

Now, as you want to prevent the host from loading the Nvidia driver for the 2080ti, you want to note its IDs with `lspci -nnk`. Add it to `/etc/modprobe.d/vfio.conf`.
Code:
options vfio-pci ids=10ae:2186,10ae:20bc disable_vga=1

There are two IDs. One for the VGA and one for the audio device.

With this all set, run `update-initramfs -u` then reboot. If you do `lspci -nnk` again you'll see your P2200 having `Kernel driver in use: nvidia` while your 2080ti having `Kernel driver in use: vfio-pci`. Now you can proceed and install Nvidia drivers for your P2200 on the host machine. For the LXC, you may refer to any other article describing how to use a GPU in a LXC. You'll have to modify the LXC config file and also install some packages like nvidia-container-toolkit to actually use the GPU in your LXC.

On the other hand, you can simply pass-through your 2080ti as a raw pcie device.

In conclusion the key is to not blacklist the driver, but only prevent the specific device (in this case, 2080ti) from being bind to the host.

If you have any questions feel free to ask!

Thank you! I did get this all to work shortly after asking my question, I ended up using a script and udev rules. Your way is cleaner, though, and I'll probably try that out. I knew there was a way, but I just couldn't find the proper syntax or examples. Thank you again for the response, I feel like this information should be easier to find!

For reference for future people, I used the following method:

Add udev rule (credit https://github.com/andre-richter/vfio-pci-bind)

nano /usr/udev/rules.d/25-vfio-pci-bind.rules

Add the following (customized for my 2080ti)

Code:
# udev rules file that binds selected PCI devices to vfio-pci instead of
# whatever driver udev and modprobe would ordinarily bind them to.
#
# This rules file should be located in /etc/udev/rules.d/
# vfio-pci-bind.sh must be located in /lib/udev/ and must be executable.
#
ACTION!="add", GOTO="vfio_pci_bind_rules_end"
SUBSYSTEM!="pci", GOTO="vfio_pci_bind_rules_end"


# Identify PCI devices to be bound to vfio-pci using udev matching rules and
# tag each device with "vfio-pci-bind".
#
# Example: Match any PCI device with <Vendor:Device> 1912:0014
#   ATTR{vendor}=="0x1912", ATTR{device}=="0x0014", TAG="vfio-pci-bind"
#
# Example: Match the PCI device with <Domain:Bus:Device.Function> 0000:0b:00.0
#  KERNEL=="0000:0b:00.0", TAG="vfio-pci-bind"
#

#Nvidia 2080ti
ATTR{vendor}=="0x10de", ATTR{device}=="0x1e04", TAG="vfio-pci-bind"
ATTR{vendor}=="0x10de", ATTR{device}=="0x10f7", TAG="vfio-pci-bind"
ATTR{vendor}=="0x10de", ATTR{device}=="0x1ad6", TAG="vfio-pci-bind"
ATTR{vendor}=="0x10de", ATTR{device}=="0x1ad7", TAG="vfio-pci-bind"

# Any device tagged by a rule above is bound to vfio-pci.
#
TAG=="vfio-pci-bind", RUN+="vfio-pci-bind.sh $kernel"
LABEL="vfio_pci_bind_rules_end"

copy the following script into /lib/udev/vfio-pci-bind.sh

nano /lib/udev/vfio-pci-bind.sh

Code:
#!/usr/bin/env bash
# -*- coding: utf-8 -*-
#
# =============================================================================
#
# The MIT License (MIT)
#
# Copyright (c) 2015-2021 Andre Richter
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#
# =============================================================================
#
# Author(s):
#   Andre Richter, <andre.o.richter @t gmail_com>
#
# =============================================================================
#
# This script takes one or two parameters in any order:
#   <Vendor:Device> i.e. vvvv:dddd
#   <Domain:Bus:Device.Function> i.e. dddd:bb:dd.f
# and then:
#
#  (1) If both <Vendor:Device> and <Domain:Bus:Device.Function> were provided,
#      validate that the requested <Vendor:Device> exists at <Domain:Bus:Device.Function>
#
#      If only <Vendor:Device> was provided, determine the current
#      <Domain:Bus:Device.Function> for that device.
#
#      If only <Domain:Bus:Device.Function> was provided, use it.
#
#  (2) Unbinds all devices that are in the same iommu group as the supplied
#      device from their current driver (except PCIe bridges).
#
#  (3) Binds to vfio-pci:
#    (3.1) The supplied device.
#    (3.2) All devices that are in the same iommu group.
#
#  (4) Transfers ownership of the respective iommu group inside /dev/vfio
#      to $SUDO_USER
#
# Script must be executed via sudo

BDF_REGEX="^[[:xdigit:]]{2}:[[:xdigit:]]{2}.[[:xdigit:]]$"
DBDF_REGEX="^[[:xdigit:]]{4}:[[:xdigit:]]{2}:[[:xdigit:]]{2}.[[:xdigit:]]$"
VD_REGEX="^[[:xdigit:]]{4}:[[:xdigit:]]{4}$"

if [[ $EUID -ne 0 ]]; then
    echo "Error: This script must be run as root" 1>&2
    exit 1
fi

if [[ -z "$@" ]]; then
    echo "Error: Please provide Domain:Bus:Device.Function (dddd:bb:dd.f) and/or Vendor:Device (vvvv:dddd)" 1>&2
    exit 1
fi

unset VD BDF
for arg in "$@"
do
    if [[ $arg =~ $VD_REGEX ]]; then
        VD=$arg
    elif [[ $arg =~ $DBDF_REGEX ]]; then
        BDF=$arg
    elif [[ $arg =~ $BDF_REGEX ]]; then
        BDF="0000:${arg}"
        echo "Warning: You did not supply a PCI domain, assuming ${BDF}" 1>&2
    else
        echo "Error: Please provide Vendor:Device (vvvv:dddd) and/or Domain:Bus:Device.Function (dddd:bb:dd.f)" 1>&2
        exit 1
    fi
done

# BDF not provided, find BDF for Vendor:Device
if [[ -z $BDF ]]; then
    COUNT=$(lspci -n -d ${VD} 2>/dev/null | wc -l)
    if [[ $COUNT -eq 0 ]]; then
        echo "Error: Vendor:Device ${VD} not found" 1>&2
        exit 1
    elif [[ $COUNT -gt 1 ]]; then
        echo "Error: Multiple results for Vendor:Device ${VD}, please provide Domain:Bus:Device.Function (dddd:bb:dd.f) as well" 1>&2
        exit 1
    fi
    BDF=$(lspci -n -d ${VD} 2>/dev/null | cut -d " " -f1)
    if [[ $BDF =~ $BDF_REGEX ]]; then
        BDF="0000:${BDF}"
    elif [[ ! $BDF =~ $DBDF_REGEX ]]; then
        echo "Error: Unable to find Domain:Bus:Device.Function for Vendor:Device ${VD}" 1>&2
        exit 1
    fi
fi

TARGET_DEV_SYSFS_PATH="/sys/bus/pci/devices/$BDF"

if [[ ! -d $TARGET_DEV_SYSFS_PATH ]]; then
    echo "Error: Device ${BDF} does not exist, unable to bind device" 1>&2
    exit 1
fi

if [[ ! -d "$TARGET_DEV_SYSFS_PATH/iommu/" ]]; then
    echo "Error: No signs of an IOMMU. Check your hardware and/or linux cmdline parameters. Use intel_iommu=on or iommu=pt iommu=1" 1>&2
    exit 1
fi

# validate that the correct Vendor:Device was found for this BDF
if [[ ! -z $VD ]]; then
    if [[ $(lspci -n -s ${BDF} -d ${VD} 2>/dev/null | wc -l) -eq 0 ]]; then
        echo "Error: Vendor:Device ${VD} not found at ${BDF}, unable to bind device" 1>&2
        exit 1
    else
        echo "Vendor:Device ${VD} found at ${BDF}"
    fi
else
    echo "Warning: You did not specify a Vendor:Device (vvvv:dddd), unable to validate ${BDF}" 1>&2
fi

unset dev_sysfs_paths
for dsp in $TARGET_DEV_SYSFS_PATH/iommu_group/devices/*
do
    dbdf=${dsp##*/}
    if [[ $(( 0x$(setpci -s $dbdf 0e.b) & 0x7f )) -eq 0 ]]; then
        dev_sysfs_paths+=( $dsp )
    fi
done

printf "\nIOMMU group members (sans bridges):\n"
for dsp in ${dev_sysfs_paths[@]}; do echo $dsp; done

modprobe -i vfio-pci
if [[ $? -ne 0 ]]; then
    echo "Error: Error probing vfio-pci" 1>&2
    exit 1
fi

printf "\nBinding...\n"
for dsp in ${dev_sysfs_paths[@]}
do
    dpath="$dsp/driver"
    dbdf=${dsp##*/}

    echo "vfio-pci" > "$dsp/driver_override"

    if [[ -d $dpath ]]; then
        curr_driver=$(readlink $dpath)
        curr_driver=${curr_driver##*/}

        if [[ "$curr_driver" == "vfio-pci" ]]; then
            echo "$dbdf already bound to vfio-pci"
            continue
        else
            echo $dbdf > "$dpath/unbind"
            echo "Unbound $dbdf from $curr_driver"
        fi
    fi

    echo $dbdf > /sys/bus/pci/drivers_probe
done

printf "\n"

# Adjust group ownership
iommu_group=$(readlink $TARGET_DEV_SYSFS_PATH/iommu_group)
iommu_group=${iommu_group##*/}
chown $SUDO_UID:$SUDO_GID "/dev/vfio/$iommu_group"
if [[ $? -ne 0 ]]; then
    echo "Error: unable to adjust group ownership of /dev/vfio/${iommu_group}" 1>&2
    exit 1
fi

printf "success...\n\n"
echo "Device ${VD} at ${BDF} bound to vfio-pci"
echo 'Devices listed in /sys/bus/pci/drivers/vfio-pci:'
ls -l /sys/bus/pci/drivers/vfio-pci | egrep [[:xdigit:]]{4}:
printf "\nls -l /dev/vfio/\n"
ls -l /dev/vfio/

And make it executable

chmod +x /lib/udev/vfio-pci-bind.sh

The script is usable any time, so you can take a card and use it to unbind the nvidia driver and bind the vfio-pci driver. I'm sure there are conditions that prevent this, but it worked well for me to do it manually also.
 
  • Like
Reactions: junukwon