Difficulty to bind vfio-pci to HBA

Kevin86

New Member
Jun 12, 2019
6
0
1
44
Hi all!

Just updated to 6.0. Works well, except for one small issue. I am unable to bind vfio-pci to my LSI HBA on boot with a conf file in /etc/modprobe.d/ This does work fine for my GPU though.
Code:
options vfio-pci ids=1000:0086
I am able to unbind mpt3sas manually, after which vfio-pci binds without an issue.
Code:
echo "0000:01:00.0" > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
I think my problem resides in initramfs, which somehow triggers the load/bind of kernel driver mpt3sas for my HBA before vfio-pci can. dmesg shows mpt3sas is loaded as soon as init is ran.
Code:
[    1.102377] Run /init as init process
[    1.132019] usb 3-3: new low-speed USB device number 2 using xhci_hcd
[    1.163191] dca service started, version 1.12.1
[    1.163328] ahci 0000:03:00.1: version 3.0
[    1.165961] ahci 0000:03:00.1: SSS flag set, parallel bus scan disabled
[    1.168694] ahci 0000:03:00.1: AHCI 0001.0301 32 slots 8 ports 6 Gbps 0x33 impl SATA mode
[    1.171490] ahci 0000:03:00.1: flags: 64bit ncq sntf stag pm led clo only pmp pio slum part sxs deso sadm sds apst
[    1.174526] mpt3sas version 27.101.00.00 loaded
Code:
[    5.703417] VFIO - User Level meta-driver version: 0.3
[    5.713032] vfio_pci: add [1000:0086[ffffffff:ffffffff]] class 0x000000/00000000
[    5.713036] vfio_pci: add [1002:687f[ffffffff:ffffffff]] class 0x000000/00000000
[    5.713039] vfio_pci: add [1002:aaf8[ffffffff:ffffffff]] class 0x000000/00000000
Blacklisting mpt3sas with a conf file in /etc/modprobe.d/ does not help. I did run update-initramfs -u after blacklisting. Tried both:
Code:
blacklist mpt3sas
and
Code:
install mpt3sas /bin/true
Could someone point me in the right direction how to prevent loading of mpt3sas? Or should I use a script to unbind the driver? Thanks for your help!
 
You can try blacklisting mpt3sas from the kernel command line by changing the following line in /etc/default/grub:

Code:
GRUB_CMDLINE_LINUX_DEFAULT="<...other parameters...> modprobe.blacklist=mpt3sas"

Then running 'update-grub'.

Alternatively, you can try including vfio-pci in your initramfs, by including it in '/etc/initramfs-tools/modules', then regenerating your initramfs with 'update-initramfs -k all -u'.
 
Just tried your suggestions. Unfortunately they did not solve the issue.

I could not find other ways to solve this in example the wiki or Debian manuals. I am to unfamiliar with Linux to fully understand the mechanisms that trigger kernel module loading. Since boot parameters don't work/initramfs vfio-pci whitelisting/blacklisting through '/etc/modprobe.d/' I guess the best alternative way is unbinding via script or a recompiled kernel without the conflicting module? Could use some hints :-)
 
It's not clear to me why the module blacklisting doesn't work. Especially the 'install mpt3sas /bin/true' should have worked, even according to 'man modprobe.d'.

A few shots in the dark:
  • Try running 'depmod -a' as root before rebuilding the initramfs (with blacklist and/or install in your modprobe.d/*.conf)
  • Check 'lsmod' and see if other modules depend on mpt3sas, blacklist them too
  • 'modinfo mpt3sas' mentions mpt2sas as an alias, try blacklisting that one as well
Otherwise, using a script to unbind seems like the only way (unless you actually want to get into recompiling your kernel - if you do, you can find the sources for our pve-kernel here).
 
Tried again to prevent the module to get loaded, but did not succeed.
First gathered info:
Code:
root@pve:~# lsmod | grep mpt3sas
mpt3sas               241664  0
raid_class             16384  1 mpt3sas
scsi_transport_sas     40960  2 ses,mpt3sas
Created corresponding <modulename>.conf files for mpt3sas, mpt2sas, ses, raid_class and scsi_transport_sas in '/etc/modprobe.d/' with the following line in it:
Code:
install <modulename> /bin/true
Ran 'depmod -a' followed by 'update-initramfs -k all -u' as root. Somehow mpt3sas still gets loaded. Verified with 'lsinitramfs' the conf files are included in the newly created initramfs. Same result if I change those file in '/etc/modprobe.d' to:
Code:
blacklist <modulename>
Really wonder what is causing this behaviour. I don't know how to diagnose this further.

I got it running with following script, the HBA behaves well with unbinds:
Code:
#!/usr/bin/perl
use strict;
use warnings;
print "GUEST HOOK: " . join(' ', @ARGV). "\n";
# First argument is the vmid
my $vmid = shift;
# Second argument is the phase
my $phase = shift;
if ($phase eq 'pre-start') {
    # First phase 'pre-start' will be executed before the guest
    # is started. Exiting with a code != 0 will abort the start
    print "$vmid is starting, doing preparations.\n";
    system("echo \"1000 0086\" > /sys/bus/pci/drivers/vfio-pci/new_id &&
            echo \"0000:01:00.0\" > /sys/bus/pci/devices/0000:01:00.0/driver/unbind &&
            echo \"0000:01:00.0\" > /sys/bus/pci/drivers/vfio-pci/bind &&
            echo \"1000 0086\" > /sys/bus/pci/drivers/vfio-pci/remove_id");
    # print "preparations failed, aborting."
    # exit(1);
} elsif ($phase eq 'post-stop') {
    # Last phase 'post-stop' will be executed after the guest stopped.
    # This should even be executed in case the guest crashes or stopped
    # unexpectedly.
    print "$vmid stopped. Doing cleanup.\n";
    system("echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove &&
            echo 1 > /sys/bus/pci/rescan");
} else {
    die "got unknown phase '$phase'\n";
}
exit(0);

Thank you for the help Stefan, really appreciated!
 
Last edited:
Got it solved. I am using UEFI and 'pve-efiboot-tool refresh' needs to run after 'update-initramfs'.

Blacklisting works fine now!