Why blacklist doesn't work

rjcab

Active Member
Mar 1, 2021
105
1
38
45
Hello.

I did a fresh install of PVE 9.0.1 and tried to make a passthrough of the GTX1650:

Code:
root@pve:~#  lspci -nnk -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: nvidiafb, nouveau

root@pve:~#  lspci -nnk -s 01:00.1
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: snd_hda_intel

root@pve:~# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1b80,10de:10f0 disable_vga=1

root@pve:~# cat /etc/modprobe.d/
blacklist-nvidia.conf           pve-blacklist.conf              zfs.conf
intel-microcode-blacklist.conf  vfio.conf
root@pve:~# cat /etc/modprobe.d/
root@pve:~# cat /etc/modprobe.d/blacklist-nvidia.conf
blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist rivafb
blacklist snd_hda_intel
blacklist iwlwifi
blacklist input

root@pve:~#

But with dmesg command At the start I don't see any "vfio":

Code:
[   16.308443] vmbr0: port 1(enp4s0) entered forwarding state
[   17.148471] vmbr0: port 1(enp4s0) entered disabled state
[   19.796336] r8169 0000:04:00.0 enp4s0: Link is Up - 1Gbps/Full - flow control rx/tx
[   19.796357] vmbr0: port 1(enp4s0) entered blocking state
[   19.796360] vmbr0: port 1(enp4s0) entered forwarding state
root@pve:~# dmesg | grep nvidia
root@pve:~# dmesg | grep nouveau
root@pve:~# dmesg | grep vfio
root@pve:~#

If you have any ideas, feel free :)
 
Thanks @Impact

Yes I did:

Code:
root@pve:~# update-initramfs -u -k all
update-initramfs: Generating /boot/initrd.img-6.14.8-2-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
root@pve:~#

Code:
root@pve:~# dmesg -T | grep -Ei "nvidia|nouveau|vfio"
root@pve:~#

Code:
root@pve:~# journalctl -kg "nvidia|nouveau|vfio"
-- No entries --
root@pve:~#

For softdep not did a try:
# echo "softdep <some-nvidia> pre: vfio-pci" >> /etc/modprobe.d/<some-nvidia>.conf

But as I can't see any nvidia, nouveau ....etc in dmesg output I think is a bit useless ? I am not an expert :)
 
Very good question I have made a mistake, now:

Code:
root@pve:~# cat /etc/modules
# /etc/modules is obsolete and has been replaced by /etc/modules-load.d/.
# Please see modules-load.d(5) and modprobe.d(5) for details.
#
# Updating this file still works, but it is undocumented and unsupported.
vfio
vfio_iommu_type1
vfio_pci

But still the message we don"t like:

Code:
root@pve:~# journalctl -kg "nvidia|nouveau|vfio"
Sep 25 21:04:57 pve kernel: VFIO - User Level meta-driver version: 0.3
Sep 25 21:04:57 pve kernel: vfio_pci: add [10de:1b80[ffffffff:ffffffff]] class 0x000000/00000000
Sep 25 21:04:57 pve kernel: vfio_pci: add [10de:10f0[ffffffff:ffffffff]] class 0x000000/00000000
Sep 25 21:06:41 pve kernel: vfio-pci 0000:01:00.0: vgaarb: deactivate vga console
Sep 25 21:06:41 pve kernel: vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:own>
Sep 25 21:06:43 pve kernel: vfio-pci 0000:01:00.0: not ready 1023ms after resume; waiting
Sep 25 21:06:44 pve kernel: vfio-pci 0000:01:00.0: not ready 2047ms after resume; waiting
Sep 25 21:06:46 pve kernel: vfio-pci 0000:01:00.0: not ready 4095ms after resume; waiting
Sep 25 21:06:51 pve kernel: vfio-pci 0000:01:00.0: not ready 8191ms after resume; waiting
Sep 25 21:06:59 pve kernel: vfio-pci 0000:01:00.0: not ready 16383ms after resume; waiting
root@pve:~#

and still load the kernel/driver blacklisted

Code:
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: snd_hda_intel
root@pve:~#  lspci -nnk -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: nvidiafb, nouveau
root@pve:~#
 
Last edited:
and still load the kernel/driver blacklisted

Code:
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: snd_hda_intel
root@pve:~#  lspci -nnk -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4026]
        Kernel modules: nvidiafb, nouveau
root@pve:~#
There is no line with Kernel driver in use: and therefore there is no driver loaded. It only mentions which module(s) could be loaded but none is actually used.
 
Thanks for your quick reply.
When I started the VM, and type root@pve:~# lspci -nnks 01:00 proxmox stucks and not accessible anymore even by ICMP requests

1758830941428.png
 
I've been told that if you boot with efi don't need to fill /proc/cmdline ? since pve 8.0 :-/

Code:
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.14.8-2-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
root@pve:~#
 
Last edited:
I've been told that if you boot with efi don't need to fill /proc/cmdline ? since pve 8.0 :-/
I don't understand, sorry.
Code:
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.14.8-2-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
root@pve:~#
intel_iommu=on is no longer necessary since kernel 6.8 and you are using 6.14.

As I said, I have no idea why passthrough is broken in this way on your computer. Ask the motherboard manufacturer and/or see if there is a BIOS version that works better. I cannot help with this, sorry. I only noticed that there was no driver loaded when you thought there was.
 
Last edited:
intel_iommu=on is no longer necessary since kernel 6.8 and you are using 6.14.

Thanks I can remove it.

So even with kernel 6.14, I have to modify /proc/cmdline even if I already edit grub as below:

Code:
root@pve:/mnt/diskint# cat /etc/default/grub
# If you change this file or any /etc/default/grub.d/*.cfg file,
# run 'update-grub' afterwards to update /boot/grub/grub.cfg.
# For full documentation of the options in these files, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`( . /etc/os-release && echo ${NAME} )`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
#GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""

The thing which is strange, if I don't add intel_iommu=on in grub, I dont have this line:

Code:
[    0.107077] DMAR: IOMMU enabled

from this command:
Code:
root@pve:/mnt/diskint# dmesg | grep -e DMAR -e IOMMU
[    0.024418] ACPI: DMAR 0x0000000099772000 000050 (v02 INTEL  EDK2     00000002      01000013)
[    0.024461] ACPI: Reserving DMAR table memory at [mem 0x99772000-0x9977204f]
[    0.107077] DMAR: IOMMU enabled
[    0.279814] DMAR: Host address width 39
[    0.279816] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.279828] DMAR: dmar0: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.279831] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 0
[    0.279833] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.279834] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.281286] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.706409] DMAR: No RMRR found
[    0.706410] DMAR: No ATSR found
[    0.706411] DMAR: No SATC found
[    0.706412] DMAR: dmar0: Using Queued invalidation
[    0.707910] DMAR: Intel(R) Virtualization Technology for Directed I/O
root@pve:/mnt/diskint#