Intel Arc Pro B50 - xe driver not binding on PVE host

eagleman04

New Member
Apr 16, 2024
16
0
1
Hoping someone in the community can help with this. I'm unable to get the xe driver to bind to my Intel B50 on Proxmox 9.1.4 running the 6.17.2-2pve kernel (my system won't boot on the 6.17.2-4 kernel, so I've pinned to this one for now.)

For starters, here is my hardware:

Motherboard: Asrock Rack GENOAD8X-2T/BCM
CPU: AMD Epyc 9355P

I have enabled SR-IOV and re-bar support in BIOS. All other virtualization functions are enabled by default on the motherboard. I have a few other devices passthrough to other VMs without issue. I am even able to passthrough the B50 to a Windows and Ubuntu VM and it works - as an example, I can transcode on Emby using the card in a Ubuntu VM. The B50 has been updated to the Q3 2025 firmware (Q4 causes the number of VFs to drop to 2).

The problem is on Proxmox itself, the xe driver doesn't seem to be binding. I have set what I believe to be the correct kernel commands in /etc/kernel/cmdline: " intel_iommu=on amd_iommu=on iommu=pt mem_encrypt=on kvm_amd.sev=1"

However, when I run lspci -n -s e3:00 -v this is what I get:

Code:
e3:00.0 0300: 8086:e212 (prog-if 00 [VGA controller])
        Subsystem: 8086:1114
        Flags: fast devsel, NUMA node 0, IOMMU group 13
        Memory at 8de0c000000 (64-bit, prefetchable) [size=16M]
        Memory at 8d400000000 (64-bit, prefetchable) [size=16G]
        Expansion ROM at f0000000 [disabled] [size=2M]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Endpoint, IntMsgNum 0
        Capabilities: [ac] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [d0] Power Management version 3
        Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [110] Null
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [420] Physical Resizable BAR
        Capabilities: [220] Virtual Resizable BAR
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [400] Latency Tolerance Reporting
        Kernel modules: xe

There is no "Kernel module in use" line on the PVE host. I would like to use the card's SR-IOV function to share with a few VMs, however, without the driver binding, I'm stuck at this step. I've spent about a week on this with no luck and really hoping someone can point me in the right direction. Thanks in advance.


 
Hoping someone in the community can help with this. I'm unable to get the xe driver to bind to my Intel B50 on Proxmox 9.1.4 running the 6.17.2-2pve kernel (my system won't boot on the 6.17.2-4 kernel, so I've pinned to this one for now.)

For starters, here is my hardware:

Motherboard: Asrock Rack GENOAD8X-2T/BCM
CPU: AMD Epyc 9355P

I have enabled SR-IOV and re-bar support in BIOS. All other virtualization functions are enabled by default on the motherboard. I have a few other devices passthrough to other VMs without issue. I am even able to passthrough the B50 to a Windows and Ubuntu VM and it works - as an example, I can transcode on Emby using the card in a Ubuntu VM. The B50 has been updated to the Q3 2025 firmware (Q4 causes the number of VFs to drop to 2).

The problem is on Proxmox itself, the xe driver doesn't seem to be binding. I have set what I believe to be the correct kernel commands in /etc/kernel/cmdline: " intel_iommu=on amd_iommu=on iommu=pt mem_encrypt=on kvm_amd.sev=1"

However, when I run lspci -n -s e3:00 -v this is what I get:

Code:
e3:00.0 0300: 8086:e212 (prog-if 00 [VGA controller])
        Subsystem: 8086:1114
        Flags: fast devsel, NUMA node 0, IOMMU group 13
        Memory at 8de0c000000 (64-bit, prefetchable) [size=16M]
        Memory at 8d400000000 (64-bit, prefetchable) [size=16G]
        Expansion ROM at f0000000 [disabled] [size=2M]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Endpoint, IntMsgNum 0
        Capabilities: [ac] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [d0] Power Management version 3
        Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [110] Null
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [420] Physical Resizable BAR
        Capabilities: [220] Virtual Resizable BAR
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [400] Latency Tolerance Reporting
        Kernel modules: xe

There is no "Kernel module in use" line on the PVE host. I would like to use the card's SR-IOV function to share with a few VMs, however, without the driver binding, I'm stuck at this step. I've spent about a week on this with no luck and really hoping someone can point me in the right direction. Thanks in advance.


Did you enable the vfs? If not, one way is to add it to sysfs.conf.

example:
Code:
devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:01.0/0000:67:00.0/sriov_numvfs = 0
devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:01.0/0000:67:00.0/sriov_numvfs = 4

As usual your pci id will differ. As listed i have mine set to 4 division.
 
Thank you for the response. I tried that but I get a message saying "No such file or directory", even though when I cd into the directory, I do see sriov_numvfs exists. I assumed it wasn't able to "see" it due to the lack of the driver, but I'm also in an area beyond my expertise.
 
Thank you for the response. I tried that but I get a message saying "No such file or directory", even though when I cd into the directory, I do see sriov_numvfs exists. I assumed it wasn't able to "see" it due to the lack of the driver, but I'm also in an area beyond my expertise.
you will need to install sysfsutils
Code:
apt install sysfsutils
 
Thank you again. I gave that a shot but it told me it's already installed.

Code:
sysfsutils is already the newest version (2.1.1-7).

When I cd into sriov_numvfs after that I get:

Code:
 [ Error writing lock file ./.sriov_numvfs.swp: Permission denied ]

I tried to save and write to the file anyway, but I still get:

Code:
[ Error writing sriov_numvfs: No such file or directory ]
 
Thank you again. I gave that a shot but it told me it's already installed.

Code:
sysfsutils is already the newest version (2.1.1-7).

When I cd into sriov_numvfs after that I get:

Code:
 [ Error writing lock file ./.sriov_numvfs.swp: Permission denied ]

I tried to save and write to the file anyway, but I still get:

Code:
[ Error writing sriov_numvfs: No such file or directory ]
sriov_numfs is not a directory.. you need to for example want 4 vfs.
Code:
echo 4 > sriov_numvfs
 
Unfortunately, I still get the same result:

Code:
root@prox:/sys/bus/pci/devices/0000:e3:00.0# echo 4 > sriov_numvfs
-bash: echo: write error: No such file or directory
 
Thank you. When I run that I get:
Code:
root@prox:~# find /sys/devices -path "*e3:00.0*sriov_numvfs"
/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs
 
Last edited:
It didn't grab all of it. Sorry, this is the output:

Code:
-bash: 4/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs: No such file or directory
 
It didn't grab all of it. Sorry, this is the output:

Code:
-bash: 4/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs: No such file or directory
so your command will be
Bash:
echo 2 > "/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs"

edit: changed to "2" I forgot you have the latest firmware.
 
Last edited:
Still no luck:


Code:
root@prox:~# echo 4 > /sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs
-bash: echo: write error: No such file or directory
root@prox:~#
 
Thanks.

Code:
root@prox:~# echo 2 > "/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0/sriov_numvfs"
-bash: echo: write error: No such file or directory

I also tried cd into that directory and writing directly to the sriov_numvfs file. Unfortunately, I still get the same error.

Code:
root@prox:~# cd /sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# echo 2 > sriov_numvfs
-bash: echo: write error: No such file or directory

Just to confirm, when I run ls in that directory, I do see the sriov_numvfs is there.
 
Last edited:
Sure, this is the output

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# ls
aer                   config                    device           iommu_group    max_link_speed  power        reset_method  resource2_resize         sriov_numvfs     sriov_vf_total_msix  vendor
ari_enabled           consistent_dma_mask_bits  dma_mask_bits    irq            max_link_width  power_state  resource      resource2_wc             sriov_offset     subsystem
boot_vga              current_link_speed        driver_override  link           modalias        remove       resource0     revision                 sriov_stride     subsystem_device
broken_parity_status  current_link_width        enable           local_cpulist  msi_bus         rescan       resource0_wc  rom                      sriov_totalvfs   subsystem_vendor
class                 d3cold_allowed            iommu            local_cpus     numa_node       reset        resource2     sriov_drivers_autoprobe  sriov_vf_device  uevent
 
Sure, this is the output

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# ls
aer                   config                    device           iommu_group    max_link_speed  power        reset_method  resource2_resize         sriov_numvfs     sriov_vf_total_msix  vendor
ari_enabled           consistent_dma_mask_bits  dma_mask_bits    irq            max_link_width  power_state  resource      resource2_wc             sriov_offset     subsystem
boot_vga              current_link_speed        driver_override  link           modalias        remove       resource0     revision                 sriov_stride     subsystem_device
broken_parity_status  current_link_width        enable           local_cpulist  msi_bus         rescan       resource0_wc  rom                      sriov_totalvfs   subsystem_vendor
class                 d3cold_allowed            iommu            local_cpus     numa_node       reset        resource2     sriov_drivers_autoprobe  sriov_vf_device  uevent
Code:
cat sriov_totalvfs
 
I reverted the B50's firmware back to Q3 2025, so I'm showing 12 VFs

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# cat sriov_totalvfs
12
 
I reverted the B50's firmware back to Q3 2025, so I'm showing 12 VFs

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# cat sriov_totalvfs
12
Kind of running out of options here:
Try these:
Bash:
apt install firmware-intel-graphics
cat sriov_numvfs
Try the latest Kernel 6.17.4-2-pve
 
Updated to the latest Proxmox kernel. and installed firmware-intel-graphics. It now no longer tells me the file or directory isn't found. Now it just says permissions denied. I tried chmod the file but even with 777, I get permission denied

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# echo > 4 sriov_numvfs
-bash: 4: Permission denied

Looking at dmesg for the device, I do see a few errors. I'm just not sure how to interpret them.

Code:
root@prox:/sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:01.0/0000:e3:00.0# dmesg | grep e3:00
[    0.903855] pci 0000:e3:00.0: [8086:e212] type 00 class 0x030000 PCIe Endpoint
[    0.903886] pci 0000:e3:00.0: BAR 0 [mem 0x8de0c000000-0x8de0cffffff 64bit pref]
[    0.903889] pci 0000:e3:00.0: BAR 2 [mem 0x8d400000000-0x8d7ffffffff 64bit pref]
[    0.903893] pci 0000:e3:00.0: ROM [mem 0xf0000000-0xf01fffff pref]
[    0.903998] pci 0000:e3:00.0: PME# supported from D0 D3hot
[    0.904035] pci 0000:e3:00.0: VF BAR 0 [mem 0x8de00000000-0x8de00ffffff 64bit pref]
[    0.904037] pci 0000:e3:00.0: VF BAR 0 [mem 0x8de00000000-0x8de0bffffff 64bit pref]: contains BAR 0 for 12 VFs
[    0.904040] pci 0000:e3:00.0: VF BAR 2 [mem 0x8d800000000-0x8d87fffffff 64bit pref]
[    0.904042] pci 0000:e3:00.0: VF BAR 2 [mem 0x8d800000000-0x8ddffffffff 64bit pref]: contains BAR 2 for 12 VFs
[    1.094278] pci 0000:e3:00.0: vgaarb: setting as boot VGA device
[    1.094281] pci 0000:e3:00.0: vgaarb: bridge control possible
[    1.094282] pci 0000:e3:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    1.118863] pci 0000:e3:00.0: Adding to iommu group 13
[   14.143216] xe 0000:e3:00.0: [drm] Running in SR-IOV PF mode
[   14.143674] xe 0000:e3:00.0: [drm] Found battlemage (device ID e212) discrete display version 14.01 stepping B0
[   14.145104] xe 0000:e3:00.0: [drm] VISIBLE VRAM: 0x000008d400000000, 0x0000000400000000
[   14.145133] xe 0000:e3:00.0: [drm] VRAM[0, 0]: Actual physical size 0x0000000400000000, usable size exclude stolen 0x00000003fb000000, CPU accessible size 0x00000003fb000000
[   14.145148] xe 0000:e3:00.0: [drm] VRAM[0, 0]: DPA range: [0x0000000000000000-400000000], io range: [0x000008d400000000-8d7fb000000]
[   14.145165] xe 0000:e3:00.0: [drm] Total VRAM: 0x000008d400000000, 0x0000000400000000
[   14.145174] xe 0000:e3:00.0: [drm] Available VRAM: 0x000008d400000000, 0x00000003fb000000
[   14.204588] xe 0000:e3:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[   14.220809] xe 0000:e3:00.0: [drm] GT0: Using GuC firmware from xe/bmg_guc_70.bin version 70.40.2
[   14.220935] xe 0000:e3:00.0: [drm] GuC firmware (70.45.2) is recommended, but only (70.40.2) was found in xe/bmg_guc_70.bin
[   14.221047] xe 0000:e3:00.0: [drm] Consider updating your linux-firmware pkg or downloading from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
[   14.230169] xe 0000:e3:00.0: swiotlb buffer is full (sz: 1048576 bytes), total 32768 (slots), used 214 (slots)
[   14.230289] xe 0000:e3:00.0: [drm] *ERROR* GT0: GuC init failed with -EIO
[   14.230401] xe 0000:e3:00.0: [drm] *ERROR* GT0: Failed to initialize uC (-EIO)
[   14.230547] xe 0000:e3:00.0: probe with driver xe failed with error -5