[TUTORIAL] Enabling SR-IOV for Intel NIC (X550-T2) on Proxmox 6

Sandbo

Well-Known Member
Jul 4, 2019
85
10
48
34
[Tutorial] Enabling SR-IOV for Intel NIC (X550-T2) on Proxmox 6
renderTimingPixel.png

As I have struggled through setting up (and succeed, yay!) SR-IOV with an Intel NIC, I decide to do a little write up for myself and as a sharing. I am not an expert in Linux/KVM/Networking, so my implementation might not be the best. I am glad if you can point out anything to improve regarding the steps.
In this tutorial I try to outline the essential steps to get SR-IOV up and running by enabling virtual functions (VF) on your NIC in a PVE system. You can set this up without even connecting to internet. Evidently, you will need a compatible system to begin with.
First of all, here is my system for reference:
  • Athlon 200GE
  • Asrock X470 Fatality ITX
  • Intel X550-T2
Then run a standalone PVE and house a router VM (ClearOS) and a few NAS/Web servers. SR-IOV is not required but I guess it doesn't hurt to learn how it can be used so I bothered.
In the following, I assume you have a fresh installation of PVE. You can skip some steps in case you have already performed them.

Part 1 - Enable IOMMU

First you want to make sure IOMMU and SR-IOV are enabled in BIOS. Also, you want to set ACS to enable in BIOS. If you do not have these setting in the BIOS, it is highly likely your system does not support SR-IOV to begin with and little can be done. For reference, some Asrock's consumer boards have these setting, e.g. Asrock X470 Fatality ITX
Here, I will follow PVE's Wiki: https://pve.proxmox.com/wiki/PCI(e)_Passthrough
For editing conf, you can use either vi or nano to edit a text file, e.g. nano /etc/default/grub. Then you can do Ctrl+O, Enter to save. Ctrl+X to leave.

Enabling IOMMU
  • edit /etc/default/grub, on the line GRUB_CMDLINE_LINUX_DEFAULT, add after the flag quiet two more flags:
  • For Intel CPU
intel_iommu=on iommu=pt
  • For AMD CPU
amd_iommu=on iommu=pt

The second flag allows higher performance when using SR-IOV VFs.
  • update grub

update-grub
  • next, add the vfio modules to /etc/modules

vfiovfio_iommu_type1vfio_pcivfio_virqfd
  • followed by updating initramfs

update-initramfs -u -k all
  • reboot the PVE host
Verifying IOMMU

After rebooting, you can check if IOMMU is functioning by reading driver message:

dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
It should display that IOMMU, Directed I/O or Interrupt Remapping is enabled, depending on hardware and kernel the exact message can vary. In my AMD system, the latter is shown.
At this point you should be able to assign PCI(e) devices to guests. To pass devices as PCI-E, you will need to use q35 as the BIOS, and SeaBIOS won't work. If you haven't already, it's a good point to go ahead and try the pass-through to make sure it works.

Part 2 - Enable SR-IOV

Enabling SR-IOV is surprisingly straight-forward (when it works), I am following this guide:
https://software.intel.com/en-us/articles/using-sr-iov-to-share-an-ethernet-port-among-multiple-vms
Checking the name of your NICs
First of all, you need to find out the name of the NICs you want to pass. This can be either done by checking them in the tree directory in PVE GUI: click on the node --> System --> Network. In my case the X550-T2's two ports are respectively named enp1s0f0 and enp1s0f1. Yours may be different.
Alternatively, you can see them in the terminal by executing

ip link

With the name of the NICs known, you can now test if SR-IOV can be switched on.
  • execute the below. Replace N with number of VF you want, and <name of your NIC> with the name of your NIC you found in the previous step.
echo N > /sys/class/net/<name of your NIC>/device/sriov_numvfs

If it works you should see no return. Otherwise here are a few possible errors.
Debugging
  1. Device busy.
    Probably for some reason VF has been assigned. You can try to set N to 0 first, then a number you want.
  2. -bash: echo: write error: Cannot allocate memory
    This can be more troublesome and related to BIOS setting. You want to check driver message and see if there is any debugging tips:
dmesg | grep sriov

For point 2, you may try to solve it by adding a line to the grub (https://access.redhat.com/solutions/37376), after iommu=pt, add

pci_pt_e820_access=on pci=assign-busses

Reboot and try the echo command again. I have not tested the functionality of the VFs with these flag so it might not work.
Verifying the assignment of virtual functions
Now assuming you have successfully enabled the VFs, we can check if they exist by looking them up.
  • execute
lspci -vv | grep Eth

And you should see a lot more Ethernet controllers with a bunch of them called virtual functions. Congratulations! It has worked for you.

Part 3 - Setting SR-IOV up for use in PVE

Making the VF persistent

The above assignment was a test to check if your system can assign VFs. To make the assignment persistent (stay over boots), you need to make the system start them automatically. We can do it the Debian way using systemd. Alternatively, you could also perform the same using rc.local.
We will first need to set up a service.
  • create a service at this location: /etc/systemd/system/sriov-NIC.service (you can pick a different name)
  • paste the below content into the above service. Again, modify N to number of VFs, replace <name of your NICx> to that of yours.
[Unit]Description=Script to enable SR-IOV on boot
[Service]Type=oneshotExecStart=/usr/bin/bash -c '/usr/bin/echo N > /sys/class/net/<name of your NIC1>/device/sriov_numvfs'ExecStart=/usr/bin/bash -c '/usr/bin/echo N > /sys/class/net/<name of your NIC2>/device/sriov_numvfs'
[Install]WantedBy=multi-user.target
  • enable the service
systemctl enable sriov-NIC

It is good to test the script once. First, repeat the echo comment in part 2 with N set to 0. Check by executing "lspci -vv | grep Eth" that the VFs are gone. Then try to start the service and read the status:

systemctl start sriov-NICsystemctl status sriov-NIC

You should see for each echo comment, the status reads 0/SUCCESS. From now on your system will have VFs assigned on boot. To disable the assignment on boot, execute "systemctl disable sriov-NIC".

Setting PF to UP on boot

To use the VFs, you will actually need the PF (Physical functions) to turn on first. To have them turned on automatically on boot, we can set it up in the GUI. Go to your node, System --> Network, double click your NIC and check the box Autostart. Alternatively, this can be set by adding a line "auto <name of your nic>" in /etc/network/interfaces.
Block VFs from being loaded by PVE
As we plan to assign the VF to the guests, we can prevent PVE from loading it to avoid any conflicts. First, we need to know what VF driver is being loaded.
  • execute
lspci -nnk | grep -A4 Eth

Look for the line "Kernel driver in use:" and see what is being loaded. With my X550-T2, it is ixgbevf. We can then blacklist this module.
  • edit /etc/modprobe.d/pve-blacklist.conf and add the following at the bottom

# <your VF module>blacklist <your VF module>
  • then execute
update-initramfs -u -k all
  • Reboot the PVE host
With this you should have prepared your system for passing VFs to the guests. Happy virtualizing!

Hints on assigning VFs

Here I am sharing some experience in using VFs.
Knowing which port is assigned to which VF
It is crucial to know which VF is corresponding to which port when a multiport NIC is used. This can be roughly found by looking at the last digit of the device ID.
  • execute
lspci | grep Eth

From the list of Ethernet controller, you can see multiple device IDs from the virtual functions. With a two-port NIC, the assignment is such that the 1st port always has an even last digit, i.e. 01:10.0, 01:10.2, 01:10.4 and so on. The 2nd port always has odd last digit, i.e. 01:10.1, 01:10.3, 01:10.5 and so on.
I have not tested with my four-port NIC, but I guess it will be a modular, such that port 1-4 got assigned to 01:10.0-01:10.3, then repeat. (someone please correct me if I am wrong).
Checking assignment of the individual VF
Further, you can check if VF is really being used by the guest. Taking a ubuntu guest as an example. Once you have set the VF adapter UP link, you should be able to see a MAC address being assigned in the PVE host (use ip link). Otherwise they may be all zeros to begin with. It is always good to verify if the VF is functional.

Fixing the MAC address of VFs

Today I ran into some issue where my router VM failed to initialize after rebooting itself. This also happens after I have rebooted the PVE host. After some checks, I realize it was caused by the fact that the VFs' MAC are randomly assigned by the guest VMs. As a router this proves problematic as now the ISP (modem) sees a different MAC everytime it boot and it needs to reassign a new IP which can run into troubles if done too often.

This can be easily solved by fixing the MAC for a certain VF. Again, we can do this with the same service we created earlier.
  • edit /etc/systemd/system/sriov-NIC.service with the follwing:
[Unit]
Description=Script to enable SR-IOV on boot

[Service]
Type=oneshot
# Starting SR-IOV
ExecStart=/usr/bin/bash -c '/usr/bin/echo N > /sys/class/net/<name of your NIC1>/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c '/usr/bin/echo N > /sys/class/net/<name of your NIC2>/device/sriov_numvfs'
# Setting static MAC for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set <name of your NIC1> vf M mac <mac addr of vf M of NIC1>'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set <name of your NIC2> vf M mac <mac addr of vf M of NIC2>'

[Install]
WantedBy=multi-user.target

As usual, replace N and <name of your NICx> with your configuration. On the two added lines, replace also M with the Mth VF of NICx you want to set a fixed MAC for. Replace <mac addr of vf M of NICx> with your desired MAC address.

After setting the above, you can follow the same method before to test the service.

Finally, for reference, I have tested the VF with my virtual router and it easily gives full 1 Gbps link speed. I have yet to test 10 Gbps, but then OpenVPN (route all traffic, tested inside LAN) gave me 600 Mbps of sustained throughput (80% CPU utilization at router). It looks like the VFs are working well with minimal performance hit.

The most up-to-date tutorial can be found here:
https://www.reddit.com/r/Proxmox/comments/cm81tc/tutorial_enabling_sriov_for_intel_nic_x550t2_on/
 
Last edited:
Rise old thread, RISE!

Apologies for the necro, but this post comes up highly in google searches for SR-IOV and PVE.

-bash: echo: write error: Cannot allocate memory
This can be more troublesome and related to BIOS setting. You want to check driver message and see if there is any debugging tips:

In case anyone else should run into the cause of this that I had:
Poweredge r420 running two cpus, with the NIC in slot 1.
While everything in dmesg showed VT-D, IOMMU, SR-IOV as good, when I tried to actually create the VF's, it would fail with that error.
It turns out the R420 I had, originally shipped with 1 CPU, which means that the Slot 1 riser was only a 4x connection, electrically. This shows up in dmesg with the intel driver (ixgbe) that I was using stating it was limited in connection speed, but nothing about VFs. YMMV depending on your driver.

Replacing that riser with a full 8x connection allowed me to create VF's without a hitch.
 
  • Like
Reactions: Zahzi and kwinz
Thanks so much for this thead! Saved me a bunch of time, even if I ended up going down a few rabbit holes reading up on SR-IOV... ;-)

Can confirm that @Sandbo 's instructions worked for me to get SR-IOV up and running (with pinned MAC addresses, great catch on that!) on my Intel X299 + X710-DA2 setup w/ Proxmox 6.2 ( proxmox-ve: 6.2-1, running kernel: 5.4.44-2-pve )

Also, FWIW, I'm running the X710-DA2 @ PCIE3 but in a X4 (electrical + logical) slot, with no issues.
 
  • Like
Reactions: lexxai
Last edited:
# Setting static MAC for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set <name of your NIC1> vf M mac <mac addr of vf M of NIC1>'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set <name of your NIC2> vf M mac <mac addr of vf M of NIC2>'
Hello, I am also using the x550t2 NIC. I can see the VF NIC after rebooting PVE by running the relevant script according to your tutorial, but the command to manually assign the MAC address does not seem to have any effect, the MAC address of the VF NIC remains random under PVE. I am using PVE7 and would appreciate your guidance.
 
So glad this necro was given a second chance! After wasting time trying to run the hamster wheel of the dead sysfsutils (took me too long to think about using systemd I admit), I found this and was on my way.

Should replace the section about SR-IOV with this.

Works for chelsio as long as you edit the " /sys/class/net/<name of your NIC1>/device/sriov_numvfs " to " /sys/class/net/ethX/device/driver/<bus_id>/sriov_numvfs ". Also don't forget to swap out the appropriate drivers to load on start ie cxgb4 and blacklist cxgb4vf.
 
Necro Part 2 - I have tried aroudn a bit with this turorial, many parts worked great, simply copying the service config, gave a few errors, I learnded to format it correctly, then it worked!


Code:
[Unit]
Description=Script to enable SR-IOV on boot

[Service]
Type=oneshot
ExecStart=/usr/bin/bash -c '/usr/bin/echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c '/usr/bin/echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs'

[Install]
WantedBy=multi-user.target


---

Unfortunately Part 3 didnt work for me. ixgbevf was found, maybe I made a mistake before proceeding with the tutorial, to assign a VF to a VM, it failed with the error of "Cant reset PCI" apparently it has something to do with the same IOMMU group on the X550 ?

https://forum.proxmox.com/threads/pci-passthrough-sr-iov-cant-reset-pci-device.65413/

---

I will try to follow the tutorial in one go, next time i have time. Either way, thanks @Sandbo for the work!
 
thank you @Sandbo great tuto that help me a lot on my x540 nic

One thing I would add is you don't have to create a service if you don't need to assign mac, you can just add this to the conf file in /etc/modprobe.d/:

options ixgbe max_vfs=8

It's a bit cleaner IMO
 
Hello, and thankyou @Sandbo for giving me a starting point. I have expanded the service a bit to use with my Solarflare SFN7022 NIC's
Blacklisting the sfc.ko driver was not an option for me, since host uses the physical NIC ports as well, plus that would prevent me from creating more VF's on the fly, or partitioning for more PF's.

with Solarflare, detaching the VFs from the host is done with echo 0000:01:00.2 > /sys/bus/pci/devices/0000\:01\:00.2/driver/unbind

The extra backslashes in the script is to prevent system logs from being spammed with Ignoring unknown escape sequences
The example creates 2 VFs per port on a dualport NIC. Then it sets static MAC addresses for the 4 VFs, and detaches them from the host.

I have also added Stop and Reload Functions. Stopping the service removes the VFs. You can edit the script if you want the Stop function to recreate the VF's, but then leave them attached to the host.

Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network-online.target
After=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac XX:XX:XX:XX:XX:XX'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.2 > /sys/bus/pci/devices/0000\\:01\\:00.2/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
# List new VFs
ExecStart=/usr/bin/lspci -D -d1924:
# Destroy VFs
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Reload NIC VFs
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.2 > /sys/bus/pci/devices/0000\\:01\\:00.2/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
ExecReload=/usr/bin/lspci -D -d1924:

[Install]
WantedBy=multi-user.target
 
  • Like
Reactions: ajtatum
Hello, and thankyou @Sandbo for giving me a starting point. I have expanded the service a bit to use with my Solarflare SFN7022 NIC's
Blacklisting the sfc.ko driver was not an option for me, since host uses the physical NIC ports as well, plus that would prevent me from creating more VF's on the fly, or partitioning for more PF's.

with Solarflare, detaching the VFs from the host is done with echo 0000:01:00.2 > /sys/bus/pci/devices/0000\:01\:00.2/driver/unbind

The extra backslashes in the script is to prevent system logs from being spammed with Ignoring unknown escape sequences
The example creates 2 VFs per port on a dualport NIC. Then it sets static MAC addresses for the 4 VFs, and detaches them from the host.

I have also added Stop and Reload Functions. Stopping the service removes the VFs. You can edit the script if you want the Stop function to recreate the VF's, but then leave them attached to the host.

Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network-online.target
After=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac XX:XX:XX:XX:XX:XX'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.2 > /sys/bus/pci/devices/0000\\:01\\:00.2/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
# List new VFs
ExecStart=/usr/bin/lspci -D -d1924:
# Destroy VFs
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Reload NIC VFs
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac XX:XX:XX:XX:XX:XX'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.2 > /sys/bus/pci/devices/0000\\:01\\:00.2/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
ExecReload=/usr/bin/lspci -D -d1924:

[Install]
WantedBy=multi-user.target
Sorry to resurrect an old thread, but I'm hoping you can help. I just installed OPNsense and finally got it configured correctly by doing PCI Passthrough. Stupid me though didn't realize that this would prevent any other VM from using any Link Bonds previously created with those NICs, so right now I'm using a USB to 2.5G adapter.

Ideally, since it took so much to configure OPNsense correctly, I could keep the MAC addresses that OPNsense currently has for the two Intel X550-T2s.

I've already done the stuff to get IOMMU working and verified. So SR-IOV would be my next step. I want to make sure I have this right. I also don't understand what "echo 2 > ...sriov_numvfs" does.

The following are the names of the NICs and the MAC Address that appears in OPNsense:

enp1s0f0: b4:96:91:3a:ec:ac
enp1s0f1: b4:96:91:3a:ec:ae
enp2s0f0: 50:7c:6f:33:a7:1a
enp2s0f1: 50:7c:6f:33:a7:18



So, using your example, I would have:
Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network-online.target
After=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f0 vf 0 mac b4:96:91:3a:ec:ac'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f1 vf 1 mac b4:96:91:3a:ec:ae'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f0 vf 0 mac 50:7c:6f:33:a7:1a'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f1 vf 1 mac 50:7c:6f:33:a7:18'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.0 > /sys/bus/pci/devices/0000\\:01\\:00.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.1 > /sys/bus/pci/devices/0000\\:01\\:00.1/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:02:00.0 > /sys/bus/pci/devices/0000\\:02\\:00.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:02:00.1 > /sys/bus/pci/devices/0000\\:02\\:00.1/driver/unbind'
# List new VFs
ExecStart=/usr/bin/lspci -D -d1924:
# Destroy VFs
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
# Reload NIC VFs
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f0 vf 0 mac b4:96:91:3a:ec:ac'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f1 vf 1 mac b4:96:91:3a:ec:ae'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f0 vf 0 mac 50:7c:6f:33:a7:1a'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f1 vf 1 mac 50:7c:6f:33:a7:18'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.0 > /sys/bus/pci/devices/0000\\:01\\:00.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.1 > /sys/bus/pci/devices/0000\\:01\\:00.1/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:02:00.0 > /sys/bus/pci/devices/0000\\:02\\:00.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:02:00.1 > /sys/bus/pci/devices/0000\\:02\\:00.1/driver/unbind'
ExecReload=/usr/bin/lspci -D -d1924:

[Install]
WantedBy=multi-user.target

Would that pass those same MAC addresses into the OPNsense VM or would it most likely break it? Any input on how I can migrate from OPNsense having pass through to using SR-IOV would be greatly appreciated.

Thank you!
AJ
 
Sorry to resurrect an old thread, but I'm hoping you can help. I just installed OPNsense and finally got it configured correctly by doing PCI Passthrough. Stupid me though didn't realize that this would prevent any other VM from using any Link Bonds previously created with those NICs, so right now I'm using a USB to 2.5G adapter.

Ideally, since it took so much to configure OPNsense correctly, I could keep the MAC addresses that OPNsense currently has for the two Intel X550-T2s.

I've already done the stuff to get IOMMU working and verified. So SR-IOV would be my next step. I want to make sure I have this right. I also don't understand what "echo 2 > ...sriov_numvfs" does.

The following are the names of the NICs and the MAC Address that appears in OPNsense:

enp1s0f0: b4:96:91:3a:ec:ac
enp1s0f1: b4:96:91:3a:ec:ae
enp2s0f0: 50:7c:6f:33:a7:1a
enp2s0f1: 50:7c:6f:33:a7:18



So, using your example, I would have:
Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network-online.target
After=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f0 vf 0 mac b4:96:91:3a:ec:ac'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f1 vf 1 mac b4:96:91:3a:ec:ae'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f0 vf 0 mac 50:7c:6f:33:a7:1a'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f1 vf 1 mac 50:7c:6f:33:a7:18'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.0 > /sys/bus/pci/devices/0000\\:01\\:00.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.1 > /sys/bus/pci/devices/0000\\:01\\:00.1/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:02:00.0 > /sys/bus/pci/devices/0000\\:02\\:00.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:02:00.1 > /sys/bus/pci/devices/0000\\:02\\:00.1/driver/unbind'
# List new VFs
ExecStart=/usr/bin/lspci -D -d1924:
# Destroy VFs
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
# Reload NIC VFs
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp1s0f1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 2 > /sys/class/net/enp2s0f1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f0 vf 0 mac b4:96:91:3a:ec:ac'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp1s0f1 vf 1 mac b4:96:91:3a:ec:ae'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f0 vf 0 mac 50:7c:6f:33:a7:1a'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set enp2s0f1 vf 1 mac 50:7c:6f:33:a7:18'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.0 > /sys/bus/pci/devices/0000\\:01\\:00.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.1 > /sys/bus/pci/devices/0000\\:01\\:00.1/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:02:00.0 > /sys/bus/pci/devices/0000\\:02\\:00.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:02:00.1 > /sys/bus/pci/devices/0000\\:02\\:00.1/driver/unbind'
ExecReload=/usr/bin/lspci -D -d1924:

[Install]
WantedBy=multi-user.target

Would that pass those same MAC addresses into the OPNsense VM or would it most likely break it? Any input on how I can migrate from OPNsense having pass through to using SR-IOV would be greatly appreciated.

Thank you!
AJ
Hi. "echo 2 > ...sriov_numvfs" creates 2 VF's (virtual functions) for the interface you do it to. Those interfaces is what you then would pass through to OPNsense. Each time you partition a physical interface like that, it will be assigned a random MAC address. So you have to have the script then change the MAC to a static address.

I have since improved on the script, so that it runs before proxmox configures networking at boot, and sets firewall rules etc.

For now, i'll post you my current running script. This script creates 8 VF's on each of the two physical interfaces. It then detaches 7 of those 8 VF's for use with pcie passthrough in a VM. It leaves one VF per port undetached, for use in the Proxmox host. Notice the 'Requires=' and 'After=' definitions have changed to make the script execute earlier in the boot process.

Just ask in this thread, and i'll try to help you along.

I'm also hosting a thread regarding this at https://forums.servethehome.com/ind...-and-detaching-solarflare-sfn7x22f-vfs.39701/

Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network.target
After=network.target
Before=pve-firewall.service
[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 8 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStart=/usr/bin/bash -c 'echo 8 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac 76:9e:17:83:39:e5'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac 46:2c:6d:24:6b:1b'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 2 mac 3e:47:48:12:ed:94'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 3 mac be:e3:6a:f3:8f:ac'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 4 mac 62:8f:3d:bb:02:08'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 5 mac ae:91:57:b9:14:7f'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 6 mac 5a:c2:08:a9:68:a7'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 7 mac b2:f0:18:af:cb:c5'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac 16:47:7c:a8:95:98'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac a6:c7:c5:7f:9c:22'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 2 mac b6:0f:45:34:5e:19'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 3 mac 2a:f7:37:84:31:30'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 4 mac 8a:fa:f8:c5:0b:93'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 5 mac b2:f5:d5:2f:79:06'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 6 mac c2:92:f5:fa:32:20'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 7 mac 2e:fb:29:1e:48:31'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.4 > /sys/bus/pci/devices/0000\\:01\\:00.4/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.5 > /sys/bus/pci/devices/0000\\:01\\:00.5/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.0 > /sys/bus/pci/devices/0000\\:01\\:01.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.1 > /sys/bus/pci/devices/0000\\:01\\:01.1/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.3 > /sys/bus/pci/devices/0000\\:01\\:01.3/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.4 > /sys/bus/pci/devices/0000\\:01\\:01.4/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.5 > /sys/bus/pci/devices/0000\\:01\\:01.5/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.6 > /sys/bus/pci/devices/0000\\:01\\:01.6/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:01.7 > /sys/bus/pci/devices/0000\\:01\\:01.7/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:02.0 > /sys/bus/pci/devices/0000\\:01\\:02.0/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:01:02.1 > /sys/bus/pci/devices/0000\\:01\\:02.1/driver/unbind'
# List new VFs
ExecStart=/usr/bin/lspci -D -d1924:
# Destroy VFs
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecStop=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
# Reload NIC VFs
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 0 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 8 > /sys/class/net/ens2f0np0/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c 'echo 8 > /sys/class/net/ens2f1np1/device/sriov_numvfs'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 0 mac 76:9e:17:83:39:e5'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 1 mac 46:2c:6d:24:6b:1b'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 2 mac 3e:47:48:12:ed:94'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 3 mac be:e3:6a:f3:8f:ac'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 4 mac 62:8f:3d:bb:02:08'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 5 mac ae:91:57:b9:14:7f'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 6 mac 5a:c2:08:a9:68:a7'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f0np0 vf 7 mac b2:f0:18:af:cb:c5'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 0 mac 16:47:7c:a8:95:98'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 1 mac a6:c7:c5:7f:9c:22'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 2 mac b6:0f:45:34:5e:19'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 3 mac 2a:f7:37:84:31:30'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 4 mac 8a:fa:f8:c5:0b:93'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 5 mac b2:f5:d5:2f:79:06'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 6 mac c2:92:f5:fa:32:20'
ExecReload=/usr/bin/bash -c '/usr/bin/ip link set ens2f1np1 vf 7 mac 2e:fb:29:1e:48:31'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.3 > /sys/bus/pci/devices/0000\\:01\\:00.3/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.4 > /sys/bus/pci/devices/0000\\:01\\:00.4/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.5 > /sys/bus/pci/devices/0000\\:01\\:00.5/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.6 > /sys/bus/pci/devices/0000\\:01\\:00.6/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:00.7 > /sys/bus/pci/devices/0000\\:01\\:00.7/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.0 > /sys/bus/pci/devices/0000\\:01\\:01.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.1 > /sys/bus/pci/devices/0000\\:01\\:01.1/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.3 > /sys/bus/pci/devices/0000\\:01\\:01.3/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.4 > /sys/bus/pci/devices/0000\\:01\\:01.4/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.5 > /sys/bus/pci/devices/0000\\:01\\:01.5/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.6 > /sys/bus/pci/devices/0000\\:01\\:01.6/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:01.7 > /sys/bus/pci/devices/0000\\:01\\:01.7/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:02.0 > /sys/bus/pci/devices/0000\\:01\\:02.0/driver/unbind'
ExecReload=/usr/bin/bash -c 'echo 0000:01:02.1 > /sys/bus/pci/devices/0000\\:01\\:02.1/driver/unbind'
ExecReload=/usr/bin/lspci -D -d1924:
[Install]
WantedBy=multi-user.target
 
Ideally, since it took so much to configure OPNsense correctly, I could keep the MAC addresses that OPNsense currently has for the two Intel X550-T2s.
If you previous OPNsense used the physical MAC addresses of your interfaces, and you want to now use those MAC addresses on a VF instead, then what you would have to do is first save the original addresses, and then change the MAC addresses of the physical interfaces. After that you could use the script to partition your PF's (Physical Funtion's) and assign those MAC's to the VF's. That way your OPNsense VM using theVF's via passthrough would have the same original MAC's.
 
edit: using Requires=network.target might be abit wrong here, i'm still working on perfecting the script. The example above does start the service at the correct time, but reading https://systemd.io/NETWORK_ONLINE/ says it really should be Before=network-pre.target and Wants=network-pre.target

Anyways the script i posted does work, and allows it to leave VF's for the proxmox host to use, while detaching other VF's. I wanted to have 1 VF per port for PVE to use for things like iscsi target, which i don't want to run behind a bridge.
 
Last edited:
edit: using Requires=network.target might be abit wrong here, i'm still working on perfecting the script. The example above does start the service at the correct time, but reading https://systemd.io/NETWORK_ONLINE/ says it really should be Before=network-pre.target and Wants=network-pre.target

Anyways the script i posted does work, and allows it to leave VF's for the proxmox host to use, while detaching other VF's. I wanted to have 1 VF per port for PVE to use for things like iscsi target, which i don't want to run behind a bridge.

So, I turned off OPNsense and removed the PCI Passthrough so that Proxmox could see the network adapters and show the right kernel in use. I have the same X550-T2, and when I run
Code:
lspci -nnk | grep -A4 Eth
it shows the Kernel driver in use is ixgbe, not ixgbevf even though that driver is installed and if I do modinfo ixgbevf it reports information.

Either way, when I execute
Code:
echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs
on either that adapter, or enp1s0f1, enp2s0f0, enp2s0f1, I get:

Code:
-bash: /sys/class/net/enp2s0f1/device/sriov_numvfs: Permission denied

Even though I'm logged into Proxmox as root. That file actually doesn't exist in that directory.

I think the problem right now is that it's not utilizing the right driver.

Here's the output that shows IOMMU enabled, but it also shows what may be some errors (I don't know), but maybe that will help:

Code:
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    0.004587] ACPI: DMAR 0x000000006175B000 000088 (v02 INTEL  Dell Inc 00000002      01000013)
[    0.004621] ACPI: Reserving DMAR table memory at [mem 0x6175b000-0x6175b087]
[    0.135411] DMAR: IOMMU enabled
[    0.313410] DMAR: Host address width 39
[    0.313410] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.313414] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[    0.313415] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.313418] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.313419] DMAR: RMRR base: 0x0000006c000000 end: 0x000000703fffff
[    0.313421] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.313421] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.313422] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.314931] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.797459] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[    1.146088] DMAR: No ATSR found
[    1.146089] DMAR: No SATC found
[    1.146089] DMAR: IOMMU feature fl1gp_support inconsistent
[    1.146090] DMAR: IOMMU feature pgsel_inv inconsistent
[    1.146090] DMAR: IOMMU feature nwfs inconsistent
[    1.146091] DMAR: IOMMU feature dit inconsistent
[    1.146091] DMAR: IOMMU feature sc_support inconsistent
[    1.146091] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    1.146092] DMAR: dmar0: Using Queued invalidation
[    1.146093] DMAR: dmar1: Using Queued invalidation
[    1.146892] DMAR: Intel(R) Virtualization Technology for Directed I/O
Here's another output for the list of ethernet adapters:

Code:
lspci -nnk | grep -A4 Eth
0000:00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (17) I219-LM [8086:1a1c] (rev 11)
        Subsystem: Dell Ethernet Connection (17) I219-LM [1028:0ac1]
        Kernel driver in use: e1000e
        Kernel modules: e1000e
0000:01:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:01:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:02:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
10000:e0:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 02)
        Kernel driver in use: pcieport

And ip link:

Code:
ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
    link/ether 08:92:04:ea:52:c6 brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether b4:96:91:3a:ec:ac brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:3a:ec:ae brd ff:ff:ff:ff:ff:ff
5: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 50:7c:6f:33:a7:18 brd ff:ff:ff:ff:ff:ff
6: enp2s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 50:7c:6f:33:a7:1a brd ff:ff:ff:ff:ff:ff
7: enx3c8cf8f9a7aa: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 3c:8c:f8:f9:a7:aa brd ff:ff:ff:ff:ff:ff
8: enxfc34971ebfec: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master vmbr5 state DOWN mode DEFAULT group default qlen 1000
    link/ether fc:34:97:1e:bf:ec brd ff:ff:ff:ff:ff:ff
9: vmbr5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether fc:34:97:1e:bf:ec brd ff:ff:ff:ff:ff:ff
10: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 3c:8c:f8:f9:a7:aa brd ff:ff:ff:ff:ff:ff

So, I'm kind of at a loss as to what I need to do. If it's using the wrong drivers, how do I tell it to use ixgbevf?
 
So, I turned off OPNsense and removed the PCI Passthrough so that Proxmox could see the network adapters and show the right kernel in use. I have the same X550-T2, and when I run
Code:
lspci -nnk | grep -A4 Eth
it shows the Kernel driver in use is ixgbe, not ixgbevf even though that driver is installed and if I do modinfo ixgbevf it reports information.

Either way, when I execute
Code:
echo 2 > /sys/class/net/enp1s0f0/device/sriov_numvfs
on either that adapter, or enp1s0f1, enp2s0f0, enp2s0f1, I get:

Code:
-bash: /sys/class/net/enp2s0f1/device/sriov_numvfs: Permission denied

Even though I'm logged into Proxmox as root. That file actually doesn't exist in that directory.

I think the problem right now is that it's not utilizing the right driver.

Here's the output that shows IOMMU enabled, but it also shows what may be some errors (I don't know), but maybe that will help:

Code:
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    0.004587] ACPI: DMAR 0x000000006175B000 000088 (v02 INTEL  Dell Inc 00000002      01000013)
[    0.004621] ACPI: Reserving DMAR table memory at [mem 0x6175b000-0x6175b087]
[    0.135411] DMAR: IOMMU enabled
[    0.313410] DMAR: Host address width 39
[    0.313410] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.313414] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[    0.313415] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.313418] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.313419] DMAR: RMRR base: 0x0000006c000000 end: 0x000000703fffff
[    0.313421] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.313421] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.313422] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.314931] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.797459] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[    1.146088] DMAR: No ATSR found
[    1.146089] DMAR: No SATC found
[    1.146089] DMAR: IOMMU feature fl1gp_support inconsistent
[    1.146090] DMAR: IOMMU feature pgsel_inv inconsistent
[    1.146090] DMAR: IOMMU feature nwfs inconsistent
[    1.146091] DMAR: IOMMU feature dit inconsistent
[    1.146091] DMAR: IOMMU feature sc_support inconsistent
[    1.146091] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    1.146092] DMAR: dmar0: Using Queued invalidation
[    1.146093] DMAR: dmar1: Using Queued invalidation
[    1.146892] DMAR: Intel(R) Virtualization Technology for Directed I/O
Here's another output for the list of ethernet adapters:

Code:
lspci -nnk | grep -A4 Eth
0000:00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (17) I219-LM [8086:1a1c] (rev 11)
        Subsystem: Dell Ethernet Connection (17) I219-LM [1028:0ac1]
        Kernel driver in use: e1000e
        Kernel modules: e1000e
0000:01:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:01:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
0000:02:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
        Subsystem: Intel Corporation Ethernet 10G 2P X550-t Adapter [8086:001d]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
10000:e0:06.0 PCI bridge [0604]: Intel Corporation Device [8086:464d] (rev 02)
        Kernel driver in use: pcieport

And ip link:

Code:
ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
    link/ether 08:92:04:ea:52:c6 brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether b4:96:91:3a:ec:ac brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:3a:ec:ae brd ff:ff:ff:ff:ff:ff
5: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 50:7c:6f:33:a7:18 brd ff:ff:ff:ff:ff:ff
6: enp2s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 50:7c:6f:33:a7:1a brd ff:ff:ff:ff:ff:ff
7: enx3c8cf8f9a7aa: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 3c:8c:f8:f9:a7:aa brd ff:ff:ff:ff:ff:ff
8: enxfc34971ebfec: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master vmbr5 state DOWN mode DEFAULT group default qlen 1000
    link/ether fc:34:97:1e:bf:ec brd ff:ff:ff:ff:ff:ff
9: vmbr5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether fc:34:97:1e:bf:ec brd ff:ff:ff:ff:ff:ff
10: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 3c:8c:f8:f9:a7:aa brd ff:ff:ff:ff:ff:ff

So, I'm kind of at a loss as to what I need to do. If it's using the wrong drivers, how do I tell it to use ixgbevf?
My script ofcourse is for Solarflare cards, and with those, i first have to use the solarflare sfboot utility to set my card in sr-iov / full virtualisation mode, and alse define max number of VF's per port.

So i'm not familiar with the X550-T2, but first thing you should check is if there is some kind of intel binary used to set the card into the correct configuration. By default sr-iov might be disabled. I don't have time now, but later i can do some reading as well regarding that.

Also make sure you have intel_iommu=on iommu=pt enable in kernel command line (check with cat /proc/cmdline

edit: for reference, here's my dmesg output:

Code:
# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    0.010791] ACPI: DMAR 0x000000007B86D658 0000F0 (v01 ALASKA A M I    00000001 INTL 20091013)
[    0.010844] ACPI: Reserving DMAR table memory at [mem 0x7b86d658-0x7b86d747]
[    0.336962] DMAR: IOMMU enabled
[    0.798299] DMAR: Host address width 46
[    0.798302] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[    0.798310] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    0.798316] DMAR: DRHD base: 0x000000c7ffc000 flags: 0x1
[    0.798322] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    0.798326] DMAR: RMRR base: 0x0000007db88000 end: 0x0000007db97fff
[    0.798330] DMAR: ATSR flags: 0x0
[    0.798333] DMAR: RHSA base: 0x000000c7ffc000 proximity domain: 0x0
[    0.798336] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x1
[    0.798341] DMAR-IR: IOAPIC id 3 under DRHD base  0xfbffc000 IOMMU 0
[    0.798344] DMAR-IR: IOAPIC id 1 under DRHD base  0xc7ffc000 IOMMU 1
[    0.798347] DMAR-IR: IOAPIC id 2 under DRHD base  0xc7ffc000 IOMMU 1
[    0.798350] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
[    0.798353] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.799120] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    3.558945] DMAR: No SATC found
[    3.558949] DMAR: dmar1: Using Queued invalidation
[    3.575762] DMAR: Intel(R) Virtualization Technology for Directed I/O
 
Last edited:
  • Like
Reactions: ajtatum
Sorry to necro this one but I'm stuck with getting traffic through the VF interfaces. I have manged to get the VF interfaces to show up in pve, I've passed a VF through to a truenas box and assigned IP addresses to the interface in truenas but I'm not getting any traffic passing through.

Do I need to assign a VF to the pve host also or can I run a linux bridge on the parent interface?
 
Sorry to necro this one but I'm stuck with getting traffic through the VF interfaces. I have manged to get the VF interfaces to show up in pve, I've passed a VF through to a truenas box and assigned IP addresses to the interface in truenas but I'm not getting any traffic passing through.

Do I need to assign a VF to the pve host also or can I run a linux bridge on the parent interface?
Sorry, only noticed your reply now. In the VM make sure that "all Functions" is unchecked, and PCI-Express is checked, and the host is q35 machinetype. (just from memory)

On my solarflare cards, the NIC processor acts as a hardware L2 switch, and i can communicate between the vf's fine. Remember any either parent of VF interface you assign to the pve host, will be behind PVE firewall. I guess best way to check is create 2 VM's, each with pcie passthrough to VF's, and test connection between those VF's directly, ie set ip of one to 10.0.0.0, and 10.0.0.1 on the other, and ping etc.
 
So apologies for asking what likely is a pretty silly question, but what would I use SR-IOV for?

I have NIC's that support it, and I use them on Proxmox boxes using the default vmbr0 to hardware network interface bridge.

Would I be better served by enabling SR-IOV and forwarding these virtual network devices to each guest? What is the benefit? Latency/performance? CPU load?

Appreciate any input. I'm trying top learn here.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!