Markdown (GitHub flavored):
Hello everyone,
I've successfully configured Open vSwitch + DPDK on PVE 7.0.
In my setup, a VM with 2-core 1.8 GHz can send 64 byte packets to wire via a Open vSwitch bridge at about 5Mpps via 10Gb Ethernet.
I write this to share you my steps. Hopefully this post can help with PVE development so that some day Open vSwitch + DPDK will be officially supported by Proxmox.
My environment:
- Intel(R) Xeon(R) CPU E5-2630L v4 @ 1.80GHz
- 2 x 32GB DDR4 ECC registered RAM
- Proxmox VE 7.0-11
- Network adapter: Intel Ethernet Converged X520-DA2 10Gigabit Ethernet Card
- Guest has 2 CPU core, 2GB RAM, runs Debian bullseye
## Enable official Debian bullseye repos on your PVE host
Add the following lines to `/etc/apt/sources.list`:
```
deb http://deb.debian.org/debian bullseye main
deb-src http://deb.debian.org/debian bullseye main
deb http://deb.debian.org/debian-security/ bullseye-security main
deb-src http://deb.debian.org/debian-security/ bullseye-security main
deb http://deb.debian.org/debian bullseye-updates main
deb-src http://deb.debian.org/debian bullseye-updates main
```
## Enable IOMMU and Hugepages
I use the `vfio-pci` driver for DPDK, so SR-IOV should be enabled on the host. Other drivers like `igb_uio` or `uio_pci_generic` don't need this IOMMU. For more information about Linux drivers for DPDK, see https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html.
DPDK requires hugepages. I reserved 4 * 1GB hugepages on the host.
- Make sure IOMMU and SR-IOV are enabled in BIOS.
- Edit /etc/default/grub, on the line GRUB_CMDLINE_LINUX_DEFAULT, add `intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=4`.
- Apply the change by running `grub-mkconfig -o /boot/grub/grub.cfg`.
## Auto-load `vfio-pci` on boot
Run `echo vfio-pci >> /etc/modules-load.d/modules.conf`.
## Reboot and verification
- After rebooting, check if IOMMU is functioning by reading driver message:
```sh
dmesg | grep -e DMAR -e IOMMU
```
- Check if hugepages are reserved:
```sh
apt install libhugetlbfs-bin
hugeadm --explain
```
- Check if `vfio-pci` is loaded:
```sh
lsmod | grep vfio-pci
```
- Mount hugepages
```sh
mkdir -p /run/hugepages/kvm/1048576kB
mount -t hugetlbfs -o pagesize=1G none /run/hugepages/kvm/1048576kB
```
## Install Open vSwitch and DPDK
```sh
apt install dpdk openvswitch-switch-dpdk
update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
```
## Configure Open vSwitch
For more information about those options, see https://docs.openvswitch.org/en/latest/intro/install/dpdk/#setup-ovs.
```sh
# Enable DPDK support
ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
# run on core 1 only
ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1"
# allocate 2G huge pages
ovs-vsctl set Open_vSwitch . "other_config:dpdk-socket-mem=2048"
# enable vhost-user-client IOMMU support
ovs-vsctl set Open_vSwitch . "other_config:vhost-iommu-support=true"
# restart OVS
systemctl restart ovs-vswitchd.service
```
## Bind network adapter to DPDK
- List all available network adapters on the host:
```sh
dpdk-devbind.py -s
```
- Bind network adapter `0000:02:00.1` to DPDK:
```sh
dpdk-devbind.py --bind=vfio-pci 0000:02:00.1
```
- Check network adapter status again:
```sh
dpdk-devbind.py -s
```
## Create an OVS bridge and port
- Create an OVS bridge called `br0`:
```sh
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
```
- Add network adapter `0000:02:00.1` to bridge `br0` as port `dpdk-p0`:
```sh
ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:02:00.1
```
- Check OVS bridge and port status:
```sh
ovs-vsctl show
```
## Adding an OVS port to a KVM guest
- Create a vhost-user-client port called `vhost-user-client-1` on the OVS bridge `br0`:
```sh
# create a directory for vhost-user sockets
mkdir -p /var/run/vhostuserclient
# add a vhost-user-client port
ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1"
```
- Ensure the VM instance is powered off.
- Change the machine type to `q35`.
- Enable NUMA on guest CPU.
- Edit `/etc/pve/qemu-server/<ID>.conf`, add the following lines to add the port to guest with mac address `00:00:00:00:00:01` and enable hugepages and (optional) vIOMMU on the guest:
```
args: -machine q35+pve0,kernel_irqchip=split -device intel-iommu,intremap=on,caching-mode=on -chardev socket,id=char1,path=/var/run/vhostuserclient/vhost-user-client-1,server=on -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce=on -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1
hugepages: 1024
```
## Other notes
The OVS bridge and port configs don't survive on reboot because systemd service `/lib/systemd/system/pvenetcommit.service` removes OVS DB file `/etc/openvswitch/conf.db` on boot.
Last edited: