vSRX clusters with PCIe pass-through

AlexJorj · Jul 6, 2023

Hello,

I'm curious if any of you has experience with using Juniper vSRX in Proxmox environment with SRX clustering enabled as well as passing physical NIC to vSRX.
In a typical setup where all NICs are vNICs, the vSRX clustering works. 2 of the vNICs become, inside vSRX, fxp0 and also em0 (control link). The next 2 vNICs I reserved to become fabric link with MTU 9000. Without passing the physical NIC (Mellanox ConnectX4/5) the vSRX VMs, who run on different PVEs, manage to form the cluster properly, with both control link and fabric links working. if I run tcpdump on the tap devices, that are actually the fabric links, on PVE I can see traffic.

The moment I attach the PCI slot that has the Mellanox ConnectX4/5, the fabric links stop working. With tcpdump from PVE host I cannot see anything being send by the VM on the tap device. The 2 vNICs for fxp0 and em0 work but not the ones that become fabric links.

It is an AMD EPYC where normal (non clustered) vSRX with the Mellanox NIC being pass-through works very very well. So, this is probably an issue with vSRX drivers not activating the interfaces properly or maybe there are some tricks in doing the PCI passthrough that can help vSRX dealing with this better.

I followed https://pve.proxmox.com/wiki/PCI(e)_Passthrough for AMD

I don't think that iommu=pt was properly added to kernel boot but I did the rest, including adding the modules in /etc/modules and rebuilding initramfs. The PCI pass-though feature works overall.

Another detail, initially both PVEs were version 8 but I downgraded one of them to 7.4. Same behavior.

AlexJorj · Jul 30, 2023

Well, this was fixed "easily" by realizing that a bug in vSRX caused the interfaces to be detected twice. The second time with DPDK I suppose and the NICs were reordered so ge-0/0/0 that used to be a virtio vNIC has become the actual 25G Mellanox PCI NIC while ge-0/0/3 that was meant to be the Mellanox NIC has become the actual virtio vNIC. Weirdly, Junos would think that ge-0/0//0 is 1Gbps and ge-0/0/3 is 25Gbps. Best is to use LLDP to identify the ports and ignore what Junos says about speed and other information. If you use LACP just "force" the port to 1Gbps which will actually be ignored, you will still get higher speeds like 10 or 25Gbps.

Search

Search

vSRX clusters with PCIe pass-through

AlexJorj

New Member

AlexJorj

New Member

We value your privacy