We have been working for long time on the deployment of a cluster based PVE and the new SDN feature. We have had the chance to work with BGP expert and also some of the people involved in the dev. of the project.
So this report is a bug report after we moved the cluster to v.7 of Proxmox.
We had a working cluster with BGP level 2 and 3 routes being correctly published (FRR v.7.4).
The move to FRR 7.5.1 found on the new PVE 7 broke the compatibility from FRR 7.4 to FRR 7.5.1.
In the install that we did, we are using routers with FRR on Debian 9 built with FRR 7.4 and it was publishing routes and working perfectly.
The move to version 7 broke things up and routes could no longer be published.
After pulling our hairs trying to figure out what was going wrong with our settings, we have decided to downgrade to version 7.4 to check if it might have a positive effect and "bingo", things got instantly back to normal.
Another problem that we have figured out while doing our tests, while VMs do correctly publish their routes in the FRR found in the cluster, this is not the case with LXC Containers.
From the EVPN point of view, the container must appear in the ARP table for a type 2 route to be created. Either the entry exists because the container has initiated traffic, or the Proxmox SDN module must create it.
In the first case, this is the expected behavior. You have to force the container to initialize traffic.
In the second case, there may be bugs in the case where the container has dynamically created its ARP entry by sending traffic and the Proxmox could not set up its own entry.
A few years ago, this was a problem, but with a recent kernel and a recent FRR, I don't think the problem should exist.
I think it's just more likely that the Proxmox doesn't create the input (either because it's not or because it had a problem, in which case there might be something in the
in the logs).
These have been reported here : https://bugzilla.proxmox.com/show_bug.cgi?id=3570 and https://bugzilla.proxmox.com/show_bug.cgi?id=3571
P.S. You should consider adding a tag "SDN" in the bug report - I have put these under "kernel", but this is not right.
So this report is a bug report after we moved the cluster to v.7 of Proxmox.
We had a working cluster with BGP level 2 and 3 routes being correctly published (FRR v.7.4).
The move to FRR 7.5.1 found on the new PVE 7 broke the compatibility from FRR 7.4 to FRR 7.5.1.
In the install that we did, we are using routers with FRR on Debian 9 built with FRR 7.4 and it was publishing routes and working perfectly.
The move to version 7 broke things up and routes could no longer be published.
After pulling our hairs trying to figure out what was going wrong with our settings, we have decided to downgrade to version 7.4 to check if it might have a positive effect and "bingo", things got instantly back to normal.
Another problem that we have figured out while doing our tests, while VMs do correctly publish their routes in the FRR found in the cluster, this is not the case with LXC Containers.
From the EVPN point of view, the container must appear in the ARP table for a type 2 route to be created. Either the entry exists because the container has initiated traffic, or the Proxmox SDN module must create it.
In the first case, this is the expected behavior. You have to force the container to initialize traffic.
In the second case, there may be bugs in the case where the container has dynamically created its ARP entry by sending traffic and the Proxmox could not set up its own entry.
A few years ago, this was a problem, but with a recent kernel and a recent FRR, I don't think the problem should exist.
I think it's just more likely that the Proxmox doesn't create the input (either because it's not or because it had a problem, in which case there might be something in the
in the logs).
These have been reported here : https://bugzilla.proxmox.com/show_bug.cgi?id=3570 and https://bugzilla.proxmox.com/show_bug.cgi?id=3571
P.S. You should consider adding a tag "SDN" in the bug report - I have put these under "kernel", but this is not right.