Hi!
Have cluster of 3 PVE nodes is deployed, the hardware is the same, Mellanox network card is used everywhere and DRBD is used as storage. All VMs with FRR installed stopped receiving routes after the update, there were no problems on PVE 8.4, there are a number of old VMs (centos 6), the old QUAGGA routing daemon is deployed there, routes are correctly sent and received there.
Update features:
Updated to PVE9, but due to the fact that we use Mellanox network card (there is no module for kernel 6.14 from the official repository, this kernel is not yet supported by them) and DRBD (there is no module for kernel 6.14), it was not possible to switch to kernel 6.14, i.e. currently on PVE9 we use kernel 6.8.12-13-pve, we do not use SDN, all VMs interact with "hardware" servers via VLAN over the bridge
Network settings on one of the nodes (standard settings)
1) Used for vmbr0, actually working with the cluster and VM work with the outside world:
2) Bridge for VM:
3) used for LINSTOR + DRBD cluster operation, not used for PVE operation:
Example config of one of the VMs, on whose interface FRR is running
On net1, FRR is running in the VM
FRRouting (version 9.1.3)
daemon:
/etc/frr/frr.conf
Please help, I don't understand where to dig and what the problem could be, especially it's strange that on old VMs with the QUAGGA daemon everything works correctly.
Have cluster of 3 PVE nodes is deployed, the hardware is the same, Mellanox network card is used everywhere and DRBD is used as storage. All VMs with FRR installed stopped receiving routes after the update, there were no problems on PVE 8.4, there are a number of old VMs (centos 6), the old QUAGGA routing daemon is deployed there, routes are correctly sent and received there.
Update features:
Updated to PVE9, but due to the fact that we use Mellanox network card (there is no module for kernel 6.14 from the official repository, this kernel is not yet supported by them) and DRBD (there is no module for kernel 6.14), it was not possible to switch to kernel 6.14, i.e. currently on PVE9 we use kernel 6.8.12-13-pve, we do not use SDN, all VMs interact with "hardware" servers via VLAN over the bridge
Network settings on one of the nodes (standard settings)
1) Used for vmbr0, actually working with the cluster and VM work with the outside world:
Code:
iface ens2f0np0 inet manual
mtu 1504
pre-up ethtool -K $IFACE rx-vlan-filter off
2) Bridge for VM:
Code:
auto vmbr0
iface vmbr0 inet static
address
gateway
bridge-ports ens2f0np0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
bridge-mcsnoop 0
mtu 1504
post-up devlink dev eswitch set pci/0000:81:00.0 mode switchdev
3) used for LINSTOR + DRBD cluster operation, not used for PVE operation:
Code:
auto ens2f1np1
iface ens2f1np1 inet static
mtu 9000
address
# set dscp 26 (106) for rdma cm, setup qos
post-up\
echo 106 > /sys/class/infiniband/mlx5_1/tc/1/traffic_class; \
cma_roce_tos -d mlx5_1 -t 106; \
mlnx_qos -i $IFACE --dscp2prio=flush; \
mlnx_qos -i $IFACE \
--pfc=0,0,0,1,0,0,0,0 \
--tsa=vendor,ets,vendor,ets,vendor,vendor,ets,vendor \
--tcbw=0,10,0,85,0,0,5,0 \
--dcbx=os \
--trust=dscp\
--cable_len=5 \
--prio2buffer=0,0,0,1,0,0,2,0 \
--buffer_size=48128,165632,48128,0,0,0,0,0; \
tc qdisc add dev $IFACE root handle 1: mq
Example config of one of the VMs, on whose interface FRR is running
Code:
# cat /etc/pve/qemu-server/1580.conf
agent: 1,fstrim_cloned_disks=1,type=virtio
boot: order=scsi0
cores: 4
cpu: Broadwell-noTSX-IBRS,flags=+pcid
memory: 1536
name:
net0: virtio=52:54:00:BC:9C:C9,bridge=vmbr0,tag=1
net1: virtio=52:54:00:D4:9A:B9,bridge=vmbr0,tag=4000
net2: virtio=52:54:00:80:FD:D1,bridge=vmbr0,tag=4068
net3: virtio=52:54:00:7B:B3:26,bridge=vmbr0,tag=667
numa: 0
onboot: 1
ostype: l26
scsi0: drbdthinpool:pm-fae9157b_1580,discard=on,iothread=1,size=4G,ssd=1
scsihw:virtio-scsi-single
serial0: socket
smbios1: uuid=65e265e6-2730-452e-8a10-7bca73780419
sockets: 1
startup: order=1000
tags: A;HA
vga: serial0
vmgenid: 1550d786-25a1-42c2-99b9-6d58d69fcc0b
On net1, FRR is running in the VM
FRRouting (version 9.1.3)
daemon:
Code:
bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no
isisd=no
pimd=no
pim6d=no
ldpd=no
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
pbrd=no
bfdd=no
fabricd=no
vrrpd=no
pathd=no
/etc/frr/frr.conf
Code:
log stdout
ip forwarding
no ipv6 forwarding
!
log file /var/log/frr/frr.log
!
interface eth1
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5
ip ospf hello-interval 5
ip ospf dead-interval 10
ip ospf priority 200
ip ospf retransmit-interval 3
!
router ospf
ospf router-id 10.0.0.220
redistribute kernel
redistribute connected
redistribute static
network 10.0.0.0/14 area 0.0.0.0
area 0.0.0.0 authentication message-digest
distribute-list ACL-TO-OSPF out kernel
distribute-list ACL-TO-OSPF out connected
distribute-list ACL-TO-OSPF out static
!
access-list ACL-TO-OSPF deny any
!
line vty
access-class vty
!
Please help, I don't understand where to dig and what the problem could be, especially it's strange that on old VMs with the QUAGGA daemon everything works correctly.