[SOLVED] Promblem with FRR in VM after update PVE 8 to 9

maxxxx

New Member
Aug 7, 2025
12
0
1
Hi!

Have cluster of 3 PVE nodes is deployed, the hardware is the same, Mellanox network card is used everywhere and DRBD is used as storage. All VMs with FRR installed stopped receiving routes after the update, there were no problems on PVE 8.4, there are a number of old VMs (centos 6), the old QUAGGA routing daemon is deployed there, routes are correctly sent and received there.

Update features:
Updated to PVE9, but due to the fact that we use Mellanox network card (there is no module for kernel 6.14 from the official repository, this kernel is not yet supported by them) and DRBD (there is no module for kernel 6.14), it was not possible to switch to kernel 6.14, i.e. currently on PVE9 we use kernel 6.8.12-13-pve, we do not use SDN, all VMs interact with "hardware" servers via VLAN over the bridge

Network settings on one of the nodes (standard settings)

1) Used for vmbr0, actually working with the cluster and VM work with the outside world:
Code:
iface ens2f0np0 inet manual
mtu 1504
pre-up ethtool -K $IFACE rx-vlan-filter off

2) Bridge for VM:
Code:
auto vmbr0
iface vmbr0 inet static
address
gateway
bridge-ports ens2f0np0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
bridge-mcsnoop 0
mtu 1504
post-up devlink dev eswitch set pci/0000:81:00.0 mode switchdev

3) used for LINSTOR + DRBD cluster operation, not used for PVE operation:
Code:
auto ens2f1np1
iface ens2f1np1 inet static
mtu 9000
address
# set dscp 26 (106) for rdma cm, setup qos
post-up\
echo 106 > /sys/class/infiniband/mlx5_1/tc/1/traffic_class; \
cma_roce_tos -d mlx5_1 -t 106; \
mlnx_qos -i $IFACE --dscp2prio=flush; \
mlnx_qos -i $IFACE \
--pfc=0,0,0,1,0,0,0,0 \
--tsa=vendor,ets,vendor,ets,vendor,vendor,ets,vendor \
--tcbw=0,10,0,85,0,0,5,0 \
--dcbx=os \
--trust=dscp\
--cable_len=5 \
--prio2buffer=0,0,0,1,0,0,2,0 \
--buffer_size=48128,165632,48128,0,0,0,0,0; \
tc qdisc add dev $IFACE root handle 1: mq

Example config of one of the VMs, on whose interface FRR is running
Code:
# cat /etc/pve/qemu-server/1580.conf
agent: 1,fstrim_cloned_disks=1,type=virtio
boot: order=scsi0
cores: 4
cpu: Broadwell-noTSX-IBRS,flags=+pcid
memory: 1536
name:
net0: virtio=52:54:00:BC:9C:C9,bridge=vmbr0,tag=1
net1: virtio=52:54:00:D4:9A:B9,bridge=vmbr0,tag=4000
net2: virtio=52:54:00:80:FD:D1,bridge=vmbr0,tag=4068
net3: virtio=52:54:00:7B:B3:26,bridge=vmbr0,tag=667
numa: 0
onboot: 1
ostype: l26
scsi0: drbdthinpool:pm-fae9157b_1580,discard=on,iothread=1,size=4G,ssd=1
scsihw:virtio-scsi-single
serial0: socket
smbios1: uuid=65e265e6-2730-452e-8a10-7bca73780419
sockets: 1
startup: order=1000
tags: A;HA
vga: serial0
vmgenid: 1550d786-25a1-42c2-99b9-6d58d69fcc0b

On net1, FRR is running in the VM
FRRouting (version 9.1.3)

daemon:
Code:
bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no
isisd=no
pimd=no
pim6d=no
ldpd=no
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
pbrd=no
bfdd=no
fabricd=no
vrrpd=no
pathd=no

/etc/frr/frr.conf
Code:
log stdout
ip forwarding
no ipv6 forwarding
!
log file /var/log/frr/frr.log
!
interface eth1
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5
ip ospf hello-interval 5
ip ospf dead-interval 10
ip ospf priority 200
ip ospf retransmit-interval 3
!
router ospf
ospf router-id 10.0.0.220
redistribute kernel
redistribute connected
redistribute static
network 10.0.0.0/14 area 0.0.0.0
area 0.0.0.0 authentication message-digest
distribute-list ACL-TO-OSPF out kernel
distribute-list ACL-TO-OSPF out connected
distribute-list ACL-TO-OSPF out static
!
access-list ACL-TO-OSPF deny any
!
line vty
access-class vty
!

Please help, I don't understand where to dig and what the problem could be, especially it's strange that on old VMs with the QUAGGA daemon everything works correctly.
 
If frr runs inside of the vm it shouldn't be affected by the PVE9 upgrade. Could you please paste the output of vtysh -c "show ip ospf neighbor", vtysh -c "show ip ospf interface" and vtysh -c "show ip ospf route"?

Maybe also throw:
Code:
debug ospf event
debug ospf packet
debug ospf zebra
debug ospf bfd
debug ospf ism
debug ospf nsm
into the frr.conf file, restart frr and paste the journal output here.
 
Last edited:
show ip ospf neighbor
Code:
Neighbor ID     Pri State           Up Time         Dead Time Address         Interface                        RXmtL RqstL DBsmL
10.255.252.254  200 2-Way/DROther   28m53s             7.177s 10.0.0.100      eth1:10.0.0.220                      0     0     0
10.255.254.251  200 2-Way/DROther   28m41s             9.098s 10.0.0.101      eth1:10.0.0.220                      0     0     0
10.255.254.252  200 2-Way/DROther   28m40s             5.170s 10.0.0.102      eth1:10.0.0.220                      0     0     0
10.255.254.253  200 Exchange/Backup 28m41s             6.366s 10.0.0.103      eth1:10.0.0.220                     11     0     0
10.255.254.254  200 ExStart/DR      8.622s             5.035s 10.0.0.104      eth1:10.0.0.220                      0     0     0
10.0.0.10       200 2-Way/DROther   28m52s             7.561s 10.0.0.251      eth1:10.0.0.220                      0     0     0
10.255.254.199  200 2-Way/DROther   28m40s             5.070s 10.0.0.254      eth1:10.0.0.220                      0     0     0
10.255.252.1    200 2-Way/DROther   28m53s             6.831s 10.0.1.1        eth1:10.0.0.220                      0     0     0
10.255.252.2    200 2-Way/DROther   28m53s             7.187s 10.0.1.2        eth1:10.0.0.220                      0     0     0
10.255.252.3    200 2-Way/DROther   1m03s              6.827s 10.0.1.3        eth1:10.0.0.220                      0     0     0
10.255.252.4    200 2-Way/DROther   28m41s             8.446s 10.0.1.4        eth1:10.0.0.220                      0     0     0
10.255.252.5    200 2-Way/DROther   16m48s             6.757s 10.0.1.5        eth1:10.0.0.220                      0     0     0
10.255.252.6    200 2-Way/DROther   28m40s             9.409s 10.0.1.6        eth1:10.0.0.220                      0     0     0
10.255.252.7    200 2-Way/DROther   28m53s             6.802s 10.0.1.7        eth1:10.0.0.220                      0     0     0
10.255.252.8    200 2-Way/DROther   28m40s             9.373s 10.0.1.8        eth1:10.0.0.220                      0     0     0
10.255.252.9    200 2-Way/DROther   14m04s             5.390s 10.0.1.9        eth1:10.0.0.220                      0     0     0
10.255.252.10   200 2-Way/DROther   28m41s             8.443s 10.0.1.10       eth1:10.0.0.220                      0     0     0
10.255.252.11   200 2-Way/DROther   28m40s             9.413s 10.0.1.11       eth1:10.0.0.220                      0     0     0
10.255.252.12   200 2-Way/DROther   28m54s             5.379s 10.0.1.12       eth1:10.0.0.220                      0     0     0
10.255.252.13   200 2-Way/DROther   13m01s             8.424s 10.0.1.13       eth1:10.0.0.220                      0     0     0
10.255.252.14   200 2-Way/DROther   28m53s             6.779s 10.0.1.14       eth1:10.0.0.220                      0     0     0
10.255.252.15   200 2-Way/DROther   28m53s             6.970s 10.0.1.15       eth1:10.0.0.220                      0     0     0
10.255.252.16   200 2-Way/DROther   28m53s             6.752s 10.0.1.16       eth1:10.0.0.220                      0     0     0
10.255.252.17   200 2-Way/DROther   28m53s             6.792s 10.0.1.17       eth1:10.0.0.220                      0     0     0
10.255.252.19   200 2-Way/DROther   28m40s             5.035s 10.0.1.19       eth1:10.0.0.220                      0     0     0
10.255.252.52   200 2-Way/DROther   28m41s             8.475s 10.0.1.52       eth1:10.0.0.220                      0     0     0
10.255.252.79   200 2-Way/DROther   28m40s             9.376s 10.0.1.79       eth1:10.0.0.220                      0     0     0
10.255.252.234  200 2-Way/DROther   28m41s             8.440s 10.0.1.234      eth1:10.0.0.220                      0     0     0
10.255.254.1    200 2-Way/DROther   28m40s             9.392s 10.0.2.1        eth1:10.0.0.220                      0     0     0
10.255.254.2    200 2-Way/DROther   28m41s             8.432s 10.0.2.2        eth1:10.0.0.220                      0     0     0
10.255.254.3    200 2-Way/DROther   28m42s             8.067s 10.0.2.3        eth1:10.0.0.220                      0     0     0
10.255.254.4    200 2-Way/DROther   28m53s             7.174s 10.0.2.4        eth1:10.0.0.220                      0     0     0
10.255.254.5    200 2-Way/DROther   28m41s             9.017s 10.0.2.5        eth1:10.0.0.220                      0     0     0
10.255.254.6    200 2-Way/DROther   28m40s             9.390s 10.0.2.6        eth1:10.0.0.220                      0     0     0
10.255.254.7    200 2-Way/DROther   28m53s             7.190s 10.0.2.7        eth1:10.0.0.220                      0     0     0
10.255.254.8    200 2-Way/DROther   28m40s             9.386s 10.0.2.8        eth1:10.0.0.220                      0     0     0
10.255.254.9    200 2-Way/DROther   28m41s             9.031s 10.0.2.9        eth1:10.0.0.220                      0     0     0
10.255.254.10   200 2-Way/DROther   28m54s             5.333s 10.0.2.10       eth1:10.0.0.220                      0     0     0
10.255.254.11   200 2-Way/DROther   28m53s             6.787s 10.0.2.11       eth1:10.0.0.220                      0     0     0
10.255.254.12   200 2-Way/DROther   28m40s             9.400s 10.0.2.12       eth1:10.0.0.220                      0     0     0
10.255.254.13   200 2-Way/DROther   28m54s             5.916s 10.0.2.13       eth1:10.0.0.220                      0     0     0
10.255.254.14   200 2-Way/DROther   28m53s             6.772s 10.0.2.14       eth1:10.0.0.220                      0     0     0
10.255.254.15   200 2-Way/DROther   28m42s             8.064s 10.0.2.15       eth1:10.0.0.220                      0     0     0
10.255.254.16   200 2-Way/DROther   28m42s             8.061s 10.0.2.16       eth1:10.0.0.220                      0     0     0
10.255.254.17   200 2-Way/DROther   28m54s             5.346s 10.0.2.17       eth1:10.0.0.220                      0     0     0
10.255.254.18   200 2-Way/DROther   28m53s             6.790s 10.0.2.18       eth1:10.0.0.220                      0     0     0
10.255.254.19   200 2-Way/DROther   28m53s             6.754s 10.0.2.19       eth1:10.0.0.220                      0     0     0
10.255.254.20   200 2-Way/DROther   28m41s             8.426s 10.0.2.20       eth1:10.0.0.220                      0     0     0
10.255.254.21   200 2-Way/DROther   28m41s             8.585s 10.0.2.21       eth1:10.0.0.220                      0     0     0

show ip ospf interface
Code:
eth1 is up
  ifindex 3, MTU 1504 bytes, BW 4294967295 Mbit <UP,BROADCAST,RUNNING,MULTICAST>
  Internet Address 10.0.0.220/22, Broadcast 10.0.3.255, Area 0.0.0.0
  MTU mismatch detection: enabled
  Router ID 10.0.0.220, Network Type BROADCAST, Cost: 1
  Transmit Delay is 1 sec, State DROther, Priority 200
  Designated Router (ID) 10.255.254.254 Interface Address 10.0.0.104/22
  Backup Designated Router (ID) 10.255.254.253, Interface Address 10.0.0.103
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 5s, Dead 10s, Wait 10s, Retransmit 3
    Hello due in 0.217s
  Neighbor Count is 49, Adjacent neighbor count is 0
  Graceful Restart hello delay: 10s
  Cryptographic authentication enabled
  Algorithm:MD5
  Cryptographic authentication enabled
  Algorithm:MD5

Code:
============ OSPF network routing table ============
N    10.0.0.0/22           [1] area: 0.0.0.0
                           directly attached to eth1

============ OSPF router routing table =============

============ OSPF external routing table ===========

Log file in attachment
I agree that this is strange, but all my VMs with FRR stopped receiving routes.
 

Attachments

Can you reach (ping) "10.255.254.254" (the Designated Router)? I'd recommend waiting a bit longer than 2mins, it hasn't even reached an adjacency with the Designated Router (DR) -- check if you can get the DR adjacency to a "Full/DR" state.
 
reach (ping) "10.255.254.254" - yes
I'd recommend waiting a bit longer than 2mins - wating 2 hours
check if you can get the DR adjacency to a "Full/DR" state - it didn't work

I'll note again that the configuration on all VMs didn't change, FRR stopped working simultaneously on all VMs after the update, the old QUAGGA works
 
Hi, test with many os (almalinux 9, almalinux 8, centos 7 and centos 6), work only in centos 6

Test VM:
almalinux 8
Code:
agent: 1,fstrim_cloned_disks=1
boot: order=scsi0
cores: 2
cpu: Broadwell-noTSX-IBRS,flags=+pcid
memory: 2048
name: tvk-server-test
net0: virtio=52:54:00:71:CE:68,bridge=vmbr0,tag=1
net1: virtio=52:54:00:87:AF:72,bridge=vmbr0,tag=4000
net2: virtio=52:54:00:4A:78:87,bridge=vmbr0,tag=4068
numa: 0
ostype: l26
scsi0: drbdthinpool:pm-9c9c5025_1581,discard=on,iothread=1,size=4197464K,ssd=1
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=1e84f7f7-83b6-45b4-a7e0-eaa26e431fcb
sockets: 1
tags: A
vga: serial0
vmgenid: 9cfa6d64-a124-4178-9ad7-ce138b2fbbf9

config frr:
Code:
log stdout
ip forwarding
no ipv6 forwarding
!
log file /var/log/frr/frr.log
!
interface eth1
 ip ospf authentication message-digest
 ip ospf message-digest-key 1 md5
 ip ospf hello-interval 5
 ip ospf dead-interval 10
 ip ospf priority 1
 ip ospf retransmit-interval 3
!
router ospf
 ospf router-id 10.0.0.220
 log-adjacency-changes detail
 redistribute kernel
 redistribute connected
 redistribute static
 network 10.0.0.0/14 area 0.0.0.0
 area 0.0.0.0 authentication message-digest
 distribute-list ACL-TO-OSPF out kernel
 distribute-list ACL-TO-OSPF out connected
 distribute-list ACL-TO-OSPF out static
!
access-list ACL-TO-OSPF deny any
!
line vty
 access-class vty
!

Code:
Hello, this is FRRouting (version 7.5.1).


Code:
almalinux8# sh ip ospf neighbor

Neighbor ID     Pri State           Dead Time Address         Interface                        RXmtL RqstL DBsmL
10.255.252.254  200 2-Way/DROther      7.598s 10.0.0.100      eth1:10.0.0.220                      0     0     0
10.255.254.251  200 2-Way/DROther      9.630s 10.0.0.101      eth1:10.0.0.220                      0     0     0
10.255.254.252  200 2-Way/DROther      6.346s 10.0.0.102      eth1:10.0.0.220                      0     0     0
10.255.254.253  200 Exchange/Backup    8.473s 10.0.0.103      eth1:10.0.0.220                      2     0     0
10.255.254.254  200 ExStart/DR         7.634s 10.0.0.104      eth1:10.0.0.220                      0     0     0
10.0.0.10       200 2-Way/DROther      8.598s 10.0.0.251      eth1:10.0.0.220                      0     0     0
10.255.254.199  200 2-Way/DROther      8.742s 10.0.0.254      eth1:10.0.0.220                      0     0     0
10.255.252.1    200 2-Way/DROther      7.771s 10.0.1.1        eth1:10.0.0.220                      0     0     0
10.255.252.2    200 2-Way/DROther      8.636s 10.0.1.2        eth1:10.0.0.220                      0     0     0
10.255.252.3    200 2-Way/DROther      8.662s 10.0.1.3        eth1:10.0.0.220                      0     0     0
10.255.252.4    200 2-Way/DROther      8.681s 10.0.1.4        eth1:10.0.0.220                      0     0     0
10.255.252.5    200 2-Way/DROther      7.549s 10.0.1.5        eth1:10.0.0.220                      0     0     0
10.255.252.6    200 2-Way/DROther      5.614s 10.0.1.6        eth1:10.0.0.220                      0     0     0
10.255.252.7    200 2-Way/DROther      7.541s 10.0.1.7        eth1:10.0.0.220                      0     0     0
10.255.252.8    200 2-Way/DROther      9.229s 10.0.1.8        eth1:10.0.0.220                      0     0     0
10.255.252.9    200 2-Way/DROther      5.635s 10.0.1.9        eth1:10.0.0.220                      0     0     0
10.255.252.10   200 2-Way/DROther      9.234s 10.0.1.10       eth1:10.0.0.220                      0     0     0
10.255.252.11   200 2-Way/DROther      9.278s 10.0.1.11       eth1:10.0.0.220                      0     0     0
10.255.252.12   200 2-Way/DROther      5.599s 10.0.1.12       eth1:10.0.0.220                      0     0     0
10.255.252.13   200 2-Way/DROther      8.612s 10.0.1.13       eth1:10.0.0.220                      0     0     0
10.255.252.14   200 2-Way/DROther      7.552s 10.0.1.14       eth1:10.0.0.220                      0     0     0
10.255.252.15   200 2-Way/DROther      7.564s 10.0.1.15       eth1:10.0.0.220                      0     0     0
10.255.252.16   200 2-Way/DROther      7.614s 10.0.1.16       eth1:10.0.0.220                      0     0     0
10.255.252.17   200 2-Way/DROther      7.546s 10.0.1.17       eth1:10.0.0.220                      0     0     0
10.255.252.19   200 2-Way/DROther      7.705s 10.0.1.19       eth1:10.0.0.220                      0     0     0
10.255.252.52   200 2-Way/DROther      8.613s 10.0.1.52       eth1:10.0.0.220                      0     0     0
10.255.252.79   200 2-Way/DROther      9.572s 10.0.1.79       eth1:10.0.0.220                      0     0     0
10.255.252.234  200 2-Way/DROther      8.660s 10.0.1.234      eth1:10.0.0.220                      0     0     0
10.255.254.1    200 2-Way/DROther      9.238s 10.0.2.1        eth1:10.0.0.220                      0     0     0
10.255.254.2    200 2-Way/DROther      8.653s 10.0.2.2        eth1:10.0.0.220                      0     0     0
10.255.254.3    200 2-Way/DROther      8.616s 10.0.2.3        eth1:10.0.0.220                      0     0     0
10.255.254.4    200 2-Way/DROther      7.574s 10.0.2.4        eth1:10.0.0.220                      0     0     0
10.255.254.5    200 2-Way/DROther      9.231s 10.0.2.5        eth1:10.0.0.220                      0     0     0
10.255.254.6    200 2-Way/DROther      9.579s 10.0.2.6        eth1:10.0.0.220                      0     0     0
10.255.254.7    200 2-Way/DROther      7.587s 10.0.2.7        eth1:10.0.0.220                      0     0     0
10.255.254.8    200 2-Way/DROther      9.576s 10.0.2.8        eth1:10.0.0.220                      0     0     0
10.255.254.9    200 2-Way/DROther      6.695s 10.0.2.9        eth1:10.0.0.220                      0     0     0
10.255.254.10   200 2-Way/DROther      5.601s 10.0.2.10       eth1:10.0.0.220                      0     0     0
10.255.254.11   200 2-Way/DROther      7.539s 10.0.2.11       eth1:10.0.0.220                      0     0     0
10.255.254.12   200 2-Way/DROther      5.622s 10.0.2.12       eth1:10.0.0.220                      0     0     0
10.255.254.13   200 2-Way/DROther      7.534s 10.0.2.13       eth1:10.0.0.220                      0     0     0
10.255.254.14   200 2-Way/DROther      7.553s 10.0.2.14       eth1:10.0.0.220                      0     0     0
10.255.254.15   200 2-Way/DROther      8.657s 10.0.2.15       eth1:10.0.0.220                      0     0     0
10.255.254.16   200 2-Way/DROther      8.639s 10.0.2.16       eth1:10.0.0.220                      0     0     0
10.255.254.17   200 2-Way/DROther      5.604s 10.0.2.17       eth1:10.0.0.220                      0     0     0
10.255.254.18   200 2-Way/DROther      7.556s 10.0.2.18       eth1:10.0.0.220                      0     0     0
10.255.254.19   200 2-Way/DROther      7.536s 10.0.2.19       eth1:10.0.0.220                      0     0     0
10.255.254.20   200 2-Way/DROther      8.619s 10.0.2.20       eth1:10.0.0.220                      0     0     0
10.255.254.21   200 2-Way/DROther      9.805s 10.0.2.21       eth1:10.0.0.220                      0     0     0

Code:
almalinux8# sh ip ospf route
============ OSPF network routing table ============
N    10.0.0.0/22           [1] area: 0.0.0.0
                           directly attached to eth1

============ OSPF router routing table =============

============ OSPF external routing table ===========

Test VM:
centos 7
Code:
agent: 1,fstrim_cloned_disks=1
boot: c
bootdisk: scsi0
cores: 2
cpu: Broadwell-noTSX-IBRS,flags=+pcid
memory: 1024
name: tvk-server-test2
net0: virtio=52:54:00:25:78:51,bridge=vmbr0,tag=1
net1: virtio=52:54:00:B6:E2:5F,bridge=vmbr0,tag=4000
numa: 0
ostype: l26
scsi0: drbdthinpool:pm-686ac2a3_1582,discard=on,iothread=1,size=4197464K,ssd=1
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=15328000-9ef8-4686-9f85-60653ec2b598
sockets: 1
tags: A
vga: serial0
vmgenid: 66adde42-25a7-4511-a298-43f67d1fe994

config quagga:
Code:
log stdout
log file /var/log/quagga/ospfd.log
!
interface eth1
 ip ospf authentication message-digest
 ip ospf message-digest-key 1 md5
 ip ospf hello-interval 5
 ip ospf dead-interval 10
 ip ospf priority 200
 ip ospf retransmit-interval 3
!
router ospf
 ospf router-id 10.0.0.220
 log-adjacency-changes detail
 redistribute kernel
 redistribute connected
 redistribute static
 network 10.0.0.0/14 area 0.0.0.0
 area 0.0.0.0 authentication message-digest
 distribute-list ACL-TO-OSPF out kernel
 distribute-list ACL-TO-OSPF out connected
 distribute-list ACL-TO-OSPF out static
!
access-list ACL-TO-OSPF deny any
!
line vty
 access-class vty
!

Code:
Hello, this is Quagga (version 0.99.22.4).

Code:
centos7-base.ghl.lan# sh ip ospf  neighbor 

    Neighbor ID Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL
10.255.252.254  200 2-Way/DROther      8.260s 10.0.0.100      eth1:10.0.0.220          0     0     0
10.255.254.251  200 2-Way/DROther      5.344s 10.0.0.101      eth1:10.0.0.220          0     0     0
10.255.254.252  200 2-Way/DROther      7.026s 10.0.0.102      eth1:10.0.0.220          0     0     0
10.255.254.253  200 Exchange/Backup    9.545s 10.0.0.103      eth1:10.0.0.220          1     0     0
10.255.254.254  200 Exchange/DR        9.286s 10.0.0.104      eth1:10.0.0.220          1     0     0
10.0.0.10       200 2-Way/DROther      9.422s 10.0.0.251      eth1:10.0.0.220          0     0     0
10.255.254.199  200 2-Way/DROther      9.554s 10.0.0.254      eth1:10.0.0.220          0     0     0
10.255.252.1    200 2-Way/DROther      8.451s 10.0.1.1        eth1:10.0.0.220          0     0     0
10.255.252.2    200 2-Way/DROther      9.305s 10.0.1.2        eth1:10.0.0.220          0     0     0
10.255.252.3    200 2-Way/DROther      9.357s 10.0.1.3        eth1:10.0.0.220          0     0     0
10.255.252.4    200 2-Way/DROther      9.310s 10.0.1.4        eth1:10.0.0.220          0     0     0
10.255.252.5    200 2-Way/DROther      8.254s 10.0.1.5        eth1:10.0.0.220          0     0     0
10.255.252.6    200 2-Way/DROther      6.303s 10.0.1.6        eth1:10.0.0.220          0     0     0
10.255.252.7    200 2-Way/DROther      8.244s 10.0.1.7        eth1:10.0.0.220          0     0     0
10.255.252.8    200 2-Way/DROther      9.892s 10.0.1.8        eth1:10.0.0.220          0     0     0
10.255.252.9    200 2-Way/DROther      6.344s 10.0.1.9        eth1:10.0.0.220          0     0     0
10.255.252.10   200 2-Way/DROther      9.909s 10.0.1.10       eth1:10.0.0.220          0     0     0
10.255.252.11   200 2-Way/DROther      9.997s 10.0.1.11       eth1:10.0.0.220          0     0     0
10.255.252.12   200 2-Way/DROther      6.292s 10.0.1.12       eth1:10.0.0.220          0     0     0
10.255.252.13   200 2-Way/DROther      9.294s 10.0.1.13       eth1:10.0.0.220          0     0     0
10.255.252.14   200 2-Way/DROther      8.228s 10.0.1.14       eth1:10.0.0.220          0     0     0
10.255.252.15   200 2-Way/DROther      8.244s 10.0.1.15       eth1:10.0.0.220          0     0     0
10.255.252.16   200 2-Way/DROther      8.316s 10.0.1.16       eth1:10.0.0.220          0     0     0
10.255.252.17   200 2-Way/DROther      8.241s 10.0.1.17       eth1:10.0.0.220          0     0     0
10.255.252.19   200 2-Way/DROther      8.492s 10.0.1.19       eth1:10.0.0.220          0     0     0
10.255.252.52   200 2-Way/DROther      9.290s 10.0.1.52       eth1:10.0.0.220          0     0     0
10.255.252.79   200 2-Way/DROther      5.276s 10.0.1.79       eth1:10.0.0.220          0     0     0
10.255.252.234  200 2-Way/DROther      9.365s 10.0.1.234      eth1:10.0.0.220          0     0     0
10.255.254.1    200 2-Way/DROther      9.911s 10.0.2.1        eth1:10.0.0.220          0     0     0
10.255.254.2    200 2-Way/DROther      9.332s 10.0.2.2        eth1:10.0.0.220          0     0     0
10.255.254.3    200 2-Way/DROther      9.286s 10.0.2.3        eth1:10.0.0.220          0     0     0
10.255.254.4    200 2-Way/DROther      8.266s 10.0.2.4        eth1:10.0.0.220          0     0     0
10.255.254.5    200 2-Way/DROther      9.901s 10.0.2.5        eth1:10.0.0.220          0     0     0
10.255.254.6    200 2-Way/DROther      5.306s 10.0.2.6        eth1:10.0.0.220          0     0     0
10.255.254.7    200 2-Way/DROther      8.239s 10.0.2.7        eth1:10.0.0.220          0     0     0
10.255.254.8    200 2-Way/DROther      5.282s 10.0.2.8        eth1:10.0.0.220          0     0     0
10.255.254.9    200 2-Way/DROther      7.386s 10.0.2.9        eth1:10.0.0.220          0     0     0
10.255.254.10   200 2-Way/DROther      6.295s 10.0.2.10       eth1:10.0.0.220          0     0     0
10.255.254.11   200 2-Way/DROther      8.231s 10.0.2.11       eth1:10.0.0.220          0     0     0
10.255.254.12   200 2-Way/DROther      6.298s 10.0.2.12       eth1:10.0.0.220          0     0     0
10.255.254.13   200 2-Way/DROther      8.228s 10.0.2.13       eth1:10.0.0.220          0     0     0
10.255.254.14   200 2-Way/DROther      8.232s 10.0.2.14       eth1:10.0.0.220          0     0     0
10.255.254.15   200 2-Way/DROther      9.348s 10.0.2.15       eth1:10.0.0.220          0     0     0
10.255.254.16   200 2-Way/DROther      9.289s 10.0.2.16       eth1:10.0.0.220          0     0     0
10.255.254.17   200 2-Way/DROther      6.298s 10.0.2.17       eth1:10.0.0.220          0     0     0
10.255.254.18   200 2-Way/DROther      8.225s 10.0.2.18       eth1:10.0.0.220          0     0     0
10.255.254.19   200 2-Way/DROther      8.222s 10.0.2.19       eth1:10.0.0.220          0     0     0
10.255.254.20   200 2-Way/DROther      9.319s 10.0.2.20       eth1:10.0.0.220          0     0     0
10.255.254.21   200 2-Way/DROther      5.503s 10.0.2.21       eth1:10.0.0.220          0     0     0

Code:
centos7-base.ghl.lan# sh ip ospf route 
============ OSPF network routing table ============
N    10.0.0.0/22           [10] area: 0.0.0.0
                           directly attached to eth1

============ OSPF router routing table =============

============ OSPF external routing table ===========


Test VM:
centos 6
Code:
agent: 1,fstrim_cloned_disks=1
boot: c
bootdisk: scsi0
cores: 2
cpu: Broadwell-noTSX-IBRS,flags=+pcid
memory: 1024
name: tvk-server-test3
net0: virtio=52:54:00:5D:51:89,bridge=vmbr0,tag=1
net1: virtio=52:54:00:17:FA:C4,bridge=vmbr0,tag=4000
numa: 0
ostype: l26
scsi0: drbdthinpool:pm-802d81e4_1583,discard=on,iothread=1,size=4197464K,ssd=1
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=71f450b9-6c55-4392-ba96-66a91dd3fe16
sockets: 1
tags: A
vga: serial0
vmgenid: 78f832ff-b4a6-40b9-848c-f32eb7478f9d

config quagga:
Code:
log stdout
log file /var/log/quagga/ospfd.log
!
interface eth1
 ip ospf authentication message-digest
 ip ospf message-digest-key 1 md5
 ip ospf hello-interval 5
 ip ospf dead-interval 10
 ip ospf priority 200
 ip ospf retransmit-interval 3
!
router ospf
 ospf router-id 10.0.0.220
 log-adjacency-changes detail
 redistribute kernel
 redistribute connected
 redistribute static
 network 10.0.0.0/14 area 0.0.0.0
 area 0.0.0.0 authentication message-digest
 distribute-list ACL-TO-OSPF out kernel
 distribute-list ACL-TO-OSPF out connected
 distribute-list ACL-TO-OSPF out static
!
access-list ACL-TO-OSPF deny any
!
line vty
 access-class vty
!

Code:
Hello, this is Quagga (version 0.99.15).


"sh ip ospf neigh" and "sh ip ospf route" you can see in attach, you can see OSPF in Centos 6 work fine, i take routes.



Logg in attach
You see OSPF work only in Centos 6, in up VMs OS (centos 7 > more) have problem in network level(??? maybe wrong), you can see "SeqNumberMismatch" in log.

This problem i see after update PVE from 8 to 9, any config and network infrastructure not changed. On VMs on Centos 6 all work fine (ospf and old quagga) on nodes (i have 5 VM on centos 6 with quagga).
 

Attachments

It appears there is some packet loss, which is causing the sequence number mismatch error. This issue might be due to an MTU mismatch. Could you verify that the MTU settings are consistent across all peers, both inside and outside the vm?

You can also use the command ping -s <mtu> -M do <ip> to check the MTU, gradually increasing the MTU value between all nodes. Ensure you ping each other's nodes from both sides, particularly between the Designated Router (DR) and the failing peer.
 
Thanks, I'll try, I also thought about MTU, but it's strange that before switching to PVE 9 there were no problems with MTU
 
Yes, indeed, in centos 6 there was an MTU of 1500, in systems above centos 6 (centos 7 and so on) the MTU was 1504, if you manually change this value to 1500, everything becomes fine, please tell me, I have not set the MTU anywhere in the VM, where does the VM get the MTU value from?
 
In PVE 8.4 i have MTU 1504 and work fine after update to PVE 9 with this MTU 1504 have problem, Any ideas why this happened? I looked at another PVE installation, where version 8.4 remained and there was no update, there is also MTU 1504 and no problems.

Found a difference, on the physical interface of the server with PVE 8.4 MTU 1504, but in VMs it forwards MTU 1500 and everything works. In PVE 9 the behavior is different, also on the physical interface MTU 1504, but in VMs it forwards MTU 1504, not 1500.
 
Last edited:
We added this patch: https://lore.proxmox.com/pve-devel/20250717175012.606372-1-s.hanreich@proxmox.com/, which is most likely the issue.

So this means that the mtu of 1504 on the host gets automatically applied inside the vm, so you have both inside and outside an mtu of 1504.
Usually you would lower the mtu on the bridge, so that you have 1496 on the bridge and inside the vm. We then add the vlan tag on the physical interface on the host, where the default mtu is 1500.
 
There are two options, as I see it,
1) force 1500 on virtual interfaces (I have a lot of them), it seems like a so-so option to me, but it will do as a temporary one, leave 1504 MTU in the node
2) set 1500 on the node and it will automatically be forwarded to virtual interfaces, which will solve the problem, then there will be no need to force 1500 in the virtual network card settings on all VMs
3) this patch, which led to the problem, it seems to me that its behavior is strange, if before everything was logical that MTU -4 was forwarded to the VM from the host, now there is a collision or am I misunderstanding something? Additionally, I think after the update, not only I will encounter such behavior, which will lead to problems, since in fact it turned out that after the update, frr stopped working and perhaps somewhere there were losses at the network level
 
I misunderstand something, maybe I should change the settings on the node network interfaces, i.e. on the physical 1504, and on the virtual bridge for the VM set -4, i.e. 1500, then everything will work logically?
 
You should choose option 2:
set 1500 on the node and it will automatically be forwarded to virtual interfaces, which will solve the problem, then there will be no need to force 1500 in the virtual network card settings on all VMs

MTU -4 was forwarded to the VM from the host

This was not the case before this patch.

I misunderstand something, maybe I should change the settings on the node network interfaces, i.e. on the physical 1504, and on the virtual bridge for the VM set -4, i.e. 1500, then everything will work logically?

This will not work, because the bridge (vmbr0) will just inherit the mtu from the physical interface (ens2f0np0).

So the best way is to just set the mtu to 1500 on the physical interface (ens2f0np0) and also set the mtu to 1500 on the bridge (vmbr0). All the vms will thus inherit the bridge mtu setting as well.
 
1) hm, working like this:

auto ens2f0np0
iface ens2f0np0 inet manual
mtu 1504
pre-up ethtool -K $IFACE rx-vlan-filter off

auto vmbr0
iface vmbr0 inet static
address
gateway
bridge-ports ens2f0np0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
bridge-mcsnoop 0
mtu 1500
post-up devlink dev eswitch set pci/0000:81:00.0 mode switchdev

2) after reboot node have this

ens2f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1504
ether b8:59:9f:35:89:6a txqueuelen 1000 (Ethernet)
RX packets 27650 bytes 11324771 (10.8 MiB)
RX errors 0 dropped 3 overruns 0 frame 0
TX packets 14155 bytes 4311324 (4.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vmbr0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet netmask broadcast
ether b8:59:9f:35:89:6a txqueuelen 1000 (Ethernet)
RX packets 24613 bytes 10599514 (10.1 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 14185 bytes 4315229 (4.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
 
Hi, this is help for me after update from PVE 8 to 9:

auto ens2f0np0
iface ens2f0np0 inet manual
mtu 1504
pre-up ethtool -K $IFACE rx-vlan-filter off

auto vmbr0
iface vmbr0 inet static
address
gateway
bridge-ports ens2f0np0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
bridge-mcsnoop 0
mtu 1500
post-up devlink dev eswitch set pci/0000:81:00.0 mode switchdev

in VM after i have MTU 1500 not 1504 and all is work, but for this need restart all VMs and live migration have ERROR becouse (i think) MTU on nodes not same (on new 1500, old 1504).

Problem have in my network configuration and this patch https://lore.proxmox.com/pve-devel/20250717175012.606372-1-s.hanreich@proxmox.com/
 
Last edited: