[SOLVED] Proxmox Cluster down / keep Changing MTU

note the following (on all four nodes):

Code:
7: vmbr99: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6c:b3:11:65:16:78 brd ff:ff:ff:ff:ff:ff
8: vmbr99.11@vmbr99: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6c:b3:11:65:16:78 brd ff:ff:ff:ff:ff:ff

MTU 1500 -> ping fails (except when the target is the node itself, because then it goes via lo which has an MTU of 64k)
 
tell me to what i should change it plz
i tried many MTU values. from 8972, i had a script which shows me the last working ping with MTU starting from 1 to 9999.
the pysically NIC to the vLan are set to 9000, as i read, all values need to be set to 9000 (or at least to the same value) but nothing works. the strange behalf is: i ping with "any_MTU" - working, i apply the MTU, run the ping again, not working anymore. the question is why
 
and what could be the possible explaination why it was working with MTU 9000 (on all 4 nodes), 2 weeks ago i mad an update of proxmox packages than 8972 was working (and 9000 not anymore). on node 1 and 2 i run an autoremove, but on 3 and 4 was no changes made.
 
I already told you - the bridges have an MTU of 1500.. once that is fixed, you can proceed to the next part. but I would suggest familiarizing yourself with the Linux networking stack if you want to implement such a rather complex setup..
 
ok, all bridges set to 1500, i also tried 1472. the MTU is inherit from the physically NIC (i read). what next ?
is there a pay support to fix the issue ?
i am sick of it.
again - the issue is that AFTER APPLY THE Network change in the GUI, the MTU VALUE which worked 10 seconds ago (ping) suddently doesnt work anymore.
it doesnt matter which value, even no MTU (not recommended but still should work) its the same
Ping answers, i enter the number in the GUI, press APPLY, after run a ping again - Dead.
 
ok, all bridges set to 1500, i also tried 1472. the MTU is inherit from the physically NIC (i read). what next ?
is there a pay support to fix the issue ?
i am sick of it.
again - the issue is that AFTER APPLY THE Network change in the GUI, the MTU VALUE which worked 10 seconds ago (ping) suddently doesnt work anymore.
it doesnt matter which value, even no MTU (not recommended but still should work) its the same
Ping answers, i enter the number in the GUI, press APPLY, after run a ping again - Dead.
the apply button is calling "ifreload -a" to reload configuration.

Maybe can you try to do a "ifreload -a -d" to have debug log , even without doing any change, it should display if some change are done on different interfaces.

(just to be sure, if you do a clean boot with current config, it's working , then reload, it's not working anymore ?)
 
Code:
info: {}
debug: enp41s0: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.enp41s0.input=0
debug: enp41s0: up : running module dhcp
debug: enp41s0: up : running module address
debug: enp41s0: up : running module addressvirtual
debug: enp41s0: up : running module usercmds
debug: enp41s0: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: enp41s0: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: enp41s0: post-up : running module usercmds
debug: enp41s0: statemanager sync state pre-up
info: vmbr0: running ops ...
debug: vmbr0: pre-up : running module openvswitch
debug: vmbr0: pre-up : running module openvswitch_port
debug: vmbr0: pre-up : running module xfrm
debug: vmbr0: pre-up : running module link
debug: vmbr0: pre-up : running module bond
debug: vmbr0: pre-up : running module vlan
debug: vmbr0: pre-up : running module vxlan
debug: vmbr0: pre-up : running module usercmds
debug: vmbr0: pre-up : running module bridge
info: vmbr0: bridge already exists
info: vmbr0: applying bridge settings
info: vmbr0: reset bridge-hashel to default: 4
info: reading '/sys/class/net/vmbr0/bridge/stp_state'
info: vmbr0: netlink: ip link set dev vmbr0 type bridge (with attributes)
debug: attributes: {26: 4}
debug: vmbr0: evaluating port expr '['enp41s0']'
info: vmbr0: port enp41s0: already processed
info: vmbr0: applying bridge configuration specific to ports
info: vmbr0: processing bridge config for port enp41s0
debug: vmbr0: evaluating port expr '['enp41s0']'
info: bridge mac is already inherited from enp41s0
debug: vmbr0: _get_bridge_mac returned (enp41s0, a8:a1:59:c1:1c:46)
debug: vmbr0: cached hwaddress value: a8:a1:59:c1:1c:46
debug: vmbr0: pre-up : running module bridgevlan
debug: vmbr0: pre-up : running module tunnel
debug: vmbr0: pre-up : running module vrf
debug: vmbr0: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.vmbr0.input=0
debug: vmbr0: up : running module dhcp
debug: vmbr0: up : running module address
info: executing /bin/ip route add default via public gw proto kernel dev vmbr0 onlink
debug: vmbr0: up : running module addressvirtual
debug: vmbr0: up : running module usercmds
debug: vmbr0: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: vmbr0: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: vmbr0: post-up : running module usercmds
debug: vmbr0: statemanager sync state pre-up
debug: vmbr99.11: found dependents ['vmbr99']
debug: vmbr99: found dependents ['enp33s0']
info: enp33s0: running ops ...
debug: enp33s0: pre-up : running module openvswitch
debug: enp33s0: pre-up : running module openvswitch_port
debug: enp33s0: pre-up : running module xfrm
debug: enp33s0: pre-up : running module link
debug: enp33s0: pre-up : running module bond
debug: enp33s0: pre-up : running module vlan
debug: enp33s0: pre-up : running module vxlan
debug: enp33s0: pre-up : running module usercmds
debug: enp33s0: pre-up : running module bridge
info: vmbr99: applying bridge port configuration: ['enp33s0']
debug: enp33s0: pre-up : running module bridgevlan
debug: enp33s0: pre-up : running module tunnel
debug: enp33s0: pre-up : running module vrf
debug: enp33s0: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.enp33s0.input=0
debug: enp33s0: up : running module dhcp
debug: enp33s0: up : running module address
debug: enp33s0: up : running module addressvirtual
debug: enp33s0: up : running module usercmds
debug: enp33s0: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: enp33s0: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: enp33s0: post-up : running module usercmds
debug: enp33s0: statemanager sync state pre-up
info: vmbr99: running ops ...
debug: vmbr99: pre-up : running module openvswitch
debug: vmbr99: pre-up : running module openvswitch_port
debug: vmbr99: pre-up : running module xfrm
debug: vmbr99: pre-up : running module link
debug: vmbr99: pre-up : running module bond
debug: vmbr99: pre-up : running module vlan
debug: vmbr99: pre-up : running module vxlan
debug: vmbr99: pre-up : running module usercmds
debug: vmbr99: pre-up : running module bridge
info: vmbr99: bridge already exists
info: vmbr99: applying bridge settings
info: vmbr99: reset bridge-hashel to default: 4
info: reading '/sys/class/net/vmbr99/bridge/stp_state'
info: vmbr99: netlink: ip link set dev vmbr99 type bridge (with attributes)
debug: attributes: {26: 4}
debug: vmbr99: evaluating port expr '['enp33s0']'
info: vmbr99: port enp33s0: already processed
info: vmbr99: applying bridge configuration specific to ports
info: vmbr99: processing bridge config for port enp33s0
debug: vmbr99: evaluating port expr '['enp33s0']'
info: bridge mac is already inherited from enp33s0
debug: vmbr99: _get_bridge_mac returned (enp33s0, 6c:b3:11:65:16:44)
debug: vmbr99: cached hwaddress value: 6c:b3:11:65:16:44
debug: vmbr99: pre-up : running module bridgevlan
debug: vmbr99: pre-up : running module tunnel
debug: vmbr99: pre-up : running module vrf
debug: vmbr99: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.vmbr99.input=0
info: vmbr99: bridge inherits mtu from its ports. There is no need to assign mtu on a bridge
debug: vmbr99: up : running module dhcp
debug: vmbr99: up : running module address
debug: vmbr99: up : running module addressvirtual
debug: vmbr99: up : running module usercmds
debug: vmbr99: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: vmbr99: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: vmbr99: post-up : running module usercmds
debug: vmbr99: statemanager sync state pre-up
info: vmbr99.11: running ops ...
debug: vmbr99.11: pre-up : running module openvswitch
debug: vmbr99.11: pre-up : running module openvswitch_port
debug: vmbr99.11: pre-up : running module xfrm
debug: vmbr99.11: pre-up : running module link
debug: vmbr99.11: pre-up : running module bond
debug: vmbr99.11: pre-up : running module vlan
info: vmbr99: netlink: bridge vlan add vid 11 dev vmbr99
debug: vmbr99.11: pre-up : running module vxlan
debug: vmbr99.11: pre-up : running module usercmds
debug: vmbr99.11: pre-up : running module bridge
debug: vmbr99.11: pre-up : running module bridgevlan
debug: vmbr99.11: pre-up : running module tunnel
debug: vmbr99.11: pre-up : running module vrf
debug: vmbr99.11: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.vmbr99/11.input=0
info: writing '0' to file /proc/sys/net/ipv4/conf/vmbr99.11/arp_accept
debug: vmbr99.11: up : running module dhcp
debug: vmbr99.11: up : running module address
debug: vmbr99.11: up : running module addressvirtual
debug: vmbr99.11: up : running module usercmds
debug: vmbr99.11: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: vmbr99.11: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: vmbr99.11: post-up : running module usercmds
debug: vmbr99.11: statemanager sync state pre-up
debug: Net001: found dependents ['vxlan_Net001']
info: vxlan_Net001: running ops ...
debug: vxlan_Net001: pre-up : running module openvswitch
debug: vxlan_Net001: pre-up : running module openvswitch_port
debug: vxlan_Net001: pre-up : running module xfrm
debug: vxlan_Net001: pre-up : running module link
debug: vxlan_Net001: pre-up : running module bond
debug: vxlan_Net001: pre-up : running module vlan
debug: vxlan_Net001: pre-up : running module vxlan
info: vxlan_Net001: vxlan already exists - no change detected
debug: vxlan_Net001: pre-up : running module usercmds
debug: vxlan_Net001: pre-up : running module bridge
info: Net001: applying bridge port configuration: ['vxlan_Net001']
debug: vxlan_Net001: pre-up : running module bridgevlan
debug: vxlan_Net001: pre-up : running module tunnel
debug: vxlan_Net001: pre-up : running module vrf
debug: vxlan_Net001: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.vxlan_Net001.input=0
debug: vxlan_Net001: up : running module dhcp
debug: vxlan_Net001: up : running module address
debug: vxlan_Net001: up : running module addressvirtual
debug: vxlan_Net001: up : running module usercmds
debug: vxlan_Net001: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: vxlan_Net001: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: vxlan_Net001: post-up : running module usercmds
debug: vxlan_Net001: statemanager sync state pre-up
info: Net001: running ops ...
debug: Net001: pre-up : running module openvswitch
debug: Net001: pre-up : running module openvswitch_port
debug: Net001: pre-up : running module xfrm
debug: Net001: pre-up : running module link
debug: Net001: pre-up : running module bond
debug: Net001: pre-up : running module vlan
debug: Net001: pre-up : running module vxlan
debug: Net001: pre-up : running module usercmds
debug: Net001: pre-up : running module bridge
info: Net001: bridge already exists
info: Net001: applying bridge settings
info: Net001: reset bridge-hashel to default: 4
info: reading '/sys/class/net/Net001/bridge/stp_state'
info: Net001: netlink: ip link set dev Net001 type bridge (with attributes)
debug: attributes: {26: 4}
debug: Net001: evaluating port expr '['vxlan_Net001']'
info: Net001: port vxlan_Net001: already processed
info: Net001: applying bridge configuration specific to ports
info: Net001: processing bridge config for port vxlan_Net001
debug: Net001: evaluating port expr '['vxlan_Net001']'
debug: Net001: _get_bridge_mac returned (None, None)
debug: Net001: pre-up : running module bridgevlan
debug: Net001: pre-up : running module tunnel
debug: Net001: pre-up : running module vrf
debug: Net001: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.Net001.input=0
info: Net001: bridge inherits mtu from its ports. There is no need to assign mtu on a bridge
debug: Net001: up : running module dhcp
debug: Net001: up : running module address
debug: Net001: up : running module addressvirtual
debug: Net001: up : running module usercmds
debug: Net001: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: Net001: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: Net001: post-up : running module usercmds
debug: Net001: statemanager sync state pre-up
debug: Pub001: found dependents ['vxlan_Pub001']
info: vxlan_Pub001: running ops ...
debug: vxlan_Pub001: pre-up : running module openvswitch
debug: vxlan_Pub001: pre-up : running module openvswitch_port
debug: vxlan_Pub001: pre-up : running module xfrm
debug: vxlan_Pub001: pre-up : running module link
debug: vxlan_Pub001: pre-up : running module bond
debug: vxlan_Pub001: pre-up : running module vlan
debug: vxlan_Pub001: pre-up : running module vxlan
info: vxlan_Pub001: vxlan already exists - no change detected
debug: vxlan_Pub001: pre-up : running module usercmds
debug: vxlan_Pub001: pre-up : running module bridge
info: Pub001: applying bridge port configuration: ['vxlan_Pub001']
debug: vxlan_Pub001: pre-up : running module bridgevlan
debug: vxlan_Pub001: pre-up : running module tunnel
debug: vxlan_Pub001: pre-up : running module vrf
debug: vxlan_Pub001: pre-up : running module address
info: executing /sbin/sysctl net.mpls.conf.vxlan_Pub001.input=0
debug: vxlan_Pub001: up : running module dhcp
debug: vxlan_Pub001: up : running module address
debug: vxlan_Pub001: up : running module addressvirtual
debug: vxlan_Pub001: up : running module usercmds
debug: vxlan_Pub001: up : running script /etc/network/if-up.d/chrony
info: executing /etc/network/if-up.d/chrony
debug: vxlan_Pub001: up : running script /etc/network/if-up.d/postfix
info: executing /etc/network/if-up.d/postfix
debug: vxlan_Pub001: post-up : running module usercmds
debug: vxlan_Pub001: statemanager sync state pre-up
info: Pub001: running ops ...
debug: Pub001: pre-up : running module openvswitch
debug: Pub001: pre-up : running module openvswitch_port
debug: Pub001: pre-up : running module xfrm
debug: Pub001: pre-up : running module link
debug: Pub001: pre-up : running module bond
debug: Pub001: pre-up : running module vlan
debug: Pub001: pre-up : running module vxlan
debug: Pub001: pre-up : running module usercmds
debug: Pub001: pre-up : running module bridge


info: executing /etc/network/if-up.d/postfix
debug: Win106: post-up : running module usercmds
debug: Win106: statemanager sync state pre-up
debug: saving state ..
info: exit status 0
root@hvirt02:~#
 
i removed the non needed lines otherwise would be to big

and after i applied with ifreload the ping was not working anymore, which has as consequence that i dont have a connection between the nodes.

the same ping which worked before. now i try to find the new MTU and its now 1444.

i thing we can agree on: if no value is entered it NEEDS to work ! it just uses standard value, the MTU is just for optimization, therfore increase the packages size

btw: i opened a ticket, its now the third day my cluster is down. this can not happen again. its the second time in 2 weeks (as i told multiple times - it was working 1 year with MTU 9000, suddently not anymore. the new MTU was than 8972, and now it seems i cant find any working MTU)

is there a way to reinstall the server with the ceph configuration (can not loose the VMs)
 

Attachments

  • ping.JPG
    ping.JPG
    32.6 KB · Views: 9
Last edited:
i removed the non needed lines otherwise would be to big

and after i applied with ifreload the ping was not working anymore, which has as consequence that i dont have a connection between the nodes.

the same ping which worked before. now i try to find the new MTU and its now 1444.

i thing we can agree on: if no value is entered it NEEDS to work ! it just uses standard value, the MTU is just for optimization, therfore increase the packages size

btw: i opened a ticket, its now the third day my cluster is down. this can not happen again. its the second time in 2 weeks (as i told multiple times - it was working 1 year with MTU 9000, suddently not anymore. the new MTU was than 8972, and now it seems i cant find any working MTU)

is there a way to reinstall the server with the ceph configuration (can not loose the VMs)
I really don't see any change related to mtu in ifreload log, that's really strange.

just to be sure, you have booted with this config, it was working fine ? then ifreload -a, with same config/no change, then it don't working anymore ?

can you share your /etc/network/interfaces && /etc/network/interfaces.d/sdn ? I'll try to reproduce on my side.
 
hello sprit,
thx for your effort.
until round about 2 weeks ago it was working for over 1 year with the MTU 9000. suddenly, after an proxmox update (i rebooted all nodes) the value 9000 wasnt working anymore, and the new value was 8972. with the new value it was working round about 1 1/2 weeks. another update was there (tzdata), i run it on 2 servers and run also on the 2 servers the autoremove (server 1 and 2), rebooted the servers, and since than (i already rebooted all) is nothing working anymore. as i understand, server 2 and 3 have that issue, server 1 and 4 works.
now i can change whatever i want (description in top), reboot servers after change, it doesnt matter (i also rebooted the switch). i have nothing changed in the network settings since a long time. at least as i cant remember right now.

interfaces
Code:
auto lo
iface lo inet loopback

auto enp41s0
iface enp41s0 inet manual
#1GB UPLINK

auto enp1s0f0
iface enp1s0f0 inet static
        address 10.10.15.10/24
        mtu 9000
#10GB SDN

auto enp1s0f1
iface enp1s0f1 inet static
        address 10.10.12.10/24
        mtu 9000
#10GB Corosync

auto enp33s0
iface enp33s0 inet manual
#10GB Ceph

auto vmbr99
iface vmbr99 inet static
        address 10.10.10.10/24
        bridge-ports enp33s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 1472
#10GB Ceph Public

auto vmbr0
iface vmbr0 inet static
        address publicip
        gateway public gw
        bridge-ports enp41s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#1GB UPLINK Public

auto vmbr99.11
iface vmbr99.11 inet static
        address 10.10.11.10/24
#10GB Ceph Cluster

source /etc/network/interfaces.d/*


sdn

Code:
#version:170

auto Net001
iface Net001
        bridge_ports vxlan_Net001
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 192.168.175 Infrastructure

auto Pub001
iface Pub001
        bridge_ports vxlan_Pub001
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 10.10.100 Webserver Network

auto Pub002
iface Pub002
        bridge_ports vxlan_Pub002
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 10.10.101 eMailGW Network

auto Pub003
iface Pub003
        bridge_ports vxlan_Pub003
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 10.10.102 Rest Network

auto Sec001
iface Sec001
        bridge_ports vxlan_Sec001
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias Secure Network SQL

auto Sec002
iface Sec002
        bridge_ports vxlan_Sec002
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias Secure Network Rest

auto Win100
iface Win100
        bridge_ports vxlan_Win100
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.100 Windows Infrastructure

auto Win101
iface Win101
        bridge_ports vxlan_Win101
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.101 Network

auto Win102
iface Win102
        bridge_ports vxlan_Win102
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.102 Network
        auto Win103
        
iface Win103
        bridge_ports vxlan_Win103
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.103 Customer01 Network

auto Win104
iface Win104
        bridge_ports vxlan_Win104
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.104 Customer02 Network

auto Win105
iface Win105
        bridge_ports vxlan_Win105
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.105 Customer03 Network

auto Win106
iface Win106
        bridge_ports vxlan_Win106
        bridge_stp off
        bridge_fd 0
        mtu 1450
        alias 172.16.106 Demo Network

auto vxlan_Net001
iface vxlan_Net001
        vxlan-id 10175
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Pub001
iface vxlan_Pub001
        vxlan-id 101100
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Pub002
iface vxlan_Pub002
        vxlan-id 101101
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Pub003
iface vxlan_Pub003
        vxlan-id 101102
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Sec001
iface vxlan_Sec001
        vxlan-id 101111
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Sec002
iface vxlan_Sec002
        vxlan-id 101112
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        
                vxlan_remoteip 10.10.15.13
        mtu 1450

auto vxlan_Win106
iface vxlan_Win106
        vxlan-id 10106
        vxlan_remoteip 10.10.15.11
        vxlan_remoteip 10.10.15.12
        vxlan_remoteip 10.10.15.13
        mtu 1450
 
you are still lacking basic understanding of what that ping command does, and misconfigure your network as a result. please re-read my answers.

it is totally expected if you do

ping -s X -M do
set MTU to X
reload
ping -s X -M do

that the last ping will fail. the size passed to ping and the configured MTU should not be the same value.
 
note that

Code:
auto enp33s0
iface enp33s0 inet manual
#10GB Ceph

auto vmbr99
iface vmbr99 inet static
        address 10.10.10.10/24
        bridge-ports enp33s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 1472
#10GB Ceph Public

and log:

info: vmbr99: bridge inherits mtu from its ports. There is no need to assign mtu on a bridge

so,I'm not sure that 1472 will apply here. (maybe it's take enp33s0 default mtu 1500)

#ip addr shoud show the current mtu
(could be interessing to have result of #ip addr before and after reload to compare)

Personnaly, I'm also fixing the mtu (same) on both physical nic && bridge to be sure.
 
can we agree on it if all MTU values are removed - it needs to work !
is that argument correct ?
 
if all MTUs are removed on the interfaces themselves (defaulting to 1500), then a ping on the physical interface(s) with -s $((1500-28)) -M do should work, yes. for SDN you need to handle the additional overhead though by configuring the MTU in the SDN config accordingly.
 
the sdn is only for the server to server communication and on ip 10.10.1.x
ceph is on 10.10.10 and 10.10.10.11
corosync is on 10.10.10.12
still timeout on the logs, no ceph comes to life and no corosync.
on server 1, 3 and 4 i see
pvestatd[34591]: got timeout
on server02 the logs attached
its how it looks like
 

Attachments

  • systemlog02.txt
    93 KB · Views: 2
  • now.JPG
    now.JPG
    95.8 KB · Views: 7
the sdn is only for the server to server communication and on ip 10.10.1.x
ceph is on 10.10.10 and 10.10.10.11
corosync is on 10.10.10.12
still timeout on the logs, no ceph comes to life and no corosync.
on server 1, 3 and 4 i see
pvestatd[34591]: got timeout
on server02 the logs attached
its how it looks like

I don't think it's related, but about your logs:

Code:
Mar 28 10:17:26 hvirt02 bgpd[1108]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for L2VPN EVPN from 10.10.15.12 in vrf default
Mar 28 10:17:26 hvirt02 bgpd[1108]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for L2VPN EVPN from 10.10.15.10 in vrf default
Mar 28 10:17:26 hvirt02 bgpd[1108]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for L2VPN EVPN from 10.10.15.13 in vrf default
Do you use a evpn sdn zone ? (Because I don't see any related config in your /etc/network/interfaces.d/sdn , they look like simple vxlan unnel).
if not, maybe an old conf in /etc/frr/frr.conf ? (note that frr with evpn could do some change in bridge, and It don't works fine with vlan-aware bridge)
 
i was a couple of weeks playing with eVPN around. i removed evpn a while ago (and everything was working). today i just had seen i left the controller and removed it.
yes, as prod i have vxlan
i will remove all frr folders (they are still on every server)
 
such ticket i opened.

Ticket #209196​

because i need to buy first the support. i ask if i need to buy a fully subscrition or if they have hours engineer based offers
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!