After upgrade to 5.4 redundant corosync ring does not work as expected

Whatever · Apr 24, 2019

#4

Code:

root@pve-node4:~# ip -details addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_ma                                                                                                                x_segs 65535
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether ac:1f:6b:45:ba:5e brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr ac:1f:6b:45:ba:5e queue_id 0 ad_aggregator_id 1 ad                                                                                                                _actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
3: enp6s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP group default qlen 1000
    link/ether 0c:c4:7a:1d:90:74 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:1d:90:74 queue_id 0 ad_aggregator_id 1 ad                                                                                                                _actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 64 numrxqueues 64 gso_max_size 65536 gso_max_segs 65535
4: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether ac:1f:6b:45:ba:5e brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr ac:1f:6b:45:ba:5f queue_id 0 ad_aggregator_id 1 ad                                                                                                                _actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
5: enp6s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP group default qlen 1000
    link/ether 0c:c4:7a:1d:90:74 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:1d:90:75 queue_id 0 ad_aggregator_id 1 ad                                                                                                                _actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 64 numrxqueues 64 gso_max_size 65536 gso_max_segs 65535
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether ac:1f:6b:45:ba:5e brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond mode 802.3ad miimon 100 updelay 0 downdelay 0 use_carrier 1 arp_interval 0 arp_validate none arp_all_targets any pri                                                                                                                mary_reselect always fail_over_mac none xmit_hash_policy layer2+3 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links                                                                                                                 0 lp_interval 1 packets_per_slave 1 lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 9 ad_partner_                                                                                                                key 1001 ad_partner_mac 02:04:96:8b:a0:dd ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00:00:00                                                                                                                 tlb_dynamic_lb 1
    bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on                                                                                                                 port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.ac:1f:6b:45:ba:5e designated_root 8                                                                                                                000.ac:1f:6b:45:ba:5e hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_p                                                                                                                ending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off group_fwd_mas                                                                                                                k 0x0 group_fwd_mask_str 0x0 vlan_tunnel off numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether ac:1f:6b:45:ba:5e brd ff:ff:ff:ff:ff:ff promiscuity 0
    bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 1 vlan_pro                                                                                                                tocol 802.1Q bridge_id 8000.ac:1f:6b:45:ba:5e designated_root 8000.ac:1f:6b:45:ba:5e root_port 0 root_path_cost 0 topology_ch                                                                                                                ange 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   28.62 vlan_d                                                                                                                efault_pvid 1 vlan_stats_enabled 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_que                                                                                                                ry_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4 mcast_hash_max 512 mcast_last_member_count 2 mcast_startup_query_coun                                                                                                                t 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mc                                                                                                                ast_query_response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_versi                                                                                                                on 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 6                                                                                                                5535
    inet 10.71.200.104/24 brd 10.71.200.255 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:fe45:ba5e/64 scope link
       valid_lft forever preferred_lft forever

#5

Code:

root@pve:~# ip -details addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 0c:c4:7a:6a:f0:e0 brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:6a:f0:e0 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
3: ens1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 90:e2:ba:2c:c5:64 brd ff:ff:ff:ff:ff:ff promiscuity 0 numtxqueues 64 numrxqueues 64 gso_max_size 65536 gso_max_segs 65535
    inet 10.10.10.100/24 brd 10.10.10.255 scope global ens1
       valid_lft forever preferred_lft forever
    inet6 fe80::92e2:baff:fe2c:c564/64 scope link
       valid_lft forever preferred_lft forever
4: enp3s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 0c:c4:7a:6a:f0:e0 brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:6a:f0:e1 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:6a:f0:e0 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond mode 802.3ad miimon 100 updelay 0 downdelay 0 use_carrier 1 arp_interval 0 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2+3 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_rate slow ad_select stable ad_aggregator 2 ad_num_ports 2 ad_actor_key 9 ad_partner_key 1007 ad_partner_mac 02:04:96:8b:a0:dd ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00:00:00 tlb_dynamic_lb 1
    bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.c:c4:7a:6a:f0:e0 designated_root 8000.c:c4:7a:6a:f0:e0 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off group_fwd_mask 0x0 group_fwd_mask_str 0x0 vlan_tunnel off numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
6: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 0c:c4:7a:6a:f0:e0 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 1 vlan_protocol 802.1Q bridge_id 8000.c:c4:7a:6a:f0:e0 designated_root 8000.c:c4:7a:6a:f0:e0 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   33.87 vlan_default_pvid 1 vlan_stats_enabled 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4 mcast_hash_max 512 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.71.200.100/24 brd 10.71.200.255 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::ec4:7aff:fe6a:f0e0/64 scope link
       valid_lft forever preferred_lft forever

Stoiko Ivanov said:
* also are the LACP/MTU settings active and working on the switch side?

Everything is fine on a switch side. Example of output on a interface is attached

Whatever · Apr 24, 2019

Jumbo is enabled and working

Whatever · Apr 24, 2019

Stoiko Ivanov said:
Also am still not sure if all nodes have the faulty ring or if it happens only on pve-node2 (most output with this error is from that node - but that could be just coincidence)

Checked. Ring#1 state is flapping on every node in the cluster (it was almost obvious due to the fact that corosync state was cluster wide)

Stoiko Ivanov · Apr 24, 2019

Thanks - nothing out of the ordinary to my eyes :/

* The output for bond1 is missing for all nodes (but that's not the interface you're having problems with - right?)
* you could check the individual ethernet interfaces' statistics (e.g. `ethtool -S enp3s0f1` and `ip -statistics -details addr`) for dropped packets other problems

Whatever · Apr 24, 2019

Stoiko Ivanov said:
Thanks - nothing out of the ordinary to my eyes :/

* The output for bond1 is missing for all nodes (but that's not the interface you're having problems with - right?)

Right. It's ring#0 10Gbe net. It works perfectly well thus I cut my logs a bit

Stoiko Ivanov said:
* you could check the individual ethernet interfaces' statistics (e.g. `ethtool -S enp3s0f1` and `ip -statistics -details addr`) for dropped packets other problems

Here we go...

For node1:

root@pve-node1:~# ethtool -S enp3s0f1

Code:

NIC statistics:
     rx_packets: 21236223
     tx_packets: 60628227
     rx_bytes: 3148535126
     tx_bytes: 83208115554
     rx_broadcast: 25734
     tx_broadcast: 287
     rx_multicast: 263377
     tx_multicast: 2301
     multicast: 263377
     collisions: 0
     rx_crc_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 5866929
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 3148535126
     tx_dma_out_of_sync: 0
     lro_aggregated: 0
     lro_flushed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     os2bmc_rx_by_bmc: 0
     os2bmc_tx_by_bmc: 0
     os2bmc_tx_by_host: 0
     os2bmc_rx_by_host: 0
     tx_hwtstamp_timeouts: 0
     rx_hwtstamp_cleared: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_frame_errors: 0
     rx_fifo_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_queue_0_packets: 60628227
     tx_queue_0_bytes: 82731400279
     tx_queue_0_restart: 223
     rx_queue_0_packets: 21236223
     rx_queue_0_bytes: 3040765917
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 2
     rx_queue_0_alloc_failed: 0

root@pve-node1:~# ethtool -S eno1

Code:

NIC statistics:
     rx_packets: 22343284
     tx_packets: 63400447
     rx_bytes: 3395833161
     tx_bytes: 90998896463
     rx_broadcast: 1289
     tx_broadcast: 155
     rx_multicast: 82519
     tx_multicast: 72791
     multicast: 82519
     collisions: 0
     rx_crc_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 4795423
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 3395833161
     tx_dma_out_of_sync: 0
     lro_aggregated: 0
     lro_flushed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     os2bmc_rx_by_bmc: 0
     os2bmc_tx_by_bmc: 0
     os2bmc_tx_by_host: 0
     os2bmc_rx_by_host: 0
     tx_hwtstamp_timeouts: 0
     rx_hwtstamp_cleared: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_frame_errors: 0
     rx_fifo_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_queue_0_packets: 63400447
     tx_queue_0_bytes: 90491461250
     tx_queue_0_restart: 66
     rx_queue_0_packets: 22343284
     rx_queue_0_bytes: 3283557892
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 1
     rx_queue_0_alloc_failed: 0

root@pve-node1:~# ip -statistics -details addr

Code:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
    RX: bytes  packets  errors  dropped overrun mcast
    352305921847 17131186 0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    352305921847 17131186 0       0       0       0
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 0c:c4:7a:2b:a7:30 brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:2b:a7:30 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    3284579874 22348026 0       2351    0       82569
    TX: bytes  packets  errors  dropped carrier collsns
    90500847833 63411235 0       0       0       0
3: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP group default qlen 1000
    link/ether 0c:c4:7a:1d:8c:c6 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:1d:8c:c6 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 64 numrxqueues 64 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    1481777991422 255617407 0       0       0       364278
    TX: bytes  packets  errors  dropped carrier collsns
    1406871402027 251960861 0       0       0       0
4: enp3s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 0c:c4:7a:2b:a7:30 brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:2b:a7:31 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    3044793952 21250278 0       2351    0       263708
    TX: bytes  packets  errors  dropped carrier collsns
    82740996474 60643296 0       0       0       0
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP group default qlen 1000
    link/ether 0c:c4:7a:1d:8c:c6 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 0c:c4:7a:1d:8c:c7 queue_id 0 ad_aggregator_id 2 ad_actor_oper_port_state 61 ad_partner_oper_port_state 61 numtxqueues 64 numrxqueues 64 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    3072086434106 455110355 0       0       0       366269
    TX: bytes  packets  errors  dropped carrier collsns
    1489953004961 306557080 0       0       0       0
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:2b:a7:30 brd ff:ff:ff:ff:ff:ff promiscuity 1
    bond mode 802.3ad miimon 100 updelay 0 downdelay 0 use_carrier 1 arp_interval 0 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2+3 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_rate slow ad_select stable ad_aggregator 2 ad_num_ports 2 ad_actor_key 9 ad_partner_key 1003 ad_partner_mac 02:04:96:8b:a0:dd ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00:00:00 tlb_dynamic_lb 1
    bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.c:c4:7a:2b:a7:30 designated_root 8000.c:c4:7a:2b:a7:30 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off group_fwd_mask 0x0 group_fwd_mask_str 0x0 vlan_tunnel off numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    6329373886 43598305 0       4702    0       346277
    TX: bytes  packets  errors  dropped carrier collsns
    173241844307 124054531 0       243     0       0
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 0c:c4:7a:2b:a7:30 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 1 vlan_protocol 802.1Q bridge_id 8000.c:c4:7a:2b:a7:30 designated_root 8000.c:c4:7a:2b:a7:30 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer  122.57 vlan_default_pvid 1 vlan_stats_enabled 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4 mcast_hash_max 512 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.71.200.101/24 brd 10.71.200.255 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::ec4:7aff:fe2b:a730/64 scope link
       valid_lft forever preferred_lft forever
    RX: bytes  packets  errors  dropped overrun mcast
    756623324  3312669  0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    873868241  2938790  0       0       0       0

Dropped 4702 for bond0 ... interesting...

Whatever · Apr 24, 2019

Almost the same results (dropped ~4500 packets) on all other nodes

New driver bug (new kernel) ?

Whatever · Apr 24, 2019

On nodes #1-3

Code:

01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

Node#4:

Code:

06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
07:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
08:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

Node#5:

Code:

01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

dmesg from node1

Code:

root@pve-node1:~# dmesg | grep Eth
[    1.804131] Intel(R) Gigabit Ethernet Linux Driver - version 5.3.5.18
[    1.924797] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Linux Driver
[    2.088695] igb 0000:03:00.1: Intel(R) Gigabit Ethernet Linux Driver
[   10.679486] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

root@pve-node1:~# dmesg | grep igb
[    1.802665] igb: loading out-of-tree module taints kernel.
[    1.924796] igb 0000:03:00.0: added PHC on eth0
[    1.924797] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Linux Driver
[    1.924798] igb 0000:03:00.0: eth0: (PCIe:5.0GT/s:Width x4)
[    1.924800] igb 0000:03:00.0 eth0: MAC: 0c:c4:7a:2b:a7:30
[    1.924920] igb 0000:03:00.0: eth0: PBA No: 070B00-000
[    1.925043] igb 0000:03:00.0: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
[    1.936130] igb 0000:03:00.0: LRO is disabled
[    1.936132] igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.088694] igb 0000:03:00.1: added PHC on eth2
[    2.088695] igb 0000:03:00.1: Intel(R) Gigabit Ethernet Linux Driver
[    2.088696] igb 0000:03:00.1: eth2: (PCIe:5.0GT/s:Width x4)
[    2.088699] igb 0000:03:00.1 eth2: MAC: 0c:c4:7a:2b:a7:31
[    2.088819] igb 0000:03:00.1: eth2: PBA No: 070B00-000
[    2.100127] igb 0000:03:00.1: LRO is disabled
[    2.100128] igb 0000:03:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.101153] igb 0000:03:00.1 enp3s0f1: renamed from eth2
[    2.128235] igb 0000:03:00.0 eno1: renamed from eth0
[    7.132906] igb 0000:03:00.0: DCA enabled
[    7.132927] igb 0000:03:00.1: DCA enabled
[   10.889888] igb 0000:03:00.0: changing MTU from 1500 to 9000
[   10.984075] igb 0000:03:00.1: changing MTU from 1500 to 9000
[   14.076734] igb 0000:03:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   14.476669] igb 0000:03:00.1 enp3s0f1: igb: enp3s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 2728.260794] igb 0000:03:00.0: changing MTU from 9000 to 1500
[ 2728.539692] igb 0000:03:00.1: changing MTU from 9000 to 1500
[ 2728.731448] igb 0000:03:00.0 eno1: speed changed to 0 for port eno1
[ 2728.731551] igb 0000:03:00.1 enp3s0f1: speed changed to 0 for port enp3s0f1
[ 2731.972034] igb 0000:03:00.1 enp3s0f1: igb: enp3s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 2732.451992] igb 0000:03:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 2805.638565] igb 0000:03:00.0: changing MTU from 1500 to 9000
[ 2805.850047] igb 0000:03:00.1: changing MTU from 1500 to 9000
[ 2806.000741] igb 0000:03:00.0 eno1: speed changed to 0 for port eno1
[ 2806.000852] igb 0000:03:00.1 enp3s0f1: speed changed to 0 for port enp3s0f1
[ 2808.979322] igb 0000:03:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 2810.079237] igb 0000:03:00.1 enp3s0f1: igb: enp3s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

root@pve-node1:~# uname -a
Linux pve-node1 4.15.18-12-pve #1 SMP PVE 4.15.18-35 (Wed, 13 Mar 2019 08:24:42 +0100) x86_64 GNU/Linux

Output from the working cluster (5.3) - where both rings are working correctly

Code:

root@pve-node1:~# dmesg | grep Eth
[    1.838267] Intel(R) Gigabit Ethernet Linux Driver - version 5.3.5.18
[    1.948530] igb 0000:01:00.0: Intel(R) Gigabit Ethernet Linux Driver
[    2.076463] igb 0000:01:00.1: Intel(R) Gigabit Ethernet Linux Driver
[   16.303317] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
root@pve-node1:~# dmesg | grep igb
[    1.837431] igb: loading out-of-tree module taints kernel.
[    1.948528] igb 0000:01:00.0: added PHC on eth0
[    1.948530] igb 0000:01:00.0: Intel(R) Gigabit Ethernet Linux Driver
[    1.948531] igb 0000:01:00.0: eth0: (PCIe:5.0GT/s:Width x4)
[    1.948533] igb 0000:01:00.0 eth0: MAC: 00:25:90:88:a7:00
[    1.948604] igb 0000:01:00.0: eth0: PBA No: 104900-000
[    1.948678] igb 0000:01:00.0: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
[    1.960071] igb 0000:01:00.0: LRO is disabled
[    1.960073] igb 0000:01:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.076462] igb 0000:01:00.1: added PHC on eth1
[    2.076463] igb 0000:01:00.1: Intel(R) Gigabit Ethernet Linux Driver
[    2.076465] igb 0000:01:00.1: eth1: (PCIe:5.0GT/s:Width x4)
[    2.076466] igb 0000:01:00.1 eth1: MAC: 00:25:90:88:a7:01
[    2.076538] igb 0000:01:00.1: eth1: PBA No: 104900-000
[    2.088041] igb 0000:01:00.1: LRO is disabled
[    2.088050] igb 0000:01:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.089047] igb 0000:01:00.0 eno1: renamed from eth0
[    2.108326] igb 0000:01:00.1 eno2: renamed from eth1
[   14.557509] igb 0000:01:00.0: DCA enabled
[   14.557734] igb 0000:01:00.1: DCA enabled
[   16.522308] igb 0000:01:00.0: changing MTU from 1500 to 9000
[   16.614201] igb 0000:01:00.1: changing MTU from 1500 to 9000
[   19.924360] igb 0000:01:00.1 eno2: igb: eno2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   20.140421] igb 0000:01:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

root@pve-node1:~# uname -a
Linux pve-node1 4.15.18-11-pve #1 SMP PVE 4.15.18-33 (Tue, 05 Feb 2019 07:36:16 +0100) x86_64 GNU/Linux

Stoiko Ivanov · Apr 24, 2019

Whatever said:
Dropped 4702 for bond0 ... interesting...

Well that's about 1% - but still - could be the issue - does the switch-statistics show the same/similar numbers?

Whatever said:
New driver bug (new kernel) ?

hmm - you experience the issue on the 1g i350/i210 NICs? For those pve-kernel uses the out-of-tree driver, which has not changed between 5.3 and 5.4
But if possible you could reboot to 4.15.18-11-pve on the cluster and see if the error goes away (IIRC there should not be a problem with booting an older kernel) - that would narrow the issue down!

Stoiko Ivanov · Apr 24, 2019

Whatever said:
[ 2805.638565] igb 0000:03:00.0: changing MTU from 1500 to 9000
[ 2805.850047] igb 0000:03:00.1: changing MTU from 1500 to 9000
[ 2806.000741] igb 0000:03:00.0 eno1: speed changed to 0 for port eno1
[ 2806.000852] igb 0000:03:00.1 enp3s0f1: speed changed to 0 for port enp3s0f1
[ 2808.979322] igb 0000:03:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 2810.079237] igb 0000:03:00.1 enp3s0f1: igb: enp3s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Are these messages from changes you made live on the cluster? (else it would indicate a flapping link)

A newer kernel just hit pve-no-subscription repository - you could also try that one

Whatever · Apr 24, 2019

Stoiko Ivanov said:
Well that's about 1% - but still - could be the issue - does the switch-statistics show the same/similar numbers?

hmm - you experience the issue on the 1g i350/i210 NICs? For those pve-kernel uses the out-of-tree driver, which has not changed between 5.3 and 5.4
But if possible you could reboot to 4.15.18-11-pve on the cluster and see if the error goes away (IIRC there should not be a problem with booting an older kernel) - that would narrow the issue down!

On Node#5 I've tried 4.15.18-11-pve kernel. No difference still see dropped packets

Code:

root@pve:~# uname -a
Linux pve 4.15.18-11-pve #1 SMP PVE 4.15.18-34 (Mon, 25 Feb 2019 14:51:06 +0100) x86_64 GNU/Linux


5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:6a:f0:e0 brd ff:ff:ff:ff:ff:ff promiscuity 0
    bond mode 802.3ad miimon 100 updelay 0 downdelay 0 use_carrier 1 arp_interval 0 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2+3 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_rate slow ad_select stable ad_aggregator 2 ad_num_ports 2 ad_actor_key 9 ad_partner_key 1007 ad_partner_mac 02:04:96:8b:a0:dd ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00:00:00 tlb_dynamic_lb 1
    bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.c:c4:7a:6a:f0:e0 designated_root 8000.c:c4:7a:6a:f0:e0 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off group_fwd_mask 0x0 group_fwd_mask_str 0x0 vlan_tunnel off numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped overrun mcast
    2490986    11875    0       24      0       1154
    TX: bytes  packets  errors  dropped carrier collsns
    1986971    11411    0       0       0       0

root@pve:~# dmesg | grep igb
[    1.781527] igb: loading out-of-tree module taints kernel.
[    1.912626] igb 0000:03:00.0: added PHC on eth0
[    1.912627] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Linux Driver
[    1.912629] igb 0000:03:00.0: eth0: (PCIe:5.0GT/s:Width x4)
[    1.912631] igb 0000:03:00.0 eth0: MAC: 0c:c4:7a:6a:f0:e0
[    1.912751] igb 0000:03:00.0: eth0: PBA No: 070B00-000
[    1.912874] igb 0000:03:00.0: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
[    1.924068] igb 0000:03:00.0: LRO is disabled
[    1.924070] igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.056641] igb 0000:03:00.1: added PHC on eth1
[    2.056642] igb 0000:03:00.1: Intel(R) Gigabit Ethernet Linux Driver
[    2.056644] igb 0000:03:00.1: eth1: (PCIe:5.0GT/s:Width x4)
[    2.056646] igb 0000:03:00.1 eth1: MAC: 0c:c4:7a:6a:f0:e1
[    2.056766] igb 0000:03:00.1: eth1: PBA No: 070B00-000
[    2.068088] igb 0000:03:00.1: LRO is disabled
[    2.068090] igb 0000:03:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    2.069382] igb 0000:03:00.1 enp3s0f1: renamed from eth1
[    2.096296] igb 0000:03:00.0 eno1: renamed from eth0
[    3.522139] igb 0000:03:00.0: DCA enabled
[    3.522157] igb 0000:03:00.1: DCA enabled
[    6.503274] igb 0000:03:00.0: changing MTU from 1500 to 9000
[    6.595196] igb 0000:03:00.1: changing MTU from 1500 to 9000
[    9.940504] igb 0000:03:00.1 enp3s0f1: igb: enp3s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   10.244511] igb 0000:03:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Whatever · Apr 24, 2019

Stoiko Ivanov said:
Are these messages from changes you made live on the cluster? (else it would indicate a flapping link)

A newer kernel just hit pve-no-subscription repository - you could also try that one

Yeap, was playing with changing MTU online (skip this messages). Default MTU (on boot) has been set to 9000 from 5.1
Will try set MTU to 1500 all over the cluster but on weekend (at least)

Stoiko Ivanov · Apr 24, 2019

Whatever said:
Will try set MTU to 1500 all over the cluster but on weekend (at least)

maybe worth a try - however if I read your `ethtool` output correctly I would not see any jumbo-frame issues ( assuming 'rx_long_length_errors' indicates dropped frames due to MTU-settings and 'rx_long_byte_count' indicates jumbo-frames)

* switch interface statistics also look ok?

* It may sound odd, but please make sure your Firmwares/BIOS is updated to the latest level

You could also try to remove one node from the cluster and see if that narrows it down somehow (and do this node after node) - if this is feasible in your environment of course!

Whatever · Apr 24, 2019

Stoiko Ivanov said:
* switch interface statistics also look ok?

No drops from switch point of view.
I will investigate this:

1:5 2:5 LACP 1 L3 1:5 - R 120
L3 2:5 Y A 113

Code:

Slot-2 Stack.1 # show sharing
Load Sharing Monitor
Config    Current Agg     Min     Ld Share    Ld Share  Agg   Link   Link Up
Master    Master  Control Active  Algorithm   Group     Mbr   State  Transitions
================================================================================
   1:1    1:1     LACP       1     L3          1:1        Y      A      357
                                   L3          2:1        Y      A      359
   1:3    1:3     LACP       1     L3          1:3        Y      A      147
                                   L3          2:3        Y      A      133
   1:4    1:4     LACP       1     L3          1:4        Y      A       83
                                   L3          2:4        Y      A       77
   1:5    2:5     LACP       1     L3          1:5        -      R      120
                                   L3          2:5        Y      A      113
   1:7    1:7     LACP       1     L3          1:7        Y      A       74
                                   L3          2:7        Y      A       36
================================================================================
Link State: A-Active, D-Disabled, R-Ready, NP-Port not present, L-Loopback
Minimum Active: (<) Group is down. # active links less than configured minimum
Load Sharing Algorithm: (L2) Layer 2 address based, (L3) Layer 3 address based
                        (L3_L4) Layer 3 address and Layer 4 port based
                        (custom) User-selected address-based configuration
Custom Algorithm Configuration: ipv4 L3-and-L4, xor
Number of load sharing trunks: 7
Slot-2 Stack.2 # show lacp counters

LACP PDUs dropped on non-LACP ports : 26488
LACP Bulk checkpointed msgs sent    : 1
LACP Bulk checkpointed msgs recv    : 1
LACP PDUs checkpointed sent         : 5671254
LACP PDUs checkpointed recv         : 6077858

Lag        Member     Rx       Rx Drop  Rx Drop  Rx Drop  Tx       Tx
Group      Port       Ok       PDU Err  Not Up   Same MAC Sent Ok  Xmit Err
--------------------------------------------------------------------------------
1:1        1:1        3146     0        0        0        3278     0
           2:1        3146     0        0        0        3279     0

1:3        1:3        2464     0        0        0        2570     0
           2:3        2465     0        0        0        2569     0

1:4        1:4        2461     0        0        0        2567     0
           2:4        2463     0        0        0        2568     0

1:5        1:5        2        0        0        0        6        0
           2:5        2541     0        0        0        2649     0

1:7        1:7        34       0        0        0        39       0
           2:7        36       0        0        0        40       0

Stoiko Ivanov · Apr 24, 2019

don't know the switches' output (googling indicates that it might be extreme switches? (just curious)) - but are these the statistics and counters for all packets or only for LACP-traffic/PDUs ?

Whatever · Apr 24, 2019

Stoiko Ivanov said:
don't know the switches' output (googling indicates that it might be extreme switches? (just curious)) - but are these the statistics and counters for all packets or only for LACP-traffic/PDUs ?

Yeap, eXtreme ones. All provided counters are for LAGs only

Whatever · Apr 25, 2019

Whatever said:
I will investigate this:

1:5 2:5 LACP 1 L3 1:5 - R 120
L3 2:5 Y A 113

Fixed. But ring#1 states still flapping (it was expected)

Stoiko Ivanov · Apr 26, 2019

* The 2 LAGs with Algorithm L3_L4 are not part of the corosync ring - right? (that might be an issue, but I'd assume it unlikely)
* Else it seems that the network seems running stable and not dropping packets
I'd suggest to rule out a single node being the culprit (as said above - vacate a node, shut it down (or stop corosync) - see if the ring stays stable) - if that's an option in your production environment

If this does not help - please try the newer kernel, and if it still persists try if the ring stays stable with the older kernel, which works on your other cluster - This helps hunting down any driver/kernel issues

Thanks!

Whatever · Apr 26, 2019

Stoiko Ivanov said:
* The 2 LAGs with Algorithm L3_L4 are not part of the corosync ring - right? (that might be an issue, but I'd assume it unlikely)

Correct. My 5 nodes cluster is associated with ports 1,3,4,5,7 (L3)

Stoiko Ivanov said:
* Else it seems that the network seems running stable and not dropping packets
I'd suggest to rule out a single node being the culprit (as said above - vacate a node, shut it down (or stop corosync) - see if the ring stays stable) - if that's an option in your production environment

Quite hard to be done but will try my best

Stoiko Ivanov said:
If this does not help - please try the newer kernel, and if it still persists try if the ring stays stable with the older kernel, which works on your other cluster - This helps hunting down any driver/kernel issues

Thanks!

Using the latest (from PVE test) kernel makes no difference(

Stoiko Ivanov · May 7, 2019

do you have any update on the issues?
are you still experiencing the problems?

Whatever · May 7, 2019

Stoiko Ivanov said:
do you have any update on the issues?
are you still experiencing the problems?

Yes, nothing has changed. No ideas so far(

What has been checked - 3 older kernels (one - that has been used in similar environment, the only difference on that setup - IPoIB instead of 10Gbe on ring#0 - anyway ring#0 is working in both setups). All the nodes we rebooted (VMs were migrated) with the same kernel - rign#1 fault is still present(

I'm going to include one more node to be able to vacate a node one by one as you suggested

After upgrade to 5.4 redundant corosync ring does not work as expected

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Proxmox Staff Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

We value your privacy