Hi,
a coworker made a network loop for 60 seconds and it turned out that some PM cluster which uses WAN side for clustering rebooted itself - all nodes.
It made me wonder, can network loop cause PM to reboot itself?
Will it reboot, even if no HA resources are defined on cluster?
Under what circumstances would PM reboot itself?
Surely loss of quorum should not trigger reboot, when there are no HA resources defined, right?
Here are some relevant logs I got from the admins of this cluster.
a coworker made a network loop for 60 seconds and it turned out that some PM cluster which uses WAN side for clustering rebooted itself - all nodes.
It made me wonder, can network loop cause PM to reboot itself?
Will it reboot, even if no HA resources are defined on cluster?
Under what circumstances would PM reboot itself?
Surely loss of quorum should not trigger reboot, when there are no HA resources defined, right?
Here are some relevant logs I got from the admins of this cluster.
Code:
Time42 serverXYZ kernel: [89005319.786954] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005319.837135] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005319.887327] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005319.937518] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005319.987698] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005320.037872] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time42 serverXYZ kernel: [89005320.088052] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time43 serverXYZ kernel: [89005320.138132] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time43 serverXYZ kernel: [89005320.188331] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time43 serverXYZ kernel: [89005320.238454] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time43 serverXYZ corosync[32844]: warning [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time43 serverXYZ corosync[32844]: [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time44 serverXYZ corosync[32844]: warning [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time44 serverXYZ corosync[32844]: [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time46 serverXYZ corosync[32844]: warning [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time46 serverXYZ corosync[32844]: [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time47 serverXYZ kernel: [89005324.838662] net_ratelimit: 91 callbacks suppressed
Time47 serverXYZ kernel: [89005324.838699] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time47 serverXYZ kernel: [89005324.888853] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time47 serverXYZ kernel: [89005324.939026] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time47 serverXYZ kernel: [89005324.981935] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time47 serverXYZ kernel: [89005325.032129] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time47 serverXYZ corosync[32844]: warning [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time47 serverXYZ corosync[32844]: [MAIN ] Totem is unable to form a cluster because of an operating system or network fault (reason: totem is continuously in gather state). The most common cause of this message is that the local firewall is configured improperly.
Time47 serverXYZ kernel: [89005325.082298] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time48 serverXYZ kernel: [89005325.132463] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time48 serverXYZ kernel: [89005325.182608] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time48 serverXYZ kernel: [89005325.232796] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time48 serverXYZ kernel: [89005325.282962] vmbr0: received packet on eno1 with own address as source address (addr:MA:C:AD:DD:ES, vlan:0)
Time32 serverXYZ systemd-modules-load[634]: Inserted module 'iscsi_tcp'
Time32 serverXYZ kernel: [ 0.000000] Linux version 4.15.18-30-pve (root@nora) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP PVE 4.15.18-58 (Fri, 12 Jun 2020 13:53:01 +0200) ()
Time32 serverXYZ systemd-modules-load[634]: Inserted module 'ib_iser'
Time32 serverXYZ kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.18-30-pve root=/dev/mapper/pve-root ro quiet