Hi all!
I found next situation:
when link is unstable, (such as network card 10Gbps attached to switch by connectors RJ45 w/o gilding - contacts on network card and on connector can be oxidized), switch can up/down links and after which corosync goes "crazy" - starts loading one core of the processor at 30-100% and after a few hours corosync is crush. Corosync crush arbitrarily, for example, on 5/8 cluster servers.
I fix that by simple script:
But probably it is necessary to check more thoroughly the code of Corosync.
I found next situation:
when link is unstable, (such as network card 10Gbps attached to switch by connectors RJ45 w/o gilding - contacts on network card and on connector can be oxidized), switch can up/down links and after which corosync goes "crazy" - starts loading one core of the processor at 30-100% and after a few hours corosync is crush. Corosync crush arbitrarily, for example, on 5/8 cluster servers.
I fix that by simple script:
Code:
#!/usr/bin/env bash
killall corosync -9
sleep 2
systemctl stop pve-ha-lrm.service
sleep 2
systemctl stop pve-ha-crm.service
sleep 2
systemctl restart pvedaemon.service
sleep 2
systemctl start pve-ha-lrm.service
But probably it is necessary to check more thoroughly the code of Corosync.
Code:
# pveversion --verbose
proxmox-ve: 5.2-2 (running kernel: 4.15.18-4-pve)
pve-manager: 5.2-8 (running version: 5.2-8/fdf39912)
pve-kernel-4.15: 5.2-7
pve-kernel-4.15.18-4-pve: 4.15.18-23
pve-kernel-4.15.18-2-pve: 4.15.18-21
pve-kernel-4.15.18-1-pve: 4.15.18-19
pve-kernel-4.15.17-3-pve: 4.15.17-14
ceph: 12.2.8-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-38
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-10
libpve-storage-perl: 5.0-27
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-2
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-30
pve-container: 2.0-26
pve-docs: 5.2-8
pve-firewall: 3.0-14
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-33
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9