Corosync In 'Activating' or Start .. and not in Running mode

Raja Saha

New Member
Nov 29, 2018
3
0
1
46
Hi!,

I'm trying to evaluate virtualization platform for my future project and proxmox seems interesting.

had installed in two node and works fine for few days. However after a switch failover one day , one of my node started giving issues. All processes are fine except for corosync . It shows as 'Activating' in node 1 and fails after a while. The other master node works fine. When corosync fails I'm not be to migrate the VMs etc. Can please help. Can some kind soul please advise... My version info is as

----Pve version from Node#1----
proxmox-ve: 5.2-2 (running kernel: 4.15.18-8-pve)
pve-manager: 5.2-10 (running version: 5.2-10/6f892b40)
pve-kernel-4.15: 5.2-11
pve-kernel-4.15.18-8-pve: 4.15.18-28
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph: 12.2.9-1~bpo90+1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.12.15-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-41
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-30
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-3
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-29
pve-docs: 5.2-9
pve-firewall: 3.0-14
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-38
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.11-pve2~bpo1

--------- Corosync.conf from Node #1----
logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: ils-phy-pri
nodeid: 1
quorum_votes: 1
ring0_addr: ils-phy-pri
}
node {
name: ils-phy-sec
nodeid: 2
quorum_votes: 1
ring0_addr: 172.24.0.2
}
}

quorum {
expected_votes: 2
last_man_standing: 1
last_man_standing_window: 1000
provider: corosync_votequorum
two_node: 1
}

totem {
cluster_name: ILS-PHY-CLUSTER
config_version: 10
interface {
bindnetaddr: 172.24.0.2
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}

----Corosync.conf from Master Node----
logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: ils-phy-pri
nodeid: 1
quorum_votes: 1
ring0_addr: ils-phy-pri
}
node {
name: ils-phy-sec
nodeid: 2
quorum_votes: 1
ring0_addr: 172.24.0.2
}
}

quorum {
expected_votes: 2
last_man_standing: 1
last_man_standing_window: 1000
provider: corosync_votequorum
two_node: 1
}

totem {
cluster_name: ILS-PHY-CLUSTER
config_version: 10
interface {
bindnetaddr: 172.24.0.2
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}

upload_2018-11-29_12-57-56.png

upload_2018-11-29_12-58-43.png
 
what does systemctl status corosync say?
anything in the journal ?
 
@Dominik .. It says 'Activating' for days.. but when I reboot the node , the following service does not start a) pvedaemon pveproxy and corosync. though they are enabled . pvedaemon is loaded and dead and so is pveproxy. However I manually started pvedaemon and corosync stays there as 'Activating' .

Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: activating (start) since Thu 2018-11-29 23:23:34 +08; 5min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Cntrl PID: 66703 (corosync)
Tasks: 2 (limit: 11059)
Memory: 38.6M
CPU: 3.865s
CGroup: /system.slice/corosync.service
└─66703 /usr/sbin/corosync -f

Nov 29 23:23:34 ils-phy-pri corosync[66703]: [CPG ] downlist left_list: 0 received
Nov 29 23:23:34 ils-phy-pri corosync[66703]: [CPG ] downlist left_list: 0 received
Nov 29 23:23:34 ils-phy-pri corosync[66703]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2
Nov 29 23:23:34 ils-phy-pri corosync[66703]: notice [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2
Nov 29 23:23:34 ils-phy-pri corosync[66703]: notice [QUORUM] This node is within the primary component and will provide service.
Nov 29 23:23:34 ils-phy-pri corosync[66703]: notice [QUORUM] Members[2]: 1 2
Nov 29 23:23:34 ils-phy-pri corosync[66703]: notice [MAIN ] Completed service synchronization, ready to provide service.
Nov 29 23:23:34 ils-phy-pri corosync[66703]: [QUORUM] This node is within the primary component and will provide service.
Nov 29 23:23:34 ils-phy-pri corosync[66703]: [QUORUM] Members[2]: 1 2
Nov 29 23:23:34 ils-phy-pri corosync[66703]: [MAIN ] Completed service synchronization, ready to provide service.
 
Oh My Bad ... Issue resolved as I discovered that the systemd file of corosync under /lib/systemd/system/corosync.service file was corrupted
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!