Oct 24 12:19:26 host17 corosync[1679]: [TOTEM ] A processor failed, forming new configuration.
Oct 24 12:19:28 host17 snmpd[16604]: Connection from UDP: [212.48.109.169]:62830->[149.202.197.68]:161
Oct 24 12:19:29 host17 corosync[1679]: [TOTEM ] A new membership (172.16.0.17:432) was formed. Members left: 3 2
Oct 24 12:19:29 host17 corosync[1679]: [TOTEM ] Failed to receive the leave message. failed: 3 2
Oct 24 12:19:29 host17 pmxcfs[1656]: [dcdb] notice: members: 1/1656
Oct 24 12:19:29 host17 pmxcfs[1656]: [status] notice: members: 1/1656
Oct 24 12:19:29 host17 corosync[1679]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Oct 24 12:19:29 host17 corosync[1679]: [QUORUM] Members[1]: 1
Oct 24 12:19:29 host17 corosync[1679]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 24 12:19:29 host17 pmxcfs[1656]: [status] notice: node lost quorum
Oct 24 12:19:29 host17 pmxcfs[1656]: [dcdb] crit: received write while not quorate - trigger resync
Oct 24 12:19:29 host17 pmxcfs[1656]: [dcdb] crit: leaving CPG group
Oct 24 12:19:29 host17 pve-ha-lrm[1715]: unable to write lrm status file - unable to open file '/etc/pve/nodes/host17/lrm_status.tmp.1715' - Permission denied
Oct 24 12:19:30 host17 pmxcfs[1656]: [dcdb] notice: start cluster connection
Oct 24 12:19:33 host17 snmpd[16604]: Connection from UDP: [212.48.109.169]:62836->[149.202.197.68]:161
Oct 24 12:19:35 host17 rrdcached[1526]: queue_thread_main: rrd_update_r (/var/lib/rrdcached/db/pve2-storage/host17/Backup17) failed with status -1. (/var/lib/rrdcached/db/pve2-storage/host17/Backup17: illegal attempt to update using time 1477304081 when last update time is 1477304342 (minimum one second step))
Oct 24 12:19:38 host17 snmpd[16604]: Connection from UDP: [212.48.109.169]:62842->[149.202.197.68]:161
Oct 24 12:19:40 host17 corosync[1679]: [TOTEM ] A new membership (172.16.0.17:468) was formed. Members
Oct 24 12:19:40 host17 corosync[1679]: [QUORUM] Members[1]: 1
Oct 24 12:19:40 host17 corosync[1679]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 24 12:19:40 host17 pmxcfs[1656]: [dcdb] notice: members: 1/1656
Oct 24 12:19:40 host17 pmxcfs[1656]: [dcdb] notice: all data is up to date
Oct 24 12:19:43 host17 snmpd[16604]: Connection from UDP: [212.48.109.169]:62849->[149.202.197.68]:161
Oct 24 12:19:46 host17 corosync[1679]: [MAIN ] Corosync main process was not scheduled for 3024.7622 ms (threshold is 1320.0000 ms). Consider token timeout increase.
Oct 24 12:19:51 host17 corosync[1679]: [TOTEM ] A new membership (172.16.0.17:492) was formed. Members
Oct 24 12:19:51 host17 corosync[1679]: [QUORUM] Members[1]: 1
Oct 24 12:19:51 host17 corosync[1679]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 24 12:20:00 host17 corosync[1679]: [TOTEM ] A new membership (172.16.0.12:516) was formed. Members joined: 3 2
Oct 24 12:20:02 host17 corosync[1679]: [TOTEM ] A processor failed, forming new configuration.
Oct 24 12:20:03 host17 pmxcfs[1656]: [status] notice: cpg_send_message retry 10
Oct 24 12:20:04 host17 pmxcfs[1656]: [status] notice: cpg_send_message retry 20
Oct 24 12:20:05 host17 pmxcfs[1656]: [status] notice: cpg_send_message retry 30
Oct 24 12:20:06 host17 pmxcfs[1656]: [status] notice: cpg_send_message retry 40
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: cpg_send_message retry 50
Oct 24 12:20:07 host17 corosync[1679]: [TOTEM ] A new membership (172.16.0.12:536) was formed. Members
Oct 24 12:20:07 host17 pmxcfs[1656]: [dcdb] notice: members: 1/1656, 3/2634
Oct 24 12:20:07 host17 pmxcfs[1656]: [dcdb] notice: starting data syncronisation
Oct 24 12:20:07 host17 corosync[1679]: [QUORUM] This node is within the primary component and will provide service.
Oct 24 12:20:07 host17 corosync[1679]: [QUORUM] Members[3]: 3 2 1
Oct 24 12:20:07 host17 corosync[1679]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: cpg_send_message retried 56 times
Oct 24 12:20:07 host17 pmxcfs[1656]: [dcdb] notice: cpg_send_message retried 1 times
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: node has quorum
Oct 24 12:20:07 host17 pmxcfs[1656]: [dcdb] notice: members: 1/1656, 2/1793, 3/2634
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: members: 1/1656, 3/2634
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: starting data syncronisation
Oct 24 12:20:07 host17 pmxcfs[1656]: [status] notice: members: 1/1656, 2/1793, 3/2634
Oct 24 12:20:08 host17 pmxcfs[1656]: [dcdb] notice: received sync request (epoch 1/1656/00000020)
Oct 24 12:20:08 host17 pmxcfs[1656]: [dcdb] notice: received sync request (epoch 1/1656/00000021)
Oct 24 12:20:08 host17 pmxcfs[1656]: [status] notice: received sync request (epoch 1/1656/0000001C)
Oct 24 12:20:09 host17 pmxcfs[1656]: [status] notice: received sync request (epoch 1/1656/0000001D)
pvecm status
Quorum information
------------------
Date: Mon Oct 24 18:34:07 2016
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000002
Ring ID: 3/536
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000003 1 IPHOST12
0x00000002 1 IPHOST16 (local)
0x00000001 1 IPHOST17
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:04 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
Oct 24 18:21:05 host17 corosync[1679]: [CPG ] Unknown node -> we will not deliver message
systemctl restart pvedaemon
systemctl restart pveproxy
systemctl restart pvestatd
but the commands hang.
---
- hosts: pve
sudo: yes
tasks:
- name: Stop pve-cluster
service: name=pve-cluster state=stopped
- name: kill corosync
shell: pkill -9 corosync
- name: Restart pve
service: name={{ item }} state=restarted
with_items:
- pve-cluster
- pvedaemon
- pvestatd
- pveproxy
()
Package versions
proxmox-ve: 4.3-66 (running kernel: 4.4.19-1-pve)
pve-manager: 4.3-10 (running version: 4.3-10/7230e60f)
pve-kernel-4.4.13-1-pve: 4.4.13-56
pve-kernel-4.4.19-1-pve: 4.4.19-66
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-47
qemu-server: 4.0-94
pve-firmware: 1.1-10
libpve-common-perl: 4.0-80
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-68
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.3-14
pve-qemu-kvm: 2.7.0-6
pve-container: 1.0-81
pve-firewall: 2.0-31
pve-ha-manager: 1.0-35
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.5-1
lxcfs: 2.0.4-pve2
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
Is there a methode to restart corosync?
If the cluster think the server is offline to resync? - Actually i have to restart the whole server.
#killall -9 corosync
# systemctl restart pve-cluster
That command did not bring the server back but it synced all data between them.#killall -9 corosync
# systemctl restart pve-cluster
We use essential cookies to make this site work, and optional cookies to enhance your experience.