Hello,
I created a 4-node cluster that worked perfectly until I enabled the firewall on the cluster and the VM.
Now the problem is that every minute nodes turn red, the syslog reports this:
The nodes come back if reboot corosync (# service corosync restart) green on all nodes.
Now I removed the firewall configuration by eliminating the folder "/etc/pve/firewall" but the problem persists.
I also tried to stop the pve-firewall service but the problem persists.
What can I do?
Thank you,
Lorenzo
I created a 4-node cluster that worked perfectly until I enabled the firewall on the cluster and the VM.
Now the problem is that every minute nodes turn red, the syslog reports this:
Code:
Aug 29 18:36:23 proxmox106 corosync[30192]: [TOTEM ] FAILED TO RECEIVE
Aug 29 18:36:26 proxmox106 corosync[30192]: [TOTEM ] A new membership (5.150.141.116:2340) was formed. Members left: 4 3 2
Aug 29 18:36:26 proxmox106 corosync[30192]: [TOTEM ] Failed to receive the leave message. failed: 4 3 2
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] notice: members: 1/20854
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [status] notice: members: 1/20854
Aug 29 18:36:26 proxmox106 corosync[30192]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 29 18:36:26 proxmox106 corosync[30192]: [QUORUM] Members[1]: 1
Aug 29 18:36:26 proxmox106 corosync[30192]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [status] notice: node lost quorum
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] notice: cpg_send_message retried 1 times
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] crit: received write while not quorate - trigger resync
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] crit: leaving CPG group
Aug 29 18:36:26 proxmox106 pve-ha-lrm[4358]: unable to write lrm status file - closing file '/etc/pve/nodes/proxmox106/lrm_status.tmp.4358' failed - Operation not permitted
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] notice: start cluster connection
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] notice: members: 1/20854
Aug 29 18:36:26 proxmox106 pmxcfs[20854]: [dcdb] notice: all data is up to date
The nodes come back if reboot corosync (# service corosync restart) green on all nodes.
Now I removed the firewall configuration by eliminating the folder "/etc/pve/firewall" but the problem persists.
I also tried to stop the pve-firewall service but the problem persists.
What can I do?
Thank you,
Lorenzo
Code:
# pveversion -v
proxmox-ve: 4.2-60 (running kernel: 4.4.15-1-pve)
pve-manager: 4.2-17 (running version: 4.2-17/e1400248)
pve-kernel-4.4.15-1-pve: 4.4.15-60
lvm2: 2.02.116-pve2
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-43
qemu-server: 4.0-85
pve-firmware: 1.1-8
libpve-common-perl: 4.0-72
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-56
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-qemu-kvm: 2.6-1
pve-container: 1.0-72
pve-firewall: 2.0-29
pve-ha-manager: 1.0-33
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.3-4
lxcfs: 2.0.2-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5.7-pve10~bpo80