Need help - tricky as it isn't a constant red X or Green Tick on a node in a cluster

Sep 27, 2019
16
0
21
57
My PROX 4 goes from a green tick to a red X in the GUI

1571402047883.png

the CEPH has a drive on PROX 4. CEPH always stays all green.
If i turn of the node PROX4 and leave it off for 30 minutes then turn it on then CEPH goes all red but after about 10 minutes it goes all green EVEN THOUGH PROX4 MIGHT HAVE A RED X.

when prox4 is green I can migrate a VM (on a CEPH disk drive) to it (prox4) with no problem from PROX2 or PROX3.
but after 10-15 minutes I notice that the VM's have been migrated back to PROX2 or PROX3.
when i look at the SYSLOG for PROX4 I see this weird pattern

1571402294163.png

not sure what it means.

I have turned the node on and off a dozen times but a simple reboot isn't working.

CEPH works fine, all VM's are running fine

thanks
Concerned ;(
 
HI Wolfgang,

thanks for the links and I will give it a go tonight.
corosync looks to be running now but since all the nodes have dual NIC's I will set up a separate network.

systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-10-22 11:33:36 AEDT; 32min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 1161 (corosync)
Tasks: 9 (limit: 4915)
Memory: 180.7M
CGroup: /system.slice/corosync.service
└─1161 /usr/sbin/corosync -f

Oct 22 11:44:44 prox4 corosync[1161]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 22 11:44:48 prox4 corosync[1161]: [TOTEM ] A new membership (1:142068) was formed. Members joined: 4
Oct 22 11:44:48 prox4 corosync[1161]: [KNET ] rx: host: 4 link: 0 is up
Oct 22 11:44:48 prox4 corosync[1161]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Oct 22 11:44:48 prox4 corosync[1161]: [CPG ] downlist left_list: 0 received
Oct 22 11:44:48 prox4 corosync[1161]: [CPG ] downlist left_list: 0 received
Oct 22 11:44:48 prox4 corosync[1161]: [CPG ] downlist left_list: 0 received
Oct 22 11:44:48 prox4 corosync[1161]: [CPG ] downlist left_list: 0 received
Oct 22 11:44:48 prox4 corosync[1161]: [QUORUM] Members[4]: 1 2 3 4
Oct 22 11:44:48 prox4 corosync[1161]: [MAIN ] Completed service synchronization, ready to provide service.


many thanks
damon
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!