I am running PVE cluster over WAN (different datacenters across the globe). It worked all the time flawlessly and best suited my needs (of course no shared storage, LM or HA but still central management, easy offline migrations etc). Some time ago I've upgraded to PVE 6.0 and was able to run the corosync directly through WAN unicast interfaces, no need to build VPN which is not necessary for some of my nodes. Simplified my setup and I was glad
But now sporadically I have some kind of corosync "sync" problems. When there are some (even short time!) connection problems between nodes (which is understandable and unavoidable on WAN links) cluster seem to get broken. When I notice this I simply run:
On disconnected nodes and it brings them back (simply replays all the messages as I can see), but it seems strange for me. I do understand that there was some connectivity issue (as said - on WAN unavoidable), but why it doesn't get re-synced automatically? It worked without problems before when I used old multicast corosync over VPN...
P.S. I have mentioned this problem here, but it seems that this topic from 2015 is dead
But now sporadically I have some kind of corosync "sync" problems. When there are some (even short time!) connection problems between nodes (which is understandable and unavoidable on WAN links) cluster seem to get broken. When I notice this I simply run:
Code:
killall -9 corosync
systemctl restart pve-cluster
On disconnected nodes and it brings them back (simply replays all the messages as I can see), but it seems strange for me. I do understand that there was some connectivity issue (as said - on WAN unavoidable), but why it doesn't get re-synced automatically? It worked without problems before when I used old multicast corosync over VPN...
P.S. I have mentioned this problem here, but it seems that this topic from 2015 is dead