Cluster Gui becomes unreachable

newprox

Active Member
Jun 11, 2016
3
0
41
47
Hi

I am running a little cluster with 4 servers with only 1 vm per machine for testing purposes (all machines 4.2) . All has been running well until I stupidly added a node with the same hostname as an existing one to the cluster. I have since reinstalled that machine and initially all seemed well once I shut it down, I am now faced with the Gui of all nodes being unreachable. Once I restart a node, it is reachable again for a period of time but wait for a few hours and it is no longer so and I have to restart it again to make it work. I am assuming the cause of this was the duplicate hostname?

Is there a way of getting myself out of this pickle without having to rebuild every machine?

Any input much appreciated.
 
Some more information - there is a backup scheduled on some of the machines that writes to an NFS storage. Those appear to be hanging processes that never end. Running a backup from the CLI freezes as soon as you try to specify the location of the backup. Without specifying the location it appears to start but just sits there doing nothing. The bind I am in is that I can force a reboot of a node to get it back into the cluster (well at least temporarily) but that I cannot do a backup before first just in case killing it will cause trouble.

It won't be the end of the world as this is mainly for testing purposes if it goes wrong but that said I would prefer to be safe so that no work is lost.
 
Some more information - there is a backup scheduled on some of the machines that writes to an NFS storage. Those appear to be hanging processes that never end. Running a backup from the CLI freezes as soon as you try to specify the location of the backup. Without specifying the location it appears to start but just sits there doing nothing. The bind I am in is that I can force a reboot of a node to get it back into the cluster (well at least temporarily) but that I cannot do a backup before first just in case killing it will cause trouble.

It won't be the end of the world as this is mainly for testing purposes if it goes wrong but that said I would prefer to be safe so that no work is lost.
Oh boy - so no commands on any of the nodes have any effect. I can get the gui working as mentioned on the nodes that were rebooted but other than that nothing works. The vms are humming along nicely and are not affected currently but no matter what command try, it just doesn't do anything.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!