HA does not work anymore

mgiammarco

Well-Known Member
Feb 18, 2010
161
7
58
Hello,
I have several three servers proxmox clusters with ceph and with versions 4.4 and 5.2 and with commercial license.

If one server of the cluster goes down (broken) the cluster maintains quorum but all seems not working:
- console/vnc does not work anymore (timeout);
- vm restarts on remaining servers but they do not really start even if there is green triangle (cannot ping them or access them, cannot do a check of what happens with vnc due to above problem).

It seems to me a serious problem, why it happens?
Thanks,
Mario
 
Hi,

a mixed setup is not supported and not tested.
please upgrade all nodes to current PVE 5.2.
 
Sorry I have a cluster with all server on 4.4 and a cluster with all servers on 5.2. These clusters are completely separated.
Unfortunately on both clusters I see above problem with HA.
 
If you have a "problem", you have to tell more details about your problem, otherwise no one can help.

As V4 is end of support already, you have to upgrade to latest V 5 anyway.

If you still see the issue on latest V 5.2, please provide error logs and more details about your HA cluster.
 
Please note that:
- clusters are in production so I cannot shutdown a server when I want to reproduce the problem;
- I have said that I have the same problem on 4.4 and 5.2 to give additional infos and show that the problem does not depend on proxmox version, not because I want to use an end of life version;
- I have explained my problem on first post: with a server down, even if the cluster has three servers and quorum is ok and pvecm status says it has quorum and ceph status says it has enough copies of data I am not able to see vms using vnc and, even if gui says they are working it is obvious they are not.

Please tell which log you want to see because there are so many.
Regarding my clusters are all with three hp or supermicro servers, redundant gigabit links and so on as documentation says.
If you need some specific infos please ask.
Thanks,
Mario
 
You need to go through the logs and send the one with the errors, you should NOT send all logs - this would break the forum.

You wrote in your first post that you have valid subscription on your clusters, so you can also get in touch with out enterprise support team via https://my.proxmox.com - if you have "Standard" or "Premium" level, our experts can do a direct SSH login and find your issues without sending logs from your side.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!