[SOLVED] Error 500 cluster not ready. Please help

thecharliewahl

New Member
Apr 28, 2023
4
0
1
Hello, I had a cluster, one of the servers disappeared. After that I cant run any of my vms. And inside of the cluster tab the cluster information is grayed out but it still shows that there are two nodes. When I look at the status of Corosync it says "Could not open /etc
corosync/authkey: No such file or directory". I NEED to get the data off of one of the VMs that I use as a nas.
 

Attachments

  • Screenshot 2023-04-27 220845.png
    Screenshot 2023-04-27 220845.png
    89.6 KB · Views: 23
  • Screenshot 2023-04-27 220820.png
    Screenshot 2023-04-27 220820.png
    248.5 KB · Views: 23
Hi,

How did the server disappear? Did you remove it from the cluster?

Can you post the syslog since the issue happened? You can sort the syslog with a specific time/date using journalctl e.g.:
Bash:
journalctl --since "2023-04-28 00:00" --until "2023-04-28 08:45" > /tmp/Syslog.log
You may have to change the date/time in the above command.

Can you also post the output of the bellow commands:

Bash:
ls /etc/pve
ls /etc/pve/nodes
# if the `ls` result that there are files in the above two commands do continue with:
cat /etc/pve/corosync.conf
cat /etc/pve/.members
pveversion -v

qm start 100
 
Hi,

How did the server disappear? Did you remove it from the cluster?

Can you post the syslog since the issue happened? You can sort the syslog with a specific time/date using journalctl e.g.:
Bash:
journalctl --since "2023-04-28 00:00" --until "2023-04-28 08:45" > /tmp/Syslog.log
You may have to change the date/time in the above command.

Can you also post the output of the bellow commands:

Bash:
ls /etc/pve
ls /etc/pve/nodes
# if the `ls` result that there are files in the above two commands do continue with:
cat /etc/pve/corosync.conf
cat /etc/pve/.members
pveversion -v

qm start 100
Hello,

I followed some instructions online on how to remove one from a cluster. ( I have two.)

As for the journalctl it outputs, "/tmp/syslog.log: No such file or directory. When I just run it without date or time it says this, "File is neither a device node, nor regular file, nor executable: /tmp/syslog.log"


1682715913889.png
1682715939318.png
1682715975817.png
1682716051038.png
1682716003232.png
1682716098594.png
 
Last edited:
Hello,

Thank you for the outputs!

Still the syslog is important to narrow down the issue. However, in the cat /etc/pve/.members shows that the only `saturn` node is in a cluster, and the qm start 100 says that there are no quorm in your cluster. Have you tried to restart the corosync and pve-cluster services?
Bash:
systemctl restart corosync.service
systemctl restart pve-cluster.service

You can use `pvecm expected 1` [0] command if you want to start the VM.

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_quorum
 
I have tried to restart the services. I have now run into a another problem. When I try to log into the shell it says, "TASK ERROR: command '/usr/bin/termproxy 5901 --path /nodes/saturn --perm Sys.Console -- /bin/login -f root' failed: exit code 1"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!