[SOLVED] Error 500 cluster not ready. Please help

thecharliewahl

New Member
Apr 28, 2023
4
0
1
Hello, I had a cluster, one of the servers disappeared. After that I cant run any of my vms. And inside of the cluster tab the cluster information is grayed out but it still shows that there are two nodes. When I look at the status of Corosync it says "Could not open /etc
corosync/authkey: No such file or directory". I NEED to get the data off of one of the VMs that I use as a nas.
 

Attachments

  • Screenshot 2023-04-27 220845.png
    Screenshot 2023-04-27 220845.png
    89.6 KB · Views: 34
  • Screenshot 2023-04-27 220820.png
    Screenshot 2023-04-27 220820.png
    248.5 KB · Views: 36
Hi,

How did the server disappear? Did you remove it from the cluster?

Can you post the syslog since the issue happened? You can sort the syslog with a specific time/date using journalctl e.g.:
Bash:
journalctl --since "2023-04-28 00:00" --until "2023-04-28 08:45" > /tmp/Syslog.log
You may have to change the date/time in the above command.

Can you also post the output of the bellow commands:

Bash:
ls /etc/pve
ls /etc/pve/nodes
# if the `ls` result that there are files in the above two commands do continue with:
cat /etc/pve/corosync.conf
cat /etc/pve/.members
pveversion -v

qm start 100
 
Hi,

How did the server disappear? Did you remove it from the cluster?

Can you post the syslog since the issue happened? You can sort the syslog with a specific time/date using journalctl e.g.:
Bash:
journalctl --since "2023-04-28 00:00" --until "2023-04-28 08:45" > /tmp/Syslog.log
You may have to change the date/time in the above command.

Can you also post the output of the bellow commands:

Bash:
ls /etc/pve
ls /etc/pve/nodes
# if the `ls` result that there are files in the above two commands do continue with:
cat /etc/pve/corosync.conf
cat /etc/pve/.members
pveversion -v

qm start 100
Hello,

I followed some instructions online on how to remove one from a cluster. ( I have two.)

As for the journalctl it outputs, "/tmp/syslog.log: No such file or directory. When I just run it without date or time it says this, "File is neither a device node, nor regular file, nor executable: /tmp/syslog.log"


1682715913889.png
1682715939318.png
1682715975817.png
1682716051038.png
1682716003232.png
1682716098594.png
 
Last edited:
Hello,

Thank you for the outputs!

Still the syslog is important to narrow down the issue. However, in the cat /etc/pve/.members shows that the only `saturn` node is in a cluster, and the qm start 100 says that there are no quorm in your cluster. Have you tried to restart the corosync and pve-cluster services?
Bash:
systemctl restart corosync.service
systemctl restart pve-cluster.service

You can use `pvecm expected 1` [0] command if you want to start the VM.

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_quorum
 
I have tried to restart the services. I have now run into a another problem. When I try to log into the shell it says, "TASK ERROR: command '/usr/bin/termproxy 5901 --path /nodes/saturn --perm Sys.Console -- /bin/login -f root' failed: exit code 1"