All nodes in cluster have grey question marks except the node where the cluster is setup. I am able to access the shell of the all nodes but their status is 'Unknown". I referred to existing threads and tried out the following:
Corsync is active and running
Node List: Out of the 13 nodes present in this cluster, only 7 nodes are actively used by our team. The other nodes are not being used and are powered off. Out of these 7 nodes, only one node is online and the other 6 have unknown status. This cluster is using CIFS storage ('online'). This CIFS storage is also not accessible in the other 6 nodes through the web interface.
pve-cluster was active and running before the restarted pve-cluster.service
But after I ran the command
After this when I checked the status of the cluster again, I got this:
In this state, no commands were able to execute and after restart, the cluster status was back to active and running. However, the other 6 nodes' status is still unknown and I'm back to the same problem.
Kindly help me fix this problem.
- Restarted the nodes: This didn't fix the issue
- Tried to restart proxmox services: The shell just hung and was not able to run the following scripts
- systemctl restart pvedaemon
- systemctl restart pveproxy
- systemctl restart pvestatd
Corsync is active and running
Node List: Out of the 13 nodes present in this cluster, only 7 nodes are actively used by our team. The other nodes are not being used and are powered off. Out of these 7 nodes, only one node is online and the other 6 have unknown status. This cluster is using CIFS storage ('online'). This CIFS storage is also not accessible in the other 6 nodes through the web interface.
pveversion -v
pvecm status
pvecm updatecerts
: The shell hung after printing these 2 linespve-cluster was active and running before the restarted pve-cluster.service
systemctl status pve-cluster.service
But after I ran the command
service pve-cluster restart
, it failed with the following message:After this when I checked the status of the cluster again, I got this:
In this state, no commands were able to execute and after restart, the cluster status was back to active and running. However, the other 6 nodes' status is still unknown and I'm back to the same problem.
Kindly help me fix this problem.