I updated a cluster to the latest release (official, we have a license for PMX) and things have been running smoothly until one of the 3 hosts became unresponsive: SSH, Web and even console never come back - for SSH and console it asks for the login and it never comes back after that. I have checked that VMs are running and I have "moved" manually some of the ones that I shutdown to another one of the hosts by moving the config files to the directory where the config files of the recipient host keeps the config files for itself and it seemed to work. I am concerned about the following:
1. First of all, has someone seen this scenario? When does it typically happen? It had never happened to us after 2-3 years of using Proxmox. It would seem like a bug on this release. What information do you need when the host comes back up and is responsive to investigate?
2. What will happen when I reset (which seems to be the only way out of this condition after shutting down all the VMs controlled by it) ? I have no idea if the config directory will be replicated from the other nodes or it will still "think" it still owns those VMs
Thoughts? Thanks
1. First of all, has someone seen this scenario? When does it typically happen? It had never happened to us after 2-3 years of using Proxmox. It would seem like a bug on this release. What information do you need when the host comes back up and is responsive to investigate?
2. What will happen when I reset (which seems to be the only way out of this condition after shutting down all the VMs controlled by it) ? I have no idea if the config directory will be replicated from the other nodes or it will still "think" it still owns those VMs
Thoughts? Thanks