Good Day All,
I have a question regarding restoring a failed node on a three cluster environment. This is an old implementation which is to be upgraded and moved in April of this year currently running 6.4-14 and also running a very old version of Ceph. In planning the move, we had a node fail unexpectedly, however the HA was not set up for two of the VMs to move after failure. Therefore, as these are production nodes, a choice of either using a backup or manually moving the VMs had to be decided and the latter was chosen using "6.5.2. Recovering/Moving Guests from Failed Nodes" section from Proxmox's documentation: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_recovering_moving_guests_from_failed_nodes
The VMs were restored by moving the VM config the files to the respective nodes 2 and 3's folders in "/etc/pve/nodes/node1/qemu-server/" which as stated on the doc, understandably violates Proxmox VE’slocking principles. The node 1 was pulled and checked on the bench at our workshop to find that it booted properly and seems to be working fine. We will continue to monitor it for the next 48 hours. My question is if we were to edit the same qemu-server folder in the offline node and delete the config files so as not to conflict with the remaining two active nodes in the cluster where the manually migrated VMs now reside and then add in the node1, would this pose a problem for the cluster. No attempt has been made to remove the node yet from the environment via pvecm. In theory this should work as everything is stored in Ceph. Kindly let me know if I am wrong in my thinking. Thank you.
I have a question regarding restoring a failed node on a three cluster environment. This is an old implementation which is to be upgraded and moved in April of this year currently running 6.4-14 and also running a very old version of Ceph. In planning the move, we had a node fail unexpectedly, however the HA was not set up for two of the VMs to move after failure. Therefore, as these are production nodes, a choice of either using a backup or manually moving the VMs had to be decided and the latter was chosen using "6.5.2. Recovering/Moving Guests from Failed Nodes" section from Proxmox's documentation: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_recovering_moving_guests_from_failed_nodes
The VMs were restored by moving the VM config the files to the respective nodes 2 and 3's folders in "/etc/pve/nodes/node1/qemu-server/" which as stated on the doc, understandably violates Proxmox VE’slocking principles. The node 1 was pulled and checked on the bench at our workshop to find that it booted properly and seems to be working fine. We will continue to monitor it for the next 48 hours. My question is if we were to edit the same qemu-server folder in the offline node and delete the config files so as not to conflict with the remaining two active nodes in the cluster where the manually migrated VMs now reside and then add in the node1, would this pose a problem for the cluster. No attempt has been made to remove the node yet from the environment via pvecm. In theory this should work as everything is stored in Ceph. Kindly let me know if I am wrong in my thinking. Thank you.