[SOLVED] 2 node cluster failed - VM migration and recovery

RAAD

Renowned Member
Dec 2, 2016
1
0
66
49
After reading some useful posts this is a small contribution summarizing the solution in a step by step process:

Scenario:
- 2 node cluster (non HA available).
- 1 node goes down
- VM from failed node must be migrated to the one still working
Steps:
1. Set expected quorum:

# pvecm expected 1
2. Enable /etc/pve as local
2.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS="-l"
2.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
3. Assign VM from unavailable node to the available one:

# mv /etc/pve/nodes/node02/qemu-server/101.conf /etc/pve/nodes/node01/qemu-server/
4. Restore /etc/pve as non local
4.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS=""
4.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
5. Go to GUI and start migrated VM
6. Disconnect failed node from network
7. When it comes back, remove migrated VM from it (execute 7.* in failed node):

7.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS="-l"
7.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
7.3. Remove VM:
# rm /etc/pve/nodes/node02/qemu-server/101.conf
7.4. Restore /etc/pve as non local
7.4.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS=""
7.4.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
8. Reconnect failed node to the network and it'll join the cluster after reboot it.
When running
# pvecm status
it will show Votequorum info including 2nd nodes

References: