[SOLVED] 2 node cluster failed - VM migration and recovery

RAAD

Active Member
Dec 2, 2016
1
0
41
48
After reading some useful posts this is a small contribution summarizing the solution in a step by step process:

Scenario:
- 2 node cluster (non HA available).
- 1 node goes down
- VM from failed node must be migrated to the one still working
Steps:
1. Set expected quorum:

# pvecm expected 1
2. Enable /etc/pve as local
2.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS="-l"
2.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
3. Assign VM from unavailable node to the available one:

# mv /etc/pve/nodes/node02/qemu-server/101.conf /etc/pve/nodes/node01/qemu-server/
4. Restore /etc/pve as non local
4.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS=""
4.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
5. Go to GUI and start migrated VM
6. Disconnect failed node from network
7. When it comes back, remove migrated VM from it (execute 7.* in failed node):

7.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS="-l"
7.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
7.3. Remove VM:
# rm /etc/pve/nodes/node02/qemu-server/101.conf
7.4. Restore /etc/pve as non local
7.4.1. Edit /etc/default/pve-cluster, set
DAEMON_OPTS=""
7.4.2. Restart the FS, run:
# /etc/init.d/pve-cluster restart
8. Reconnect failed node to the network and it'll join the cluster after reboot it.
When running
# pvecm status
it will show Votequorum info including 2nd nodes

References:

 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!