L
lonegroover
Guest
Hi.
I've recently inherited a Proxmox VE cluster from a sysadmin colleague who is now no longer with our company.
I've been very impressed with it so far. The web admin interface is very slick and the features are quite impressive, especially the live migration facility.
The cluster consists of two servers, each with access to a mirrired NAS array where the individual VMs are represented as logical volumes under /dev/drbdvg{0,1}, eg /dev/drbdvg1/vm-106-disk-1.
However, although it has run painlessly and smoothly for months now, one of the servers (called pves1) in the cluster failed yesterday.
I had expected the other (pves2) to take over the VMs automatically on detecting that the other was down - but this didn't happen. All of the (KVM) VMs hosted on pves1 simply crashed with it.
So:
1. Am I right in my assumption that pves2 should have taken over the VMs painlessly on detecting that pves1 was down, or is this not supported?
2. Assuming I am right, how can we fix this?
3. If my assumption above is wrong, is there any way that the VMs from the dead half of the cluster can be migrated and restarted while it's dead? (the data centre is unfortunately quite remote).
Grateful for any help / advice.
I've recently inherited a Proxmox VE cluster from a sysadmin colleague who is now no longer with our company.
I've been very impressed with it so far. The web admin interface is very slick and the features are quite impressive, especially the live migration facility.
The cluster consists of two servers, each with access to a mirrired NAS array where the individual VMs are represented as logical volumes under /dev/drbdvg{0,1}, eg /dev/drbdvg1/vm-106-disk-1.
However, although it has run painlessly and smoothly for months now, one of the servers (called pves1) in the cluster failed yesterday.
I had expected the other (pves2) to take over the VMs automatically on detecting that the other was down - but this didn't happen. All of the (KVM) VMs hosted on pves1 simply crashed with it.
So:
1. Am I right in my assumption that pves2 should have taken over the VMs painlessly on detecting that pves1 was down, or is this not supported?
2. Assuming I am right, how can we fix this?
3. If my assumption above is wrong, is there any way that the VMs from the dead half of the cluster can be migrated and restarted while it's dead? (the data centre is unfortunately quite remote).
Grateful for any help / advice.