Auto live-migrate VM when node fails - Is this possible ??

Dawid

New Member
May 8, 2014
11
0
1
Poland
Hello everyone,

What I would like to do:
I have 3 nodes i HA cluster, configured with fence and failover domain, storage on external NFS.
When node fails, then VM migrate on next node and starts up,
every think works fine, but when VM (restarts) starts up, then every time linux installed on it makes fsck, no every time with success :(.

Q1. So I would like to have live-migration from failed node to another ready node in cluster, but I don't know what I should configurate. Is this possibe in proxmox VE 3.2 ?

px1---(1 Gb)---switch---(1 Gb)---NAS---(1 Gb)---- <- another data incoming from outsiede
>>>>>>>>>>> | | |
px2---(1 Gb)---| | |----------------------------- <- access from outside
>>>>>>>>>>>>> |
px3---(1 Gb)-----|

Q2. How agregate nodes CPU power and nodes memory to have it all available in one VM
lets's assume that:
px1 4 cores , 8GB Ram
px2 4 cores , 8GB Ram
px3 4 cores , 8GB Ram
I can start VM with 4 cores and 8 GB without problems,
but when I try start VM with 4 cores and 16 GB it will not start. I don't know what I should configurate. Is so agregation possibe in proxmox VE 3.2 ? I thought it will be when I made Cluster.
 
Last edited:
File system for VMs , HA->add NFS-> NFS Located on External NAS ( NAS is in the same subnet like Nodes )
disk image is qcow2, VM works with three partitions ext2(system), swap, ext4 (rest).
 
Last edited:
My config look like:
<cluster config_version="8" name="cluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="172.16.0.66" login="XXX" name="ipmi-2" passwd="YYY" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="172.16.0.67" login="XXX" name="ipmi-3" passwd="YYY" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="172.16.0.68" login="XXX" name="ipmi-4" passwd="YYY" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="proxmox-2" nodeid="1" votes="1">
<fence>
<method name="1">
<device action="reboot" name="ipmi-2"/>
</method>
</fence>
</clusternode>
<clusternode name="proxmox-3" nodeid="2" votes="1">
<fence>
<method name="1">
<device action="reboot" name="ipmi-3"/>
</method>
</fence>
</clusternode>
<clusternode name="proxmox-4" nodeid="3" votes="1">
<fence>
<method name="1">
<device action="reboot" name="ipmi-4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<failoverdomains>
<failoverdomain name="nodefailover" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="proxmox-2"/>
<failoverdomainnode name="proxmox-3"/>
<failoverdomainnode name="proxmox-4"/>
</failoverdomain>
</failoverdomains>
<pvevm autostart="1" vmid="101" domain="nodefailover" recovery="relocate"/>
</rm>
</cluster>
192.168.1.0/24 = Pool for test VM's
172.16.0.0/26 = Pool for Proxmox nodes & NAS
172.16.0.64/26 = Pool for fence devices

I tried add:
<service autostart="1" exclusive="0" name="TestIPvm101" recovery="relocate">
<ip address="192.168.1.234"/>
</service>
But without effect.
 
Last edited:
Ads fsck take long when run one the ext2 partition? If so, try to use ext4 instead (fsck should be much faster with ext4)
 
Thanks Dietmar, for you reply. I know ext4 is better like ext2, but this isn't sens of this problem,
I need live-migration for VM when node goes down, or maybe something like "VM redundancy" or "Node redundancy".
Maybe i didn't explain good this what i want before.

So i try again:
I start VM 101 on proxmox-2, ( I whish have there example: real time software , I need 99,999999% Avibility for this VM)
SO, when node proxmox-2 go down, example. power loss or some else,
then proxmox-3 should make fast live-migration VM 101 (without restaring), or maybe switching VM.

the sens is: Software installed on VM should work without break.
I would this nodes works together like "disks in RAID 1" or like redundancy power in server, when one goes down compuer not restart , continues to work as if nothing had happened.
Maybe solution is something like one VM works in the same time on all nodes, and when one node go down , vm work on rest nodes.

Q.: Is this possible in PROXMOX ? What or how should I config my nodes to have node redundancy or VM redundancy. I do not expect ready solution, I need to some one say "it is possibe" and show me direction or sad "it is impossible, don't waste a time :D"
 
Last edited:
This thread I think speaks to an issue I'm having in testing our Proxmox. I have a cluster with fencing, a gluster storage domain, and HA configured. I can live migrate a VM from one node to another just fine, but when I simulate a failure, the VM stays "attached" to that failed node and does not autmatically move over to a healthy node.

Is this a feature proxmox does not yet support? Perhaps I am reading this incorrectly. if I am, I would love it if someone here could point me in the right direction to get failover and HA working properly.

Thanks in advance.
 
but when I simulate a failure, the VM stays "attached" to that failed node and does not autmatically move over to a healthy node.

It is not normal, the VM should be migrated and booted on the other node of the domain.
Do you still have quorum after the simulated crash ?
Did you attach the VM to the created HA domain like David ? :
<pvevm autostart="1" vmid="101" domain="nodefailover" recovery="relocate"/>

By default Proxmox does not attache the VM to the failover domain, you have to edit the file manually.

Regards

EDIT: By the way "active redundancy" would be a GREEEAT feature. Is it in a long term roadmap ?
 
Last edited:
...
I start VM 101 on proxmox-2, ( I whish have there example: real time software , I need 99,999999% Avibility for this VM)
...
Hi,
this sound that you need to run a bunch of this VMs - all active with syncing databases (or what your service need) on seperate clusters with reverse proxys in front of this.
If one VM is down, the reverse proxy(s) send all traffic to another host.

Udo
 
cjw, don't forget configurate failover domains before, read this https://fedorahosted.org/cluster/wiki/FailoverDomains
and then look on this part of my config:
<rm>
<failoverdomains>
<failoverdomain name="nodefailover" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="proxmox-2"/>
<failoverdomainnode name="proxmox-3"/>
<failoverdomainnode name="proxmox-4"/>
</failoverdomain>
</failoverdomains>
<pvevm autostart="1" vmid="101" domain="nodefailover" recovery="relocate"/>
</rm>
important for good working fencing is fence ip should be in diffrent subnet like nodes.
 
Hi,
this sound that you need to run a bunch of this VMs - all active with syncing databases (or what your service need) on seperate clusters with reverse proxys in front of this.
If one VM is down, the reverse proxy(s) send all traffic to another host.
I thought about it, but in this configuration proxy will be point of Failure.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!