Master is fail

docent · Aug 4, 2009

Hi, everybody!

Lat's say I have a cluster of three computers.
What happens if the master falls?

tom · Aug 11, 2009

docent said:
Hi, everybody!

Lat's say I have a cluster of three computers.
What happens if the master falls?

all VM´s running on the master are down.

you have two possibilities:

repair the master and bring it up again (e.g. replace the faulty hardware piece)
the master is gone forever: promote one of the remaining nodes to the new master and restore the VM´s from backup

RCK · Oct 23, 2009

Hello Tom,

Now that proxmox 1.4 is released, I have setup a master-slave cluster system with DRBD for storage - http://pve.proxmox.com/wiki/DRBD
The live migration is fantastic ! dumping ram and starting VM on slave node works perfectly.

Now the question:
On my master I have one VM. (Shared DRBD / LVM)
On my slave I have no VM.
My master is burning and hardware is dead (simulated by unplugged power).
My slave is alone, he can be promoted to master with "pveca -m", he have the VM disk data to DRBD, but he doesn't seem to have the VM conf file under /etc/qemu-server.

If there any way to start again my VM on my slave ?

dietmar · Oct 23, 2009

RCK said:
If there any way to start again my VM on my slave ?

We are currently working on HA - this will be the next feature we implement.

RCK · Oct 23, 2009

Yes, this is a very good news

Last question: if I backup the qemu conf file from the master, could I simply restore it on the slave and restart the VM ?

tom · Oct 23, 2009

RCK said:
Yes, this is a very good news

Last question: if I backup the qemu conf file from the master, could I simply restore it on the slave and restart the VM ?

Yes, sll backups can be restored on any host running Proxmox VE. (with vzrestore or qmrestore).

bogomolov · Oct 23, 2009

Hi guys

Just for ensure - the cluster layer _until now_ only provide a simple way to configure/manage vms/containers across all nodes from master, using only one login/screen - its correct?

RCK · Oct 23, 2009

tom said:
Yes, sll backups can be restored on any host running Proxmox VE. (with vzrestore or qmrestore).

I was only speaking about the configuration file - for example /etc/qemu-server/101.conf

If I manage to copy this conf file and I'm able to copy it on the slave, will the slave be able to start the VM (by reading data on the DRBD disk) ?

RCK · Oct 23, 2009

bogomolov said:
Hi guys

Just for ensure - the cluster layer _until now_ only provide a simple way to configure/manage vms/containers across all nodes from master, using only one login/screen - its correct?

It can also migrate VM between any nodes, including live migration (by copying RAM) if you are using shared disk (nfs, etc.) or DRBD live replication.

dietmar · Oct 23, 2009

RCK said:
Last question: if I backup the qemu conf file from the master, could I simply restore it on the slave and restart the VM ?

Yes, that work if the VM uses shared storage, but you need to make sure that the oriiginal VM dos not run anymore (fencing).

RCK · Oct 23, 2009

dietmar said:
Yes, that work if the VM uses shared storage, but you need to make sure that the oriiginal VM dos not run anymore (fencing).

Perfect, that's what I thought

docent · Oct 24, 2009

I think, the best way to do this is having config files of VMs on all nodes of cluster and to mark in each config file (or an other methods) the host on which the VM is running. If node on which was run the VM becomes down, any other randomly selected node can run that VM.
I have realised this method on my pve 1.3 and this work fine.
And yet, I think it would be better if the virtual machines would not be tied to a particular cluster node, but would be virtual in a cloud of cluster.

docent · Oct 27, 2009

And master must has virtual IP-address too.

mrshark · Nov 23, 2009

i'm using for testing a 2 node cluster with drbd for shared storage, some questions:

when a vm is started on a node, its conf file is created in: /etc/qemu-server/

and if the vm is migrated on the other node, the conf file is created on the other node, DISAPPEARING from the first NODE!

so, let's say a node running some VMs fails and is put away for repair, even if i've the vm disks on drbd, HOW could i start the vm on the other node if i've NO conf file???

i think, beside HA on future versions, that on a cluster, conf file MUST reside on EACH node, else there's no way to restart the vm!

or i'm missing something, please let me know...

tom · Nov 23, 2009

mrshark said:
i'm using for testing a 2 node cluster with drbd for shared storage, some questions:

when a vm is started on a node, its conf file is created in: /etc/qemu-server/

and if the vm is migrated on the other node, the conf file is created on the other node, DISAPPEARING from the first NODE!

so, let's say a node running some VMs fails and is put away for repair, even if i've the vm disks on drbd, HOW could i start the vm on the other node if i've NO conf file???

i think, beside HA on future versions, that on a cluster, conf file MUST reside on EACH node, else there's no way to restart the vm!

or i'm missing something, please let me know...

you should have a backup of your VM´s.

but anyway, we are working on HA for 2.0. if you want it now, you need to do some manual work (e.g. copying the config files somewhere if the vzdump backups are not enough for you).

Search

Search

Master is fail

docent

Renowned Member

tom

Proxmox Staff Member

RCK

Renowned Member

dietmar

Proxmox Staff Member

RCK

Renowned Member

tom

Proxmox Staff Member

bogomolov

New Member

RCK

Renowned Member

RCK

Renowned Member

dietmar

Proxmox Staff Member

RCK

Renowned Member

docent

Renowned Member

docent

Renowned Member

mrshark

Member

tom

Proxmox Staff Member

We value your privacy