Testing HA with 5.3

Discussion in 'Proxmox VE: Installation and configuration' started by txsastre, Dec 8, 2018.

  1. txsastre

    txsastre New Member

    Joined:
    Jan 6, 2015
    Messages:
    17
    Likes Received:
    0
    I have 3 hosts,if one is stopped in 2-3 minutes I can see the VM that host was holding, starts on another host. but when there is only 1 host remaining it does not wake up the VMs that were on.

    I realised that "pvecm expected 1" makes the VMs start.
    *cluster information says that every node has "votes=1
    does this votes 1 related to pvecm expected 1 ?

    Could it be configured the cluster to work in case that there is only 1 host ?
     
  2. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,302
    Likes Received:
    131
    HA can only work in you have quorum. It's not possible to get it work if you loose more than half total nodes number.

    If you have only 1node up, (and you are sure that the others are done), you can setup "pvecm expected 1", to said "I need only 1 vote to have quorum), to get /etc/pve writeable again, and be able de restart vm on it.

    (but don't forget to set pvecm expected to previous value, we others nodes are back)
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    txsastre likes this.
  3. txsastre

    txsastre New Member

    Joined:
    Jan 6, 2015
    Messages:
    17
    Likes Received:
    0
    hi thanks for the reply, sorry for the delay, pretty busy here :(

    question, if I have 5 hosts, and 3 go down, the HA will not work with the 2 remaining hosts ?

    yes I tried"pvecm expected 1" and it works, but I have to read more to understand how does proxmox works in this situations, I'm used to xenserver and it can work with only 1 host, also tested oVirt and it also works without touching anything.

    Thanks
     
  4. gosha

    gosha Member

    Joined:
    Oct 20, 2014
    Messages:
    274
    Likes Received:
    15
    Hi!
    HA need quorum. Quorum - more then half nodes.
    For 5 nodes - more then half nodes (quorum) is the more then 5/2=2.5 i.e. 3 nodes.

    Best regards,
    Gosha
     
    #4 gosha, Dec 14, 2018
    Last edited: Dec 14, 2018
    txsastre likes this.
  5. txsastre

    txsastre New Member

    Joined:
    Jan 6, 2015
    Messages:
    17
    Likes Received:
    0
    ok. thank you.

    for example, if I have this 3 nodes down, I can use "pvecm expected 1" on the 2 remaining to wake up all the VM and CT ?

    I studying as much as I can proxmox because i want to migrate from xenserver to another hypervisor and I'm trying to find the best option that fits to us.

    our scenario is :
    a primary CPD with 3 nodes and 40-50 VM stored on a SAN iSCSI
    and a secondary (backup) CPD with the same configuration.

    thanks again !
     
  6. Andrew Hart

    Andrew Hart Member

    Joined:
    Dec 1, 2017
    Messages:
    67
    Likes Received:
    9
    Hi,
    What would happen to cause 3 out of five nodes to die? I think if you had a five node cluster, the most you would expect is 1 failure at a time. There is also the point that if you did get it working how you want, then all 40-50 VMs could end up running on one node.
    Your idea of having two physically separated 3 node clusters is a good one BUT if one is just a secondary backup and a single node can run all the 40-50 VMs.....why not have 5 nodes in the primary and 1 in the backup?

    Even better would be to have a nine node cluster spread over three datacentres, but who has three?
    Andrew
     
  7. gosha

    gosha Member

    Joined:
    Oct 20, 2014
    Messages:
    274
    Likes Received:
    15
    I join this approach. And I use a similar configuration in my work.
    I have 4 nodes (the 5th node will be soon) in the main cluster (storage CEPH).
    And a single node with Proxmox VE as a backup server (for VMs and CTs from main cluster). Storage - 4 HDD ZFS RAIDZ-1.

    Gosha
     
  8. txsastre

    txsastre New Member

    Joined:
    Jan 6, 2015
    Messages:
    17
    Likes Received:
    0
    hi there.

    well, we want to have 2 physical separated CPD for security reasons.
    for example unfortunately this week we have had a disaster situation, everything was lost, still don't know why, but we have lost a host (OS failure) and all the VMs, yes all of them.

    So, in this situation if I have had 5 nodes instead of 3, the problem would have been the same.

    For this reason, its because I prefer to have 2 CPD with 3 nodes and a SAN each, that only one huge CPD ( 3 hosts, because is the minimum needed for HA)

    by the way, I barely could start all my VM with only one backup node, and if I do that, I will be running everything on only one host, and I don't want that scenario either.

    But, thanks for your point of view, I think there a different views to attack a problem.
     
    #8 txsastre, Dec 15, 2018
    Last edited: Dec 15, 2018
  9. gosha

    gosha Member

    Joined:
    Oct 20, 2014
    Messages:
    274
    Likes Received:
    15
    Hi!
    Do you understand that a VM on a cluster node is just a configuration file?
    This configuration file allows the node to allocate the necessary resources for the VM.
    And disks for all VMs in your case are stored in the SAN-storage.

    If you are using HA and have lost 2 of 3 nodes, then such a cluster will not work.
    But the question arises, what to do in this case?
    :eek:
    You can find configuration files for all VMs from lost nodes on the remaining node
    in the /etc/pve/nodes/<nodename>/qemu-server/ directory.
    And you can use them to start VMs from the lost nodes on the remaining node (by moving
    config files to a similar directory for remaining node) since the disks for all the lost VMs are available on the SAN.

    If the lost nodes are not destroyed and can be returned to the cluster, then you simply return them.
    And migrate needful VMs on arrived nodes.

    If the lost nodes do not allow them to simply be returned (for example, due the loss the OS),
    then you need to first remove the lost nodes from the cluster (Required! And read doc for deleting node)
    and then re-create them. Proxmox VE installation on the returned nodes is done in 15...20 minutes (debian install method).

    Best regards,
    Gosha
     
    #9 gosha, Dec 15, 2018
    Last edited: Dec 15, 2018
    txsastre likes this.
  10. txsastre

    txsastre New Member

    Joined:
    Jan 6, 2015
    Messages:
    17
    Likes Received:
    0
    hi gosha.

    thanks for your contribution.
    the hypervisor that we're using right now is Xenserver.

    really appreciate the description about how the HA works, indeed the last weekend I spend a lot of time learning that and is exactly the way you have described. thanks!

    once thing that I really liked of proxmox is the easy way they seem to do things. and that everythig looks very transparent, its very easy to know how it works, and also is easy to do things when a disaster happens. Xenserver is not so easy to understand under the hood and its a terrible headache when a host is down and you cannot reach the VM.

    once we have the new servers, I will try proxmox, and if its good enough for us we will migrate from xenserver to proxmox.

    I'm really excited with proxmox 5.3
     
  11. gosha

    gosha Member

    Joined:
    Oct 20, 2014
    Messages:
    274
    Likes Received:
    15
    I think you like Proxmox VE. ;)
     
    txsastre likes this.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice