Former Cluster, just disappears

Discussion in 'Proxmox VE: Installation and configuration' started by BloodyIron, Apr 10, 2019.

  1. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    I'm not sure when this happened, but recently when I check the "Cluster" section of my Datacentre, it says there's no cluster. Except, it has been a cluster for over 5 years now, currently with 5x nodes.

    The whole cluster is 5.3-11, and I have been maintaining and upgrading it since about 2.3, following the proper documentation over the years for upgrading and such.

    Now, I have no idea what happened. There "join info" button is greyed out, I can "Create Cluster" or "Join Cluster".

    But I can still do Clustery things, like Live Migrate, manage the whole cluster from any node, etc.

    What on EARTH is going on?!?! Halp!
     
  2. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    Yeah in the Datacenter section even lists "Cluster Nodes"... wtf
     
  3. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,679
    Likes Received:
    338
    can your nodes resolve the nodesnames of the others?
    what is the output of
    Code:
    pvesh get /cluster/config/join
    
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  4. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    I think you're onto something about the DNS aspect, I did make a DNS change recently. I'll have to check that out and get back to you!
     
  5. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    Okay it looks like I messed up a good while ago, and I'm not sure the best course to correct this.

    I have 5x nodes.

    Small1
    Small2
    Dormant1
    Dormant2
    BigBoy1

    Early history, the cluster was

    Small1
    Small2

    Then I got new servers, and added them to the cluster

    Small1
    Small2
    Dormant1
    Dormant2

    However, Dormant1 and Dormant2 were very loud and power inefficient. I primarily have them because they were $0, and have a lot of RAM (DDR2 though), so I would only turn them on when I need to lab large stuff, hence the name "Dormant". As such, I gave Small1 and Small2, 2 votes each, and Dormant1 and Dormant2, 1 vote each. I would also turn on Dormant1 and Dormant2 if I needed to upgrade/update/reboot Small1 OR Small2 for whatever reason.

    Then I got a new server, and I think this is where I don goofed. I added BigBoy1.

    Small1
    Small2
    Dormant1
    Dormant2
    BigBoy1

    However, I just turned Dormant1 and Dormant2 on, and not only do they not see BigBoy1, but they are also 5.2.x, where my other nodes are 5.3.x.

    So I think I goofed in that I forgot to turn Dormant1 and Dormant2 on when BigBoy1 joined the cluster. And now Dormant1 and Dormant2, when I turn them only, only see themselves as "online". And when Dormant1 and Dormant2 are on, the cluster do not see them as on.

    When I run the command you advised me to do, I got the output:

    "hostname lookup 'dormant1' failed - failed to get address info for: doormant1: Name or service not knoown"

    So, at this point, it looks like my cluster is in a bad state, and I'm not sure what the appropriate steps are to address this. Please help!
     
  6. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    And now I can't edit my above post to fix typos because "it looks spammy"... wtf?
     
  7. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,679
    Likes Received:
    338
    first i would add your nodes to the /etc/hosts so that they can resolve them

    i would stop pve-cluster and corosync on those nodes, copy the corosync.conf from /etc/corosync/ to the nodes restart corosync and pve-cluster then they should see each other again
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    Okay I just want to clarify so I precisely follow your direction here

    1. Add dormant1 and dormant2 to the hosts of small1, small2 and bigboy1
    2. On dormant1 and dormant2, stop pve-cluster
    3. FROM small1, copy /etc/corosync/corosync.conf TO dormant1 and dormant2 in the same location
    4. On dormant1 and dormant2, restart corosync and pve-cluster
    Is the hosts thing a temporary thing? Because right now small1 does not have any entry in /etc/hosts for anything but itself.

    Also, thanks for your help! :D Please let me know if I missed any details here.


     
  9. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    Any chance I can get your clarification on above please? :) I'm holding off on executing to hear from you.

     
  10. dcsapak

    dcsapak Proxmox Staff Member
    Staff Member

    Joined:
    Feb 1, 2016
    Messages:
    3,679
    Likes Received:
    338
    looks good

    generally the nodes should be able to resolve all other node names, this can be via /etc/hosts or any other means (e.g. your local dns server)
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    BloodyIron likes this.
  11. BloodyIron

    BloodyIron Member

    Joined:
    Jan 14, 2013
    Messages:
    193
    Likes Received:
    4
    WOOT! IT WORKED!

    The cluster section under Data Centre now has "Join Cluster" enabled and "Create Cluster" greyed out! It says it's a cluster, and all that!

    Thanks a tonne!

    I'm commenting out the manual resolution in the hosts for the time being (now that they're rejoined), simply to keep homogeneous operations. And if that somehow breaks things, I may set it back up. Yay! :DDD

     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice