pvecm addnode via ssh

Discussion in 'Proxmox VE: Installation and configuration' started by tirili, Oct 1, 2018.

Tags:
  1. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Is there any way like

    pvecm add NOdEIP -use_ssh

    to do the same for

    pvecm addnode ?

    Is there any way to change the node's name afterwards?

    Having

    # pvecm status
    Quorum information
    ------------------
    Date: Mon Oct 1 16:10:23 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/604
    Quorate: Yes

    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 2
    Quorum: 2
    Flags: Quorate Qdevice

    Membership information
    ----------------------
    Nodeid Votes Qdevice Name
    0x00000001 1 A,V,NMW 1.2.3.1 (local)
    0x00000000 1 Qdevice


    And want to change the name to the correct node name?

    Thanks for your help in advance
     
  2. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    3,862
    Likes Received:
    230
    Hi,

    pvecm addnode is for internal use and you should not use it directly.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Hello Wolfgang, ok, thanks for the info.
    Having issues with mcast, and configured to use udpu, the pvecm add on the 2nd node does end up in a desaster, as it is waiting for quorum, and not getting any.

    Which is the recommended way to add a new node, and configure it to connect / use quorum afterwards? (I tried with -vote 0 as well)
    I configured corosyc-qdevice on the 1st node which connects perfectly to the corosync-qnetd.
    The second does not connect using corosyc-qdevice as the corosync.conf does not exist.

    Any further help is appreciated.
    Best regards
    Thomas
     
  4. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    We need some clarification.

    omping for the nodes 10.40.20.84 and 10.40.20.20 is working:

    10.40.20.84 : unicast, xmt/rcv/%loss = 5/5/0%, min/avg/max/std-dev = 2.790/2.868/2.925/0.060
    10.40.20.84 : multicast, xmt/rcv/%loss = 5/5/0%, min/avg/max/std-dev = 2.797/2.882/2.940/0.063
    10.40.20.20 : unicast, xmt/rcv/%loss = 5/5/0%, min/avg/max/std-dev = 2.782/2.853/2.918/0.048
    10.40.20.20 : multicast, xmt/rcv/%loss = 5/5/0%, min/avg/max/std-dev = 2.830/2.892/2.933/0.038

    Then I created the cluster

    pvecm create CLUSTER -bindnet0_addr 10.40.20.20 --ring0_addr 10.40.20.20

    Corosync Cluster Engine Authentication key generator.
    Gathering 1024 bits for key from /dev/urandom.
    Writing corosync key to /etc/corosync/authkey.
    Writing corosync config to /etc/pve/corosync.conf
    Restart corosync and cluster filesystem​

    pvecm status
    Quorum information
    ------------------
    Date: Thu Oct 4 13:47:04 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/4
    Quorate: Yes

    Votequorum information
    ----------------------
    Expected votes: 1
    Highest expected: 1
    Total votes: 1
    Quorum: 1
    Flags: Quorate

    Membership information
    ----------------------
    Nodeid Votes Name
    0x00000001 1 10.40.20.20 (local)​


    Now I tried to add the other node and did on the node 10.40.20.84:

    pvecm add 10.40.20.20 -votes 0 -ring0_addr 10.40.20.84 -use_ssh

    The authenticity of host '10.40.20.20 (10.40.20.20)' can't be established.
    ECDSA key fingerprint is SHA256:XXXXXXXXXXXX/yGIzWjSteDbjCuTkUl4tBFeDI.
    Are you sure you want to continue connecting (yes/no)? yes
    copy corosync auth key
    stopping pve-cluster service
    backup old database to '/var/lib/pve-cluster/backup/config-1538653699.sql.gz'
    waiting for quorum...OK
    (re)generate node files
    generate new node certificate
    merge authorized SSH keys and known hosts
    generated new node certificate, restart pveproxy and pvedaemon services
    successfully added node '10.40.20.20' to cluster.​



    What I am really wondering is:

    I want to add the node 10.40.20.84 to cluster, but not vice versa!
    Documentation tells

    pvecm add <hostname.of.existing.cluster.member> -ring0_addr <hostname.of.this.node.which.should.be.added>

    I expected another output like "successfully added node '10.40.20.84' to cluster."

    Any help/clarification is highly appreciated!
    Best regards
    Thomas
     
  5. t.lamprecht

    t.lamprecht Proxmox Staff Member
    Staff Member

    Joined:
    Jul 28, 2015
    Messages:
    878
    Likes Received:
    86
    why votes 0 ?

    and why with 'use_ssh' are packages version on both nodes not the same?

    yes, this is strange, is your /etc/hosts and /etc/hostname correct on both nodes?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  6. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    /etc/hosts and /etc/hostname are not identical.

    I have in /etc/hosts

    10.40.20.20 v20-int
    10.40.20.84 v84-int
    # and external, official IP
    5.9.20.20 v20
    188.9.20.84 v84

    hostname matches on name of official IP.

    Do you have any ideas how to solve?
     
  7. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    What does the parameter "votes" do?
    And, I use use_ssh, as I always get an error 500 when trying it with the API.
     
  8. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    pvecm nodes only shows the local node,

    root@vm20 ~ # pvecm nodes

    Membership information
    ----------------------
    Nodeid Votes Name
    1 1 10.40.20.20 (local)

    root@vm84 ~ # pvecm nodes

    Membership information
    ----------------------
    Nodeid Votes Name
    2 1 10.40.20.84 (local)​

    But corosync.conf on both nodes looks ok

    logging {
    debug: off
    to_syslog: yes
    }

    nodelist {
    node {
    name: vm20
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.40.20.20
    }
    node {
    name: vm84
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.40.20.84
    }
    }

    quorum {
    provider: corosync_votequorum
    }

    totem {
    cluster_name: cpmx
    config_version: 2
    interface {
    bindnetaddr: 10.40.20.0
    ringnumber: 0
    }
    ip_version: ipv4
    secauth: on
    version: 2
    }​

    So what is wrong, and how can I get the cluster up and running?

    I expect to see on pvecm nodes both nodes on each system.

    Is there manual way to get ist solved?
     
  9. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    3,862
    Likes Received:
    230
    Please make a longer omping test 5 pings are not representative.
    If your IGMP-snooping is wrong configured the first packages will not block.

    Code:
    omping -c 10000 -i 0.001 -F -q node1 node2 node3
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  10. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    We have now

    10.40.20.92 : unicast, xmt/rcv/%loss = 9309/9309/0%, min/avg/max/std-dev = 0.214/0.254/0.417/0.023
    10.40.20.92 : multicast, xmt/rcv/%loss = 9309/9309/0%, min/avg/max/std-dev = 0.232/0.290/0.465/0.023
    10.40.20.20 : unicast, xmt/rcv/%loss = 10000/9999/0%, min/avg/max/std-dev = 2.726/2.773/2.923/0.034
    10.40.20.20 : multicast, xmt/rcv/%loss = 10000/9999/0%, min/avg/max/std-dev = 2.751/2.815/2.947/0.029


    Best regards
    Thomas
     
  11. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Still the question, is there any way to add cluster nodes manually? It is really hard to see that pvecm add compromizes the master node, as the new node requires a quorum... is there any way to add and get the new node arbitrating ?
     
  12. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    3,862
    Likes Received:
    230
    The config what you send is correct so the question is why the second node failed?
    Normally you get an error when you check

    systemctl status corosync

    Also, check on the second node if /etc/pve/corosync.conf exists and it is the same as that one on the first node.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  13. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    So, from the beginning:
    omping -c 10000 -i 0.001 -F -q 10.40.20.84 10.40.20.92 10.40.20.20
    works:
    10.40.20.84 : unicast, xmt/rcv/%loss = 10000/9998/0%, min/avg/max/std-dev = 2.724/2.770/2.920/0.032
    10.40.20.84 : multicast, xmt/rcv/%loss = 10000/9998/0%, min/avg/max/std-dev = 2.743/2.811/2.930/0.029
    10.40.20.92 : unicast, xmt/rcv/%loss = 9154/9152/0%, min/avg/max/std-dev = 2.725/2.773/2.933/0.029
    10.40.20.92 : multicast, xmt/rcv/%loss = 9154/9152/0%, min/avg/max/std-dev = 2.755/2.811/2.942/0.025
    10.40.20.20 : unicast, xmt/rcv/%loss = 10000/9998/0%, min/avg/max/std-dev = 2.726/2.776/2.905/0.029
    10.40.20.20 : multicast, xmt/rcv/%loss = 10000/9998/0%, min/avg/max/std-dev = 2.747/2.813/2.964/0.026


    all /etc/hosts
    10.40.20.84 vm84-int.proxmox.com vm84-int vm84 vm84.proxmox.com
    10.40.20.20 vm20-int.proxmox.com vm20-int vm20 vm20.proxmox.com
    10.40.20.92 vm92-int.proxmox.com vm92-int vm92 vm92.proxmox.com


    on vm20:

    root@vm20 ~ # pvecm status
    Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?
    Cannot initialize CMAP service
    We take for bindnet0_addr the complete cluster network, so we specify 10.40.20.0 (netmask 24)
    root@vm20 ~ # pvecm create pmxc -bindnet0_addr 10.40.20.0 -ring0_addr 10.40.20.20
    Corosync Cluster Engine Authentication key generator.
    Gathering 1024 bits for key from /dev/urandom.
    Writing corosync key to /etc/corosync/authkey.
    Writing corosync config to /etc/pve/corosync.conf
    Restart corosync and cluster filesystem
    root@vm20 ~ # pvecm status
    Quorum information
    ------------------
    Date: Wed Oct 10 21:39:10 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/4
    Quorate: Yes
    Votequorum information
    ----------------------
    Expected votes: 1
    Highest expected: 1
    Total votes: 1
    Quorum: 1
    Flags: Quorate
    Membership information
    ----------------------
    Nodeid Votes Name
    0x00000001 1 10.40.20.20 (local)
    root@vm20 ~ #


    Now edit config for qdevice, and increment config_version (from 1 to 2 )

    vi /etc/corosync/corosync.conf

    logging {
    debug: off
    to_syslog: yes
    }
    nodelist {
    node {
    name: vm20
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.40.20.20
    }
    }
    quorum {
    provider: corosync_votequorum
    device {
    model: net
    votes: 1
    net {
    tls: off
    host: 9.17.8.23
    port: 5403
    algorithm: ffsplit
    }
    }
    }
    totem {
    cluster_name: pmxc
    config_version: 2
    interface {
    bindnetaddr: 10.40.20.0
    ringnumber: 0
    }
    ip_version: ipv4
    secauth: on
    version: 2
    }

    root@vm20 ~ # pvecm status
    Quorum information
    ------------------
    Date: Wed Oct 10 21:42:46 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/4
    Quorate: Yes
    Votequorum information
    ----------------------
    Expected votes: 1
    Highest expected: 1
    Total votes: 1
    Quorum: 1
    Flags: Quorate
    Membership information
    ----------------------
    Nodeid Votes Name
    0x00000001 1 10.40.20.20 (local)


    Now start corosync-qdevice

    root@vm20 ~ # systemctl start corosync-qdevice
    Job for corosync-qdevice.service failed because the control process exited with error code.
    See "systemctl status corosync-qdevice.service" and "journalctl -xe" for details.


    Failed:

    root@vm20 ~ # systemctl status corosync-qdevice.service
    ● corosync-qdevice.service - Corosync Qdevice daemon
    Loaded: loaded (/lib/systemd/system/corosync-qdevice.service; disabled; vendor preset: enabled)
    Active: failed (Result: exit-code) since Wed 2018-10-10 21:43:01 CEST; 1min 33s ago
    Docs: man:corosync-qdevice
    Process: 24504 ExecStart=/usr/sbin/corosync-qdevice -f $COROSYNC_QDEVICE_OPTIONS (code=exited, status=1/FAILURE)
    Main PID: 24504 (code=exited, status=1/FAILURE)
    CPU: 3ms
    Oct 10 21:43:01 vm20 systemd[1]: Starting Corosync Qdevice daemon...
    Oct 10 21:43:01 vm20 corosync-qdevice[24504]: Can't read quorum.device.model cmap key.
    Oct 10 21:43:01 vm20 systemd[1]: corosync-qdevice.service: Main process exited, code=exited, status=1/FAILURE
    Oct 10 21:43:01 vm20 systemd[1]: Failed to start Corosync Qdevice daemon.
    Oct 10 21:43:01 vm20 systemd[1]: corosync-qdevice.service: Unit entered failed state.
    Oct 10 21:43:01 vm20 systemd[1]: corosync-qdevice.service: Failed with result 'exit-code'.
    root@vm20 ~ #
    root@vm20 ~ # systemctl status corosync
    ● corosync.service - Corosync Cluster Engine
    Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
    Active: active (running) since Wed 2018-10-10 21:45:01 CEST; 3s ago
    Docs: man:corosync
    man:corosync.conf
    man:corosync_overview
    Main PID: 24674 (corosync)
    Tasks: 2 (limit: 4915)
    Memory: 37.5M
    CPU: 80ms
    CGroup: /system.slice/corosync.service
    └─24674 /usr/sbin/corosync -f
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync watchdog service [7]
    Oct 10 21:45:01 vm20 corosync[24674]: [QUORUM] Using quorum provider corosync_votequorum
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
    Oct 10 21:45:01 vm20 corosync[24674]: [QB ] server name: votequorum
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
    Oct 10 21:45:01 vm20 corosync[24674]: [QB ] server name: quorum
    Oct 10 21:45:01 vm20 corosync[24674]: [TOTEM ] A new membership (10.40.20.20:8) was formed. Members joined: 1
    Oct 10 21:45:01 vm20 corosync[24674]: [CPG ] downlist left_list: 0 received
    Oct 10 21:45:01 vm20 corosync[24674]: [QUORUM] Members[1]: 1
    Oct 10 21:45:01 vm20 corosync[24674]: [MAIN ] Completed service synchronization, ready to provide service.


    So now restart corosync:

    root@vm20 ~ # systemctl restart corosync
    root@vm20 ~ # systemctl status corosync
    ● corosync.service - Corosync Cluster Engine
    Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
    Active: active (running) since Wed 2018-10-10 21:45:01 CEST; 3s ago
    Docs: man:corosync
    man:corosync.conf
    man:corosync_overview
    Main PID: 24674 (corosync)
    Tasks: 2 (limit: 4915)
    Memory: 37.5M
    CPU: 80ms
    CGroup: /system.slice/corosync.service
    └─24674 /usr/sbin/corosync -f
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync watchdog service [7]
    Oct 10 21:45:01 vm20 corosync[24674]: [QUORUM] Using quorum provider corosync_votequorum
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
    Oct 10 21:45:01 vm20 corosync[24674]: [QB ] server name: votequorum
    Oct 10 21:45:01 vm20 corosync[24674]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
    Oct 10 21:45:01 vm20 corosync[24674]: [QB ] server name: quorum
    Oct 10 21:45:01 vm20 corosync[24674]: [TOTEM ] A new membership (10.40.20.20:8) was formed. Members joined: 1
    Oct 10 21:45:01 vm20 corosync[24674]: [CPG ] downlist left_list: 0 received
    Oct 10 21:45:01 vm20 corosync[24674]: [QUORUM] Members[1]: 1
    Oct 10 21:45:01 vm20 corosync[24674]: [MAIN ] Completed service synchronization, ready to provide service.


    and start corosync-qdevice

    root@vm20 ~ # systemctl start corosync-qdevice
    root@vm20 ~ # systemctl status corosync-qdevice
    ● corosync-qdevice.service - Corosync Qdevice daemon
    Loaded: loaded (/lib/systemd/system/corosync-qdevice.service; disabled; vendor preset: enabled)
    Active: active (running) since Wed 2018-10-10 21:45:21 CEST; 13s ago
    Docs: man:corosync-qdevice
    Main PID: 24716 (corosync-qdevic)
    Tasks: 1 (limit: 4915)
    Memory: 1.1M
    CPU: 4ms
    CGroup: /system.slice/corosync-qdevice.service
    └─24716 /usr/sbin/corosync-qdevice -f
    Oct 10 21:45:21 vm20 systemd[1]: Starting Corosync Qdevice daemon...
    Oct 10 21:45:21 vm20 systemd[1]: Started Corosync Qdevice daemon.
    root@vm20 ~ # corosync-quorumtool -s
    Quorum information
    ------------------
    Date: Wed Oct 10 21:46:42 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 1
    Ring ID: 1/8
    Quorate: Yes
    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 2
    Quorum: 2
    Flags: Quorate Qdevice
    Membership information
    ----------------------
    Nodeid Votes Qdevice Name
    1 1 A,V,NMW 10.40.20.20 (local)
    0 1 Qdevice


    pvecm looks similar

    root@vm20 ~ # pvecm status
    Quorum information
    ------------------
    Date: Wed Oct 10 21:46:57 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/8
    Quorate: Yes
    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 2
    Quorum: 2
    Flags: Quorate Qdevice
    Membership information
    ----------------------
    Nodeid Votes Qdevice Name
    0x00000001 1 A,V,NMW 10.40.20.20 (local)
    0x00000000 1 Qdevice
     
  14. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Now go to the next node, as vm92 should be added to the cluster

    root@vm92 ~ # pvecm status
    Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?
    Cannot initialize CMAP service
    root@vm92 ~ # time pvecm add vm92 -ring0_addr 10.40.20.92
    Please enter superuser (root) password for 'vm92':
    Password for root@vm92: *********


    But here is the point!
    pvecm help add tells, "add the current node to the cluster".
    But why am I asked for the root password for the node which wants to be added?
    Or should I run this command from the first node?
    But this does not work as well:

    root@vm20 ~ # pvecm add vm92 -ring0_addr 10.40.20.20
    detected the following error(s):
    * authentication key '/etc/corosync/authkey' already exists
    * cluster config '/etc/pve/corosync.conf' already exists
    * corosync is already running, is this node already in a cluster?!
    Check if node may join a cluster failed!


    So any help is appreciated!
     
  15. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    3,862
    Likes Received:
    230
    You have to run the pvecm add command on the new node and not on a cluster member.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  16. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Hello Wolfgang,

    I do not run it on a cluster member.

    But I did not reference the vm20 (which created the cluster above).

    pvecm add vm20 -ring0_addr 10.40.20.20 -use_ssh

    Now I end up in

    The authenticity of host 'vm20 (10.40.20.20)' can't be established.
    ECDSA key fingerprint is SHA256:vzaqUBkjCXXXXXXfla+Uoj0T2ZxspL4wSmU.
    Are you sure you want to continue connecting (yes/no)? yes
    copy corosync auth key
    stopping pve-cluster service
    backup old database to '/var/lib/pve-cluster/backup/config-1539257263.sql.gz'
    waiting for quorum...


    Why is it waiting for quorum so long ?
    And, I cannot start the corosync-qdevice now, as the corosync.conf which is now created, does not have the first cluster members configuration.

    root@VM92 ~ # cat /etc/corosync/corosync.conf
    logging {
    debug: off
    to_syslog: yes
    }
    nodelist {
    node {
    name: x0720
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.40.20.20
    }
    node {
    name: x1892
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.40.20.92
    }
    }
    quorum {
    provider: corosync_votequorum
    }
    totem {
    cluster_name: pmxc
    config_version: 2
    interface {
    bindnetaddr: 10.40.20.0
    ringnumber: 0
    }
    ip_version: ipv4
    secauth: on
    version: 2
    }


    The node vm20 (master), see above the corosync config contains the former posted qnetd config.

    on the node vm92 which is currently waiting for the quorum and being added tells:

    root@vm92 ~ # pvecm status
    Quorum information
    ------------------
    Date: Thu Oct 11 13:32:09 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000002
    Ring ID: 2/264
    Quorate: No
    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 1
    Quorum: 2 Activity blocked
    Flags:
    Membership information
    ----------------------
    Nodeid Votes Name
    0x00000002 1 10.40.20.92 (local)


    Do you have any ideas how to fix this?
     
  17. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    Oh, now I see the problem.
    While using the pvecm add command, the vm20's corosync.conf is being replaced!
    There is no information about the qnetd device and my former config any more.
    Is this a bug, or what were we doing wrong?

    Best regards
    Thomas
     
  18. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    3,862
    Likes Received:
    230
    I don't know what you doing but your output says you that you create the cluster and the add the node on the same host what is not correct.

    This is not a bug qdevice is not implemented and must add at the end and not at start of the creation.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  19. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    I tried following approach which did result in the weird message, as the vm20 is added to cluster, and not the new node vm92.
    So while vm92 is performing the waiting for quorum, I added on the vm20 (master) the config stanza for the qnetd.

    root@vm20 ~ # cat /etc/corosync/corosync.conf
    logging {
    debug: off
    to_syslog: yes
    }
    nodelist {
    node {
    name: vm20
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.40.20.20
    }
    node {
    name: vm92
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.40.20.92
    }
    }
    quorum {
    provider: corosync_votequorum
    device {
    model: net
    votes: 1
    net {
    tls: off
    host: 9.17.8.23
    port: 5403
    algorithm: ffsplit
    }
    }
    }
    totem {
    cluster_name: pmxc
    config_version: 3
    interface {
    bindnetaddr: 10.40.20.0
    ringnumber: 0
    }
    ip_version: ipv4
    secauth: on
    version: 2
    }

    I incremented the config_version on 1, and copied over this config to the new clustermember vm20 which is still waiting for quorum.

    Then I restarted on both noded the corosync using
    systemctl restart corosync

    Then I started on the vm20 which was still waiting for quorum the qdevice.service and got progress.

    root@vm92 ~ # pvecm add vm20 -ring0_addr 10.40.20.92 -use_ssh
    The authenticity of host 'vm20 (10.40.20.20)' can't be established.
    ECDSA key fingerprint is SHA256:vzaqUBkjCrj/jY5m48Nl0aCfla+Uoj0T2ZxspL4wSmU.
    Are you sure you want to continue connecting (yes/no)? yes
    copy corosync auth key
    stopping pve-cluster service
    backup old database to '/var/lib/pve-cluster/backup/config-1539257263.sql.gz'
    waiting for quorum...OK
    (re)generate node files
    generate new node certificate
    merge authorized SSH keys and known hosts
    generated new node certificate, restart pveproxy and pvedaemon services
    successfully added node 'vm20' to cluster.


    But the last message is the desaster "successfully added node 'vm20' to cluster." as I expected the vm92 being added to cluster.
    And pvecm status on the new node looks like:

    root@vm92 ~ # pvecm status
    Quorum information
    ------------------
    Date: Thu Oct 11 13:54:04 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000002
    Ring ID: 2/1500
    Quorate: Yes
    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 2
    Quorum: 2
    Flags: Quorate Qdevice
    Membership information
    ----------------------
    Nodeid Votes Qdevice Name
    0x00000002 1 A,V,NMW 10.40.20.92 (local)
    0x00000000 1 Qdevice


    and on the master node vm20 the pvecm status does not have any information on the complete cluster.

    root@vm20 ~ # pvecm status
    Quorum information
    ------------------
    Date: Thu Oct 11 13:55:02 2018
    Quorum provider: corosync_votequorum
    Nodes: 1
    Node ID: 0x00000001
    Ring ID: 1/1556
    Quorate: No
    Votequorum information
    ----------------------
    Expected votes: 2
    Highest expected: 2
    Total votes: 1
    Quorum: 2 Activity blocked
    Flags:
    Membership information
    ----------------------
    Nodeid Votes Name
    0x00000001 1 10.40.20.20 (local)
     
  20. tirili

    tirili New Member

    Joined:
    Sep 19, 2018
    Messages:
    27
    Likes Received:
    0
    And as you can see, the pvecm create was done on vm20, and the pvecm add was initiated on vm92.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice