I have 2 newly installed proxmox hosts, and I'm trying to create a cluster. I have proxmox01 as my primary node and proxmox02 as my second node.
I was able to create the cluster on proxmox01.
root@proxmox01:~# pvecm create CADRE-PVE
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
root@proxmox01:~# pvecm status
Quorum information
------------------
Date: Wed Sep 14 05:16:48 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/4
Quorate: Yes
Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.0.1 (local)
Now, I tried to add the second node to the cluster:
root@proxmox02:~# pvecm add 192.168.0.1
copy corosync auth key
stopping pve-cluster service
backup old database
waiting for quorum...
and it is stuck like that for like 20mins and counting.. I don't know if it was the right move but I canceled the operation as I'm seeing this in the log keeps on piling up:
Sep 14 06:12:26 proxmox02 corosync[11541]: [MAIN ] Completed service synchronization, ready to provide service.
Sep 14 06:12:28 proxmox02 corosync[11541]: [TOTEM ] A new membership (192.168.0.2:2528) was formed. Members
Sep 14 06:12:28 proxmox02 corosync[11541]: [QUORUM] Members[1]: 2
Sep 14 06:12:28 proxmox02 corosync[11541]: [MAIN ] Completed service synchronization, ready to provide service.
Sep 14 06:12:29 proxmox02 corosync[11541]: [TOTEM ] A new membership (192.168.0.2:2532) was formed. Members
Sep 14 06:12:29 proxmox02 corosync[11541]: [QUORUM] Members[1]: 2
I restarted pve-cluster, pvestatd, pveproxy and that message from the log stop accumulating.
So I attempted to re-add the proxmox02 again:
root@proxmox02:~# pvecm add 192.168.0.1
authentication key already exists
I tried to append a "--force" option to make proxmox02 join the cluster.
root@proxmox02:~# pvecm add 192.168.0.1 --force
node proxmox02 already defined
copy corosync auth key
stopping pve-cluster service
backup old database
generating node certificates
merge known_hosts file
restart services
successfully added node 'proxmox02' to cluster.
It looks fine, but proxmox02 still failed to join the cluster:
root@proxmox01:~# pvecm status
Quorum information
------------------
Date: Wed Sep 14 06:35:03 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/3336
Quorate: Yes
Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.0.1 (local)
It's even offline (red mark) in the proxmox01's web UI.
Is there any method I can re-join proxmox02 and clean-up proxmox01? There is no way for me to remove proxmox02 node since it's not even a member of the cluster.
Also an added information:
Before I was able to setup the cluster with no issue when the MTU settings in my bridge ingterface (weave) was set to 1410, now that I increased the number to 8900 (jumbo frame) I have this issue.
Also, I used 4.2.8-1-pve kernel instead of the latest one (4.4). For the reason, this is something to do with the weave bridge interface.
Anyhelp is appreciated.. TIA
I was able to create the cluster on proxmox01.
root@proxmox01:~# pvecm create CADRE-PVE
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
root@proxmox01:~# pvecm status
Quorum information
------------------
Date: Wed Sep 14 05:16:48 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/4
Quorate: Yes
Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.0.1 (local)
Now, I tried to add the second node to the cluster:
root@proxmox02:~# pvecm add 192.168.0.1
copy corosync auth key
stopping pve-cluster service
backup old database
waiting for quorum...
and it is stuck like that for like 20mins and counting.. I don't know if it was the right move but I canceled the operation as I'm seeing this in the log keeps on piling up:
Sep 14 06:12:26 proxmox02 corosync[11541]: [MAIN ] Completed service synchronization, ready to provide service.
Sep 14 06:12:28 proxmox02 corosync[11541]: [TOTEM ] A new membership (192.168.0.2:2528) was formed. Members
Sep 14 06:12:28 proxmox02 corosync[11541]: [QUORUM] Members[1]: 2
Sep 14 06:12:28 proxmox02 corosync[11541]: [MAIN ] Completed service synchronization, ready to provide service.
Sep 14 06:12:29 proxmox02 corosync[11541]: [TOTEM ] A new membership (192.168.0.2:2532) was formed. Members
Sep 14 06:12:29 proxmox02 corosync[11541]: [QUORUM] Members[1]: 2
I restarted pve-cluster, pvestatd, pveproxy and that message from the log stop accumulating.
So I attempted to re-add the proxmox02 again:
root@proxmox02:~# pvecm add 192.168.0.1
authentication key already exists
I tried to append a "--force" option to make proxmox02 join the cluster.
root@proxmox02:~# pvecm add 192.168.0.1 --force
node proxmox02 already defined
copy corosync auth key
stopping pve-cluster service
backup old database
generating node certificates
merge known_hosts file
restart services
successfully added node 'proxmox02' to cluster.
It looks fine, but proxmox02 still failed to join the cluster:
root@proxmox01:~# pvecm status
Quorum information
------------------
Date: Wed Sep 14 06:35:03 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/3336
Quorate: Yes
Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.0.1 (local)
It's even offline (red mark) in the proxmox01's web UI.
Is there any method I can re-join proxmox02 and clean-up proxmox01? There is no way for me to remove proxmox02 node since it's not even a member of the cluster.
Also an added information:
Before I was able to setup the cluster with no issue when the MTU settings in my bridge ingterface (weave) was set to 1410, now that I increased the number to 8900 (jumbo frame) I have this issue.
Also, I used 4.2.8-1-pve kernel instead of the latest one (4.4). For the reason, this is something to do with the weave bridge interface.
Anyhelp is appreciated.. TIA