So first of all I tried to setup a cluster and failed because my hosting provider does not support Multicast, so I decided to try and follow the information to configure a unicast cluster and I am having a nightmare.
Here is my /etc/hosts (public IP address & domain removed)
Here is my ifconfig (public IP address & MAC addresses removed)
Here is my corosync.conf:
Steps I took:
1. pvecm create tpcl
2. edit /etc/pve/corosync.conf (to match the cons above)
3. pvecm status
Cluster was up but was using the wrong IP (it was using my public IP which is to be expected because the original corosync.conf was using bindnetaddr: public_ip
4. rebooted the server
5. pvecm status
Syslog output:
There was a whole bunch more in syslog but basically it kept failing up until attempt 9 and then stopped trying.
I really have no idea how to fix this and would really appreciate some help.
Thanks.
Here is my /etc/hosts (public IP address & domain removed)
Code:
root@pmn1:~# more /etc/hosts
127.0.0.1 localhost
*.*.*.* pmn1.mydomain.com pmn1
# Proxmox Cluster
10.91.150.134 pmn1.tp
10.91.156.172 pmn2.tp
10.91.156.173 pmn3.tp
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Here is my ifconfig (public IP address & MAC addresses removed)
Code:
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:24092 errors:0 dropped:0 overruns:0 frame:0
TX packets:4330 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8551475 (8.1 MiB) TX bytes:2326213 (2.2 MiB)
eth1 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:10.91.150.134 Bcast:10.91.150.255 Mask:255.255.255.128
inet6 addr: fe80::ec4:7aff:fe57:5c25/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:684 (684.0 B) TX bytes:1192 (1.1 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:224 errors:0 dropped:0 overruns:0 frame:0
TX packets:224 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:20554 (20.0 KiB) TX bytes:20554 (20.0 KiB)
vmbr0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
int add:xxx.xxx.xxx.xxx Bcast:xxx.xxx.xxx.xxx Mask:255.255.255.0
inet6 addr: fe80::ec4:7aff:fe57:5c24/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:24090 errors:0 dropped:0 overruns:0 frame:0
TX packets:4326 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8214095 (7.8 MiB) TX bytes:2325949 (2.2 MiB)
Here is my corosync.conf:
Code:
totem {
version: 2
secauth: on
cluster_name: tpcl
config_version: 2
ip_version: ipv4
transport: udpu
interface {
ringnumber: 0
}
}
nodelist {
node {
ring0_addr: pmn1.tp
name: pmn1
nodeid: 1
quorum_votes: 1
}
node {
ring0_addr: pmn2.tp
name: pmn2
nodeid: 2
quorum_votes: 1
}
node {
ring0_addr: pmn3.tp
name: pmn3
nodeid: 3
quorum_votes: 1
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
logging {
to_syslog: yes
debug: off
}
Steps I took:
1. pvecm create tpcl
2. edit /etc/pve/corosync.conf (to match the cons above)
3. pvecm status
Cluster was up but was using the wrong IP (it was using my public IP which is to be expected because the original corosync.conf was using bindnetaddr: public_ip
4. rebooted the server
5. pvecm status
Code:
root@pmn1:~# pvecm status
Cannot initialize CMAP service
root@pmn1:~#
Syslog output:
Code:
Sep 23 08:12:35 pmn1 systemd[1]: Starting The Proxmox VE cluster filesystem...
Sep 23 08:12:35 pmn1 pmxcfs[1267]: [dcdb] crit: local corosync.conf is newer
Sep 23 08:12:35 pmn1 pmxcfs[1267]: [dcdb] crit: local corosync.conf is newer
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [quorum] crit: quorum_initialize failed: 2
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [quorum] crit: can't initialize service
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [confdb] crit: cmap_initialize failed: 2
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [confdb] crit: can't initialize service
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [dcdb] crit: cpg_initialize failed: 2
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [dcdb] crit: can't initialize service
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [status] crit: cpg_initialize failed: 2
Sep 23 08:12:35 pmn1 pmxcfs[1273]: [status] crit: can't initialise service
There was a whole bunch more in syslog but basically it kept failing up until attempt 9 and then stopped trying.
I really have no idea how to fix this and would really appreciate some help.
Thanks.