High Avaibility/Cluster Issue

Antonino89

Member
Jul 13, 2017
76
1
6
35
Hi guys,

From some days i'm trying to configure High Avaibility between two ProxMox physical servers.
I have problem to keep it up on the second server.

Both server are attached to the same Cisco switch SF-300, one server has ip addredd 192.168.2.50(Server1) and the other one has 192.168.2.55(Server 2) (/24). Servers are able to ping each other.

Configuration i've done:

Server1# pvecm create CLUSTERTEST

Server1 # pvecm add 192.168.2.50

"
root@pve:/# pvecm status
Quorum information
------------------
Date: Thu Jul 13 09:59:53 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/88
Quorate: Yes

Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.2.50 (local)"


On server2:
Server2# pvecm add 192.168.2.50
"node pve already defined
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details
."


root@pve:/etc/corosync# systemctl status corosync.service

● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Thu 2017-07-13 10:06:28 CEST; 1min 23s ago
Process: 1286 ExecStop=/usr/share/corosync/corosync stop (code=exited, status=0/SUCCESS)
Process: 6770 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)
Main PID: 1193 (code=killed, signal=ABRT)

Jul 13 10:05:27 pve corosync[6778]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Jul 13 10:05:27 pve corosync[6778]: [QB ] server name: quorum
Jul 13 10:05:27 pve corosync[6778]: [TOTEM ] A new membership (192.168.2.55:88) was formed. Members joined: 1
Jul 13 10:05:27 pve corosync[6778]: [QUORUM] Members[1]: 1
Jul 13 10:05:27 pve corosync[6778]: [MAIN ] Completed service synchronization, ready to provide service.
Jul 13 10:05:27 pve corosync[6778]: [TOTEM ] A new membership (192.168.2.50:92) was formed. Members joined: 1
Jul 13 10:06:28 pve corosync[6770]: Starting Corosync Cluster Engine (corosync): [FAILED]
Jul 13 10:06:28 pve systemd[1]: corosync.service: control process exited, code=exited status=1
Jul 13 10:06:28 pve systemd[1]: Failed to start Corosync Cluster Engine.
Jul 13 10:06:28 pve systemd[1]: Unit corosync.service entered failed state.

root@pve:/etc/corosync# pvecm status
Cannot initialize CMAP service

root@pve:/etc/corosync# more corosync.conf (On the second server)
totem {
version: 2
secauth: on
cluster_name: CLUSTERTEST
config_version: 1
ip_version: ipv4
interface {
ringnumber: 0
bindnetaddr: 192.168.2.50
}
}

nodelist {
node {
ring0_addr: pve
name: pve
nodeid: 1
quorum_votes: 1
}
}

quorum {
provider: corosync_votequorum
}

logging {
to_syslog: yes
debug: off
}


Normal ping is working.

root@pve:/etc/corosync# omping -c 10000 -i 0.001 -F -q 192.168.2.50 192.168.2.55
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
192.168.2.50 : waiting for response msg
^C
192.168.2.50 : response message never received

Can someone please help me to figure out in order to solve the problem and set the HighAvaibility?

Thanks all :)
 
first:
ha with two nodes will not work because of self fencing (if a node looses quorum, it restarts itself), so you should use at least 3 nodes

"node pve already defined
it seems both hosts have the same hostname? this will not work.
please give each server its own (cluster-)unique hostname
 
Thaaaanks.

I changed the servers hostname and /etc/hosts file:

root@Server2:/etc# pvecm add 192.168.2.50
copy corosync auth key
stopping pve-cluster service
backup old database
waiting for quorum...OK
generating node certificates
merge known_hosts file
restart services
successfully added node 'Server2' to cluster.

Now should be okay? How can i test if HighAvaibility works? :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!