Hello,
Just started with proxmox, and decided to start a little homelab. I am attempting to setup a 2 node cluster with infiniband host to host used for corosync, followed the guide. I think I have setup the infiniband properly, as I am able to omping, and nodes 'see' each other according to pvecm.
I have tried restarting several times (both nodes, and services, to no avail..)
What I am curious about is why the config versions in both .members and corosync.conf are all different? does that have anything to do with clusters not in sync ?
screenshots:
hxxps://imgur.com/a/Krj6z
debug cluster service:
hxxps://hastebin.com/zonidizaqi.swift
Just started with proxmox, and decided to start a little homelab. I am attempting to setup a 2 node cluster with infiniband host to host used for corosync, followed the guide. I think I have setup the infiniband properly, as I am able to omping, and nodes 'see' each other according to pvecm.
I have tried restarting several times (both nodes, and services, to no avail..)
What I am curious about is why the config versions in both .members and corosync.conf are all different? does that have anything to do with clusters not in sync ?
screenshots:
hxxps://imgur.com/a/Krj6z
debug cluster service:
hxxps://hastebin.com/zonidizaqi.swift
Code:
/etc/pve/.members on 10.0.100.25
{
"nodename": "PVE-NAS-IBM-64G",
"version": 7,
"cluster": { "name": "MAIN", "version": 4, "nodes": 2, "quorate": 1 },
"nodelist": {
"IB-LAB-DELL": { "id": 2, "online": 1, "ip": "10.0.100.26"},
"IB-NAS-IBM": { "id": 1, "online": 1}
}
}
/etc/pve/.members on 10.0.100.26
{
"nodename": "PVE-LAB-DELL-72G",
"version": 8,
"cluster": { "name": "MAIN", "version": 4, "nodes": 2, "quorate": 1 },
"nodelist": {
"IB-LAB-DELL": { "id": 2, "online": 1},
"IB-NAS-IBM": { "id": 1, "online": 1, "ip": "10.0.100.25"}
}
}
root@PVE-NAS-IBM-64G:~# pvecm status
Quorum information
------------------
Date: Sun Jan 28 15:33:49 2018
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1/23924
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.200.0.1 (local)
0x00000002 1 10.200.0.2
root@PVE-NAS-IBM-64G:~# pvecm nodes
Membership information
----------------------
Nodeid Votes Name
1 1 10.200.0.1 (local)
2 1 10.200.0.2
/etc/hosts on 10.0.100.25
127.0.0.1 localhost.localdomain localhost
10.0.100.25 PVE-NAS-IBM-64G.local.domain PVE-NAS-IBM-64G
10.0.100.26 PVE-LAB-DELL-72G.local.domain PVE-LAB-DELL-72G
10.200.0.1 IB-NAS-IBM.local.domain IB-NAS-IBM
10.200.0.2 IB-LAB-DELL.local.domain IB-LAB-DELL
/etc/network/interfaces on 10.0.100.25
auto lo
iface lo inet loopback
iface enp11s0f0 inet manual
iface enp11s0f1 inet manual
iface enp0s29f0u2 inet manual
auto ib0
iface ib0 inet static
address 10.200.0.1
netmask 255.255.255.252
pre-up modprobe ib_ipoib
pre-up echo connected > /sys/class/net/ib0/mode
mtu 65520
auto bond0
iface bond0 inet manual
slaves enp11s0f0 enp11s0f1
bond_miimon 100
bond_mode 802.3ad
bond_xmit_hash_policy layer2+3
auto vmbr0
iface vmbr0 inet static
address 10.0.100.25
netmask 255.255.255.0
gateway 10.0.100.1
bridge_ports bond0
bridge_stp off
bridge_fd 0
/etc/network/interfaces on 10.0.100.26
auto lo
iface lo inet loopback
iface enp6s0f0 inet manual
iface enp6s0f1 inet manual
auto ib0
iface ib0 inet static
address 10.200.0.2
netmask 255.255.255.252
pre-up modprobe ib_ipoib
pre-up echo connected > /sys/class/net/ib0/mode
mtu 65520
auto bond0
iface bond0 inet manual
slaves enp6s0f0 enp6s0f1
bond_miimon 100
bond_mode 802.3ad
bond_xmit_hash_policy layer2+3
auto vmbr0
iface vmbr0 inet static
address 10.0.100.26
netmask 255.255.255.0
gateway 10.0.100.1
bridge_ports bond0
bridge_stp off
bridge_fd 0
root@PVE-NAS-IBM-64G:~# systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2018-01-28 02:41:25 EST; 13h ago
Process: 2505 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 2489 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Main PID: 2500 (pmxcfs)
Tasks: 7 (limit: 4915)
Memory: 48.6M
CPU: 42.859s
CGroup: /system.slice/pve-cluster.service
└─2500 /usr/bin/pmxcfs
Jan 28 12:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 13:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 14:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 15:19:38 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:43 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:48 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:58 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:20:08 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:36:08 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
root@PVE-NAS-IBM-64G:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2018-01-28 02:41:26 EST; 13h ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 2530 (corosync)
Tasks: 2 (limit: 4915)
Memory: 44.4M
CPU: 8min 44.823s
CGroup: /system.slice/corosync.service
└─2530 /usr/sbin/corosync -f
Jan 28 02:42:15 PVE-NAS-IBM-64G corosync[2530]: [QUORUM] Members[1]: 1
Jan 28 02:42:15 PVE-NAS-IBM-64G corosync[2530]: [MAIN ] Completed service synchronization, ready to provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice [TOTEM ] A new membership (10.200.0.1:23924) was formed. Members joined: 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: [TOTEM ] A new membership (10.200.0.1:23924) was formed. Members joined: 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice [QUORUM] This node is within the primary component and will provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice [QUORUM] Members[2]: 1 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice [MAIN ] Completed service synchronization, ready to provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: [QUORUM] This node is within the primary component and will provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: [QUORUM] Members[2]: 1 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: [MAIN ] Completed service synchronization, ready to provide service.
/etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: IB-LAB-DELL
nodeid: 2
quorum_votes: 1
ring0_addr: 10.200.0.2
}
node {
name: IB-NAS-IBM
nodeid: 1
quorum_votes: 1
ring0_addr: 10.200.0.1
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: MAIN
config_version: 4
interface {
bindnetaddr: 10.200.0.1
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
netmtu: 2043
}