Cluster not being detected in webUI, Quorum OK

2nodepxx

Member
Jan 28, 2018
1
0
6
34
Hello,

Just started with proxmox, and decided to start a little homelab. I am attempting to setup a 2 node cluster with infiniband host to host used for corosync, followed the guide. I think I have setup the infiniband properly, as I am able to omping, and nodes 'see' each other according to pvecm.

I have tried restarting several times (both nodes, and services, to no avail..)

What I am curious about is why the config versions in both .members and corosync.conf are all different? does that have anything to do with clusters not in sync ?

screenshots:
hxxps://imgur.com/a/Krj6z

debug cluster service:
hxxps://hastebin.com/zonidizaqi.swift


Code:
/etc/pve/.members on 10.0.100.25
{
"nodename": "PVE-NAS-IBM-64G",
"version": 7,
"cluster": { "name": "MAIN", "version": 4, "nodes": 2, "quorate": 1 },
"nodelist": {
  "IB-LAB-DELL": { "id": 2, "online": 1, "ip": "10.0.100.26"},
  "IB-NAS-IBM": { "id": 1, "online": 1}
  }
}


/etc/pve/.members on 10.0.100.26
{
"nodename": "PVE-LAB-DELL-72G",
"version": 8,
"cluster": { "name": "MAIN", "version": 4, "nodes": 2, "quorate": 1 },
"nodelist": {
  "IB-LAB-DELL": { "id": 2, "online": 1},
  "IB-NAS-IBM": { "id": 1, "online": 1, "ip": "10.0.100.25"}
  }
}




root@PVE-NAS-IBM-64G:~# pvecm status
Quorum information
------------------
Date:             Sun Jan 28 15:33:49 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1/23924
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.200.0.1 (local)
0x00000002          1 10.200.0.2


root@PVE-NAS-IBM-64G:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 10.200.0.1 (local)
         2          1 10.200.0.2



/etc/hosts on 10.0.100.25
127.0.0.1 localhost.localdomain localhost
10.0.100.25 PVE-NAS-IBM-64G.local.domain PVE-NAS-IBM-64G
10.0.100.26 PVE-LAB-DELL-72G.local.domain PVE-LAB-DELL-72G
10.200.0.1 IB-NAS-IBM.local.domain IB-NAS-IBM
10.200.0.2 IB-LAB-DELL.local.domain IB-LAB-DELL



/etc/network/interfaces on 10.0.100.25
auto lo
iface lo inet loopback

iface enp11s0f0 inet manual

iface enp11s0f1 inet manual

iface enp0s29f0u2 inet manual

auto ib0
iface ib0 inet static
        address  10.200.0.1
        netmask  255.255.255.252
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib0/mode
        mtu 65520

auto bond0
iface bond0 inet manual
        slaves enp11s0f0 enp11s0f1
        bond_miimon 100
        bond_mode 802.3ad
        bond_xmit_hash_policy layer2+3

auto vmbr0
iface vmbr0 inet static
        address  10.0.100.25
        netmask  255.255.255.0
        gateway  10.0.100.1
        bridge_ports bond0
        bridge_stp off
        bridge_fd 0

/etc/network/interfaces on 10.0.100.26
auto lo
iface lo inet loopback

iface enp6s0f0 inet manual

iface enp6s0f1 inet manual

auto ib0
iface ib0 inet static
        address  10.200.0.2
        netmask  255.255.255.252
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib0/mode
        mtu 65520

auto bond0
iface bond0 inet manual
        slaves enp6s0f0 enp6s0f1
        bond_miimon 100
        bond_mode 802.3ad
        bond_xmit_hash_policy layer2+3

auto vmbr0
iface vmbr0 inet static
        address  10.0.100.26
        netmask  255.255.255.0
        gateway  10.0.100.1
        bridge_ports bond0
        bridge_stp off
        bridge_fd 0



root@PVE-NAS-IBM-64G:~# systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-28 02:41:25 EST; 13h ago
  Process: 2505 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
  Process: 2489 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
 Main PID: 2500 (pmxcfs)
    Tasks: 7 (limit: 4915)
   Memory: 48.6M
      CPU: 42.859s
   CGroup: /system.slice/pve-cluster.service
           └─2500 /usr/bin/pmxcfs

Jan 28 12:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 13:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 14:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful
Jan 28 15:19:38 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:43 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:48 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:19:58 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:20:08 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:36:08 PVE-NAS-IBM-64G pmxcfs[2500]: [status] notice: received log
Jan 28 15:41:24 PVE-NAS-IBM-64G pmxcfs[2500]: [dcdb] notice: data verification successful


root@PVE-NAS-IBM-64G:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-28 02:41:26 EST; 13h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 2530 (corosync)
    Tasks: 2 (limit: 4915)
   Memory: 44.4M
      CPU: 8min 44.823s
   CGroup: /system.slice/corosync.service
           └─2530 /usr/sbin/corosync -f

Jan 28 02:42:15 PVE-NAS-IBM-64G corosync[2530]:  [QUORUM] Members[1]: 1
Jan 28 02:42:15 PVE-NAS-IBM-64G corosync[2530]:  [MAIN  ] Completed service synchronization, ready to provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice  [TOTEM ] A new membership (10.200.0.1:23924) was formed. Members joined: 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]:  [TOTEM ] A new membership (10.200.0.1:23924) was formed. Members joined: 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice  [QUORUM] This node is within the primary component and will provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice  [QUORUM] Members[2]: 1 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]: notice  [MAIN  ] Completed service synchronization, ready to provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]:  [QUORUM] This node is within the primary component and will provide service.
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]:  [QUORUM] Members[2]: 1 2
Jan 28 02:44:24 PVE-NAS-IBM-64G corosync[2530]:  [MAIN  ] Completed service synchronization, ready to provide service.

/etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: IB-LAB-DELL
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.200.0.2
  }
  node {
    name: IB-NAS-IBM
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.200.0.1
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: MAIN
  config_version: 4
  interface {
    bindnetaddr: 10.200.0.1
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
  netmtu: 2043
}
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!