[SOLVED] Proxmox 5.4 to 6.x - Corosync 3 - No quorum anymore

Mandarine

New Member
Nov 17, 2020
5
0
1
36
Europe
Hello Proxmox community ! :)

I inherited a Proxmox 2-nodes setup at my new workplace and since it's running the EOL 5.4.15, I thought I would take a look at how to upgrade it to 6.x which is supported.

So I started the procedure here https://pve.proxmox.com/wiki/Upgrade_from_5.x_to_6.0 and Corosync 3 got installed properly on both of my nodes but I seems to have lost quorum now.

I am a fairly new beginner with Proxmox, my previous experience was on VMware 5.x.

Would someone have any pointers to help me out ?

Here are some relevant config/info

PVE2 quorum status

Code:
root@pve2:~# pvecm status
Quorum information
------------------
Date:             Tue Nov 17 15:33:14 2020
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1.14
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:          

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.23.2 (local)

PVE3 quorum status

Code:
root@pve3:~# pvecm status
Quorum information
------------------
Date:             Tue Nov 17 15:33:01 2020
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2.14
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:          

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 192.168.23.3 (local)

/etc/pve/corosync.conf (same on both nodes)

Code:
logging {

  debug: off

  to_syslog: yes

}


nodelist {

  node {

    name: pve2

    nodeid: 1

    quorum_votes: 1

    ring0_addr: pve2

  }

  node {

    name: pve3

    nodeid: 2

    quorum_votes: 1

    ring0_addr: pve3

  }

}


quorum {

  provider: corosync_votequorum

}


totem {

  cluster_name: ha-earlytracks

  config_version: 4

  interface {

    bindnetaddr: 192.168.23.2

    ringnumber: 0

  }

  ip_version: ipv4

  secauth: on

  version: 2

}

The side effect of this is that the "unified" management UI for the 2 nodes doesn't show the PVE3 node anymore (red cross and "unknown" status for VM's)

Appreciate any help !
 
Last edited:
Please post the pveversion -v output of both nodes. Did you run the pve5to6 script before and after every step?
 
Hello Mira and thank you for the reply !

Yes I did run pve5to6 before starting and the only error on both nodes was that Corosync was at v2 which was expected (and hence the v3 upgrade)

To clarify : I did not completed the Proxmox v6 upgrade (yet), I just updated to Corosync v3 and stopped when I saw that the quorum was not working anymore.

PVE2 pveversion -v

Code:
root@pve2:~# pveversion -v
proxmox-ve: 5.4-2 (running kernel: 4.13.13-6-pve)
pve-manager: 5.4-15 (running version: 5.4-15/d0ec33c6)
pve-kernel-4.15: 5.4-19
pve-kernel-4.13: 5.2-2
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
pve-kernel-4.15.18-2-pve: 4.15.18-21
pve-kernel-4.13.16-4-pve: 4.13.16-51
pve-kernel-4.13.16-2-pve: 4.13.16-48
pve-kernel-4.13.13-6-pve: 4.13.13-42
pve-kernel-4.10.17-1-pve: 4.10.17-18
corosync: 3.0.4-pve1~bpo9
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-56
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.5-1~bpo9+2
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-7
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-42
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-56
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

PVE3 pveversion -v

Code:
proxmox-ve: 5.4-2 (running kernel: 4.13.13-6-pve)
pve-manager: 5.4-15 (running version: 5.4-15/d0ec33c6)
pve-kernel-4.15: 5.4-19
pve-kernel-4.13: 5.2-2
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-9-pve: 4.15.18-30
pve-kernel-4.15.18-2-pve: 4.15.18-21
pve-kernel-4.13.16-4-pve: 4.13.16-51
pve-kernel-4.13.16-2-pve: 4.13.16-48
pve-kernel-4.13.13-6-pve: 4.13.13-42
pve-kernel-4.10.17-1-pve: 4.10.17-18
corosync: 3.0.4-pve1~bpo9
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-56
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.5-1~bpo9+2
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-7
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-42
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-56
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
 
You're running a very old kernel, try rebooting. You should also look at the warnings the pve5to6 script prints. Is there anything else, other than the old kernel, it warns about?
 
Thank you, so I did reboot one of the host (PVE3) and re-executed the pve5to6 script, here's the output, I edited the public IP which was shown on IT (REDACTED_PUB_IP instead)

Code:
root@pve3:~# pve5to6
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
PASS: all packages uptodate

Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 5.4-2

Checking running kernel version..
PASS: expected running kernel '4.15.18-30-pve'.

Checking for installed stock Debian Kernel..
PASS: Stock Debian kernel package not installed.

= CHECKING CLUSTER HEALTH/SETTINGS =

PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.

Analzying quorum settings and state..
FAIL: 1 nodes are offline!
INFO: configured votes - nodes: 2
INFO: configured votes - qdevice: 0
INFO: current expected votes: 1
INFO: current total votes: 1
WARN: expected votes set to non-standard value '1'.
WARN: cluster consists of less than three nodes!

Checking nodelist entries..
WARN: pve3: ring0_addr 'pve3' resolves to '192.168.23.3'.
 Consider replacing it with the currently resolved IP address.
WARN: pve2: ring0_addr 'pve2' resolves to 'REDACTED_PUB_IP'.
 Consider replacing it with the currently resolved IP address.

Checking totem settings..
PASS: Corosync transport set to implicit default.
PASS: Corosync encryption and authentication enabled.

INFO: run 'pvecm status' to get detailed cluster status..

= CHECKING INSTALLED COROSYNC VERSION =

PASS: corosync 3.x installed.

= CHECKING HYPER-CONVERGED CEPH STATUS =

SKIP: no hyper-converged ceph setup detected!

= CHECKING CONFIGURED STORAGES =

PASS: storage 'local' enabled and active.
SKIP: storage 'ns3045425.REDACTED_PUB_IP.eu' disabled.
SKIP: storage 'u208666-sub3' disabled.
PASS: storage 'ns3065847.REDACTED_PUB_IP.eu' enabled and active.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 5 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pve3' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '192.168.23.3' configured and active on single interface.
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters security level for TLS connections (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters security level for TLS connections (2048 >= 2048)
PASS: Certificate 'pveproxy-ssl.pem' passed Debian Busters security level for TLS connections (4096 >= 2048)
INFO: Checking KVM nesting support, which breaks live migration for VMs using it..
PASS: KVM nested parameter not set.
INFO: Checking VMs with OVMF enabled and bad efidisk sizes...
PASS: No VMs with OVMF and problematic efidisk found.

= SUMMARY =

TOTAL:    30
PASSED:   21
SKIPPED:  3
WARNINGS: 5
FAILURES: 1

ATTENTION: Please check the output for detailed information!
Try to solve the problems one at a time and then run this checklist tool again.

Maybe it is weird that "pve2" which is one of the 2 nodes (and previous quorum master) is resolved by public IP ?
 
Yes, that is most likely the problem. Either fix it in /etc/hosts on pve2 to resolve to the cluster network IP, or change the corosync config (/etc/pve/corosync.conf) to contain the IP instead of the hostname.
Changing the corosync config file is not that easy currently as your nodes are not quorate, so the /etc/hosts solution will be easier.
 
  • Like
Reactions: Mandarine
Yes ! That was is, upon closer inspection, one of the private IP's in /etc/hosts was wrong.

I now have quorum back after running

Code:
systemctl restart corosync && systemctl start pve-ha-lrm && systemctl start pve-ha-crm

Thank you very much for the help, I'll continue with the upgrade path :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!