Hi,
We're having some problems replacing a PVE node in a cluster. These are the steps we've taken so far:
Reading here in the forums, it is commonly mentioned that this is likely a unicast vs. multicast issue. Given that the cluster has been working flawlessly for years now, before the hardware replacement, I just don't see how this would be an issue with multicast, especially since we're reusing the exact same switch and switch ports.
The only thing I can explain this with at this point, is that somehow the re-use of the same IP and hostname is creating issues somewhere in the cluster configuration. As far as I can tell, the
/etc/pve/corosync.conf looks good. I've compared it to that of another cluster we have in a different data center and can't find any meaningful differences.
Does anyone have any ideas? Thanks in advance!
We're having some problems replacing a PVE node in a cluster. These are the steps we've taken so far:
- Turn off pve03 (out of 3).
- From pve01, remove pve03 from the cluster with: pvecm delnode pve03
- Unrack the hardware.
- Mount a new server to become the new pve03.
- Perform clean install of PVE on the new server with the same IP and hostname as the the old node, pve03.
- Upgrade new pve03 node.
- From pve03, add it to cluster with pvecm add pve01 (or with IP address of pve01).
- Done; pve03 is now in the cluster.
Code:
root@pve03:~# pvecm s
Quorum information
------------------
Date: Wed Mar 27 12:46:05 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/843000
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.20.13 (local)
Reading here in the forums, it is commonly mentioned that this is likely a unicast vs. multicast issue. Given that the cluster has been working flawlessly for years now, before the hardware replacement, I just don't see how this would be an issue with multicast, especially since we're reusing the exact same switch and switch ports.
The only thing I can explain this with at this point, is that somehow the re-use of the same IP and hostname is creating issues somewhere in the cluster configuration. As far as I can tell, the
/etc/pve/corosync.conf looks good. I've compared it to that of another cluster we have in a different data center and can't find any meaningful differences.
Does anyone have any ideas? Thanks in advance!
Last edited: