I am doing an upgrade from proxmox 3 to 5.1, by reinstalling the two servers I have in my cluster.
I basically moved all VMs from node2 to node1. I installed node2 from scratch and then moved all VMs (manually) for node1 to node2. Then installed node1. Now I have Proxmox 5.1 on both nodes.
I initialted the cluster on node2 where I already had VMs running and now I just want to add node1 to the cluster. When I do this I get the following:
On the other node where I started the cluster I get this in syslog:
What can I do to overcome this and make the nodes join in the cluster?
I basically moved all VMs from node2 to node1. I installed node2 from scratch and then moved all VMs (manually) for node1 to node2. Then installed node1. Now I have Proxmox 5.1 on both nodes.
I initialted the cluster on node2 where I already had VMs running and now I just want to add node1 to the cluster. When I do this I get the following:
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
waiting for quorum...^C
___________________________stopping pve-cluster service
backup old database
Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
waiting for quorum...^C
systemctl status corosync.service
â corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2018-02-24 11:30:13 CET; 17min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 2725 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=20)
Main PID: 2725 (code=exited, status=20)
CPU: 52ms
Feb 24 11:30:13 vms1 corosync[2725]: info [WD ] no resources configured.
Feb 24 11:30:13 vms1 corosync[2725]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
Feb 24 11:30:13 vms1 corosync[2725]: notice [QUORUM] Using quorum provider corosync_votequorum
Feb 24 11:30:13 vms1 corosync[2725]: crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
Feb 24 11:30:13 vms1 corosync[2725]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
Feb 24 11:30:13 vms1 corosync[2725]: error [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
Feb 24 11:30:13 vms1 systemd[1]: Failed to start Corosync Cluster Engine.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Unit entered failed state.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Failed with result 'exit-code'.
_____________________________â corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2018-02-24 11:30:13 CET; 17min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 2725 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=20)
Main PID: 2725 (code=exited, status=20)
CPU: 52ms
Feb 24 11:30:13 vms1 corosync[2725]: info [WD ] no resources configured.
Feb 24 11:30:13 vms1 corosync[2725]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
Feb 24 11:30:13 vms1 corosync[2725]: notice [QUORUM] Using quorum provider corosync_votequorum
Feb 24 11:30:13 vms1 corosync[2725]: crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
Feb 24 11:30:13 vms1 corosync[2725]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
Feb 24 11:30:13 vms1 corosync[2725]: error [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
Feb 24 11:30:13 vms1 systemd[1]: Failed to start Corosync Cluster Engine.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Unit entered failed state.
Feb 24 11:30:13 vms1 systemd[1]: corosync.service: Failed with result 'exit-code'.
journalctl -xe
.....
Feb 24 11:48:30 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
_______________________________.....
Feb 24 11:48:30 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:36 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:42 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:48 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [quorum] crit: quorum_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [confdb] crit: cmap_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [dcdb] crit: cpg_initialize failed: 2
Feb 24 11:48:54 vms1 pmxcfs[2704]: [status] crit: cpg_initialize failed: 2
On the other node where I started the cluster I get this in syslog:
Feb 24 11:41:25 vms2 pveproxy[22913]: Could not verify remote node certificate 'C1:A6:5B:42:41:8F:55:56:53:F9:65:73:8E:52:96:30:F7:AF:B2:62:9C:97:53:4C:88:2E:43:17:82
7:3E:60' with list of pinned certificates, refreshing cache
Feb 24 11:41:30 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
Feb 24 11:41:36 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
Feb 24 11:41:41 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
Feb 24 11:41:30 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
Feb 24 11:41:36 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
Feb 24 11:41:41 vms2 pveproxy[22913]: unable to read '/etc/pve/nodes/vms1/pve-ssl.pem' - No such file or directory
What can I do to overcome this and make the nodes join in the cluster?