Hello I had a Proxmox cluster with all nodes update to 8.3.3
Today frpm dashboard I noticed that only one node had a red cross and the other 4 ones where green.
So I started investigating.
On all nodes I get the following
pvecm status
root@pve03:/etc# pvecm status
Cluster information
-------------------
Name: US01
Config Version: 7
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Apr 17 17:24:38 2025
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000004
Ring ID: 1.161f5
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 4
Quorum: 3
Flags: 2Node Quorate WaitForAll
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 182.xx.y.227
0x00000002 1 182.xx.y.228
0x00000003 1 182.xx.y.229
0x00000004 1 182.xx.y.230 (local)
root@pve03:/etc#
systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
Active: active (running) since Thu 2025-02-27 16:00:17 CET; 1 month 18 days ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 52581 (corosync)
Tasks: 9 (limit: 154510)
Memory: 146.4M
CPU: 1d 21h 2min 26.287s
CGroup: /system.slice/corosync.service
└─52581 /usr/sbin/corosync -f
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] link: host: 5 link: 0 is down
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] host: host: 5 has no active links <===== how to active ?
Apr 17 07:23:27 pve03 corosync[52581]: [TOTEM ] Token has not been received in 3712 ms
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Sync members[4]: 1 2 3 4
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Sync left[1]: 5
Apr 17 07:23:34 pve03 corosync[52581]: [TOTEM ] A new membership (1.161f5) was formed. Members left: 5
Apr 17 07:23:34 pve03 corosync[52581]: [TOTEM ] Failed to receive the leave message. failed: 5
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Members[4]: 1 2 3 4
Apr 17 07:23:34 pve03 corosync[52581]: [MAIN ] Completed service synchronization, ready to provide service.
root@pve03:~#
root@pve04:/etc/pve# pvecm status
Can't use an undefined value as a HASH reference at /usr/share/perl5/PVE/CLI/pvecm.pm line 496, <DATA> line 960.
root@pve04:/etc/pve# systemctl status corosync.service
× corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Thu 2025-04-17 16:17:50 CEST; 40min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1316 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
Main PID: 1316 (code=exited, status=8)
CPU: 10ms
Apr 17 16:17:50 pve04 systemd[1]: Starting corosync.service - Corosync Cluster Engine...
Apr 17 16:17:50 pve04 corosync[1316]: parser error: /etc/corosync/corosync.conf:54: Missing closing brace
Apr 17 16:17:50 pve04 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Apr 17 16:17:50 pve04 systemd[1]: corosync.service: Failed with result 'exit-code'.
Apr 17 16:17:50 pve04 systemd[1]: Failed to start corosync.service - Corosync Cluster Engine.
On node red crossed I noticed file corosync missing a bracket but I am not to update it
file to update is
/etc/pve/corosync.conf
How can I update it ?
Thanks
/Franco
Today frpm dashboard I noticed that only one node had a red cross and the other 4 ones where green.
So I started investigating.
On all nodes I get the following
pvecm status
root@pve03:/etc# pvecm status
Cluster information
-------------------
Name: US01
Config Version: 7
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Thu Apr 17 17:24:38 2025
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000004
Ring ID: 1.161f5
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 4
Quorum: 3
Flags: 2Node Quorate WaitForAll
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 182.xx.y.227
0x00000002 1 182.xx.y.228
0x00000003 1 182.xx.y.229
0x00000004 1 182.xx.y.230 (local)
root@pve03:/etc#
systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
Active: active (running) since Thu 2025-02-27 16:00:17 CET; 1 month 18 days ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 52581 (corosync)
Tasks: 9 (limit: 154510)
Memory: 146.4M
CPU: 1d 21h 2min 26.287s
CGroup: /system.slice/corosync.service
└─52581 /usr/sbin/corosync -f
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] link: host: 5 link: 0 is down
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 17 07:23:25 pve03 corosync[52581]: [KNET ] host: host: 5 has no active links <===== how to active ?
Apr 17 07:23:27 pve03 corosync[52581]: [TOTEM ] Token has not been received in 3712 ms
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Sync members[4]: 1 2 3 4
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Sync left[1]: 5
Apr 17 07:23:34 pve03 corosync[52581]: [TOTEM ] A new membership (1.161f5) was formed. Members left: 5
Apr 17 07:23:34 pve03 corosync[52581]: [TOTEM ] Failed to receive the leave message. failed: 5
Apr 17 07:23:34 pve03 corosync[52581]: [QUORUM] Members[4]: 1 2 3 4
Apr 17 07:23:34 pve03 corosync[52581]: [MAIN ] Completed service synchronization, ready to provide service.
root@pve03:~#
root@pve04:/etc/pve# pvecm status
Can't use an undefined value as a HASH reference at /usr/share/perl5/PVE/CLI/pvecm.pm line 496, <DATA> line 960.
root@pve04:/etc/pve# systemctl status corosync.service
× corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Thu 2025-04-17 16:17:50 CEST; 40min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1316 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
Main PID: 1316 (code=exited, status=8)
CPU: 10ms
Apr 17 16:17:50 pve04 systemd[1]: Starting corosync.service - Corosync Cluster Engine...
Apr 17 16:17:50 pve04 corosync[1316]: parser error: /etc/corosync/corosync.conf:54: Missing closing brace
Apr 17 16:17:50 pve04 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Apr 17 16:17:50 pve04 systemd[1]: corosync.service: Failed with result 'exit-code'.
Apr 17 16:17:50 pve04 systemd[1]: Failed to start corosync.service - Corosync Cluster Engine.
On node red crossed I noticed file corosync missing a bracket but I am not to update it
file to update is
/etc/pve/corosync.conf
How can I update it ?
Thanks
/Franco