hello,
i have 4 nodes - .-pm01 .. .-pm04 all running Virtual Environment 8.2.2.
2 known system difference :
1. .pm01 .. .pm03 have valid subscriptions while .pm04 does not yet has a subscription (will have one in the future)
2. on .pm01 .. .pm03 uname -a: Linux .pm03 6.5.13-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-1 (2024-02-05T13:50Z) x86_64 GNU/Linux
on .pm04 uname -a: Linux .pm04 6.8.4-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) x86_64 GNU/Linux
--
findings:
on .pm01 .. .pm03:
pvecm status
Cluster information
-------------------
Name: testcluster
Config Version: 13
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri May 3 16:58:15 2024
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000002
Ring ID: 1.3a0
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 3
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 *.34
0x00000002 1 *.33 (local)
0x00000003 1 *.35
-----
on ms-pm04:
pvecm status
Cluster information
-------------------
Name: testcluster
Config Version: 13
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri May 3 16:59:36 2024
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000004
Ring ID: 4.3c4
Quorate: No
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 1
Quorum: 3 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000004 1 *.36 (local)
systemctl status corosync says:
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 has no active links
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 has no active links
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] link: Resetting MTU for link 0 because host 4 joined
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Sync members[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Sync joined[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [TOTEM ] A new membership (4.3c9) was formed. Members joined: 4
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Members[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [MAIN ] Completed service synchronization, ready to provide service.
May 03 17:11:46 .pm04 systemd[1]: Started corosync.service - Corosync Cluster Engine.
initially it worked and i was able to migrate a vm from .pm02 to .pm04 and i could start it on .pm04.
/etc/corosync/corosync.conf and /etc/pve/corosync.conf have identical content and seem to be the same on all 4 nodes.
/var/log/syslog on .pm04 states:
2024-05-03T16:40:14.438441+02:00 .pm04 pmxcfs[1619]: [quorum] crit: quorum_initialize failed: 2
2024-05-03T16:40:14.438484+02:00 .pm04 pmxcfs[1619]: [quorum] crit: can't initialize service
2024-05-03T16:40:14.438501+02:00 .pm04 pmxcfs[1619]: [confdb] crit: cmap_initialize failed: 2
2024-05-03T16:40:14.438514+02:00 .pm04 pmxcfs[1619]: [confdb] crit: can't initialize service
2024-05-03T16:40:14.438528+02:00 .pm04 pmxcfs[1619]: [dcdb] crit: cpg_initialize failed: 2
2024-05-03T16:40:14.438549+02:00 .pm04 pmxcfs[1619]: [dcdb] crit: can't initialize service
2024-05-03T16:40:14.438564+02:00 .pm04 pmxcfs[1619]: [status] crit: cpg_initialize failed: 2
2024-05-03T16:40:14.438578+02:00 .pm04 pmxcfs[1619]: [status] crit: can't initialize service
i feel, the nodes are now unable to communicate with each other.
maybe someone can give me a hint to resolve this issue.
thanks in advance, gustav
i have 4 nodes - .-pm01 .. .-pm04 all running Virtual Environment 8.2.2.
2 known system difference :
1. .pm01 .. .pm03 have valid subscriptions while .pm04 does not yet has a subscription (will have one in the future)
2. on .pm01 .. .pm03 uname -a: Linux .pm03 6.5.13-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-1 (2024-02-05T13:50Z) x86_64 GNU/Linux
on .pm04 uname -a: Linux .pm04 6.8.4-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) x86_64 GNU/Linux
--
findings:
on .pm01 .. .pm03:
pvecm status
Cluster information
-------------------
Name: testcluster
Config Version: 13
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri May 3 16:58:15 2024
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000002
Ring ID: 1.3a0
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 3
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 *.34
0x00000002 1 *.33 (local)
0x00000003 1 *.35
-----
on ms-pm04:
pvecm status
Cluster information
-------------------
Name: testcluster
Config Version: 13
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Fri May 3 16:59:36 2024
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000004
Ring ID: 4.3c4
Quorate: No
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 1
Quorum: 3 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000004 1 *.36 (local)
systemctl status corosync says:
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 has no active links
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] host: host: 3 has no active links
May 03 17:11:46 .pm04 corosync[1835]: [KNET ] link: Resetting MTU for link 0 because host 4 joined
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Sync members[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Sync joined[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [TOTEM ] A new membership (4.3c9) was formed. Members joined: 4
May 03 17:11:46 .pm04 corosync[1835]: [QUORUM] Members[1]: 4
May 03 17:11:46 .pm04 corosync[1835]: [MAIN ] Completed service synchronization, ready to provide service.
May 03 17:11:46 .pm04 systemd[1]: Started corosync.service - Corosync Cluster Engine.
initially it worked and i was able to migrate a vm from .pm02 to .pm04 and i could start it on .pm04.
/etc/corosync/corosync.conf and /etc/pve/corosync.conf have identical content and seem to be the same on all 4 nodes.
/var/log/syslog on .pm04 states:
2024-05-03T16:40:14.438441+02:00 .pm04 pmxcfs[1619]: [quorum] crit: quorum_initialize failed: 2
2024-05-03T16:40:14.438484+02:00 .pm04 pmxcfs[1619]: [quorum] crit: can't initialize service
2024-05-03T16:40:14.438501+02:00 .pm04 pmxcfs[1619]: [confdb] crit: cmap_initialize failed: 2
2024-05-03T16:40:14.438514+02:00 .pm04 pmxcfs[1619]: [confdb] crit: can't initialize service
2024-05-03T16:40:14.438528+02:00 .pm04 pmxcfs[1619]: [dcdb] crit: cpg_initialize failed: 2
2024-05-03T16:40:14.438549+02:00 .pm04 pmxcfs[1619]: [dcdb] crit: can't initialize service
2024-05-03T16:40:14.438564+02:00 .pm04 pmxcfs[1619]: [status] crit: cpg_initialize failed: 2
2024-05-03T16:40:14.438578+02:00 .pm04 pmxcfs[1619]: [status] crit: can't initialize service
i feel, the nodes are now unable to communicate with each other.
maybe someone can give me a hint to resolve this issue.
thanks in advance, gustav
Last edited: