Issue after upgrading Proxmox cluster: disconnected nodes and pve-cluster service restart

Vladimir.root

New Member
Sep 1, 2023
3
0
1
Hello, esteemed community members!

I have encountered an issue after upgrading our Proxmox cluster from version 7.4 to version 8.1. Half of the 11 cluster nodes stopped functioning properly, which has caused significant problems for our infrastructure.

Specifically, these nodes would disconnect and reconnect until we restarted the pve-cluster service. Only after restarting the service, all nodes became accessible again.

I would like to know if anyone else has experienced a similar issue after upgrading their Proxmox cluster or if there are any recommendations for this situation. Perhaps someone can share their experience or suggest a possible solution.

I would be grateful for any assistance or advice!

Thank you in advance!
 
Hi,

I had a similar problem today.
Our 16 node cluster was upgraded to version 7.4.17 without problems.

Next step was upgrading one node to 8.1.4. During this upgrade 6 nodes was restarted.

Our cluster is used almost 2 years in PROD. Cluster was upraded many times without problems. We have 2 separated network rings for corosync.

I am afraid to proceed with the upgrade of other nodes.

Logs on restarted nodes contains this messages and then was restarted:
Code:
Feb 14 12:03:29 pve1-prg1a corosync[2412]:   [TOTEM ] Token has not been received in 21221 ms
Feb 14 12:03:41 pve1-prg1a corosync[2412]:   [TOTEM ] Token has not been received in 33326 ms
Feb 14 12:03:44 pve1-prg1a corosync[2412]:   [QUORUM] Sync members[9]: 2 3 4 6 8 10 11 12 15
Feb 14 12:03:44 pve1-prg1a corosync[2412]:   [QUORUM] Sync left[7]: 1 5 7 9 13 14 16
Feb 14 12:03:44 pve1-prg1a corosync[2412]:   [TOTEM ] A new membership (2.2b7) was formed. Members left: 1 5 7 9 13 14 16
Feb 14 12:03:44 pve1-prg1a corosync[2412]:   [TOTEM ] Failed to receive the leave message. failed: 1 5 7 9 13 14 16
Feb 14 12:03:44 pve1-prg1a pmxcfs[2260]: [dcdb] notice: members: 2/2270, 3/2260, 4/2273, 6/2259, 8/2338, 10/2340, 11/2312, 12/2330, 15/2340
Feb 14 12:03:44 pve1-prg1a pmxcfs[2260]: [dcdb] notice: starting data syncronisation
Feb 14 12:03:44 pve1-prg1a pmxcfs[2260]: [status] notice: members: 2/2270, 3/2260, 4/2273, 6/2259, 8/2338, 10/2340, 11/2312, 12/2330, 15/2340
Feb 14 12:03:44 pve1-prg1a pmxcfs[2260]: [status] notice: starting data syncronisation
Feb 14 12:03:44 pve1-prg1a pvedaemon[3213573]: <root@pam> successful auth for user 'pve-exporter@pve'
Feb 14 12:03:45 pve1-prg1a pmxcfs[2260]: [status] notice: cpg_send_message retry 10
Feb 14 12:03:46 pve1-prg1a pmxcfs[2260]: [status] notice: cpg_send_message retry 20
Feb 14 12:03:47 pve1-prg1a pmxcfs[2260]: [status] notice: cpg_send_message retry 30
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!