Upgrade Proxmox 3.4 to premium repositories: quorum problem

aluco

New Member
Nov 30, 2015
3
0
1
I have a 2 node proxmox 3.4 cluster, with separate quorum device.

I upgraded the proxmox nodes using premium repositories, with success.

Nevertheless, after the reboots, one node loose quorum and the other, for mysterious reasons, interferes with another cluster we have, using proxmox 2.3:

proxmox 2.3 cluster: /var/log/syslog
Nov 25 10:59:45 a84 rgmanager[641850]: [pvevm] VM 41245 is running
Nov 25 10:59:45 a84 rgmanager[641864]: [pvevm] VM 243 is running
Nov 25 10:59:46 a84 corosync[3031]: [TOTEM ] Retransmit List: 298b1 298b2 298b
3 298b4 298b5 298b6 298b7 298b8 298b9 298ba 298bb 298bc 298bd 298be 298bf 298c0
298c1

If I power down the 3.4 nodes, the messages disappear from the neighbor 2.3 cluster and works normally with no issues at all.

First I thought the problem was multicast issues, I reconfigured the cluster for unicast, modified some firewall rules, check the switches, and after several tests, down-times, and so on, I discovered if I boot proxmox 3.4-premium nodes with old kernel, the quorum problem no longer exists and all nodes and clusters work fine.

stock kernel: 2.6.32-39-pve
premium kernel: 2.6.32-43-pve

short story:

- one proxmox 2.3 cluster working normally.
- another 3.4 proxmox cluster.
|->after upgrade 3.4 nodes, proxmox 2.3 cluster looses rgmanager conectivity and messages from syslog are flooded with 'retransmit list' and alike. 3.4 cluster never gets Quorate status (inquorate).

powering down the 3.4 nodes, or booting 3.4 nodes with stock kernel, the problem is gone from the 2.3 cluster.

Please help me with this issue.

Regards,

Alfredo Luco.
 
Hi,
If I understand you correct you have a mixed Cluster with 2.3 and 3.4?
if yes this is not recommended the nodes should always have the same Version.
 
If the 2cluster are really differents cluster, it could be :

- do you have the same clustername on both cluster ?

- do you have an igmp querier on your network ? (switches,router ?).
if not, the proxmox nodes can become igmp querier, and on node reboot, you can have multicast problem.
 
Hi Wolfgang

No, I have a 2.3 cluster and another 3.4 cluster, they are different.

Regards,

Alfredo.
 
Hi Spirit

- No, the clustername is not the same.

- I think the IGMP querier is the real problem, I'll check the swith and post the results.

Thank you for your help.

Alfredo.
 
Hi wolfgang. I need some help with proxmox-4. I have the same issue with my 4 nodes. They have the same version of pve but I dont' have a comertional subscription.
[TOTEM ] Retransmit List: 298b1 298b2 298b
3 298b4 298b5 298b6 298b7 298b8 298b9 298ba 298bb 298bc 298bd 298be 298bf 298c0
298c1
I see it in my logs before my servers go down. They just reboot. I started to dig and I know that It's watchdog the reason. I did not have any problem with version 3. But now It becomes a real rpoblem. We have our servers rebooted one in 5 days or frequently. It;s posible to have 2 times a day. We tryed to switch our servers in one separed switch in case of network problems but It did not help. We use HA with the only purpose - to have the common management interface for the all servers in a cluster. Other options we do not use. Is there any chance to turn off the watchdog/softdog? Or may be other option could help us. Because reboot drives us crazy.
Thank you in advance.

 
Last edited:
Hi,
this is not related to this thread please make a new one.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!