Cluster HA not working correctly. Files on /etc/pve/qemu-server/ they are not being replicated between nodes. Errors on corosync start

joserosa

New Member
May 19, 2025
3
0
1
Hello everyone,

I have a cluster of two nodes and a qdevice for the quorum. I am using Proxmox VE 9.1, which I recently installed with the official ISO.


I have added my VMs to HA from the data center, and checking the replication of the VMID.conf files stored in /etc/pve/qemu-server, I see that they are not being replicated between both nodes.

Now I have all my VMs on node 1, and on node 2 this folder is empty.


I have been looking at several forum threads but cannot figure it out.


The HA master node is node 1. I am doing the tests on node 2.

When I stop the services with the command:


systemctl stop pve-cluster

systemctl stop corosync




And start only corosync, it starts but I get these warnings:



root@host02:~# systemctl start corosync

sleep 5

pvecm status



ipcc_send_rec[1] failed: Connection refused

ipcc_send_rec[2] failed: Connection refused

ipcc_send_rec[3] failed: Connection refused Unable to load access control list: Connection refused

root@host02:~#



I suspect this is the source of my problem, but I don't know how to proceed.

Can anyone help me?
 
This is my network configuration, same on each node, but with diferents IP. I joined the nodes using the network 10.1.30.X/24 and the management network from i access it's 10.1.10.X/24.

Checking the settings, I see that there are strange things that I haven't done before... The system has done them for me, such as “iface bond0.1.”

My network configuration it's:


Code:
HOST02:~# cat /etc/network/interfaces
np14s0f0 - NIC 4

auto nic5
iface nic5 inet manual
        mtu 9000
#enp14s0f1  - NIC 5

auto nic1.50
iface nic1.50 inet static
        address 10.1.50.21/24
        mtu 9000
#iSCSI-01

auto nic5.26
iface nic5.26 inet static
        address 10.1.26.21/24
        mtu 9000
#iSCSI-02

auto bond0
iface bond0 inet manual
        bond-slaves nic0 nic4
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
        mtu 9000
#BOND0

auto bond0.10
iface bond0.10 inet manual
#BOND - MGMT

auto bond0.20
iface bond0.20 inet manual
#BOND - NFS-STORAGE-20

auto bond0.30
iface bond0.30 inet static
        address 10.1.30.21/24
        mtu 9000
#live migration - heartbeat



auto bond0.11
iface bond0.1# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback
#Interfaces sin uso

iface nic3 inet manual
#enp1s0f1 - NIC 3

iface nic2 inet manual
#enp1s0f0 - NIC 2

auto nic0
iface nic0 inet manual
        mtu 9000
#enp9s0 - MLOM 0

auto nic1
iface nic1 inet manual
        mtu 9000
#enp10s0 - MLOM 1

auto nic4
iface nic4 inet manual
        mtu 9000
#e1 inet manual
#BOND - SERVER-11

....

....


auto SERVER_10
iface SERVER_10 inet static
        address 10.1.10.21/24
        gateway 10.1.10.3
        bridge-ports bond0.10
        bridge-stp off
        bridge-fd 0
        mtu 1500
#VM network - MGMT 10

auto NFS_STORAGE_20
iface NFS_STORAGE_20 inet static
        address 10.1.20.21/24
        bridge-ports bond0.20
        bridge-stp off
        bridge-fd 0
        mtu 1500
#VM network - NFS-STORAGE-20

....

....


source /etc/network/interfaces.d/*
 
Last edited:
I see you have bond for interfaces nic0 nic4 with a policy of xmit-hash-policy in layer 2+3 this uses a combination of MAC address and IP addresses to do the bond for balancing.
May your switch does not support this feature?
You can ping between the cluster interfaces (10.1.30.X/24)?
 
Hello, the switches support it. These hosts are connected via Cisco Nexus switches, with LACP, VPC, etc. configuration. They are in trunk mode with the VLANs passed, etc. At the network level, everything seems to be OK. Pings between devices on the 10.1.10.X network work.



Pings between devices on the 10.1.30.X network work.


The device qdevice also has two interfaces, one on the 10.1.10.X network and another on the 10.1.30.X network.

When I configured the quorum between the hosts and the qdevice, I did so using the 10.1.30.X network.
 
Last edited:
If you are using VPC i recommend you to use openvswitch and do a lacp with the following parameters:
Code:
bond_mode=balance-tcp lacp=active

I put a piece of a sample config

Code:
apt install openvswitch-switch

Code:
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

iface idrac inet manual

auto eno2
iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

auto gestion
iface gestion inet static
        address 10.100.5.242/24
        gateway 10.100.5.1
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=105

auto cluster
iface cluster inet static
        address 10.100.10.242/28
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=110

auto ceph
iface ceph inet static
        address 10.100.12.242/28
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=112

auto cephpublic
iface ceph inet static
        address 10.100.13.242/28
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=113

auto migration
iface ceph inet static
        address 10.100.14.242/28
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=114

auto backups
iface ceph inet static
        address 10.100.15.242/24
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=115

auto storage
iface alma inet static
        address 10.100.22.242/24
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=122

auto bond0
iface bond0 inet manual
        ovs_bonds eno1 eno2
        ovs_type OVSBond
        ovs_bridge vmbr0
        ovs_options bond_mode=balance-tcp lacp=active

auto vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports bond0 gestion ceph cephpublic migration storage backups cluster