Cluster requires IPv4 gateway?

owlnical

Member
Jun 1, 2017
3
1
6
35
Hi,

Due to a power failure in our building I had to cold boot our two Proxmox nodes. I was unable to start any VMs until both nodes could reach their default gw. The problem was (thanks to this post) that the machine with that IP is a KVM with pfsense in our cluster.

Code:
pvecm status
Cannot initialize CMAP service

journalctl -xn
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [quorum] crit: quorum_initialize failed: 2
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [quorum] crit: can't initialize service
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [confdb] crit: cmap_initialize failed: 2
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [confdb] crit: can't initialize service
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [dcdb] crit: cpg_initialize failed: 2
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [dcdb] crit: can't initialize service
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [status] crit: cpg_initialize failed: 2
Jun 01 10:58:16 shpx01 pmxcfs[5802]: [status] crit: can't initialize service

So I assigned the gw IP to my laptop, rebooted the server and then everything was fine.

Of course I want to avoid this catch-22 in the future. My googling has failed me though... Can I somehow force PVE to ignore that the gateway cannot be reached? Os is this bad practice? Any advice/input on this would be greatly appreciated.

node info

Code:
root@shpx01:~# pveversion -v
proxmox-ve: 4.4-87 (running kernel: 4.4.59-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.24-1-pve: 4.4.24-72
pve-kernel-4.4.19-1-pve: 4.4.19-66
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-99
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
openvswitch-switch: 2.6.0-2
 
If you have hostnames and not IPs in your corosync.conf, does the hostname resolution require access to a third machine which is outside your LAN ?
In that case I would recommend you to put in the hostname and IP in /etc/hosts.
 
Hi manu, thanks for looking into this. The nameserver is hosted on a VM as well, so it was unavailable at the time. I'll add node info to /etc/hosts on both nodes and try the cold boot precedure again this weekend. But if this was the problem, why did my fix make the cluster work? My laptop did not forward any traffic, is simply assigned the gw IP to eth0, nothing more.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!