Cluster Fails pmxcfs error

hk135

Renowned Member
Nov 3, 2014
25
0
66
Hi There

Just wondering if anyone had run into this error or what it means:

pmxcfs: crit: cpg_send_message failed: 12

All nodes but the local node are red, vms still running but I am unable to migrate or effectively admin.

proxmox-ve-2.6.32: 3.3-147 (running kernel: 2.6.32-37-pve)
pve-manager: 3.4-1 (running version: 3.4-1/3f2d890e)
pve-kernel-2.6.32-37-pve: 2.6.32-147
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.3-20
pve-firmware: 1.1-3
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-31
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-12
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Node Sts Inc Joined Name
1 M 36504 2015-12-10 16:56:02 proxmox1-dh4
2 M 36508 2015-12-10 16:56:02 pve6-dh4
3 M 36504 2015-12-10 16:56:02 proxmox-cibse-dh4
4 M 36504 2015-12-10 16:56:02 proxmox2-dh4
5 M 36516 2015-12-10 16:56:12 proxmox-db1-dh4
6 M 36516 2015-12-10 16:56:12 proxmox-db2-dh4
7 M 36516 2015-12-10 16:56:12 proxmox3-dh4
8 M 36504 2015-12-10 16:56:02 proxmox4-dh4
9 M 36504 2015-12-10 16:56:02 proxmox5-dh4
10 M 36504 2015-12-10 16:56:02 pve-aci-dh4
11 M 36504 2015-12-10 16:56:02 pve-ielts-dh4

Thanks in advance for any help, I am banging my head against a wall with this!!
 
try a restart of pve-manager.

> service pve-manager restart

and update your system, you run old packages.
 
I have already run apt-get update, apt-get upgrade and apt-get dist-upgrade until there are no more upgrades.

Just wondering if there is more info on the error message? What does it mean, where is it all going so terribly terribly wrong?
 
Hi There

I don't think is a corosync communications error as I have quorum and omping is fine, restarting cman has no effect. I also have corosync running on other servers using the same network (but different multicast addresses) and they run fine.

Through trial and error I have determined a fix for this:
1) Stop pve-cluster on all hypervisors
2) Wait a few mins
3) Start pve-cluster on each hypervisor in turn, waiting for the init script to finish before moving onto the next one.

From what I can tell pmxcfs (Proxmox Cluster Filesystem) gets out of sync and needs to resync between all the nodes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!