HA cluster and issue

clubzero

New Member
May 9, 2016
6
0
1
40
Hello,

I installed March 1 node proxmox 4.2 cluster to make HA. But in the HA tab we did not see the master server, more when I add a VM HA statue remains are "queued". Finally, it is also impossible to start the watchdogs service.

Thank you in advance
 
Please explain your problem in full detail, describe your cluster configuration.
 
Hello,

Here is the result of pveversion -v
Code:
root@proxmox1:~# pveversion -v
proxmox-ve: 4.2-49 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-4 (running version: 4.2-4/2660193c)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.8-1-pve: 4.4.8-49
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-74
pve-firmware: 1.1-8
libpve-common-perl: 4.0-60
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-15
pve-container: 1.0-63
pve-firewall: 2.0-26
pve-ha-manager: 1.0-31
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
pvecm status
Code:
root@proxmox1:~# pvecm status
Quorum information
------------------
Date:  Tue May 10 10:25:34 2016
Quorum provider:  corosync_votequorum
Nodes:  3
Node ID:  0x00000002
Ring ID:  12
Quorate:  Yes

Votequorum information
----------------------
Expected votes:  3
Highest expected: 3
Total votes:  3
Quorum:  2  
Flags:  Quorate

Membership information
----------------------
  Nodeid  Votes Name
0x00000001  1 192.168.1.101 (local)
0x00000003  1 192.168.1.102
0x00000002  1 192.168.1.103

I saw that the service was failing on all nodes, in the log below the node seems to have the slave status and no node appears as master
Code:
root@proxmox1:~# systemctl status pve-ha-crm.service
● pve-ha-crm.service - PVE Cluster Ressource Manager Daemon
  Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled)
  Active: failed (Result: exit-code) since Tue 2016-05-10 10:20:05 CEST; 6min ago
  Process: 9709 ExecStop=/usr/sbin/pve-ha-crm stop (code=exited, status=0/SUCCESS)
 Main PID: 8776 (code=exited, status=255)

May 10 10:13:54 proxmox1 pve-ha-crm[8776]: ipcc_send_rec failed: Transport endpoint is not connected
May 10 10:13:54 proxmox1 pve-ha-crm[8776]: ipcc_send_rec failed: Connection refused
May 10 10:13:54 proxmox1 pve-ha-crm[8776]: ipcc_send_rec failed: Connection refused
May 10 10:18:09 proxmox1 pve-ha-crm[8776]: status change wait_for_quorum => slave
May 10 10:20:04 proxmox1 pve-ha-crm[8776]: successfully acquired lock 'ha_manager_lock'
May 10 10:20:04 proxmox1 pve-ha-crm[8776]: ERROR: unable to open watchdog socket - No such file or directory
May 10 10:20:04 proxmox1 pve-ha-crm[8776]: server received shutdown request
May 10 10:20:04 proxmox1 pve-ha-crm[8776]: server stopped
May 10 10:20:04 proxmox1 systemd[1]: pve-ha-crm.service: main process exited, code=exited, status=255/n/a
May 10 10:20:05 proxmox1 systemd[1]: Unit pve-ha-crm.service entered failed state.

Furthermore, the vm "Test" enabled in HA has always status "queued" and no trace of the master :
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!