[SOLVED] LXC container won't start with HA enabled

pizza

Renowned Member
Nov 7, 2015
101
10
83
I have 2 Proxmox 4.1 in a cluster with NFS storage. KVM with HA enabled (software watchdog) works great, but a LXC container with HA enabled won't start. The task says ok, but the container is stopped?
When I disable HA, the LXC container starts.

Feb 12 09:11:08 pve1 pct[18020]: <root@pam> starting task UPIDve1:0000466A:003644C9:56BD939C:hastart:103:root@pam:
Feb 12 09:11:09 pve1 pct[18020]: <root@pam> end task UPIDve1:0000466A:003644C9:56BD939C:hastart:103:root@pam: OK
 
On one node crm was not started, I started the service but after a while it stops again.

root@pve1:/var/log# ha-manager status
quorum OK
master pve (active, Fri Feb 12 10:16:50 2016)
lrm pve (active, Fri Feb 12 10:16:45 2016)
lrm pve1 (old timestamp - dead?, Fri Feb 12 10:14:35 2016)
service ct:103 (pve1, freeze)
service ct:104 (pve, started)
service ct:105 (pve, error)
service vm:100 (pve1, freeze)

syslog:
Feb 12 10:19:21 pve1 pveproxy[1266]: worker 26356 started
Feb 12 10:19:28 pve1 pve-ha-lrm[26314]: successfully acquired lock 'ha_agent_pve1_lock'
Feb 12 10:19:28 pve1 pve-ha-lrm[26314]: ERROR: unable to open watchdog socket - No such file or directory
Feb 12 10:19:28 pve1 pve-ha-lrm[26314]: restart LRM, freeze all services
Feb 12 10:19:28 pve1 pve-ha-lrm[26314]: server stopped
Feb 12 10:19:28 pve1 systemd[1]: pve-ha-lrm.service: main process exited, code=exited, status=255/n/a
Feb 12 10:19:29 pve1 systemd[1]: Unit pve-ha-lrm.service entered failed state.
Feb 12 10:19:36 pve1 pveproxy[23742]: worker exit
Feb 12 10:19:36 pve1 pveproxy[1266]: worker 23742 finished
Feb 12 10:19:36 pve1 pveproxy[1266]: starting 1 worker(s)
Feb 12 10:19:36 pve1 pveproxy[1266]: worker 26481 started