master : old timestamp dead ?

Aug 30, 2013
22
0
1
48
Ireland
www.digital-netcom.com
Hi all, i've a 3 nodes cluster (Proxmox VE 4.1) with active subscription. On this cluster i had a HA configured for one VM. I've delete this HA configuration because of a network maintenance. But now, when i am going in HA i can see this message on the "master" line :

proxmox.png


Any idea for this issue ?

kind regards,

--
Christophe Casalegno
http://www.christophe-casalegno.com
 
There must be something if pve-ha-crm service is running...
What is the output of (on netcloud2)

# systemctl status pve-ha-crm.service
 
root@netcloud2:~# systemctl status pve-ha-crm.service
● pve-ha-crm.service - PVE Cluster Ressource Manager Daemon
Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled)
Active: active (running) since Sun 2016-03-13 11:15:56 CET; 6h ago
Process: 1991 ExecStart=/usr/sbin/pve-ha-crm start (code=exited, status=0/SUCCESS)
Main PID: 1993 (pve-ha-crm)
CGroup: /system.slice/pve-ha-crm.service
└─1993 pve-ha-crm

Mar 13 11:15:56 netcloud2 pve-ha-crm[1993]: starting server
Mar 13 11:15:56 netcloud2 pve-ha-crm[1993]: status change startup => wait_for_quorum
Mar 13 11:15:56 netcloud2 systemd[1]: Started PVE Cluster Ressource Manager Daemon.
 
3 day without sleeping, i've not verify how is this status when ha isn't enabled, i 've just ha cluster in this version.

I am confused now - do you have a problem or not? If you just want to check normal cluster status use:

# pvecm status
 
Dear Dietmar, it is not a problem : just want to know if it is "normal", you have answered me yes, so no problem. I know for pvecm status, i manage a lot of proxmox clusters, but in v4 i've only HA clusters in production with ha VM (with kvm), so never see this false "problem". Now i'll add this to my notes for the future. Thank for all :)

kind regards
 
We recently had this same "problem", glad to see it was only a problem in status display. We couldn't figure it out, even went as far as to start looking for alternatives.
 
Hello, searching the solution on forum , I dont have any solution. But working on my same bug on my unique cluster, i found the solution.

I remove the file /etc/ha/manager_status and the system is ok.
 
Hi,

I am also having this issue.. Though mine might be different.. Let me explain.

I needed to change the IP's in the corosync to get them off the same network as everything else. (it was causing other issues). After updating the corosync file I can't seem to get quorum working again.

Running: systemctl status pve-ha-crm.service on all servers bring back the same

systemctl status pve-ha-crm.service

● pve-ha-crm.service - PVE Cluster Ressource Manager Daemon

Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled; vendor prese

Active: active (running) since Sat 2018-10-27 02:33:52 EDT; 8h ago

Main PID: 3009 (pve-ha-crm)

Tasks: 1 (limit: 4915)

Memory: 79.8M

CPU: 3.176s

CGroup: /system.slice/pve-ha-crm.service

└─3009 pve-ha-crm


Oct 27 02:33:51 pve1 systemd[1]: Starting PVE Cluster Ressource Manager Daemon..

Oct 27 02:33:52 pve1 pve-ha-crm[3009]: starting server

Oct 27 02:33:52 pve1 pve-ha-crm[3009]: status change startup => wait_for_quorum

Oct 27 02:33:52 pve1 systemd[1]: Started PVE Cluster Ressource Manager Daemon.


Could someone please point me in the right path to get this back up. Yes all IP's (new) are pingable from all server.
 
I think this is a clue but I don't know what it means.

root@pve1:~# tail /var/log/syslog

Oct 27 11:05:05 pve1 pvesr[130012]: trying to acquire cfs lock 'file-replication_cfg' ...

Oct 27 11:05:06 pve1 pvesr[130012]: trying to acquire cfs lock 'file-replication_cfg' ...

Oct 27 11:05:07 pve1 pvesr[130012]: trying to acquire cfs lock 'file-replication_cfg' ...

Oct 27 11:05:08 pve1 pvesr[130012]: trying to acquire cfs lock 'file-replication_cfg' ...

Oct 27 11:05:09 pve1 pvesr[130012]: trying to acquire cfs lock 'file-replication_cfg' ...

Oct 27 11:05:10 pve1 pvesr[130012]: error with cfs lock 'file-replication_cfg': no quorum!

Oct 27 11:05:10 pve1 systemd[1]: pvesr.service: Main process exited, code=exited, status=13/n/a

Oct 27 11:05:10 pve1 systemd[1]: Failed to start Proxmox VE replication runner.

Oct 27 11:05:10 pve1 systemd[1]: pvesr.service: Unit entered failed state.

Oct 27 11:05:10 pve1 systemd[1]: pvesr.service: Failed with result 'exit-code'.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!