Hi,
we are facing issues with one Node for some days now.
It has a red X beside the Nodename. When I restart corosync, it works again for ~ 1 minute. Then it fails to the raid X state again.
When that happens, pvesr.service fails:
Any ideas what can cause that issue?
we are facing issues with one Node for some days now.
It has a red X beside the Nodename. When I restart corosync, it works again for ~ 1 minute. Then it fails to the raid X state again.
When that happens, pvesr.service fails:
systemctl status pvesr
● pvesr.service - Proxmox VE replication runner
Loaded: loaded (/lib/systemd/system/pvesr.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2018-07-17 11:44:09 CEST; 6s ago
Process: 7757 ExecStart=/usr/bin/pvesr run --mail 1 (code=exited, status=13)
Main PID: 7757 (code=exited, status=13)
CPU: 516ms
Jul 17 11:44:04 captive005-74001 pvesr[7757]: trying to aquire cfs lock 'file-replication_cfg' ...
Jul 17 11:44:05 captive005-74001 pvesr[7757]: trying to aquire cfs lock 'file-replication_cfg' ...
Jul 17 11:44:06 captive005-74001 pvesr[7757]: trying to aquire cfs lock 'file-replication_cfg' ...
Jul 17 11:44:07 captive005-74001 pvesr[7757]: trying to aquire cfs lock 'file-replication_cfg' ...
Jul 17 11:44:08 captive005-74001 pvesr[7757]: trying to aquire cfs lock 'file-replication_cfg' ...
Jul 17 11:44:09 captive005-74001 pvesr[7757]: error with cfs lock 'file-replication_cfg': no quorum!
Jul 17 11:44:09 captive005-74001 systemd[1]: pvesr.service: Main process exited, code=exited, status=13/n/a
Jul 17 11:44:09 captive005-74001 systemd[1]: Failed to start Proxmox VE replication runner.
Jul 17 11:44:09 captive005-74001 systemd[1]: pvesr.service: Unit entered failed state.
Jul 17 11:44:09 captive005-74001 systemd[1]: pvesr.service: Failed with result 'exit-code'.
Any ideas what can cause that issue?