Hi
Today i got that error on all 7 nodes of the cluster.
Each not is up but sees only itself up and all other nodes down.
Error:
Nov 03 17:46:43 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:44 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:45 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:46 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:47 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:48 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:49 kvm38 pvesr[5245]: error during cfs-locked 'file-replication_cfg' operation: no quorum!
Nov 03 17:46:49 kvm38 systemd[1]: pvesr.service: Main process exited, code=exited, status=13/n/a
Nov 03 17:46:49 kvm38 systemd[1]: pvesr.service: Failed with result 'exit-code'.
Nov 03 17:46:49 kvm38 systemd[1]: Failed to start Proxmox VE replication runner.
There is also messages from corosync, like below:
Nov 3 17:58:20 kvm38 corosync[4717]: [KNET ] rx: host: 4 link: 0 is up
Nov 3 17:58:21 kvm38 corosync[4717]: [KNET ] rx: host: 2 link: 0 is up
Nov 3 17:58:24 kvm38 corosync[4717]: [KNET ] rx: host: 7 link: 0 is up
Nov 3 17:58:24 kvm38 corosync[4717]: [KNET ] link: host: 2 link: 0 is down
Nov 3 17:58:25 kvm38 corosync[4717]: [KNET ] link: host: 4 link: 0 is down
Nov 3 17:58:26 kvm38 corosync[4717]: [KNET ] link: host: 8 link: 0 is down
Nov 3 17:58:27 kvm38 corosync[4717]: [KNET ] link: host: 3 link: 0 is down
Nov 3 17:58:28 kvm38 corosync[4717]: [KNET ] link: host: 7 link: 0 is down
Why it can be that interfaces go up/down on host??
Physically they UP on hosts and switches. There is no flapping....
All VM's running there is no any issue with network. People works on 100+ VM's but cluster management is down....
management and VM's connected to the same switches...
Please advice.
Today i got that error on all 7 nodes of the cluster.
Each not is up but sees only itself up and all other nodes down.
Error:
Nov 03 17:46:43 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:44 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:45 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:46 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:47 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:48 kvm38 pvesr[5245]: trying to acquire cfs lock 'file-replication_cfg' ...
Nov 03 17:46:49 kvm38 pvesr[5245]: error during cfs-locked 'file-replication_cfg' operation: no quorum!
Nov 03 17:46:49 kvm38 systemd[1]: pvesr.service: Main process exited, code=exited, status=13/n/a
Nov 03 17:46:49 kvm38 systemd[1]: pvesr.service: Failed with result 'exit-code'.
Nov 03 17:46:49 kvm38 systemd[1]: Failed to start Proxmox VE replication runner.
There is also messages from corosync, like below:
Nov 3 17:58:20 kvm38 corosync[4717]: [KNET ] rx: host: 4 link: 0 is up
Nov 3 17:58:21 kvm38 corosync[4717]: [KNET ] rx: host: 2 link: 0 is up
Nov 3 17:58:24 kvm38 corosync[4717]: [KNET ] rx: host: 7 link: 0 is up
Nov 3 17:58:24 kvm38 corosync[4717]: [KNET ] link: host: 2 link: 0 is down
Nov 3 17:58:25 kvm38 corosync[4717]: [KNET ] link: host: 4 link: 0 is down
Nov 3 17:58:26 kvm38 corosync[4717]: [KNET ] link: host: 8 link: 0 is down
Nov 3 17:58:27 kvm38 corosync[4717]: [KNET ] link: host: 3 link: 0 is down
Nov 3 17:58:28 kvm38 corosync[4717]: [KNET ] link: host: 7 link: 0 is down
Why it can be that interfaces go up/down on host??
Physically they UP on hosts and switches. There is no flapping....
All VM's running there is no any issue with network. People works on 100+ VM's but cluster management is down....
management and VM's connected to the same switches...
Please advice.