unable to acquire pmxcfs lock - trying again

Aug 6, 2019
2
0
1
47
[main] notice: unable to acquire pmxcfs lock - trying again
[main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
[main] notice: exit proxmox configuration filesystem (-1)

What i'm suppose to do with that ?

I have only acces by the GUI to pve01.. pve02 and pve03 is only accessible by ssh

attached: report file for node pve01, pve02 and pve03
 

Attachments

  • pve01.txt
    88.8 KB · Views: 14
  • pve02.txt
    86.9 KB · Views: 1
  • pve03.txt
    88.3 KB · Views: 1
Last edited:
[main] notice: unable to acquire pmxcfs lock - trying again
[main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
[main] notice: exit proxmox configuration filesystem (-1)

What i'm suppose to do with that ?

I have only acces by the GUI to pve01.. pve02 and pve03 is only accessible by ssh

attached: report file for node pve01, pve02 and pve03



The corosync cluster is down. Check your Cluster network(s) and restart cluster service:

Code:
systemctl  restart pve-cluster.service
 
NODE 1

root@pve01:~# systemctl restart pve-cluster.service

root@pve01:~# systemctl status pve-cluster.service

  • pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2019-08-12 09:44:34 EDT; 13s ago
Process: 2145714 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Process: 2145727 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Main PID: 2145719 (pmxcfs)
Tasks: 6 (limit: 7372)
Memory: 19.2M
CGroup: /system.slice/pve-cluster.service └─2145719 /usr/bin/pmxcfs
Aug 12 09:44:33 pve01 pmxcfs[2145719]: [status] crit: can't initialize service Aug 12 09:44:34 pve01 systemd[1]: Started The Proxmox VE cluster filesystem. Aug 12 09:44:39 pve01 pmxcfs[2145719]: [quorum] crit: quorum_initialize failed: 2 Aug 12 09:44:39 pve01 pmxcfs[2145719]: [confdb] crit: cmap_initialize failed: 2 Aug 12 09:44:39 pve01 pmxcfs[2145719]: [dcdb] crit: cpg_initialize failed: 2 Aug 12 09:44:39 pve01 pmxcfs[2145719]: [status] crit: cpg_initialize failed: 2 Aug 12 09:44:45 pve01 pmxcfs[2145719]: [quorum] crit: quorum_initialize failed: 2 Aug 12 09:44:45 pve01 pmxcfs[2145719]: [confdb] crit: cmap_initialize failed: 2 Aug 12 09:44:45 pve01 pmxcfs[2145719]: [dcdb] crit: cpg_initialize failed: 2 Aug 12 09:44:45 pve01 pmxcfs[2145719]: [status] crit: cpg_initialize failed: 2

-----------------------------------

NODE 2

root@pve02:~# systemctl restart pve-cluster.service Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xe" for details.

root@pve02:~# systemctl status pve-cluster.service

● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2019-08-12 09:52:53 EDT; 33s ago
Process: 217828 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

Aug 12 09:52:53 pve02 systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Aug 12 09:52:53 pve02 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Aug 12 09:52:53 pve02 systemd[1]: Stopped The Proxmox VE cluster filesystem.
Aug 12 09:52:53 pve02 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Aug 12 09:52:53 pve02 systemd[1]: pve-cluster.service: Failed with result 'exit-code'. Aug 12 09:52:53 pve02 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

root@pve02:~# journalctl -xe

-- Subject: A start job for unit pvesr.service has begun execution -- Defined-By: systemd

--
-- A start job for unit pvesr.service has begun execution.
--
-- The job identifier is 613367.
Aug 12 09:56:00 pve02 pvesr[221621]: ipcc_send_rec[1] failed: Connection refused
Aug 12 09:56:00 pve02 pvesr[221621]: ipcc_send_rec[2] failed: Connection refused
Aug 12 09:56:00 pve02 pvesr[221621]: ipcc_send_rec[3] failed: Connection refused
Aug 12 09:56:00 pve02 pvesr[221621]: Unable to load access control list: Connection refused
Aug 12 09:56:00 pve02 systemd[1]: pvesr.service: Main process exited, code=exited, status=111/n/a -- Subject: Unit process exited
-- Defined-By: systemd

--
-- An ExecStart= process belonging to unit pvesr.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 111.
Aug 12 09:56:00 pve02 systemd[1]: pvesr.service: Failed with result 'exit-code'. -- Subject: Unit failed
-- Defined-By: systemd

--
-- The unit pvesr.service has entered the 'failed' state with result 'exit-code'.
Aug 12 09:56:00 pve02 systemd[1]: Failed to start Proxmox VE replication runner. -- Subject: A start job for unit pvesr.service has failed
-- Defined-By: systemd

--
-- A start job for unit pvesr.service has finished with a failure.
--
-- The job identifier is 613367 and the job result is failed.
Aug 12 09:56:01 pve02 cron[2651]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Aug 12 09:56:04 pve02 pveproxy[221618]: worker exit
Aug 12 09:56:04 pve02 pveproxy[221619]: worker exit
Aug 12 09:56:04 pve02 pveproxy[221620]: worker exit
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221619 finished
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221620 finished
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221618 finished
Aug 12 09:56:04 pve02 pveproxy[1008347]: starting 3 worker(s)
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221717 started
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221718 started
Aug 12 09:56:04 pve02 pveproxy[1008347]: worker 221719 started
Aug 12 09:56:04 pve02 pveproxy[221717]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688. Aug 12 09:56:04 pve02 pveproxy[221718]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688. Aug 12 09:56:04 pve02 pveproxy[221719]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688. Aug 12 09:56:09 pve02 pveproxy[221717]: worker exit
Aug 12 09:56:09 pve02 pveproxy[221718]: worker exit
Aug 12 09:56:09 pve02 pveproxy[221719]: worker exit
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221719 finished
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221718 finished
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221717 finished
Aug 12 09:56:09 pve02 pveproxy[1008347]: starting 3 worker(s)
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221820 started
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221821 started
Aug 12 09:56:09 pve02 pveproxy[1008347]: worker 221822 started
Aug 12 09:56:09 pve02 pveproxy[221820]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688. Aug 12 09:56:09 pve02 pveproxy[221821]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688. Aug 12 09:56:09 pve02 pveproxy[221822]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688.
-----------------------------------

NODE 3

root@pve03:~# systemctl restart pve-cluster.service

root@pve03:~# systemctl status pve-cluster.service


pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2019-08-12 10:00:49 EDT; 39s ago Process: 1211022 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Process: 1211124 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Main PID: 1211028 (pmxcfs)
Tasks: 6 (limit: 7372)
Memory: 15.9M
CGroup: /system.slice/pve-cluster.service
└─1211028 /usr/bin/pmxcfs

Aug 12 10:01:12 pve03 pmxcfs[1211028]: [dcdb] crit: cpg_initialize failed: 2 Aug 12 10:01:12 pve03 pmxcfs[1211028]: [status] crit: cpg_initialize failed: 2 Aug 12 10:01:18 pve03 pmxcfs[1211028]: [quorum] crit: quorum_initialize failed: 2 Aug 12 10:01:18 pve03 pmxcfs[1211028]: [confdb] crit: cmap_initialize failed: 2 Aug 12 10:01:18 pve03 pmxcfs[1211028]: [dcdb] crit: cpg_initialize failed: 2 Aug 12 10:01:18 pve03 pmxcfs[1211028]: [status] crit: cpg_initialize failed: 2 Aug 12 10:01:24 pve03 pmxcfs[1211028]: [quorum] crit: quorum_initialize failed: 2 Aug 12 10:01:24 pve03 pmxcfs[1211028]: [confdb] crit: cmap_initialize failed: 2 Aug 12 10:01:24 pve03 pmxcfs[1211028]: [dcdb] crit: cpg_initialize failed: 2 Aug 12 10:01:24 pve03 pmxcfs[1211028]: [status] crit: cpg_initialize failed: 2
 
the status output does not show enough from the log, please use journalctl to give us more log output
 
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- An ExecStart= process belonging to unit pvesr.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 111.
Nov 13 16:42:00 srv0 systemd[1]: pvesr.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit pvesr.service has entered the 'failed' state with result 'exit-code'.
Nov 13 16:42:00 srv0 systemd[1]: Failed to start Proxmox VE replication runner.
-- Subject: A start job for unit pvesr.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit pvesr.service has finished with a failure.
--
-- The job identifier is 10448689 and the job result is failed.
Nov 13 16:42:01 srv0 cron[2742]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Nov 13 16:42:02 srv0 pve-ha-lrm[2787]: updating service status from manager failed: Connection refused
Nov 13 16:42:02 srv0 pveproxy[42268]: worker exit
Nov 13 16:42:02 srv0 pveproxy[2781]: worker 42268 finished
Nov 13 16:42:02 srv0 pveproxy[2781]: starting 1 worker(s)
Nov 13 16:42:02 srv0 pveproxy[2781]: worker 42383 started
Nov 13 16:42:02 srv0 pveproxy[42383]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688.
Nov 13 16:42:03 srv0 pveproxy[42270]: worker exit
Nov 13 16:42:03 srv0 pveproxy[2781]: worker 42270 finished
Nov 13 16:42:03 srv0 pveproxy[2781]: starting 1 worker(s)
Nov 13 16:42:03 srv0 pveproxy[2781]: worker 42384 started
Nov 13 16:42:03 srv0 pveproxy[42384]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688.
Nov 13 16:42:06 srv0 pmxcfs[42265]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
Nov 13 16:42:06 srv0 pmxcfs[42265]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
Nov 13 16:42:06 srv0 pmxcfs[42265]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 13 16:42:06 srv0 pmxcfs[42265]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 13 16:42:06 srv0 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- An ExecStart= process belonging to unit pve-cluster.service has exited.

-- The process' exit code is 'exited' and its exit status is 255.
Nov 13 16:42:06 srv0 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Nov 13 16:42:06 srv0 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
-- Subject: A start job for unit pve-cluster.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit pve-cluster.service has finished with a failure.
--
-- The job identifier is 10448632 and the job result is failed.
Nov 13 16:42:06 srv0 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
-- Subject: A start job for unit corosync.service has finished successfully
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit corosync.service has finished successfully.
--
-- The job identifier is 10448635.
Nov 13 16:42:06 srv0 systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Nov 13 16:42:06 srv0 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 3.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Automatic restarting of the unit pve-cluster.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Nov 13 16:42:06 srv0 systemd[1]: Stopped The Proxmox VE cluster filesystem.
-- Subject: A stop job for unit pve-cluster.service has finished
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A stop job for unit pve-cluster.service has finished.
--
-- The job identifier is 10448799 and the job result is done.
Nov 13 16:42:06 srv0 systemd[1]: Starting The Proxmox VE cluster filesystem...
-- Subject: A start job for unit pve-cluster.service has begun execution
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit pve-cluster.service has begun execution.
--
-- The job identifier is 10448799.
Nov 13 16:42:06 srv0 pmxcfs[42479]: [main] notice: unable to acquire pmxcfs lock - trying again
Nov 13 16:42:06 srv0 pmxcfs[42479]: [main] notice: unable to acquire pmxcfs lock - trying again
Nov 13 16:42:06 srv0 pve-firewall[2753]: status update error: Connection refused
Nov 13 16:42:07 srv0 pve-ha-lrm[2787]: updating service status from manager failed: Connection refused

Nov 13 16:42:07 srv0 pveproxy[42383]: worker exit
Nov 13 16:42:07 srv0 pveproxy[2781]: worker 42383 finished
Nov 13 16:42:07 srv0 pveproxy[2781]: starting 1 worker(s)
Nov 13 16:42:07 srv0 pveproxy[2781]: worker 42482 started
Nov 13 16:42:07 srv0 pveproxy[42482]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1688.
 
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l
rm /etc/pve/corosync.conf
rm -r /etc/corosync/*
killall pmxcfs
systemctl start pve-cluster


this is what i do before.
 
root@srv0:~# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: activating (start) since Fri 2020-11-13 16:47:24 WIB; 9s ago
Cntrl PID: 49011 (pmxcfs)
Tasks: 1 (limit: 7372)
Memory: 1.2M
CGroup: /system.slice/pve-cluster.service
└─49011 /usr/bin/pmxcfs

Nov 13 16:47:24 srv0 systemd[1]: Starting The Proxmox VE cluster filesystem...
Nov 13 16:47:24 srv0 pmxcfs[49011]: [main] notice: unable to acquire pmxcfs lock - trying again
Nov 13 16:47:24 srv0 pmxcfs[49011]: [main] notice: unable to acquire pmxcfs lock - trying again
 
[main] notice: unable to acquire pmxcfs lock - trying again
[main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
[main] notice: exit proxmox configuration filesystem (-1)

What i'm suppose to do with that ?

I have only acces by the GUI to pve01.. pve02 and pve03 is only accessible by ssh

attached: report file for node pve01, pve02 and pve03
Hello,
I have almost the same problem.
I had 1 proxmox machine (proxmox1). Then i have installed a second one (proxmox2). Then i have created a cluster on the first machine.
When i try to add the second machine to this cluster, it does not work. Neither via GUI nor via command line.

In GUI it gives me "Connection error 401: permission denied - invalid PVE ticket" and kills the GUI, web server is no longer accessible.
proxmox1:
pvecm status
Cluster information
-------------------
Name: dellcluster
Config Version: 6
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Fri Jul 8 18:05:58 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.a
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.1.40 (local)

====================================================================================================
proxmox2:

pvecm status
Cluster information
-------------------
Name: dellcluster
Config Version: 6
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Fri Jul 8 18:15:16 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2.a
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 192.168.1.30 (local)
-----------------------------------------------------------------------------------------------------------
systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: activating (start) since Fri 2022-07-08 17:56:28 CEST; 2s ago
Cntrl PID: 68372 (pmxcfs)
Tasks: 1 (limit: 18984)
Memory: 1.5M
CPU: 7ms
CGroup: /system.slice/pve-cluster.service
└─68372 /usr/bin/pmxcfs

Jul 08 17:56:28 proxmox2 systemd[1]: Starting The Proxmox VE cluster filesystem...
Jul 08 17:56:28 proxmox2 pmxcfs[68372]: [main] notice: unable to acquire pmxcfs lock - trying again
Jul 08 17:56:28 proxmox2 pmxcfs[68372]: [main] notice: unable to acquire pmxcfs lock - trying again
----------------------------------------------------------------------------------------------------------
journalctl -xe
Jul 08 17:56:38 proxmox2 pmxcfs[68394]: [main] notice: unable to acquire pmxcfs lock - trying again
Jul 08 17:56:38 proxmox2 pmxcfs[68394]: [main] notice: unable to acquire pmxcfs lock - trying again
Jul 08 17:56:48 proxmox2 pmxcfs[68394]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
Jul 08 17:56:48 proxmox2 pmxcfs[68394]: [main] crit: unable to acquire pmxcfs lock: Resource temporarily unavailable
Jul 08 17:56:48 proxmox2 pmxcfs[68394]: [main] notice: exit proxmox configuration filesystem (-1)
Jul 08 17:56:48 proxmox2 pmxcfs[68394]: [main] notice: exit proxmox configuration filesystem (-1)
Jul 08 17:56:48 proxmox2 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ An ExecStart= process belonging to unit pve-cluster.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 255.
Jul 08 17:56:48 proxmox2 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Jul 08 17:56:48 proxmox2 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
░░ Subject: A start job for unit pve-cluster.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has finished with a failure.
░░
░░ The job identifier is 2125 and the job result is failed.
Jul 08 17:56:48 proxmox2 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 8.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ Automatic restarting of the unit pve-cluster.service has been scheduled, as the result for
░░ the configured Restart= setting for the unit.
Jul 08 17:56:48 proxmox2 systemd[1]: Stopped The Proxmox VE cluster filesystem.
░░ Subject: A stop job for unit pve-cluster.service has finished
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A stop job for unit pve-cluster.service has finished.
░░
░░ The job identifier is 2215 and the job result is done.
Jul 08 17:56:48 proxmox2 systemd[1]: Starting The Proxmox VE cluster filesystem...
░░ Subject: A start job for unit pve-cluster.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has begun execution.
░░
░░ The job identifier is 2215.
Jul 08 17:56:48 proxmox2 pmxcfs[68414]: [main] notice: unable to acquire pmxcfs lock - trying again
Jul 08 17:56:48 proxmox2 pmxcfs[68414]: [main] notice: unable to acquire pmxcfs lock - trying again
------------------------------------------------------------------------------------------------------------------------
systemctl stop pve-cluster
deleting the /var/lib/pve-cluster/.pmxcfs.lockfile
systemctl start pve-cluster
does not change anything. The lockfile reappears, and the node is not working.
All containers are stopped and cannot be started: "cluster not ready - no quorum? (500)"
------------------------
This problem is very common, clusters go mad and fail miserably with these same errors, I tried other related posts, and nothing helps.
Why this cluster thing is so weak and unreliable??? Nobody would want this solution for PRODUCTION, if it fails so easily and cannot be repaired?
In my case the cluster cannot even be built in the first place! Many people have asked to rewrite this whole cluster portion of the code, and for the reason.
Do you have any working solutions, please?
 
Last edited:
Hello,
That was when i started building a cluster. I had different errors. Then i have upgraded proxmox to the latest version and these errors stopped showing up. About july 2022 i have succeeded building the cluster. The problem was my LAN, presumably. I had 3 switches in between the nodes, and cluster communication (corosync, i think) failed between them. As soon as i have moved the second node to the same switch as the first node, the cluster started working.
The switch configuration was basic, no apparent filtering, multicast enabled, as far as i know. I have never found the root cause of my network-related problem.
 
  • Like
Reactions: pcuci

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!