Hi,
I have a cluster with 3 nodes : PVE-01, PVE-02 and PVE-03
On yesterday PVE-02 was shut down for maintenance and when put back into operation was no longer recognized in the cluster.
Here are some info I've got so far :
- root@pve-02:~# pveversion
pve-manager/6.0-8/b6b80da7 (running kernel: 5.0.21-2-pve)
- /etc/pve is in read only mode
- I can ping by IP and name all the nodes
- I can ssh into any node from any node
- root@pve-02:~# pvecm status
Quorum information
------------------
Date: Thu Oct 17 09:32:58 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/688
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 xx.xx.xx.242 (local)
- I see the lock file in /var/lib/pve-cluster/.pmxcfs.lockfile
- PVE-Cluster service has errors
root@pve-02:~# service pve-cluster status
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor pres
Active: active (running) since Thu 2019-10-17 09:30:57 -03; 6min ago
Process: 4264 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Process: 4389 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited,
Main PID: 4267 (pmxcfs)
Tasks: 6 (limit: 4915)
Memory: 35.5M
CGroup: /system.slice/pve-cluster.service
└─4267 /usr/bin/pmxcfs
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [dcdb] crit: cpg_initialize failed: 2
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [dcdb] crit: can't initialize service
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [status] crit: cpg_initialize failed: 2
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [status] crit: can't initialize service
Oct 17 09:30:57 pve-02 systemd[1]: Started The Proxmox VE cluster filesystem.
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: update cluster info (clust
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [dcdb] notice: members: 2/4267
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [dcdb] notice: all data is up to date
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: members: 2/4267
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: all data is up to date
- file /etc/pve/corosync.conf is the same on all 3 nodes
What else can be done?
Thank you,
Ricardo Jorge
I have a cluster with 3 nodes : PVE-01, PVE-02 and PVE-03
On yesterday PVE-02 was shut down for maintenance and when put back into operation was no longer recognized in the cluster.
Here are some info I've got so far :
- root@pve-02:~# pveversion
pve-manager/6.0-8/b6b80da7 (running kernel: 5.0.21-2-pve)
- /etc/pve is in read only mode
- I can ping by IP and name all the nodes
- I can ssh into any node from any node
- root@pve-02:~# pvecm status
Quorum information
------------------
Date: Thu Oct 17 09:32:58 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/688
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 xx.xx.xx.242 (local)
- I see the lock file in /var/lib/pve-cluster/.pmxcfs.lockfile
- PVE-Cluster service has errors
root@pve-02:~# service pve-cluster status
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor pres
Active: active (running) since Thu 2019-10-17 09:30:57 -03; 6min ago
Process: 4264 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
Process: 4389 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited,
Main PID: 4267 (pmxcfs)
Tasks: 6 (limit: 4915)
Memory: 35.5M
CGroup: /system.slice/pve-cluster.service
└─4267 /usr/bin/pmxcfs
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [dcdb] crit: cpg_initialize failed: 2
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [dcdb] crit: can't initialize service
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [status] crit: cpg_initialize failed: 2
Oct 17 09:30:55 pve-02 pmxcfs[4267]: [status] crit: can't initialize service
Oct 17 09:30:57 pve-02 systemd[1]: Started The Proxmox VE cluster filesystem.
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: update cluster info (clust
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [dcdb] notice: members: 2/4267
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [dcdb] notice: all data is up to date
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: members: 2/4267
Oct 17 09:31:01 pve-02 pmxcfs[4267]: [status] notice: all data is up to date
- file /etc/pve/corosync.conf is the same on all 3 nodes
What else can be done?
Thank you,
Ricardo Jorge