Hi,
I currently have three nodes :
The problem happened when I tried to add PVE02 to the cluster via the WEBUI : the operation got stuck while calling the cluster API (no error shown). I eventually stopped the operation after waiting for 10 minutes and since then, I can't login to PVE02 via WEBUI (login failed) and I get lost communication errors on PVE01 and PVE03 via WEBUI.
I can still connect to all three via ssh though.
I have VMs running on PVE01 which appear to still run correctly though I can't check as any command I run on PVE01 via ssh give me no output (they're stuck).
On PVE02 I ran "pvecm status"
and the content of /etc/pve/corosync.conf on PVE02 :
Here are system logs on PVE02 :
And the output of "pvesh get /cluster/config/join --output-format json-pretty" on PVE02 :
I realized that /etc/pve/nodes doesn't exist on PVE02 and that /etc/pve is write protected.
What can I do to fix this ?
I currently have three nodes :
- PVE01 : 192.168.53.222 (v 7.2)
- PVE02 : 192.168.53.219 (v7.2)
- PVE03 : 192.168.1.254 (v6.4 because there's an existing third party software on the machine requiring debian buster. Not important since the node just exists for quorum.)
The problem happened when I tried to add PVE02 to the cluster via the WEBUI : the operation got stuck while calling the cluster API (no error shown). I eventually stopped the operation after waiting for 10 minutes and since then, I can't login to PVE02 via WEBUI (login failed) and I get lost communication errors on PVE01 and PVE03 via WEBUI.
I can still connect to all three via ssh though.
I have VMs running on PVE01 which appear to still run correctly though I can't check as any command I run on PVE01 via ssh give me no output (they're stuck).
On PVE02 I ran "pvecm status"
Code:
Cluster information -------------------
Name: clustername
Config Version: 3
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Oct 24 09:15:06 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000003
Ring ID: 3.15dc
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000003 1 192.168.53.219 (local)
and the content of /etc/pve/corosync.conf on PVE02 :
Code:
logging { debug: off
to_syslog: yes
}
nodelist {
node {
name: pve01
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.53.222
}
node {
name: pve02
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.53.219
}
node {
name: sbc
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.1.254
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: clustername
config_version: 3
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
Here are system logs on PVE02 :
Code:
Oct 24 09:17:01 pve02 cron[2196]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Oct 24 09:17:01 pve02 CRON[42481]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Oct 24 09:17:01 pve02 CRON[42482]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Oct 24 09:17:01 pve02 CRON[42481]: pam_unix(cron:session): session closed for user root
Oct 24 09:17:01 pve02 pvescheduler[42485]: jobs: cfs-lock 'file-jobs_cfg' error: no quorum!
Oct 24 09:17:01 pve02 pvescheduler[42484]: replication: cfs-lock 'file-replication_cfg' error: no quorum!
Oct 24 09:17:03 pve02 corosync[12423]: [QUORUM] Sync members[1]: 3
Oct 24 09:17:03 pve02 corosync[12423]: [TOTEM ] A new membership (3.1684) was formed. Members
Oct 24 09:17:03 pve02 corosync[12423]: [QUORUM] Members[1]: 3
Oct 24 09:17:03 pve02 corosync[12423]: [MAIN ] Completed service synchronization, ready to provide service.
And the output of "pvesh get /cluster/config/join --output-format json-pretty" on PVE02 :
Code:
'/etc/pve/nodes/pve01/pve-ssl.pem' does not exist!
I realized that /etc/pve/nodes doesn't exist on PVE02 and that /etc/pve is write protected.
What can I do to fix this ?
Last edited: