Proxmox 9.2.3 - 2-Node Cluster Join: pve2 stays grey, pve-ssl.pem missing despite quorum

Taupsi

New Member
Jun 2, 2026
5
1
1
Hi everyone,

I’m trying to set up a 2-node Proxmox cluster and keep hitting the same problem every time.

Hardware:

• Node 1 (pve): Lenovo ThinkCentre M93p, i5-7500T, IP: 10.0.0.1
• Node 2 (pve2): Dell OptiPlex 3070 Mini, i5-9500T, IP: 10.0.0.2
• Both: Proxmox VE 9.2.3, Kernel 7.0.6-2-pve, fresh installations
• Both connected to the same switch, same subnet /24

What I did:

1. Fresh install on both nodes
2. Updated both to 9.2.3
3. pvecm create homelab on pve
4. pvecm add 10.0.0.1 on pve2

What happens:

• Join completes successfully, waiting for quorum...OK
• pvecm status on both nodes correctly shows 2 nodes, Quorate: Yes
• pve2 stays grey in the WebUI permanently
• WebUI shell of pve2 shows: Host key verification failed or /etc/pve/local/pve-ssl.key: failed to load local private key
• /etc/pve/nodes/pve2/priv/ is empty - no SSL certificates were generated
• pvecm updatecerts --force hangs on pve2 with waiting for pmxcfs mount to appear and get quorate

What I already tried:

• Multiple fresh reinstalls of both nodes
• Manually copying SSH keys between nodes
• pvecm updatecerts --force on both pve and pve2
• Cleaning up config.db
• Restarting corosync and pve-cluster services
• Used community post-install script (disables Corosync) - same result

Relevant logs on pve2 after join:
pveproxy: /etc/pve/local/pve-ssl.key: failed to load local private key
pvecm updatecerts: waiting for pmxcfs mount to appear and get quorate (hangs indefinitely)
Question: Why are the SSL certificates for pve2 not being generated automatically even though quorum is established? What needs to be done to get pve2 to show green in the WebUI?

Thanks in advance!
 
please post the full journal of both nodes covering a clean attempt to join the cluster, as well as all the task logs/CLI output. thanks!
 
yeah, this could very well be a switch or MTU or NIC issue..
 
The NIC on pve2 (Dell OptiPlex 3070 Mini) uses the r8169 driver. pve (Lenovo ThinkCentre M93p) uses e1000e. Both nodes are connected to the same simple home switch.

I noticed in the logs that pve sends all 44 inode updates successfully, but pve2 is stuck at “waiting for updates from leader”. The KNET link goes down repeatedly after the join.

I tried disabling GRO/GSO on pve2 with:
ethtool -K nic0 gro off gso off

Could the r8169 driver be causing UDP packet loss for Corosync? Is there a permanent fix?
 
you can try disabling offloading features on both ends if they are enabled, and try to do UDP stress testing with bigger packets in both directions and see if the link holds up..
 
Hi Fabian,


I disabled GRO/GSO/TSO on both nodes:


ethtool -K nic0 gro off gso off tso off


I also ran a UDP stress test with iperf3 between the nodes:


iperf3 -c 192.168.188.10 -u -b 100M -l 1400


Result: 0% packet loss over 84793 datagrams. The UDP link looks perfect.


But the cluster sync still fails with the same result:


[dcdb] notice: waiting for updates from leader[KNET] link: host: 1 link: 0 is down


So disabling offloading features did not help. The iperf3 UDP test passes fine but Corosync knet still drops the connection after a few minutes.


pve NIC: e1000e (Intel, Lenovo ThinkCentre M93p)pve2 NIC: r8169 (Realtek, Dell OptiPlex 3070 Mini)


Any other suggestions? Would replacing the NIC on pve2 with a USB Ethernet adapter (ASIX AX88179) be the right solution?


Thanks!
 
try bigger packets or let it auto-discover.. you can also check with `tcpdump` on the corosync ports whether what the first node sends ends up being received on the second node..
 
Problem solved! The issue was a faulty network cable on pve2. The NIC was downshifting from 1Gbps to 100Mbps and losing the connection repeatedly. After replacing the cable everything worked immediately - the cluster joined successfully and all certificates were generated correctly. Thank you for your help!
 
  • Like
Reactions: fabian