Hi all. I have 2 servers running Proxmox 7.3 standalone and I want to create a cluster with 2 nodes. I have the following schema:
- server with hostname proxmox-1 at IP 192.168.88.70
- server with hostname proxmox-2 at IP 192.168.88.80
and they communicate to each other via router and can access each other freely.
The issue starts to happen when I create a cluster on proxmox-1 and add proxmox-2 to this cluster via UI. After that, the UI on proxmox-1 stops working after some time (only SSH works) and loadavg on proxmox-2 goes crazy. On proxmox-1 I see the following in logs, repeating all over and over:
(this file the pveproxy is complaining about is indeed absent on proxmox-1 but is present on proxmox-2).
This continues until I remove the cluster on both nodes using this guide: https://pve.proxmox.com/wiki/Cluster_Manager#_remove_a_cluster_node, then everything becomes normal again.
How I can debug it further?
- server with hostname proxmox-1 at IP 192.168.88.70
- server with hostname proxmox-2 at IP 192.168.88.80
and they communicate to each other via router and can access each other freely.
The issue starts to happen when I create a cluster on proxmox-1 and add proxmox-2 to this cluster via UI. After that, the UI on proxmox-1 stops working after some time (only SSH works) and loadavg on proxmox-2 goes crazy. On proxmox-1 I see the following in logs, repeating all over and over:
Bash:
proxmox@proxmox-1 ~ ❯ sudo journalctl -u corosync -u pveproxy -u pve-cluster -f
-- Journal begins at Fri 2022-12-02 12:47:10 MSK. --
Dec 07 18:03:46 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:46 proxmox-1 pveproxy[3396716]: '/etc/pve/nodes/proxmox-2/pve-ssl.pem' does not exist!
Dec 07 18:03:46 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:47 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:47 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:47 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:48 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72 73
Dec 07 18:03:48 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:48 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:49 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72 75
Dec 07 18:03:50 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72 75
Dec 07 18:03:50 proxmox-1 pveproxy[3407038]: '/etc/pve/nodes/proxmox-2/pve-ssl.pem' does not exist!
Dec 07 18:03:50 proxmox-1 pveproxy[3407038]: '/etc/pve/nodes/proxmox-2/pve-ssl.pem' does not exist!
Dec 07 18:03:50 proxmox-1 pveproxy[3397286]: proxy detected vanished client connection
Dec 07 18:03:51 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:51 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
Dec 07 18:03:52 proxmox-1 corosync[3395280]: [TOTEM ] Retransmit List: 14 15 1d 1e 25 2f 30 31 39 3f 47 4f 50 56 5f 72
This continues until I remove the cluster on both nodes using this guide: https://pve.proxmox.com/wiki/Cluster_Manager#_remove_a_cluster_node, then everything becomes normal again.
How I can debug it further?