Hi,
i have a proxmox-cluster setup with currently 5 nodes in it.
i'm using this cluster since pve6 and had multiple joins and deletion in it already. Everything was fine till now.
i'm switching to a new datacenter and have to move the vms from the machines that are running in the old to the new dc, i'd love to move them via live-migration thats why i decided to buy new hardware and install them in the new dc and then move the vms.
but now when i'm trying to add the nodes to the cluster i'm struggling. Tried it multiple times with each node and everytime it stops in a point where i cannot find any error.
here is the log of one of the joins:
when i look into /etc/pve it is pretty empty:
what else can i do?
Multicast between the DCs are enabled it is network over a darkfiber-connection with 10g. it has nothing to do with the network, because i have another cluster with the same constellation and it worked out just fine.
any help appreciated
thank you
i have a proxmox-cluster setup with currently 5 nodes in it.
i'm using this cluster since pve6 and had multiple joins and deletion in it already. Everything was fine till now.
i'm switching to a new datacenter and have to move the vms from the machines that are running in the old to the new dc, i'd love to move them via live-migration thats why i decided to buy new hardware and install them in the new dc and then move the vms.
but now when i'm trying to add the nodes to the cluster i'm struggling. Tried it multiple times with each node and everytime it stops in a point where i cannot find any error.
here is the log of one of the joins:
Code:
Feb 22 11:34:01 root25 corosync[7875]: [QUORUM] Members[6]: 1 3 4 5 6 7
Feb 22 11:34:01 root25 corosync[7875]: [MAIN ] Completed service synchronization, ready to provide service.
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: cpg_send_message retried 1 times
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: dfsm_deliver_queue: queue length 3
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: members: 1/1269, 3/1201, 4/2706682, 5/1481356, 6/2876296, 7/7880
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: starting data syncronisation
Feb 22 11:34:01 root25 pmxcfs[7880]: [dcdb] notice: cpg_send_message retried 1 times
Feb 22 11:34:01 root25 pmxcfs[7880]: [dcdb] notice: received sync request (epoch 1/1269/00000018)
Feb 22 11:34:01 root25 pmxcfs[7880]: [dcdb] notice: received sync request (epoch 1/1269/00000019)
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: received sync request (epoch 1/1269/00000018)
Feb 22 11:34:01 root25 pmxcfs[7880]: [status] notice: received sync request (epoch 1/1269/00000019)
Feb 22 11:34:01 root25 corosync[7875]: [TOTEM ] Retransmit List: 3c 3f
Feb 22 11:34:05 root25 corosync[7875]: [KNET ] link: host: 1 link: 0 is down
Feb 22 11:34:05 root25 corosync[7875]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb 22 11:34:05 root25 corosync[7875]: [KNET ] host: host: 1 has no active links
Feb 22 11:34:06 root25 pve-ha-lrm[1676]: loop take too long (150 seconds)
Feb 22 11:34:06 root25 pve-ha-lrm[1676]: unable to write lrm status file - unable to open file '/etc/pve/nodes/root25/lrm_status.tmp.1676' - No such file or directory
Feb 22 11:34:08 root25 corosync[7875]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
Feb 22 11:34:08 root25 corosync[7875]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb 22 11:34:08 root25 corosync[7875]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 22 11:34:11 root25 pve-ha-lrm[1676]: unable to write lrm status file - unable to open file '/etc/pve/nodes/root25/lrm_status.tmp.1676' - No such file or directory
Feb 22 11:34:15 root25 corosync[7875]: [TOTEM ] Retransmit List: 98
Feb 22 11:34:16 root25 pve-ha-lrm[1676]: unable to write lrm status file - unable to open file '/etc/pve/nodes/root25/lrm_status.tmp.1676' - No such file or directory
Feb 22 11:34:21 root25 pve-ha-lrm[1676]: unable to write lrm status file - unable to open file '/etc/pve/nodes/root25/lrm_status.tmp.1676' - No such file or directory
Feb 22 11:34:26 root25 pve-ha-lrm[1676]: unable to write lrm status file - unable to open file '/etc/pve/nodes/root25/lrm_status.tmp.1676' - No such file or directory
when i look into /etc/pve it is pretty empty:
Code:
root@root25:~# ls -l /etc/pve/
total 1
-rw-r----- 1 root www-data 923 Feb 22 11:31 corosync.conf
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 local -> nodes/root25
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 lxc -> nodes/root25/lxc
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 openvz -> nodes/root25/openvz
drwx------ 2 root www-data 0 Feb 22 11:31 priv
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 qemu-server -> nodes/root25/qemu-server
what else can i do?
Multicast between the DCs are enabled it is network over a darkfiber-connection with 10g. it has nothing to do with the network, because i have another cluster with the same constellation and it worked out just fine.
any help appreciated
thank you
Last edited: