Hi,
I had a 8 nodes proxmox v3.4 cluster (named LPNHE-CLUSTER)
To migrate to v4, I did the following :
- Migrate all VM out from node 1 and 2
- Shutdown node 1 and 2 and reinstall in v4 with new name and IP
- create new cluster (with name different from the old v3 cluster, i.e. new name is LPNHE)
- move VM from old cluster to new one
and so on for the other nodes.
So far so good.... but when I reached the last node in th old cluster, I had the surprise that immediately after I stopped it, the new cluster went down (more precisely the nodes started to leave the new cluster and I lost the quorum). I can see all the nodes in red in the interface (except the on on which I connect to the web page).
If I turn on the last server on the old cluster, the new one recover (corosync says the the new nodes joined again, I regain the quorum and everything is ok).
I cannot figure out what is going on !
Any clue ?
Thanks
F.
PS: Here are my confs
### New cluster ###
>more /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: newnode1
nodeid: 3
quorum_votes: 1
ring0_addr: newnode1
}
node {
name: newnode2
nodeid: 2
quorum_votes: 1
ring0_addr: newnode2
}
.
.
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: LPNHE
config_version: 6
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: ip-from-newnode1
ringnumber: 0
}
}
### Old cluster ###
>more /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="80" name="LPNHE-CLUSTER">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<quorumd votes="1" allow_kill="0" interval="1" label="proxmox_quorum_disk" tko="10"/>
<totem token="154000"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="xxx.xxx.xxx.xxx" lanplus="1" login="XXXX" name="fencenode1" passwd="XXXX" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="xxx.xxx.xxx.xxx" lanplus="1" login="XXXX" name="fencenode2" passwd="XXXX" power_wait="5"/>
.
.
</fencedevices>
<clusternodes>
<clusternode name="oldnode1" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="fencenode1"/>
</method>
</fence>
</clusternode>
<clusternode name="oldnode2" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="fencenode2"/>
</method>
</fence>
</clusternode>
.
.
</clusternodes>
<rm/>
</cluster>
I had a 8 nodes proxmox v3.4 cluster (named LPNHE-CLUSTER)
To migrate to v4, I did the following :
- Migrate all VM out from node 1 and 2
- Shutdown node 1 and 2 and reinstall in v4 with new name and IP
- create new cluster (with name different from the old v3 cluster, i.e. new name is LPNHE)
- move VM from old cluster to new one
and so on for the other nodes.
So far so good.... but when I reached the last node in th old cluster, I had the surprise that immediately after I stopped it, the new cluster went down (more precisely the nodes started to leave the new cluster and I lost the quorum). I can see all the nodes in red in the interface (except the on on which I connect to the web page).
If I turn on the last server on the old cluster, the new one recover (corosync says the the new nodes joined again, I regain the quorum and everything is ok).
I cannot figure out what is going on !
Any clue ?
Thanks
F.
PS: Here are my confs
### New cluster ###
>more /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: newnode1
nodeid: 3
quorum_votes: 1
ring0_addr: newnode1
}
node {
name: newnode2
nodeid: 2
quorum_votes: 1
ring0_addr: newnode2
}
.
.
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: LPNHE
config_version: 6
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: ip-from-newnode1
ringnumber: 0
}
}
### Old cluster ###
>more /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="80" name="LPNHE-CLUSTER">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<quorumd votes="1" allow_kill="0" interval="1" label="proxmox_quorum_disk" tko="10"/>
<totem token="154000"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="xxx.xxx.xxx.xxx" lanplus="1" login="XXXX" name="fencenode1" passwd="XXXX" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="xxx.xxx.xxx.xxx" lanplus="1" login="XXXX" name="fencenode2" passwd="XXXX" power_wait="5"/>
.
.
</fencedevices>
<clusternodes>
<clusternode name="oldnode1" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="fencenode1"/>
</method>
</fence>
</clusternode>
<clusternode name="oldnode2" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="fencenode2"/>
</method>
</fence>
</clusternode>
.
.
</clusternodes>
<rm/>
</cluster>