Hello,
I am trying to make a setup of Two-Node HA ( https://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster ).
I have 2 identical machines (Dell R720 with idrac7) and I have setup a PVE Cluster with these 2 and a quorum disk via an iscsi target from a third-machine.
Although everything seems to work fine: I can do live migration with no packet loss, if I manually fence a node
or crash it on purpose the VM running on the "broken" node is getting moved to the operational one but I get this in the logs:
I have 2 VMs (100,101) both are CentOS 6 (actually 101 is a clone of 100 ) with which I ran these tests.
The setup consists in a drbd session between the 2 nodes on top of which I run gfs2 ( no lvm involved ). I had a hard time
mounting this resource at startup and my cluster.conf looks like this now:
( all the mess with failoverdomains and 2 services was the only solution I found to use the cluster to mount drbd0 )
So the question is why does the VM get restarted? as I see in rgmanager.log it says it was stopped..
Thank you,
Teodor
I am trying to make a setup of Two-Node HA ( https://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster ).
I have 2 identical machines (Dell R720 with idrac7) and I have setup a PVE Cluster with these 2 and a quorum disk via an iscsi target from a third-machine.
Although everything seems to work fine: I can do live migration with no packet loss, if I manually fence a node
or crash it on purpose the VM running on the "broken" node is getting moved to the operational one but I get this in the logs:
Code:
Dec 22 02:01:11 rgmanager State change: proxmox2 DOWN
Dec 22 02:01:34 rgmanager Marking service:gfs2-2 as stopped: Restricted domain unavailable
Dec 22 02:01:34 rgmanager Starting stopped service pvevm:101
Dec 22 02:01:34 rgmanager [pvevm] VM 100 is running
Dec 22 02:01:35 rgmanager [pvevm] Move config for VM 101 to local node
Dec 22 02:01:36 rgmanager Service pvevm:101 started
==
Dec 22 02:01:11 fenced fencing node proxmox2
Dec 22 02:01:33 fenced fence proxmox2 success
==
The setup consists in a drbd session between the 2 nodes on top of which I run gfs2 ( no lvm involved ). I had a hard time
mounting this resource at startup and my cluster.conf looks like this now:
Code:
<?xml version="1.0"?>
<cluster config_version="39" name="Cluster">
<cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
<quorumd allow_kill="0" interval="1" label="cluster_qdisk" tko="10" votes="1"/>
<totem token="1000"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="192.168.162.90" login="fence" name="proxmox1-drac" passwd="123456" secure="1"/>
<fencedevice agent="fence_ipmilan" ipaddr="192.168.162.91" login="fence" name="proxmox2-drac" passwd="123456" secure="1"/>
</fencedevices>
<clusternodes>
<clusternode name="proxmox1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="proxmox1-drac"/>
</method>
</fence>
</clusternode>
<clusternode name="proxmox2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="proxmox2-drac"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<failoverdomains>
<failoverdomain name="node1" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="proxmox1"/>
</failoverdomain>
<failoverdomain name="node2" nofailback="0" ordered="0" restricted="1">
<failoverdomainnode name="proxmox2"/>
</failoverdomain>
</failoverdomains>
<resources>
<clusterfs name="gfs2" mountpoint="/gfs2" device="/dev/drbd0" fstype="gfs2" force_unmount="1" options="noatime,nodiratime,noquota"/>
</resources>
<service autostart="1" name="gfs2-1" domain="node1" exclusive="0">
<clusterfs ref="gfs2"/>
</service>
<service autostart="1" name="gfs2-2" domain="node2" exclusive="0">
<clusterfs ref="gfs2"/>
</service>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>
So the question is why does the VM get restarted? as I see in rgmanager.log it says it was stopped..
Thank you,
Teodor
Last edited: