Losing NFS share connection

jerim · Jul 20, 2012

I know this isn't necessarily the right group for this issue, but I wanted to cover all my bases and make sure this issue didn't have something to do with Proxmox.

I have set up a DRBD cluster for replicating a drbd volume across two servers. Each server has its own IP address (10.89.99.31 and 10.89.99.32). I also have a collocated IP address (10.89.99.30) used for mounting the volume on other computers. I am using cman and pacemaker.

I can mount the volume with no issues using 10.89.99.30. When I unplug the network cable from my second storage server, the computer will lose connectivity to the NFS share, even though the IP address shows it is shared. I can still ping 10.89.99.30, but I can’t find the share using

showmount -e 10.89.99.30

I am just wondering if there is anything in Proxmox with the way it handles NFS shares. I don't understand NFS thoroughly but I understand that NFS handles can become stale. I don't know if Proxmox is able to work with a collocated IP address. I understand there would probably be a delay of a few minutes as it switches over to the other server, but I was expecting it to come back up on its own after migrating.

I wanted to attach my cluster.conf file, but ran into some problems so I am pasting it here. I am just looking for a push in the right direction.

Code:

<?xml version="1.0"?><cluster name="Cluster" config_version="6">
    <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
    </cman>
    <fencedevices>
        <fencedevice agent="fence_ipmilan" name="NODE1" lanplus="1" ipaddr="10.89.99.51" login="root" passwd="password" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="NODE2" lanplus="1" ipaddr="10.89.99.50" login="root" passwd="password" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="NODE3" lanplus="1" ipaddr="10.89.99.49" login="root" passwd="password" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="NODE4" lanplus="1" ipaddr="10.89.99.48" login="root" passwd="password" power_wait="5"/>
    </fencedevices>


    <clusternodes>
    <clusternode name="NODE1" votes="1" nodeid="1">
        <fence>
            <method name="1">
                 <device name="NODE1"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="NODE2" votes="1" nodeid="2">
        <fence>
            <method name="1">
                 <device name="NODE2"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="KNTCLCN003" votes="1" nodeid="3">
        <fence>
            <method name="1">
                 <device name="NODE3"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="NODE4" votes="1" nodeid="4">
        <fence>
            <method name="1">
                 <device name="NODE4"/>
            </method>
        </fence>
    </clusternode>


</clusternodes>
<rm>
    <service autostart="1" exclusive="0" name="ha_test_ip" recovery="relocate">
        <ip address="10.89.99.30"/>
    </service>
    <pvevm autostart="1" vmid="100"/>
    <pvevm autostart="1" vmid="101"/>
    <pvevm autostart="1" vmid="102"/>
    <pvevm autostart="1" vmid="103"/>
    <pvevm autostart="1" vmid="104"/>
    <pvevm autostart="1" vmid="105"/>
    <failoverdomains>
      <failoverdomain name="Failover" nofailback="1" ordered="1" restricted="1">
        <failoverdomainnode name="NODE1" priority="1"/>
        <failoverdomainnode name="NODE2" priority="10"/>
        <failoverdomainnode name="NODE3" priority="20"/>
        <failoverdomainnode name="NODE4" priority="30"/>
      </failoverdomain>
    </failoverdomains>
</rm>
</cluster>

e100 · Jul 21, 2012

I've setup HA NFS servers using DRBD in the past, but not in Proxmox.

I remember that /var/lib/nfs needed to be symlinked to a directory that is on the filesystem on top of DRBD so connection info/file locks etc would also be replicated.
Without that on failover things will not work as expected.

This old article seems to be similar to how I had my systems setup:
http://www.linux4beginners.info/node/redundant-DRBD-NFS-Heartbeat-free

It was written for heartbeat, but seems to explain things enough that you can adapt it to cman/pacemaker

jerim · Jul 28, 2012

Re: Losing NFS share connection [SOLVED]

I went back through the DRBD setup using the "Clusters from Scratch" document and found a few things I missed the first time around. I got those corrected and it works now.

Search

Search

Losing NFS share connection

jerim

Guest

e100

Renowned Member

jerim

Guest