Q: ProxVE 'cluster' and NFS Storage Target - not mounting after reboot

fortechitsolutions

Renowned Member
Jun 4, 2008
437
48
93
Hi,

I'm not sure if this will sound familiar or not to anyone. I've got a ProxVE cluster setup with 3 hardware nodes (1 master / 2 slave). All 3 are identical machines (Dell pe2950, hardware raid, HA trunked dual gig ether interfaces - consistent config on all 3 machines).

I have a "NFS storage pool" defined which mounts an NFS share from a local NFS server. This mounts fine on all 3 hardware nodes.

The mount is actually present on the hardware nodes, so that some OpenVZ based VMs I've got // which need to be able to mount a share off this NFS server work; and my review of docs suggested that ensuring the NFS server is mounted on the ProxVE hardware node is the 'easiest way' to ensure modules are loaded properly / and working smoothly -- to allow NFS mounts to work for the OpenVZ VMs.

The problem:

I had to reboot the "Master" hardware node in the ProxVE cluster last night. It came back online OK; however, I didn't notice until thisAM -- that the NFS mount on the ProxVE hardware node simply didn't mount on reboot.

Via the web interface, I was able to choose "Enable" - and pouf, it was online.

Subsequently, I was able to stop and start all the VMs (one per hardware node) which wanted to make NFS mounts. And after doing so -- they had their NFS mounts fine as well.

However: Ideally -- this should happen on its own // and not require any systems admin intervention.

I'm just curious, if there are know issues / or circumstances where NFS 'storage' will refuse to mount when the ProxVE hardware node(s) are rebooted. .. and if there are any known workarounds to resolve this.


Thanks,


Tim Chipman

Fortech IT Solutions
http://FortechITSolutions.ca
 
any hint in the syslog or dmesg?
 
Hi,

I don't think there really is anything of much use. This is the last ~50 lines in dmesg, illustrating

-bond interfaces coming up
- various OpenVZ VMs coming online
- stopping VZ VMs
- NFS module is being loaded (this is after I manually 'enabled' via web interface)
- etc.

--paste--

Code:
... capture from dmesg...
... reflects content current since the last boot of hardware node...

bonding: bond0: enslaving eth0 as a backup interface with a down link.
  alloc irq_desc for 34 on node -1
  alloc kstat_irqs on node -1
bnx2 0000:07:00.0: irq 34 for MSI/MSI-X
bnx2: eth1: using MSI
bonding: bond0: enslaving eth1 as a backup interface with a down link.
Bridge firewalling registered
device bond0 entered promiscuous mode
Loading iSCSI transport class v2.0-870.
iscsi: registered transport (tcp)
iscsi: registered transport (iser)
bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex
bonding: bond0: link status definitely up for interface eth0.
bonding: bond0: making interface eth0 the new active one.
device eth0 entered promiscuous mode
bonding: bond0: first active interface up!
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
vmbr0: port 1(bond0) entering forwarding state
bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex
bonding: bond0: link status definitely up for interface eth1.
ip_tables: (C) 2000-2006 Netfilter Core Team
warning: `vzctl' uses 32-bit capabilities (legacy support in use)
CT: 101: started
vmbr0: no IPv6 routers present
bond0: no IPv6 routers present
CT: 102: started
CT: 104: started
CT: 105: started
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
device tap100i0d0 entered promiscuous mode
vmbr0: port 2(tap100i0d0) entering forwarding state
kvm: 5183: cpu0 unhandled wrmsr: 0x198 data 0
kvm: 5183: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 5183: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdc7b14
kvm: 5183: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
tap100i0d0: no IPv6 routers present
tmpfs: No value for mount option 'relatime'
CT: 101: stopped
CT: 101: started
tmpfs: No value for mount option 'relatime'
CT: 105: stopped
CT: 105: started
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Slow work thread pool: Starting up
Slow work thread pool: Ready
FS-Cache: Loaded
FS-Cache: Netfs 'nfs' registered for caching
svc: failed to register lockdv1 RPC service (errno 97).
tmpfs: No value for mount option 'relatime'
CT: 101: stopped
CT: 101: started
svc: failed to register lockdv1 RPC service (errno 97).
 
Seems that nfs and openvz have the same startup priority in /etc/rc2.d:

S20nfs-common
S20vz

Does it help when you change the priority of openvz to 21 (for a quick test, rename S20vz to S21vz)
 
Hi
I found this thread in search of the very same problem - NFS mounts aren't there until I look at them via the GUI after a reboot (or vzdump is invoked as our nfs mount is dedicated to vzdump images).

I see in Proxmox V1.8 there is already S21vz in /etc/rc2.d.

But my problem seems to be something like:

Waiting for vmbr0 to get ready (MAXWAIT is 2 seconds).
tg3: eth0: Link is up at 100 Mbps, full duplex.
tg3: eth0: Flow control is off for TX and off for RX.
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
vmbr0: port 1(eth0) entering forwarding state
if-up.d/mountnfs[vmbr0]: waiting for interface vmbr1 before doing NFS mounts (warning).
device eth1

-------------------------

in other words the vzdump nfs storage is connected via vmbr1, which doesn't go up before some seconds because of spanningtree timeout and so on pass by during reboots.

And - I know I asked this before but don't seem to be able to find the answer again - I'd like to make sure before running some rsync scripts that the specific storage is mounted cleanly - is there a way to start some script in the PVE framework and check for its return-value that everything is go and all needed storage-elements are online and mounted?

thank you again
hk
 
And - I know I asked this before but don't seem to be able to find the answer again - I'd like to make sure before running some rsync scripts that the specific storage is mounted cleanly - is there a way to start some script in the PVE framework and check for its return-value that everything is go and all needed storage-elements are online and mounted?

try

# pvesm list
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!