I have an issue where my NFS storage pool stops working when the primary DNS server goes offline. I would expect that the next DNS entry in /etc/resolve.conf would be used but this apparently isn't happening.
Am I missing something fundamental here? If not then there appears to be a DNS resolve bug somewhere in the storage stack.
Setup on nodes
/etc/resolv.conf
/etc/pve/storage.cfg
pvesm status
mount
Bring Down Primary DNS (192.168.1.20)
When the primary dns is brought down then pvestorage goes offline and never comes back until primary dns is brought back online.
Primary dns is no longer responding. The secondary dns is responding using dig, both by specifying the target dns server and letting dig resolve the proper server.
Am I missing something fundamental here? If not then there appears to be a DNS resolve bug somewhere in the storage stack.
Setup on nodes
Code:
PrimaryDNS: 192.168.1.20
SecondaryDNS: 192.168.1.21
NFS Host: 192.168.1.10
Proxmox nodes: 192.168.1.1[2-4]
/etc/resolv.conf
Code:
root@vmhost04:~# cat /etc/resolv.conf
nameserver 192.168.1.20
nameserver 192.168.1.21
nameserver 8.8.8.8
search <my-domain>
/etc/pve/storage.cfg
Code:
root@vmhost04:~# cat /etc/pve/storage.cfg
...
nfs: pvestorage
export /srv/data/pvestorage
path /mnt/pve/pvestorage
server vmhost01
content iso,backup,images,vztmpl,rootdir
maxfiles 1
options vers=3
pvesm status
Code:
root@vmhost04:~# pvesm status
Name Type Status Total Used Available %
local dir active 20511312 1836172 17610180 8.95%
local-lvm lvmthin active 448278528 39314026 408964501 8.77%
pvestorage nfs active 154687488 24112128 124037120 15.59%
mount
Code:
root@vmhost04:~# mount
...
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=3281408k,mode=700)
vmhost01:/srv/data/pvestorage on /mnt/pve/pvestorage type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.10,mountvers=3,mountport=36981,mountproto=udp,local_lock=none,addr=192.168.1.10)
Bring Down Primary DNS (192.168.1.20)
When the primary dns is brought down then pvestorage goes offline and never comes back until primary dns is brought back online.
Code:
root@vmhost04:~# pvesm status
storage 'pvestorage' is not online
Name Type Status Total Used Available %
local dir active 20511312 1836304 17610048 8.95%
local-lvm lvmthin active 448278528 39314026 408964501 8.77%
pvestorage nfs inactive 0 0 0 0.00%
Primary dns is no longer responding. The secondary dns is responding using dig, both by specifying the target dns server and letting dig resolve the proper server.
Code:
root@vmhost04:~# dig +search +noall +answer vmhost01
vmhost01.<domain>. 3600 IN A 192.168.1.10
root@vmhost04:~# dig +search +noall +answer @192.168.1.20 vmhost01
;; connection timed out; no servers could be reached
root@vmhost04:~# dig +search +noall +answer @192.168.1.21 vmhost01
vmhost01.<domain>. 3600 IN A 192.168.1.10