A missing NFS server can destroy HA

mgiammarco

Well-Known Member
Feb 18, 2010
161
7
58
Hello,
I have a three node Proxmox/ceph cluster with an external nfs NAS.
I write backups on that nas. Unfortunately the nas has disconnected for an hardware problem.
Linux disconnected nfs (why??) and started to write on mount point in /mnt. So it has filled the root partition blocking the ceph mon and causing several problems to the ceph cluster.
I built a Proxmox cluster to have HA and it is incredibile that a single failure of a NAS can put down the system...
What can I do?
Thanks,
Mario
 
Hi,
please share the output of pveversion -v and cat /etc/pve/storage.cfg.
 
I can send you output of that commands but it happens in many installations with different versions of proxmox. If you want to reproduce it you can configure an external nfs unit with a backup job, then power down the external unit and wait.
 
I cannot reproduce this. For me, the backup job fails with
Code:
TASK ERROR: could not activate storage 'mynfs': storage 'mynfs' is not online

Did you by chance configure the storage with type dir instead of type nfs? In that case you should add the option is_mountpoint 1 to the storage's configuration, so Proxmox VE can detect when nothing is mounted at the expected location.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!