Hi people....
First post, happy to join proxmox community!
So... here we go:
- I got a 3.x cluster at OVH provider (I mention it since I see it sponsors PROXMOX somehow...), there are 4 HN Cluster runing over Vrack (internal LAN network): cluster is OK, quorum, etc...
- I got two storage servers, let's say them "nas", connected to the internal LAN (Vrack) providing storage through NFS.
Beyond horrible performance, the setup performs roughly against the odds... but I'm tired of dealing with some NFS problems:
- Sometimes, as I "lost" a storage server (due to hardware / vrack failures... sadly this occurs almost monthly) the Hypervisors and their VMs react in a strange manner.
Since mount is not handled via fstab, the only way I know to restore mounts is to restart nfs at the nas, followed by pve-cluster process on the hypervisor.
- This leads to strange behaviour as I see duplicated mount entries in /mtab
- Disks stored on this NFS shared are available and "known" at the VMs level, although VMs complain those disks to be "resource or device busy".
- md array manager mark those disks as Failed/Faulty (althoug being virtual, this is impossible) and I cannot operate them (readd to array, etc...) as the system complains they're busy!
- To workaround this I remove / add (got hotswap enabled... it rocks!) the disk to the VM, but then the VM identifies the disk as a completely new drive! and RAID has to be resynced from 0!
... I feel lost on the root causes of all this, maybe you culd give me some clues on understanding what is going on, and several doubts arise...
- How can I deal with duplicate nfs mounts?
- How could I manage the "remount" of nfs shares "elegantly" (i.e without restarting every related service!) to have my shares mounted again properly...
- Why are disk blocked on the VM? how could I handle this withou remove / readd disk to VM?
- Why the same exact thisk, in the same VM, is considered a new block device? this forces me to rearrange my RAID every time...
...I know, I know... too much questions...
Hop someone here could help me!
Anyhow, thank you in advance and best regards
First post, happy to join proxmox community!
So... here we go:
- I got a 3.x cluster at OVH provider (I mention it since I see it sponsors PROXMOX somehow...), there are 4 HN Cluster runing over Vrack (internal LAN network): cluster is OK, quorum, etc...
- I got two storage servers, let's say them "nas", connected to the internal LAN (Vrack) providing storage through NFS.
Beyond horrible performance, the setup performs roughly against the odds... but I'm tired of dealing with some NFS problems:
- Sometimes, as I "lost" a storage server (due to hardware / vrack failures... sadly this occurs almost monthly) the Hypervisors and their VMs react in a strange manner.
Since mount is not handled via fstab, the only way I know to restore mounts is to restart nfs at the nas, followed by pve-cluster process on the hypervisor.
- This leads to strange behaviour as I see duplicated mount entries in /mtab
- Disks stored on this NFS shared are available and "known" at the VMs level, although VMs complain those disks to be "resource or device busy".
- md array manager mark those disks as Failed/Faulty (althoug being virtual, this is impossible) and I cannot operate them (readd to array, etc...) as the system complains they're busy!
- To workaround this I remove / add (got hotswap enabled... it rocks!) the disk to the VM, but then the VM identifies the disk as a completely new drive! and RAID has to be resynced from 0!
... I feel lost on the root causes of all this, maybe you culd give me some clues on understanding what is going on, and several doubts arise...
- How can I deal with duplicate nfs mounts?
- How could I manage the "remount" of nfs shares "elegantly" (i.e without restarting every related service!) to have my shares mounted again properly...
- Why are disk blocked on the VM? how could I handle this withou remove / readd disk to VM?
- Why the same exact thisk, in the same VM, is considered a new block device? this forces me to rearrange my RAID every time...
...I know, I know... too much questions...
Hop someone here could help me!
Anyhow, thank you in advance and best regards