This happened already in the past (1,5 years ago) before Proxmox VE started supporting mounting CIFS shares via GUI. Back then I wrote a script that would check every Minute if it can read the content of a file on the samba share and trigger a forced unmount and remount of a failed share.
After 3 retries, with 30 seconds waiting time, it would then trigger an critical alert that it's unable to remount the CIFS share.
Now the issue with that is that it causes the server to crash, since it writes million of entries like that into the syslog until the disk is full. On a standard setup of PVE there is no separate log partition, hence this will cause any host to crash at a specific point in time (when the disk is full).
A hard reboot is the only thing that helps to bring the host back as even the console won't react anymore.
@PVE devs:
Can you please build in a check for the CIFS drive that checks if the drive is available before the system triggers a constant remount nightmare?
It should also check for the current hostname resolution, in case that the IP behind the CIFS share should have changed. This does rarely happen but is quiet common if you rent storage via CIFS shares from providers.
Can you implement such feature please to improve the stability of the system?
Currently running PVE 5.3 Community on this box.
After 3 retries, with 30 seconds waiting time, it would then trigger an critical alert that it's unable to remount the CIFS share.
Now the issue with that is that it causes the server to crash, since it writes million of entries like that into the syslog until the disk is full. On a standard setup of PVE there is no separate log partition, hence this will cause any host to crash at a specific point in time (when the disk is full).
A hard reboot is the only thing that helps to bring the host back as even the console won't react anymore.
@PVE devs:
Can you please build in a check for the CIFS drive that checks if the drive is available before the system triggers a constant remount nightmare?
It should also check for the current hostname resolution, in case that the IP behind the CIFS share should have changed. This does rarely happen but is quiet common if you rent storage via CIFS shares from providers.
Can you implement such feature please to improve the stability of the system?
Currently running PVE 5.3 Community on this box.
Code:
Dec 16 06:25:03 pve1 kernel: [1524357.930340] cifs_vfs_err: 2 callbacks suppressed
Dec 16 06:25:03 pve1 kernel: [1524357.930341] CIFS VFS: Free previous auth_key.response = 0000000035620a02
Dec 16 06:25:03 pve1 kernel: [1524357.932165] CIFS VFS: Send error in SessSetup = -13