ESTALE: Stale file handle

xk3tchuPx

Member
May 16, 2020
7
0
6
32
I'm having this issue since a couple weeks.
Backup would fail with this error.

It usually juste need to reboot PBS server to fix the issue, but it's very annoying.
Is there something I could do to solve this?

I'm backing up to a Local PBS server, running in a VM and then it's remote sync to another PBS instance running in AWS.

R,
xk3tchuPx
 
Do you see those errors in the task log when backing up from PVE to the local PBS VM?
Please provide the full log and the output of `proxmox-backup-manager versions --verbose`
 
Code:
proxmox-backup                2.3-1        running kernel: 5.15.83-1-pve
proxmox-backup-server         2.3.1-1      running version: 2.3.1       
pve-kernel-5.15               7.3-1                                     
pve-kernel-helper             7.3-1                                     
pve-kernel-5.15.83-1-pve      5.15.83-1                                 
pve-kernel-5.15.74-1-pve      5.15.74-1                                 
pve-kernel-5.15.35-1-pve      5.15.35-3                                 
ifupdown2                     3.1.0-1+pmx3                             
libjs-extjs                   7.0.0-1                                   
proxmox-backup-docs           2.3.1-1                                   
proxmox-backup-client         2.3.1-1                                   
proxmox-mini-journalreader    1.2-1                                     
proxmox-offline-mirror-helper 0.5.0-1                                   
proxmox-widget-toolkit        3.5.3                                     
pve-xtermjs                   4.16.0-1                                 
smartmontools                 7.2-pve3                                 
zfsutils-linux                2.1.7-pve1

Code:
2022-12-31T15:08:20-05:00: starting new backup on datastore 'NFS-CATALINA': "ct/109/2022-12-31T20:08:20Z"
2022-12-31T15:08:20-05:00: protocol upgrade done
2022-12-31T15:08:20-05:00: GET /previous_backup_time
2022-12-31T15:08:20-05:00: POST /blob
2022-12-31T15:08:20-05:00: add blob "/mnt/datastore/ct/109/2022-12-31T20:08:20Z/pct.conf.blob" (217 bytes, comp: 217)
2022-12-31T15:08:20-05:00: POST /dynamic_index
2022-12-31T15:08:20-05:00: POST /dynamic_index
2022-12-31T15:08:20-05:00: POST /dynamic_index: 400 Bad Request: unable to get shared lock - ESTALE: Stale file handle
2022-12-31T15:08:20-05:00: POST /dynamic_index: 400 Bad Request: unable to get shared lock - ESTALE: Stale file handle
2022-12-31T15:08:20-05:00: backup ended and finish failed: backup ended but finished flag is not set.
2022-12-31T15:08:20-05:00: removing unfinished backup
2022-12-31T15:08:20-05:00: TASK ERROR: backup ended but finished flag is not set.


My backup storage isn't fast storage, it's a ZFS raid10 made of 4x4TB HDD.
I've just added a mirrored SLOG ssd in front to see if that helps with stale.

Will report back.


R,
xk3tchuPx
 
Based on the name `NFS-CATALINA`, is the datastore an NFS mount instead of a local ZFS pool?
 
That can happen when clients don't disconnect cleanly.
Try removing the exports on the NFS and adding them again.

It's a common issue with NFS it seems, at least you can find a lot of things regarding those when searching for `ESTALE` and `NFS`.
 
I've seen a lot of post regarding that issue when NFS is in use, however nothing really helpful regarding how to fix it.
 
Re-export the mounts, this should fix it at least for some time.
Also check the logs of your NFS server to see if there are errors on that side.
 
Im having the same problem. - Not with PBS, but with the PVE host itself which has mounted an NFS share of a QNAP NAS.

One day my NAS seemed to hang/freeze and i needed to reboot it. Now i always had this STALE error, when i tried accessing files there.
In the PVE GUI u cannot remount NFS shares. So i deleted it and recreated it, but throws some obscure error message similar to this one:
https://forum.proxmox.com/threads/read-only-cifs-smb-and-nfs.101332/

So i tried remounting it from the command line, and also first unmounting it, and cleanly mounting it again. But it does it work, it always fails.
I tried it from other machines and they have no problem mounting it, so its definitely a pve-problem.
When i try removing it from GUI and adding it again there, then it always creates the directory, fe. in /mnt/pve/myNasShare . But its never accessible. But this directory cannot be deleted afterwards, it says device is busy.

I wanted to avoid rebooting the whole PVE server. - Is there any other way to do it?
 
UPDATE:
We just rebooted the server. The problem still persists.
The error message in the GUI when trying to add the NFS share again, is:
Code:
"create storage failed: mkdir /mnt/pve/vlnas1-nfsvol1/images: Read-only file system at /usr/share/perl5/PVE/Storage/Plugin.pm line 1374. (500)"

Is there any solution?
 
Last edited:
  • Like
Reactions: Altrove
UPDATE:
We just rebooted the server. The problem still persists.
The error message in the GUI when trying to add the NFS share again, is:
Code:
"create storage failed: mkdir /mnt/pve/vlnas1-nfsvol1/images: Read-only file system at /usr/share/perl5/PVE/Storage/Plugin.pm line 1374. (500)"

Is there any solution?
i have the some problem, after i have update the proxmox VE to 7.4.3 from 7.4.1, Till yesterday the QNAP and NFS storage work like as charme...
 
The solution from Altrove also works for me. The error is reproducable. Maybe it will not take many effort for the PROXMOX guys to fix it in a further release?
 
  • Like
Reactions: Altrove

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!