SMB/CIFS - backup job failed - unable to open ... - Stale file handle

getQueryString

New Member
Sep 26, 2024
12
0
1
I've already read through several threads, but unfortunately I can't see through them. That's why I'd like to go through it properly here.

Problem:
INFO: starting new backup job: vzdump --notes-template '{{guestname}}' --prune-backups 'keep-monthly=3' --all 1 --fleecing 0 --storage MyCloudEX2Ultra --node pve-srv-01 --mode snapshot --compress zstd
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-09-26 12:40:30
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: Debian-LXQt
INFO: include disk 'virtio0' 'local-lvm:vm-100-disk-1' 32G
INFO: include disk 'efidisk0' 'local-lvm:vm-100-disk-0' 4M
ERROR: Backup of VM 100 failed - unable to open '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_26-12_40_30.tmp/qemu-server.conf' - Stale file handle
INFO: Failed at 2024-09-26 12:40:30
INFO: Backup job finished with errors
TASK ERROR: job errors

What I did:
I added my NAS via SMB/CIFS via Data Center > Storage:

ID: MyCloudEX2Ultra | Server: 192.168.2.134 | Username: srv-01 | Share: srv-01 | Nodes: All | Enable: Yes | Content: VZDump backup file

I have added a job via Datacenter > Backup:

Node: All | Storage: MyCloudEX2Ultra | Schedule: 9 pm everyday | Selection Mode: All | Compression: ZSTD | Mode: Snapshot | Enable: Yes | Retain for 3 months

The storage MyCloudEX2Ultra (pve-srv-01) also appears on the left in the pve-srv-01 node. I am logged in to Webaccess as root all the time for security reasons. I am aware of the security risks, but the server is on a test basis.

In the pve-srv-01 node in the shell:
mount | grep MyCloudEX2Ultra:
//192.168.2.134/srv-01 on /mnt/pve/MyCloudEX2Ultra type cifs (rw,relatime,vers=3.1.1,cache=strict,username=srv-01,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.2.134,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

df -h:
Filesystem Size Used Avail Use% Mounted on
udev 12G 0 12G 0% /dev
tmpfs 2.4G 1.4M 2.4G 1% /run
/dev/mapper/pve-root 94G 4.1G 86G 5% /
tmpfs 12G 46M 12G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
efivarfs 128K 121K 2.7K 98% /sys/firmware/efi/efivars
/dev/nvme0n1p2 1022M 12M 1011M 2% /boot/efi
/dev/fuse 128M 16K 128M 1% /etc/pve
//192.168.2.134/srv-01 64G 26G 39G 40% /mnt/pve/MyCloudEX2Ultra
tmpfs 2.4G 0 2.4G 0% /run/user/0

ls -ld /mnt/pve/MyCloudEX2Ultra/:
drwxr-xr-x 2 root root 0 Sep 26 12:40 /mnt/pve/MyCloudEX2Ultra/

If it is relevant:
  • I have a Fritz!Box 7530
  • The user srv-01 is actually present on the MyCloudEX2Ultra. The NAS is actively used.
  • I can reach the NAS with the ping command
Screenshot 2024-09-26 130428.png


I will be happy to answer any questions. Thank you all in advance.
 
Can you touch some file on this share with root?

Code:
touch /mnt/pve/MyCloudEX2Ultra/mytestfile

What happens when you unmount the SMB share over the CMD?

Code:
umount /mnt/pve/MyCloudEX2Ultra

After that it will be automatically mounted again from the system, are you able to put a backup on it now?
 
Last edited:
Can you touch some file on this share with root?

Code:
touch /mnt/pve/MyCloudEX2Ultra/mytestfile

What happens when you unmount the SMB share over the CMD?

Code:
umount /mnt/pve/MyCloudEX2Ultra

After that it will be automatically mounted again form the system, are you able to put a backup on it now?
Thank you for your answer.

This happens with touch:
Code:
root@pve-srv-01:~# touch /mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_27-08_39_46.log
root@pve-srv-01:~#

And when I unmount the drive, the backup mounts it again, but unfortunately the error remains:
Code:
INFO: starting new backup job: vzdump --fleecing 0 --compress zstd --node pve-srv-01 --prune-backups 'keep-monthly=3' --mode snapshot --notes-template '{{guestname}}' --storage MyCloudEX2Ultra --all 1
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-09-27 08:40:17
INFO: status = running
INFO: VM Name: Debian-LXQt
INFO: include disk 'virtio0' 'local-lvm:vm-100-disk-1' 32G
INFO: include disk 'efidisk0' 'local-lvm:vm-100-disk-0' 4M
INFO: backup mode: snapshot
INFO: ionice priority: 7
ERROR: Backup of VM 100 failed - unable to open '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_27-08_40_17.tmp/qemu-server.conf' - Stale file handle
INFO: Failed at 2024-09-27 08:40:18
INFO: Backup job finished with errors
TASK ERROR: job errors

Of course, I have already restarted the entire PC more than once. What should not actually be the problem is that I am accessing the Debian-LXQt VM via AnyDesk. I have also tried another access method where the VM was switched off. Same error.

EDIT: I have also noticed that a file such as vzdump-qemu-100-2024_09_27-08_39_46.log is stored on the NAS under /dump.
 

Attachments

  • Screenshot 2024-09-27 084601.png
    Screenshot 2024-09-27 084601.png
    78.3 KB · Views: 8
Last edited:
Do I understand you correctly: new files of any kind can be stored on the Samba share, but the backup process cannot store the backup? But it writes a log file. Maybe some cache problem?

What do the permissions look like directly on the Samba server when you use “ls -l” to view the directory? (If there is CLI access...)
Do you have the same behavior with other clients accessing this samba share?

Of course, I have already restarted the entire PC more than once.
You mean Proxmox VE, right? If so, have you ever restarted the Samba service where the share is located on your NAS?


Please also post your active Samba server config from you NAS.
 
Do I understand you correctly: new files of any kind can be stored on the Samba share, but the backup process cannot store the backup? But it writes a log file. Maybe some cache problem?

What do the permissions look like directly on the Samba server when you use “ls -l” to view the directory? (If there is CLI access...)
Do you have the same behavior with other clients accessing this samba share?


You mean Proxmox VE, right? If so, have you ever restarted the Samba service where the share is located on your NAS?


Please also post your active Samba server config from you NAS.

Yes, exactly. It writes a log file, and I can also manage other files on the NAS manually. The approach with the cache problem sounds plausible, I would have to check it out. What could I set for this?


Bash:
root@pve-srv-01:~# ls -l /mnt/pve/MyCloudEX2Ultra/
total 0
drwxr-xr-x 2 root root 0 Sep 27 15:31  dump
drwxr-xr-x 2 root root 0 Sep 18 18:49 'Manuelles Backup'
drwxr-xr-x 2 Wurzelwurzel 0 19. März 2024 SRV-01

A few seconds have passed here ...

Okay, wait, that's strange. I asked ChatGPT and then I noticed that I can't get the following status:
Bash:
sudo systemctl status smbd
sudo systemctl status nmbd

I had to install samba like this once:
Bash:
sudo apt install samba

Now I also get a status from smbd and nmbd. The backup is currently running (currently at 70 %). The problem seems to be solved. But I'll get back to you anyway.
 
Do I understand you correctly: new files of any kind can be stored on the Samba share, but the backup process cannot store the backup? But it writes a log file. Maybe some cache problem?

What do the permissions look like directly on the Samba server when you use “ls -l” to view the directory? (If there is CLI access...)
Do you have the same behavior with other clients accessing this samba share?


You mean Proxmox VE, right? If so, have you ever restarted the Samba service where the share is located on your NAS?


Please also post your active Samba server config from you NAS.
Okay, at the 3rd VM the process is aborted again. There are now more VMs than yesterday, but that doesn't really matter.
What is also noticeable now is that smbd and nmbd are running, but a new backup attempt now fails again:

Code:
INFO: starting new backup job: vzdump --compress zstd --node pve-srv-01 --fleecing 0 --prune-backups 'keep-monthly=3' --mode snapshot --notes-template '{{guestname}}' --storage MyCloudEX2Ultra --all 1
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-09-27 16:45:52
INFO: status = running
INFO: VM Name: Debian-LXQt
INFO: include disk 'virtio0' 'local-lvm:vm-100-disk-1' 32G
INFO: include disk 'efidisk0' 'local-lvm:vm-100-disk-0' 4M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_27-16_45_52.vma.zst'
ERROR: unable to open file '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_27-16_45_52.vma.dat' - Stale file handle
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - unable to open file '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-100-2024_09_27-16_45_52.vma.dat' - Stale file handle
INFO: Failed at 2024-09-27 16:45:53
INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2024-09-27 16:45:53
INFO: status = running
INFO: VM Name: Docker-01
INFO: include disk 'virtio0' 'local-lvm:vm-101-disk-0' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
ERROR: Backup of VM 101 failed - unable to open '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-101-2024_09_27-16_45_53.tmp/qemu-server.conf' - Stale file handle
INFO: Failed at 2024-09-27 16:45:53
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2024-09-27 16:45:53
INFO: status = running
INFO: VM Name: mc-legacy-jsrv
INFO: include disk 'virtio0' 'local-lvm:vm-102-disk-0' 16G
INFO: backup mode: snapshot
INFO: ionice priority: 7
ERROR: Backup of VM 102 failed - unable to open '/mnt/pve/MyCloudEX2Ultra/dump/vzdump-qemu-102-2024_09_27-16_45_53.tmp/qemu-server.conf' - Stale file handle
INFO: Failed at 2024-09-27 16:45:53
INFO: Backup job finished with errors
TASK ERROR: job errors

The fact is that if everything goes right first time, you've done something wrong. So it's better to take the long route :D
 
Do I understand you correctly: new files of any kind can be stored on the Samba share, but the backup process cannot store the backup? But it writes a log file. Maybe some cache problem?

What do the permissions look like directly on the Samba server when you use “ls -l” to view the directory? (If there is CLI access...)
Do you have the same behavior with other clients accessing this samba share?


You mean Proxmox VE, right? If so, have you ever restarted the Samba service where the share is located on your NAS?


Please also post your active Samba server config from you NAS.

A few additional insights on my part:

Updates are up to date. I have set the cache on all VMs to “No Cache” and “Write Through” once, although 2 of the 3 VMs are currently switched off. I have also tried backing up a single VM several times on all 3. Again, this only works from time to time and unpredictably, which VM backup fails. The error output is always the same.
There are no transfer errors between client and NAS via other systems in the network.
 
I had to install samba like this once:

Does the NAS really (also) have apt as a package manager? I must ask again: Did you have to install samba on your mycloudext2ultra? Or did you install it on your Proxmox? You don't need this on Proxmox, unless you want to use Proxmox as a samba server. The client itself is already installed via the cifs-utils.

//192.168.2.134/srv-01 64G 26G 39G 40% /mnt/pve/MyCloudEX2Ultra

This is your samba share on the NAS mounted in Proxmox VE, this config from this share would be interesting. (Config/Screenshot...)

Please also show the content of this file from your Proxmox VE.
Code:
cat /etc/pve/storage.cfg
 
Does the NAS really (also) have apt as a package manager? I must ask again: Did you have to install samba on your mycloudext2ultra? Or did you install it on your Proxmox? You don't need this on Proxmox, unless you want to use Proxmox as a samba server. The client itself is already installed via the cifs-utils.



This is your samba share on the NAS mounted in Proxmox VE, this config from this share would be interesting. (Config/Screenshot...)

Please also show the content of this file from your Proxmox VE.
Code:
cat /etc/pve/storage.cfg
I have installed samba on the Proxmox VE server. It is interesting that backups, even if only partial, only ran as soon as I installed samba on the Proxmox VE server manually. Should it just stay that way now?

My NAS does not have apt as package manager:
Bash:
root@MyCloudEX2Ultra ~ # apt
-sh: apt: not found

Content of storage.cfg:
Bash:
root@pve-srv-01:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content iso,backup,vztmpl

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

cifs: MyCloudEX2Ultra
        path /mnt/pve/MyCloudEX2Ultra
        server 192.168.2.134
        share srv-01
        content backup
        prune-backups keep-all=1
        username srv-01
        options vers=3.1.1,cache=none
 

Attachments

  • Screenshot 2024-09-30 135759.png
    Screenshot 2024-09-30 135759.png
    113.9 KB · Views: 8
I have installed samba on the Proxmox VE server. It is interesting that backups, even if only partial, only ran as soon as I installed samba on the Proxmox VE server manually. Should it just stay that way now?
No, the Samba server has nothing to do with it. Only the client part. If you have not configured any shares in your Samba server on the PVE, you can uninstall it with apt remove samba

  • Samba Server = Server component -> this is running only on this devices that share some folders. For example your share on your NAS srv-01. Samba can be installed also on debian based systems with apt install samba
  • The client part is only needed on clients, like your Proxmox VE that should connect to your NAS. This can be installed with apt install cifs-utils. This package is pre-installed by default on Proxmox VE, and does not need to be installed manually. Otherwise you would not have been able to enter your share under Datacenter -> Storage

What I would like to see is the config/share of your Nas. There will probably be a web interface. Please take a screenshot of the share srv-01 with options/settings/rights/ACL etc.
And please have a look at your NAS, maybe it is also possible to share a directory per NFS.

The journal log during the backup would also be interesting. Maybe we can see something helpful there.

Please adjust date and time.
Code:
journalctl --since "2023-12-06 20:00" --until "2023-12-07 03:00" > syslog.txt

Then attach this file in your post. Thank you.
 
Last edited:
No, the Samba server has nothing to do with it. Only the client part. If you have not configured any shares in your Samba server on the PVE, you can uninstall it with apt remove samba

  • Samba Server = Server component -> this is running only on this devices that share some folders. For example your share on your NAS srv-01. Samba can be installed also on debian based systems with apt install samba
  • The client part is only needed on clients, like your Proxmox VE that should connect to your NAS. This can be installed with apt install cifs-utils. This package is pre-installed by default on Proxmox VE, and does not need to be installed manually. Otherwise you would not have been able to enter your share under Datacenter -> Storage

What I would like to see is the config/share of your Nas. There will probably be a web interface. Please take a screenshot of the share srv-01 with options/settings/rights/ACL etc.
And please have a look at your NAS, maybe it is also possible to share a directory per NFS.

The journal log during the backup would also be interesting. Maybe we can see something helpful there.

Please adjust date and time.
Code:
journalctl --since "2023-12-06 20:00" --until "2023-12-07 03:00" > syslog.txt

Then attach this file in your post. Thank you.
Thank you for your efforts so far.
Okay, samba is gone from the PVE. cifs-utils is installed, correct.
I have now connected the NAS via NFSv4.2, and so far no more errors are occurring (access to <NAS-IP>\srv-01\ via NFS only set for 192.168.2.138 (PVE server), see image NFS settings in NAS).

Nevertheless, I have attached the syslog.txt once (2024-09-26 10:00 to 2024-10-01 23:00). I have also attached images of the settings on my NAS. The texts are all in German, but the technical terms are important. Simply skip anything unimportant.

I think it's a pity that it hasn't worked properly via SMB/CIFS yet, as NFS is unencrypted without Kerberos or anything like that. Please correct me if I'm saying something wrong :)
 

Attachments

  • srv-01_rights.png
    srv-01_rights.png
    110.2 KB · Views: 11
  • NFS settings in proxmox.png
    NFS settings in proxmox.png
    126.3 KB · Views: 12
  • NFS settings in NAS.png
    NFS settings in NAS.png
    51.2 KB · Views: 11
  • NAS settings_other.png
    NAS settings_other.png
    60.4 KB · Views: 7
  • NAS settings_network service.png
    NAS settings_network service.png
    40.3 KB · Views: 8
  • NAS settings_network profile.png
    NAS settings_network profile.png
    23.6 KB · Views: 8
I have now connected the NAS via NFSv4.2, and so far no more errors are occurring (access to <NAS-IP>\srv-01\ via NFS only set for 192.168.2.138 (PVE server), see image NFS settings in NAS).
Perfect, good work!

Nevertheless, I have attached the syslog.txt once (2024-09-26 10:00 to 2024-10-01 23:00). I have also attached images of the settings on my NAS. The texts are all in German, but the technical terms are important. Simply skip anything unimportant.

Thanks for the pictures. I don't see anything bad here now. I could only imagine that “Oplocks” is doing something different here. I don't see the syslog in the attachments, would you like to try attaching it again?
 
Perfect, good work!



Thanks for the pictures. I don't see anything bad here now. I could only imagine that “Oplocks” is doing something different here. I don't see the syslog in the attachments, would you like to try attaching it again?
:D

I then also noticed "Oplocks". I switched it off once, but unfortunately that didn't help either.
I have attached the syslog.txt again. I had to go down to 2024-09-29 20:00” --until “2024-09-30 21:30” because the file was too big all the time.
 

Attachments

I then also noticed "Oplocks". I switched it off once, but unfortunately that didn't help either.
Thank you. It was worth a try.

I found the following messages in the logs:
Code:
Sep 30 03:00:25 pve-srv-01 pvestatd[923]: storage 'MyCloudEX2Ultra' is not online
Sep 30 08:02:21 pve-srv-01 kernel: CIFS: VFS: \\192.168.2.134 has not responded in 180 seconds. Reconnecting...

The message that the NAS is not online occurs very often. If you send your NAS to standby, Samba does not like this very much.
With NFS it doesn't seem to matter. I have also mounted some NFS shares with 4.2 on the PVE. The backup server, with this NFS shares was switched off for a few days. After starting the backup server, the NFS shares on the PVE's continue to work normally without any action. With Samba I got the same error message with “Stale File handle”. Here in my tests it helped to unmount the share on the PVE nodes. After it was automatically mounted again, Samba access worked normally again.

Probably your NAS is doing something different here than a default Samba implementation on Debian. Unfortunately, it is also very difficult to test if you do not have the same NAS here.

If you would like you can go to https://bugzilla.proxmox.com to create a bug report.

The important thing is that it now works with NFS.
 
Thank you. It was worth a try.

I found the following messages in the logs:
Code:
Sep 30 03:00:25 pve-srv-01 pvestatd[923]: storage 'MyCloudEX2Ultra' is not online
Sep 30 08:02:21 pve-srv-01 kernel: CIFS: VFS: \\192.168.2.134 has not responded in 180 seconds. Reconnecting...

The message that the NAS is not online occurs very often. If you send your NAS to standby, Samba does not like this very much.
With NFS it doesn't seem to matter. I have also mounted some NFS shares with 4.2 on the PVE. The backup server, with this NFS shares was switched off for a few days. After starting the backup server, the NFS shares on the PVE's continue to work normally without any action. With Samba I got the same error message with “Stale File handle”. Here in my tests it helped to unmount the share on the PVE nodes. After it was automatically mounted again, Samba access worked normally again.

Probably your NAS is doing something different here than a default Samba implementation on Debian. Unfortunately, it is also very difficult to test if you do not have the same NAS here.

If you would like you can go to https://bugzilla.proxmox.com to create a bug report.

The important thing is that it now works with NFS.
The backup error already occurred before anything was restarted. And the NAS is always cleanly integrated after a restart of all systems.
This error message from the NAS in the file arose because the NAS is in sleep mode at night between 2am and 8am or so. The server PC (Proxmox VE) is intentionally switched off from time to time overnight.

Hmm
 
Last edited:
I had the same problem with the backup to a samba share via CIFS.
Adding the parameter "options noserverino" in the /etc/pve/storage.cfg and a "umount /mnt/pve/xxx" solved the problem.
the CLI command "pvesm set xxxx --options noserverino" did not solve the problem.
Best regards
Albert
 
I had the same problem with the backup to a samba share via CIFS.
Adding the parameter "options noserverino" in the /etc/pve/storage.cfg and a "umount /mnt/pve/xxx" solved the problem.
the CLI command "pvesm set xxxx --options noserverino" did not solve the problem.
Best regards
Albert
Unfortunately, this only worked once.
 
Hello, fyi I have the very same problem "ERROR: Backup of VM xxx failed - unable to open '/mnt/pve/<share of MyCloudEx2>/dump/vzdump-qemu-<something>.tmp/qemu-server.conf' - Stale file handle". Manually writing files from within the shell works fine.
I also use the MyCloud Ex2 Ultra. Solution for now is "options noserverino" and a restart of the host. Hope this will fix things in the long term.
Best regards
Daniel