[SOLVED] PVE6 various issues since PVE5 upgrade

ckt

New Member
Sep 23, 2019
6
1
3
36
Hi there,

a few months ago, we've upgraded a few servers running PVE 5.4 to 6.0. While the upgrade was quite easy, we encountered some serious issues since then.

Backup mounts seem to lose connectivity and crash the entire backup system
We're using a CIFS-backed storage to store backups. While this worked pretty well in the past (pre 6.0), it became a nightmare with PVE 6.0. CIFS mounts seem to become unresponsive after some time (even with the implicit soft mount option, which should take care of reconnecting in case the connection to the CIFS server gets lost), causing vzdump to freeze. The only way to get backup back running is to reboot the host. Lazy unmount doesn't help, there are still Zombie task processes running which seem to keep backup locks (qm unlock <vmid> doesn't help to restore backup functionality at all). There seems to be another thread that has a similar issue (https://forum.proxmox.com/threads/proxmox-6-cifs-stuck.57162/)

Another issue (maybe related to the CIFS mount one): after some time, the web UI only reports a list of VMs with a question mark icon, without any names.
I can fix this by restarting the pvestatd service, but I'm not supposed to restart that service every day to keep things working.

Are these known bugs? What can I do to solve them?

PVE version: PVE 6.0-7

(Excerpt from dmesg)
Code:
[2016780.401181] CIFS: Attempting to mount //xxx.your-storagebox.de/backup
[2016780.401316] No dialect specified on mount. Default has changed to a more secure dialect, SMB2.1 or later (e.g. SMB3), from CIFS (SMB1). To use the less secure SMB1 dialect to access old servers which do not support SMB3 (or SMB2.1) specify vers=1.0 on mount.
[2016780.414361] FS-Cache: Duplicate cookie detected
[2016780.414448] FS-Cache: O-cookie c=000000005514e2f8 [p=00000000f8901738 fl=222 nc=1 na=1]
[2016780.414569] FS-Cache: O-cookie d=00000000d5afabc2 n=00000000c83e5b8e
[2016780.414655] FS-Cache: O-key=[20] '0a0001bd2a0104f80b1070000000000000000098'
[2016780.414758] FS-Cache: N-cookie c=00000000599ee848 [p=00000000f8901738 fl=2 nc=0 na=1]
[2016780.414876] FS-Cache: N-cookie d=00000000d5afabc2 n=00000000f3f2513a
[2016780.414966] FS-Cache: N-key=[20] '0a0001bd2a0104f80b1070000000000000000098'
[2018229.097634] CIFS VFS: Error -32 sending data on socket to server
[2018229.097731] CIFS VFS: Error -32 sending data on socket to server
[2018229.098199] CIFS VFS: Error -32 sending data on socket to server
[2027656.435098] CIFS: Attempting to mount //xxx.your-storagebox.de/backup
[2027656.435245] No dialect specified on mount. Default has changed to a more secure dialect, SMB2.1 or later (e.g. SMB3), from CIFS (SMB1). To use the less secure SMB1 dialect to access old servers which do not support SMB3 (or SMB2.1) specify vers=1.0 on mount.
[2027656.452064] FS-Cache: Duplicate cookie detected
[2027656.452150] FS-Cache: O-cookie c=000000005514e2f8 [p=00000000f8901738 fl=222 nc=1 na=1]
[2027656.452267] FS-Cache: O-cookie d=00000000d5afabc2 n=00000000c83e5b8e
[2027656.452365] FS-Cache: O-key=[20] '0a0001bd2a0104f80b1070000000000000000098'
[2027656.452459] FS-Cache: N-cookie c=00000000d62955e4 [p=00000000f8901738 fl=2 nc=0 na=1]
[2027656.452576] FS-Cache: N-cookie d=00000000d5afabc2 n=000000004e79b069
[2027656.452663] FS-Cache: N-key=[20] '0a0001bd2a0104f80b1070000000000000000098'
[2027656.643074] CIFS VFS: cifs_mount failed w/return code = -13
[2034909.638253] CIFS VFS: Cancelling wait for mid 798469 cmd: 5
[2034909.638341] CIFS VFS: Cancelling wait for mid 798470 cmd: 16
[2034909.638426] CIFS VFS: Cancelling wait for mid 798471 cmd: 6
[2034910.539319] CIFS VFS: Close unmatched open
[2034939.116714] CIFS VFS: Cancelling wait for mid 805505 cmd: 5
[2034939.116812] CIFS VFS: Cancelling wait for mid 805506 cmd: 16
[2034939.116906] CIFS VFS: Cancelling wait for mid 805507 cmd: 6
[2034940.147990] CIFS VFS: Close unmatched open
…
 
I have the same problems with a proxmox6-cluster (the proxmox5-cluster is not affected).

seem to become unresponsive after some time

I see the problem, when a xxx.your-storagebox.de is not availiable (maintaince or something similar). As long as there is no backup-task running / staled, i can umount the cifs-mount with umount /mnt/pve/xxx/dump and the storage is re-mounted after some time.
 
It seems to be indeed related to the SMB version, which is 3.1.x by default for >= 5.0 kernels. I've added the vers=3.0 mount option yesterday and it looks like the issues gone away. The pvestatd issues are most likely related to the CIFS mount, because it queries the storage entries and may fail when they become unresponsive.

I'll closely monitor the behaviour now and report back to (hopefully) confirm that it has been fixed.
 
Hello, I have the same issue:
[53689.442799] CIFS VFS: Cancelling wait for mid 67599 cmd: 5
[53689.442865] CIFS VFS: Cancelling wait for mid 67600 cmd: 16
[53692.285196] CIFS VFS: Close unmatched open
[53698.981206] CIFS VFS: Cancelling wait for mid 68432 cmd: 5
[53698.981254] CIFS VFS: Cancelling wait for mid 68433 cmd: 16
[53703.265468] CIFS VFS: Close unmatched open


I tried modifing /etc/pve/storage.cfg changing smb version to 3.0 and then to 2.0 but the system still hangs during backups.
This is my configuration:
cifs: sambaProxmoxBackup
path /mnt/pve/sambaProxmoxBackup
server pdcsamba
share proxmoxBackup
content backup
maxfiles 3
username adminProxmoxBackup
smbversion 2.0


Any suggestion?
 
Hello, I have the same issue:
[53689.442799] CIFS VFS: Cancelling wait for mid 67599 cmd: 5
[53689.442865] CIFS VFS: Cancelling wait for mid 67600 cmd: 16
[53692.285196] CIFS VFS: Close unmatched open
[53698.981206] CIFS VFS: Cancelling wait for mid 68432 cmd: 5
[53698.981254] CIFS VFS: Cancelling wait for mid 68433 cmd: 16
[53703.265468] CIFS VFS: Close unmatched open


I tried modifing /etc/pve/storage.cfg changing smb version to 3.0 and then to 2.0 but the system still hangs during backups.
This is my configuration:
cifs: sambaProxmoxBackup
path /mnt/pve/sambaProxmoxBackup
server pdcsamba
share proxmoxBackup
content backup
maxfiles 3
username adminProxmoxBackup
smbversion 2.0


Any suggestion?


Did it help? Have you found the silution?

Thx
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!