[SOLVED] Frequent hangs when using CIFS on host

reah

New Member
Dec 31, 2023
2
1
1
Hello,

for usage in an unprivileged LXC containers I need to mount a CIFS share on my host and forward it into the container. But I already run into a problem just after mounting it into the host.

Code:
mount cifs -o username=username,password=password //192.168.0.1/cctv /mnt/smb/cctv
Mounting works, but then bash hangs just when navigating the fs. Albeit it doesn't right away, chances are good it does after half a dozend ls. Also it
Code:
umount /mnt/smb/cctv
often (but not always) hangs.

After some time later, bash just reacts as if nothing happened. Looking into syslog I then see
Code:
hades kernel: [508979.758049] CIFS: VFS: \\192.168.0.1 has not responded in 180 seconds. Reconnecting...
But I also hadq
Code:
hades kernel: [510309.890844] CIFS: VFS: reconnect tcon failed rc = -11

The interesting part is, I'm doing the same mount with same parameters in two VM since months without any signs of a problem.
Also pinging the NAS parallel to the hangs shows no losses.

So what can be the reason here?

Regards,
Reah
 
I have a similar problem. Fresh PVE 8.1 install, fresh TrueNAS CORE install. Fairly certain it's not a hardware problem on my NAS, ive run SMART tests on all of my disks, came back fine, check the RAM, no issues. I can upload large files/isos to the SMB/CIFS share fine, however, when I try to create a new LXC container, I get all sorts of errors. When I run `pct fsck <CT_ID>` it hangs and the IO delay spikes eventually causing a watchdog timeout. What's even more strange is I can install OPNSense in a VM fine (UFS filesystem), but when I try to install Debian 12 or Ubuntu 23.04 (EXT4 filesystem) the installer fails immediately due to a Bufffer I/O error. When I share my NAS as an NS share (same ZFS pool different dataset) I have none of these issues, which makes me think this is an issue with the CIFS client on the Linux 6.5.11-7-pve kernel.
I'll attach some relevant logs.

PROXMOX LOGS:
code_language.shell:
0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 12584 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 16680 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 20776 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 24872 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 28968 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 33064 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 37160 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 41256 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 45352 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:34 pve1 pvedaemon[30078]: volume 'apt-nas:101/vm-101-disk-0.raw' does not exist
Jan 01 16:57:34 pve1 kernel: CIFS: VFS: cifs_invalidate_mapping: invalidate inode 00000000c5ad4404 failed with rc -5
Jan 01 16:57:34 pve1 pvedaemon[1272]: <root@pam> end task UPID:pve1:0000757E:000F9AA3:6593435E:vzstart:101:root@pam: volume 'apt-nas:101/vm-101-disk-0.raw' does not >
Jan 01 16:58:47 pve1 kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan 01 16:58:47 pve1 kernel: Buffer I/O error on dev loop0, logical block 0, lost sync page write

 01 16:58:47 pve1 kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan 01 16:58:47 pve1 kernel: Buffer I/O error on dev loop0, logical block 0, lost sync page write
Jan 01 16:58:47 pve1 kernel: EXT4-fs (loop0): I/O error while writing superblock
Jan 01 16:58:47 pve1 kernel: EXT4-fs (loop0): mount failed
Jan 01 16:58:52 pve1 kernel: CIFS: VFS: No writable handle in writepages rc=-9
Jan 01 17:02:22 pve1 kernel: INFO: task fsck:30383 blocked for more than 120 seconds.
Jan 01 17:02:22 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:02:22 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:04:23 pve1 kernel: INFO: task fsck:30383 blocked for more than 241 seconds.
Jan 01 17:04:23 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:04:23 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:06:23 pve1 kernel: INFO: task fsck:30383 blocked for more than 362 seconds.
Jan 01 17:06:23 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:06:23 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:08:24 pve1 kernel: INFO: task fsck:30383 blocked for more than 483 seconds.
Jan 01 17:08:24 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:08:24 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:10:25 pve1 kernel: INFO: task fsck:30383 blocked for more than 604 seconds.
Jan 01 17:10:25 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:10:25 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:12:26 pve1 kernel: INFO: task fsck:30383 blocked for more than 724 seconds.
Jan 01 17:12:26 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:12:26 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:14:27 pve1 kernel: INFO: task fsck:30383 blocked for more than 845 seconds.
Jan 01 17:14:27 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:14:27 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:16:28 pve1 kernel: INFO: task fsck:30383 blocked for more than 966 seconds.
Jan 01 17:16:28 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:16:28 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:18:28 pve1 kernel: INFO: task fsck:30383 blocked for more than 1087 seconds.
Jan 01 17:18:28 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:18:28 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:20:29 pve1 kernel: INFO: task fsck:30383 blocked for more than 1208 seconds.
Jan 01 17:20:29 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:20:29 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:23:34 pve1 pvedaemon[34262]: command '/usr/bin/termproxy 5900 --path /nodes/pve1 --perm Sys.Console -- /bin/login -f root' failed: exit code 1
Jan 01 17:23:34 pve1 pvedaemon[1273]: <root@pam> end task UPID:pve1:000085D6:0011F839:6593496C:vncshell::root@pam: command '/usr/bin/termproxy 5900 --path /nodes/pve>
Jan 01 17:31:40 pve1 pvedaemon[1273]: <root@pam> end task UPID:pve1:000085A7:0011F2BA:6593495E:vzstart:101:root@pam: unable to read tail (got 0 bytes)

DEBIAN 12 VM INSTALLER LOGS:

code_language.shell:
Jan  2 06:09:26 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:09:26 partman: mkfs.ext4:
Jan  2 06:09:26 partman: Input/output error while writing out and closing file system
Jan  2 06:09:26 kernel: [ 3210.547038] sd 2:0:0:0: [sda] tag#65 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:09:26 kernel: [ 3210.547043] sd 2:0:0:0: [sda] tag#65 Sense Key : Aborted Command [current]
Jan  2 06:09:26 kernel: [ 3210.547045] sd 2:0:0:0: [sda] tag#65 Add. Sense: I/O process terminated
Jan  2 06:09:26 kernel: [ 3210.547048] sd 2:0:0:0: [sda] tag#65 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:09:26 kernel: [ 3210.547049] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:09:26 kernel: [ 3210.547054] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
Jan  2 06:13:53 net/hw-detect.hotplug: Detected hotpluggable network interface ens18
Jan  2 06:13:53 net/hw-detect.hotplug: Detected hotpluggable network interface lo
Jan  2 06:13:53 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:13:53 kernel: [ 3478.185563] sd 2:0:0:0: [sda] tag#80 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:13:53 kernel: [ 3478.185572] sd 2:0:0:0: [sda] tag#80 Sense Key : Aborted Command [current]
Jan  2 06:13:53 kernel: [ 3478.185574] sd 2:0:0:0: [sda] tag#80 Add. Sense: I/O process terminated
Jan  2 06:13:53 kernel: [ 3478.185577] sd 2:0:0:0: [sda] tag#80 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:13:53 kernel: [ 3478.185579] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:13:53 kernel: [ 3478.185583] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
Jan  2 06:13:53 partman: mkfs.ext4:
Jan  2 06:13:53 partman: Input/output error while writing out and closing file system
Jan  2 06:14:01 net/hw-detect.hotplug: Detected hotpluggable network interface ens18
Jan  2 06:14:01 net/hw-detect.hotplug: Detected hotpluggable network interface lo
Jan  2 06:14:01 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:14:01 partman: mkfs.ext4:
Jan  2 06:14:01 partman: Input/output error while writing out and closing file system
Jan  2 06:14:01 kernel: [ 3485.767081] sd 2:0:0:0: [sda] tag#94 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:14:01 kernel: [ 3485.767087] sd 2:0:0:0: [sda] tag#94 Sense Key : Aborted Command [current]
Jan  2 06:14:01 kernel: [ 3485.767089] sd 2:0:0:0: [sda] tag#94 Add. Sense: I/O process terminated
Jan  2 06:14:01 kernel: [ 3485.767092] sd 2:0:0:0: [sda] tag#94 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:14:01 kernel: [ 3485.767093] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:14:01 kernel: [ 3485.767097] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
 
Hello,

for usage in an unprivileged LXC containers I need to mount a CIFS share on my host and forward it into the container. But I already run into a problem just after mounting it into the host.

Code:
mount cifs -o username=username,password=password //192.168.0.1/cctv /mnt/smb/cctv
Mounting works, but then bash hangs just when navigating the fs. Albeit it doesn't right away, chances are good it does after half a dozend ls. Also it
Code:
umount /mnt/smb/cctv
often (but not always) hangs.

After some time later, bash just reacts as if nothing happened. Looking into syslog I then see
Code:
hades kernel: [508979.758049] CIFS: VFS: \\192.168.0.1 has not responded in 180 seconds. Reconnecting...
But I also hadq
Code:
hades kernel: [510309.890844] CIFS: VFS: reconnect tcon failed rc = -11

The interesting part is, I'm doing the same mount with same parameters in two VM since months without any signs of a problem.
Also pinging the NAS parallel to the hangs shows no losses.

So what can be the reason here?

Regards,
Reah

So this is somewhat solved for me. The traffic from hypervisor to NAS is being mingled through management VLAN, via router to homenet but the return path directly in management VLAN. Although it works per se, there seem to happen timeouts. I just wasn't able to determine why.

Anyway, I added a homenet VLAN on the hypervisor to be able to directly communicate to the NAS on the same network which is the same thing the VM do and that works fine.
 
  • Like
Reactions: maleko48

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!