[SOLVED] Issues with TrueNAS Core CIFS share

coremed

New Member
Jan 3, 2024
6
0
1
Wisconsin
Fresh PVE 8.1 install, fresh TrueNAS CORE install. Fairly certain it's not a hardware problem on my NAS, ive run SMART tests on all of my disks, came back fine, check the RAM, no issues. I can upload large files/isos to the SMB/CIFS share fine, however, when I try to create a new LXC container, I get all sorts of errors. When I run `pct fsck <CT_ID>` it hangs and the IO delay spikes eventually causing a watchdog timeout. What's even more strange is I can install OPNSense in a VM fine (UFS filesystem), but when I try to install Debian 12 or Ubuntu 23.04 (EXT4 filesystem) the installer fails immediately due to a Bufffer I/O error. When I share my NAS as an NS share (same ZFS pool different dataset) I have none of these issues, which makes me think this is an issue with the CIFS client on the Linux 6.5.11-7-pve kernel.
I'll attach some relevant logs.

PROXMOX LOGS:
code_language.shell:
0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 12584 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 16680 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 20776 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 24872 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 28968 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 33064 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 37160 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 41256 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:22 pve1 kernel: operation not supported error, dev loop0, sector 45352 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 2
Jan 01 16:57:34 pve1 pvedaemon[30078]: volume 'apt-nas:101/vm-101-disk-0.raw' does not exist
Jan 01 16:57:34 pve1 kernel: CIFS: VFS: cifs_invalidate_mapping: invalidate inode 00000000c5ad4404 failed with rc -5
Jan 01 16:57:34 pve1 pvedaemon[1272]: <root@pam> end task UPID:pve1:0000757E:000F9AA3:6593435E:vzstart:101:root@pam: volume 'apt-nas:101/vm-101-disk-0.raw' does not >
Jan 01 16:58:47 pve1 kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan 01 16:58:47 pve1 kernel: Buffer I/O error on dev loop0, logical block 0, lost sync page write

 01 16:58:47 pve1 kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan 01 16:58:47 pve1 kernel: Buffer I/O error on dev loop0, logical block 0, lost sync page write
Jan 01 16:58:47 pve1 kernel: EXT4-fs (loop0): I/O error while writing superblock
Jan 01 16:58:47 pve1 kernel: EXT4-fs (loop0): mount failed
Jan 01 16:58:52 pve1 kernel: CIFS: VFS: No writable handle in writepages rc=-9
Jan 01 17:02:22 pve1 kernel: INFO: task fsck:30383 blocked for more than 120 seconds.
Jan 01 17:02:22 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:02:22 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:04:23 pve1 kernel: INFO: task fsck:30383 blocked for more than 241 seconds.
Jan 01 17:04:23 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:04:23 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:06:23 pve1 kernel: INFO: task fsck:30383 blocked for more than 362 seconds.
Jan 01 17:06:23 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:06:23 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:08:24 pve1 kernel: INFO: task fsck:30383 blocked for more than 483 seconds.
Jan 01 17:08:24 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:08:24 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:10:25 pve1 kernel: INFO: task fsck:30383 blocked for more than 604 seconds.
Jan 01 17:10:25 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:10:25 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:12:26 pve1 kernel: INFO: task fsck:30383 blocked for more than 724 seconds.
Jan 01 17:12:26 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:12:26 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:14:27 pve1 kernel: INFO: task fsck:30383 blocked for more than 845 seconds.
Jan 01 17:14:27 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:14:27 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:16:28 pve1 kernel: INFO: task fsck:30383 blocked for more than 966 seconds.
Jan 01 17:16:28 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:16:28 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:18:28 pve1 kernel: INFO: task fsck:30383 blocked for more than 1087 seconds.
Jan 01 17:18:28 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:18:28 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:20:29 pve1 kernel: INFO: task fsck:30383 blocked for more than 1208 seconds.
Jan 01 17:20:29 pve1 kernel:       Tainted: P           O       6.5.11-7-pve #1
Jan 01 17:20:29 pve1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 01 17:23:34 pve1 pvedaemon[34262]: command '/usr/bin/termproxy 5900 --path /nodes/pve1 --perm Sys.Console -- /bin/login -f root' failed: exit code 1
Jan 01 17:23:34 pve1 pvedaemon[1273]: <root@pam> end task UPID:pve1:000085D6:0011F839:6593496C:vncshell::root@pam: command '/usr/bin/termproxy 5900 --path /nodes/pve>
Jan 01 17:31:40 pve1 pvedaemon[1273]: <root@pam> end task UPID:pve1:000085A7:0011F2BA:6593495E:vzstart:101:root@pam: unable to read tail (got 0 bytes)

DEBIAN 12 VM INSTALLER LOGS:

code_language.shell:
Jan  2 06:09:26 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:09:26 partman: mkfs.ext4:
Jan  2 06:09:26 partman: Input/output error while writing out and closing file system
Jan  2 06:09:26 kernel: [ 3210.547038] sd 2:0:0:0: [sda] tag#65 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:09:26 kernel: [ 3210.547043] sd 2:0:0:0: [sda] tag#65 Sense Key : Aborted Command [current]
Jan  2 06:09:26 kernel: [ 3210.547045] sd 2:0:0:0: [sda] tag#65 Add. Sense: I/O process terminated
Jan  2 06:09:26 kernel: [ 3210.547048] sd 2:0:0:0: [sda] tag#65 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:09:26 kernel: [ 3210.547049] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:09:26 kernel: [ 3210.547054] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
Jan  2 06:13:53 net/hw-detect.hotplug: Detected hotpluggable network interface ens18
Jan  2 06:13:53 net/hw-detect.hotplug: Detected hotpluggable network interface lo
Jan  2 06:13:53 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:13:53 kernel: [ 3478.185563] sd 2:0:0:0: [sda] tag#80 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:13:53 kernel: [ 3478.185572] sd 2:0:0:0: [sda] tag#80 Sense Key : Aborted Command [current]
Jan  2 06:13:53 kernel: [ 3478.185574] sd 2:0:0:0: [sda] tag#80 Add. Sense: I/O process terminated
Jan  2 06:13:53 kernel: [ 3478.185577] sd 2:0:0:0: [sda] tag#80 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:13:53 kernel: [ 3478.185579] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:13:53 kernel: [ 3478.185583] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
Jan  2 06:13:53 partman: mkfs.ext4:
Jan  2 06:13:53 partman: Input/output error while writing out and closing file system
Jan  2 06:14:01 net/hw-detect.hotplug: Detected hotpluggable network interface ens18
Jan  2 06:14:01 net/hw-detect.hotplug: Detected hotpluggable network interface lo
Jan  2 06:14:01 partman: mke2fs 1.47.0 (5-Feb-2023)
Jan  2 06:14:01 partman: mkfs.ext4:
Jan  2 06:14:01 partman: Input/output error while writing out and closing file system
Jan  2 06:14:01 kernel: [ 3485.767081] sd 2:0:0:0: [sda] tag#94 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Jan  2 06:14:01 kernel: [ 3485.767087] sd 2:0:0:0: [sda] tag#94 Sense Key : Aborted Command [current]
Jan  2 06:14:01 kernel: [ 3485.767089] sd 2:0:0:0: [sda] tag#94 Add. Sense: I/O process terminated
Jan  2 06:14:01 kernel: [ 3485.767092] sd 2:0:0:0: [sda] tag#94 CDB: Write(10) 2a 00 01 d3 50 00 00 00 08 00
Jan  2 06:14:01 kernel: [ 3485.767093] I/O error, dev sda, sector 30625792 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
Jan  2 06:14:01 kernel: [ 3485.767097] Buffer I/O error on dev dm-0, logical block 3702784, lost async page write
 
To add more info:
Network speed is not an issue, iperf3 shows full 10gbps from node to NAS.
MTU is 1500 on both NAS and PVE nodes.
Kernel is 6.5.11-7-pve (proxmox 8.1.3)

Storage config:

code_language.shell:
root@pve1:/etc/pve# cat storage.cfg
dir: local
    path /var/lib/vz
    content iso,vztmpl,backup

lvmthin: local-lvm
    thinpool data
    vgname pve
    content rootdir,images

cifs: apt-nas
    path /mnt/pve/apt-nas
    server 10.10.10.100
    share apt-nas
    content iso,images,vztmpl,rootdir
    prune-backups keep-all=1
    username pve

mount:

Code:
root@pve1:/etc/pve# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=65758104k,nr_inodes=16439526,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=13158356k,mode=755,inode64)
/dev/mapper/pve-root on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=49169)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-sysusers.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
/dev/sda2 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
ramfs on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
//10.10.10.100/apt-nas on /mnt/pve/apt-nas type cifs (rw,relatime,vers=3.1.1,cache=strict,username=pve,uid=0,noforceuid,gid=0,noforcegid,addr=10.10.10.100,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=13158352k,nr_inodes=3289588,mode=700,inode64)

Proxmox packages:

code_language.shell:
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.5.11-4-pve-signed: 6.5.11-4
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.5
pve-qemu-kvm: 8.1.2-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
 
Last edited:
try to add "option cache=none" for the CIFS storage into your /etc/pve/storage.cfg
 
Just fixed the issue. If anyone else runs into an issue similar to this, install TrueNAS Scale. TrueNAS Core is using an outed version of Samba which I believe was the cause of the problem. Either way, installing scale resolved the issue.
 
Last edited:
Had same problem and to find out the reason I stopped VM's and CT's one by one. This way found that 2 of my CT's were the culprit. To fix this I've basically remove them and restore an older backup, no more problems after that!