LXC containers with CIFS Share as Disk Drive Not booting.

yeabor · Sep 19, 2022

Hi,

I am having issues with booting a lxc container using a CIFS/SMB network share as disk storage. The only way the container will boot is by rebooting my proxmox node. The proxmox version is 7.2-3

Attached is the storage.cfg, pct config and pct start log.

Stoiko Ivanov · Sep 20, 2022

on a hunch - please try running `pct fsck 240` and see if this improves the situation

yeabor · Sep 21, 2022

Stoiko Ivanov said:
on a hunch - please try running `pct fsck 240` and see if this improves the situation

Wow! that worked like magic. Please what was the issue? and what does that command do?

fiona · Sep 22, 2022

Hi,
the command checks and repairs the file system on the volume. Maybe the superblock got corrupted somehow?

I could reproduce the issue and unfortunately, running fsck doesn't seem to be a permanent solution, but only until the next time the file-system is re-mounted. There are quite a few threads in the forum over the years, about the same issue with a rootfs disk on CIFS.

Example on my machine. I even got a transient volume 'mysmb:143/vm-143-disk-0.raw' does not exist error once.

Code:

root@pve701:~# pct mount 143
mounted CT 143 in '/var/lib/lxc/143/rootfs'
root@pve701:~# pct unmount 143
root@pve701:~# pct mount 143
mount: /var/lib/lxc/143/rootfs: can't read superblock on /dev/loop0.
mounting container failed
command 'mount /dev/loop0 /var/lib/lxc/143/rootfs//' failed: exit code 32
root@pve701:~# pct unlock 143
root@pve701:~# pct mount 143
mount: /var/lib/lxc/143/rootfs: can't read superblock on /dev/loop0.
mounting container failed
command 'mount /dev/loop0 /var/lib/lxc/143/rootfs//' failed: exit code 32
root@pve701:~# pct unlock 143
root@pve701:~# pct fsck 143
fsck from util-linux 2.36.1
MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...
/mnt/pve/mysmb/images/143/vm-143-disk-0.raw: recovering journal
/mnt/pve/mysmb/images/143/vm-143-disk-0.raw: clean, 21884/524288 files, 186373/2097152 blocks
root@pve701:~# pct mount 143
mounted CT 143 in '/var/lib/lxc/143/rootfs'
root@pve701:~# pct unmount 143
root@pve701:~# pct mount 143
volume 'mysmb:143/vm-143-disk-0.raw' does not exist
root@pve701:~# pct unlock 143
root@pve701:~# pct mount 143
mount: /var/lib/lxc/143/rootfs: can't read superblock on /dev/loop0.
mounting container failed
command 'mount /dev/loop0 /var/lib/lxc/143/rootfs//' failed: exit code 32

Alwin Antreich · Sep 4, 2024

Hi,

sorry to revive this old thread. Though I was running into this issue on one of my trainings. And the corrupt/missing inodes seem to vanish when options cache=none is set on the CIFS storage in /etc/pve/storage.cfg.

From the mount.cifs man page. cache=strict is the default.

cache=none means that the client never utilizes the cache for normal reads and writes. It always accesses the server directly to satisfy a read or write request.

cache=strict means that the client will attempt to follow the CIFS/SMB2 protocol strictly. That is, the cache is only trusted when the client holds an oplock. When the client does not hold an oplock, then the client bypasses the cache and accesses the server directly to satisfy a read or write request. By doing this, the client avoids problems with byte range locks. Additionally, byte range locks are cached on the client when it holds an oplock and are "pushed" to the server when that oplock is recalled.

The performance difference for the container. Test done on an alpine 3.19 container.

Code:

~ # fio --rw=write --name=test --bs=4K --size=2G --direct=0 --sync=0 --filename=/srv/test.dd --runtime=600 --time_based
### cache=none
Jobs: 1 (f=1): [W(1)][100.0%][w=18.0MiB/s][w=4600 IOPS][eta 00m:00s]
### cache=strict (default)
Jobs: 1 (f=1): [W(1)][100.0%][w=659MiB/s][w=169k IOPS][eta 00m:00s]

Code:

~ # fio --rw=read --name=test --bs=4K --size=2G --direct=0 --sync=0 --filename=/srv/test.dd --runtime=600 --time_based
### cache=none
Jobs: 1 (f=1): [R(1)][100.0%][r=21.0MiB/s][r=5376 IOPS][eta 00m:00s]
### cache=strict (default)
Jobs: 1 (f=1): [R(1)][100.0%][r=567MiB/s][r=145k IOPS][eta 00m:00s]

Just a simple test for sequential read/writes, in particularly testing the difference of the cache performance. Though since my test system is all VMs, those IOPS are only exemplary.

I suppose there might be other settings that could mitigate this behavior as well. I'd keep digging a bit. Just wanted to post my findings in the meantime.

Search

Search

LXC containers with CIFS Share as Disk Drive Not booting.

yeabor

New Member

Stoiko Ivanov

Proxmox Staff Member

yeabor

New Member

fiona

Proxmox Staff Member

Alwin Antreich

Active Member