[SOLVED] lxc nested subvols sideloaded not visable at boot

Kodey

Member
Oct 26, 2021
120
6
23
Have a new pct created with the gui:
INI:
arch: amd64
cores: 1
hostname: testct
memory: 2048
nameserver: 192.168.1.1
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=BC:24:11:C6:DA:75,link_down=1,type=veth
ostype: debian
rootfs: zfs16Tr10:subvol-112-disk-0,replicate=0,size=12G
searchdomain: jammed-with-love.com
swap: 2048
unprivileged: 1

Have a nested subvol mounted on the rootfs:
Code:
# zfs create \
            -o aclinherit=discard \
            -o canmount=noauto \
            -o devices=off \
            -o compression=on \
            -o dnodesize=auto \
            -o encryption=on \
            -o keyformat=passphrase \
            -o keylocation='file:///root/zfs10-pool-subvol-112-disk-1-storage.password' \
            -o mountpoint='/zfs10-pool/subvol-112-disk-0/srv/nested/storage' \
            'zfs10-pool/subvol-112-disk-0/subvol-112-disk-1'

It's mounted, the host can read it:
Code:
# zfs mount -l zfs10-pool/subvol-112-disk-0/subvol-112-disk-1
# zfs list -Ho mounted "zfs10-pool/subvol-112-disk-0/subvol-112-disk-1"
yes
# ls /zfs10-pool/subvol-112-disk-0/srv/nested/storage
test1  test2  test3  test4

When I start the container, it's not visible:
Code:
# pct start 112
# pct enter 112
root@testct:/# df
Filesystem                   1K-blocks   Used Available Use% Mounted on
zfs10-pool/subvol-112-disk-0  12582912 352000  12230912   3% /
none                               492      4       488   1% /dev
udev                          53297284      0  53297284   0% /dev/tty
tmpfs                         65913848      0  65913848   0% /dev/shm
tmpfs                         26365540     44  26365496   1% /run
tmpfs                             5120      0      5120   0% /run/lock
tmpfs                             1024      0      1024   0% /run/credentials/systemd-sysctl.service
tmpfs                             1024      0      1024   0% /run/credentials/systemd-sysusers.service
tmpfs                             1024      0      1024   0% /run/credentials/systemd-tmpfiles-setup-dev.service
tmpfs                             1024      0      1024   0% /run/credentials/systemd-tmpfiles-setup.service
root@testct:/# ls -l /srv/nested/storage/
total 0

However, if I mount the fs when the container is running, it works fine.
Code:
# pct start 112
# zfs mount -l zfs10-pool/subvol-112-disk-0/subvol-112-disk-1
# zfs list -Ho mounted "zfs10-pool/subvol-112-disk-0/subvol-112-disk-1"
yes
# ls /zfs10-pool/subvol-112-disk-0/srv/nested/storage
test1  test2  test3  test4
# pct enter 112
root@testct:/# df
Filesystem                                       1K-blocks   Used   Available Use% Mounted on
zfs10-pool/subvol-112-disk-0                      12582912 352000    12230912   3% /
none                                                   492      4         488   1% /dev
udev                                              53297284      0    53297284   0% /dev/tty
tmpfs                                             65913848      0    65913848   0% /dev/shm
tmpfs                                             26365540     44    26365496   1% /run
tmpfs                                                 5120      0        5120   0% /run/lock
tmpfs                                                 1024      0        1024   0% /run/credentials/systemd-sysctl.service
tmpfs                                                 1024      0        1024   0% /run/credentials/systemd-sysusers.service
tmpfs                                                 1024      0        1024   0% /run/credentials/systemd-tmpfiles-setup-dev.service
tmpfs                                                 1024      0        1024   0% /run/credentials/systemd-tmpfiles-setup.service
zfs10-pool/subvol-112-disk-0/subvol-112-disk-1 24040025728    384 24040025344   1% /srv/nested/storage
root@testct:/# ls -l /srv/nested/storage/
total 36
drwxr-xr-x 2 root root 3 Apr  8 03:30 test1
drwxr-xr-x 2 root root 3 Apr  8 03:30 test2
drwxr-xr-x 2 root root 3 Apr  8 03:31 test3
drwxr-xr-x 2 root root 3 Apr  8 03:31 test4

I'm trying to make sure the subvol is mounted before startup because the media server will empty the database if all the media is missing.
This is a problem if it autostarts from boot.
I was trying to write a hook script that made sure it was mounted on boot or aborts when the key is not present.

I'm using:
Code:
# zfs --version
zfs-2.2.7-pve2
zfs-kmod-2.2.7-pve2
# pveversion
pve-manager/8.3.5/dac3aa88bac3f300 (running kernel: 6.8.12-9-pve)

kernel
proxmox-kernel-6.8.12-9-pve-signed   6.8.12-9                             amd64        Proxmox Kernel Image (signed)
This is very repeatable behaviour. I can start and shutdown but always have this problem.

It looks like a bug to me. Am I mistaken, or should I file a bug report?
What workaround can I try?
 
I don't use ZFS, but I believe you should be using bind mounts in the LXC to that nested dataset.
 
The problem with that is if the subvol isn't mounted the container will boot with an empty directory and delete the database entries for the media server.
I need something to prevent boot if that volume isn't mounted.

It appears that there is some container trigger/hook that detects a mount at runtime, but it's not run at startup. I haven't tested yet, but from memory, this is the same behaviour for other types of fs mounts as well.
 
Last edited:
As a workaround; you could write your own script that tests for that mount (or mounts it) & only then starts the CT.
 
The container doesn't see the mount unless it is mounted after the container has started. That's the bug.

I'd probably be able to test for the mount in the systemd media service startup with a Requires=
Seems far from ideal and I'm not a systemd aficionado. I'd rather prevent the container from starting at all.

I'm looking for something that will boot without intervention as long as the usb thumb drive with the key is present.
 
The container doesn't see the mount unless it is mounted after the container has started. That's the bug.
As I've said - use a bind mount to the CT directly from the subvol & it should see the mount immediately - as long as it is present on the host.
 
Thanks, that works now.
I had to change the mount point and rename of the subvol.
This was a problem initially because I had to create the volume manually to create an encrypted fs and I am not familiar with all the conventions of Proxmox container storage management. It differs from conventional zfs subvolume nesting and property inheritance is and issue, but that wasn't my issue.

There still appears to be a bug in the management of side loaded subvolumes of any kind, in that they are not detected at startup the same way they are detected at runtime. I don't know why.
Now I've tested this on samba, zfs, and rclone with the same result.
I surmise the issue exists because the boot storage hook script sequence doesn't run the same methods which make the container aware of newly attached storage at runtime.

Should this bug go unreported?
 
  • Like
Reactions: gfngfn256
If I understood you correctly; you are now using a bind mount (BM) & have no issue.

I must tell you I am surprised that without a BM you in fact managed to see that subvol in the CT (unprivileged) at all. I wonder if you could read/write to it. Maybe you tested this? Anyway if there is a bug - it is probably the fact that it is visible at all.

I must add; I don't have any personal experience with your exact setup.
 
Yes, and yes. I have no issue using bind mount at boot in this case, and I can write to a subvolume mounted at runtime without bindmount.
I didn't have this problem until I wanted to side mount a volume at boot time.

I've been side mounting volumes on unpriveleged containers successfully for many years and it's invaluable.
I've seen many others mounting samba shares this way on this forum and I don't see the problem with it.
Some filesystems/volumes don't need to be mounted at boot time but only mounted ad-hok when required.
Restarting and reconfiguring containers to add a filesystem might be thought of like rebooting a computer to load a new dvd.
Appliances may require to mount and unmount filesystems in a functional way. How could preventing that be beneficial?
Bindmounts seem to have odd namespace restrictions and external setup requirements for anything non standard making them less ideal.

Ultimately, whatever decision or design function is made regarding side loading filesystems, should be applied equally at boot as at runtime.
It would be disappointing to see it go away and no doubt would cause a lot of backward compatibility issues for existing systems.
 
Last edited:
I learn something new each day. Thanks.
I must be honest I don't use ZFS - so maybe that is why I wasn't aware of the situation.

I guess the question is; Has the above difference between runtime/boot of CT, something that has changed only recently?
 
I can't remember definitively, but I think I ran into this problem when I was first setting up, and some of the things I've read on this forum suggest others have had network mount troubles for the same reason. It may always have been this way. I doubt it was a design decision.

This is also a problem for samba and rclone network mounts so I don't think it's fs dependent.
It's fairly simple to reproduce. I outlined the steps in the op.
Another possible use case for loading fs at runtime is live mounts of user storage/profile space on a multi-user system.
 
What I'm wondering is if this is a Proxmox thing, caused by the way they use hooks, or if it's a Linux Containers thing?