rbind-ing a zfs mountpoint for LXC containers not working as expected

EpicBeagle

New Member
Jan 15, 2023
6
1
3
Greetings folks,

I've been trying to get a Samba share set up through an unprivileged Alpine LXC container. I have a ZFS storage pool that I would like to share with this (and other) containers. Thing is I want to do this with a recursive bindmount so that I can have other containers use the same data (say mounting it read-only for a media server)*. Unfortunately I don't think creating a storage provider and using that will work for this because I want to access the contents of the directory rather than mounting a block of storage inside that directory, so from what I tested adding a "zfs" storage provider from the GUI is out. It does look like I can add a mountpoint though:
https://pve.proxmox.com/wiki/Unprivileged_LXC_containers
I also have many zfs datasets and would like to keep adding more, and unfortunately sharing the parent bind mount doesn't get all of its children... go figure. Having to edit the configuration of the storage container all the time to add more binds is out of the question (especially is the cap is 32 like I've heard)... I'd likely pursue a different solution to my problem if that was my only answer.

This introduces two problems, because I have many datasets I would like to share which are children of a common dataset, and it doesn't look like rbind is implemented yet for mp. I also need to set permissions correctly so that I can access the contents in the unprivileged container correctly. I haven't touched the later yet, but I have attempted the former and am a bit baffled by what I'm seeing here.

For the sake of the following examples and to be painfully clear, `mypool/share` is the dataset I would like to share that contains two child datasets, `mypool/share/foo` and `mypool/share/bar`. The directory I want these to appear on in the Alpine container is `/samba`, so the contents of `/samba/bar` should be equivalent to `mypool/share/bar`. Additionally, all datasets are mounted on root with a similar structure, so dataset `mypool/share/foo` is mounted at `/mypool/share/foo` on the host.

For instance, I set up the container with it's root storage in a separate pool and do a simple bind mount with pct like so:
`pct set 100 -mp0 /mypool/share,mp=/samba`

This performs as I expected, I can see `/samba` in the container, and even the directories `foo` and `bar`, but neither has any contents.

So, I add another line to the end of `/etc/pve/lxc/100.conf` (100 being the ID of the alpine container), and add
`lxc.mount.entry: /mypool/share /samba/ none rbind,create=dir,optional 0 0`
I've seen this on the forums before as a valid way to address this problem:
https://forum.proxmox.com/threads/b...iles-in-nested-zfs-datasets.47454/post-382266

The issue though is I'm having the exact same problem another user is having as of last September:
https://forum.proxmox.com/threads/permission-issues-on-rbind-mounts-to-lxc.114551/

The issue is that the rbind just outright doesn't work. Where my expectation is that the rbind would show me all the contents of `samba` so that I could look into `/samba/foo` and `/samba/bar` and see their files, I don't see anything! To make matter worse, I tried to start up a different privileged container that follows the same process, but instead of running the `pct` command I only add in the `lxc.mount.entry` configuration - that leads the container to not start properly at all! I don't think this should be possible given that the `optional` flag specifically instructs the container to not fail if the mount does not work, and `create=dir` should make the directory structure anyways.

To round this out, it even looks like there is a patch available to add `rbind` to the mp-syntax, here, which would be the ideal solution to this problem:
https://bugzilla.proxmox.com/show_bug.cgi?id=2762

But I've never seen the "patch available" status and it isn't explained in the field explanation blurb:
https://bugzilla.proxmox.com/page.cgi?id=fields.html#bug_status

Supposedly this patch has been available since mid 2020, but I have no idea how to access it or if I should expect that I can use it in a non-enterprise application, but adding complexity by applying a patch ontop of my proxmox installation every time I update the OS? Yeah no thanks, that's insane to expect that over merging it into mainline.

tl;dr and summary:
I've given a pretty good shot at getting a recursive bind mount to a container from a ZFS dataset mounted to the host to work, but I'm getting a very opaque and confusing result and I need some help.

Thanks folks!
* I want to acknowledge the possibility of write collisions if you have multiple containers accessing the same directory, though I'd be curious to hear if a separate NFS and Samba container would be smart enough to avoid colliding with each other. If so I'd just run the services in the same container and call it a day, since I'm fairly sure modern filesystems handle that pretty well, but otherwise running them in separate containers would be helpful if it's possible.
 
As I've thought about this some more I came up with another possibly important detail. In the example above I never mentioned that the rootfs pool is a different, encrypted zfs pool than the one I'm trying to mount for storage. I don't think this should make a difference, but it might give somebody a clue I hope :). For the sake of being thorough, here's my container config:
Code:
arch: amd64
cores: 1
features: samba-alpine
memory: 512
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=REDACTED,ip=192.168.1.101/24,type=veth
ostype: alpine
rootfs: encrypted_zfs:subvol-100-disk-0,mountoptions=noatime,size=1G
swap: 512
unprivileged: 1
lxc.mount.entry: /mypool/share/ /samba/ none rbind,create=dir,optional 0 0

Thanks!
 
This is interesting, in the course of debugging this I found an interesting behavior. In short, when checking to see where in the host's file system the rbind mount was getting slotted into, I found that the target, in this case `/mypool/share`, was always getting prepended with the directory that `rootfs` is set to in the container. At first I thought this was an error on my part with understanding the docs, since they seem to claim that relative paths only will be taken as relative to the filesystem:
https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html

e.g.
`mypool/share` would be equivalent to `${LXC_ROOTFS_MOUNT}/mypool/share` on the host, and `/mypool/share` would be equivalent to `/mypool/share` on the host. Seems consistent, so that makes sense to me.

But this wasn't super clear so I poked around on the github, and there's an open issue for this behavior, last it was mentioned, a distinction between absolute and relative paths was intentional:
https://github.com/lxc/lxc/issues/2276#issuecomment-381356975

So it's very possible this is a bug or regression in LXC containers themselves.
 
  • Like
Reactions: leesteken
I've opened up an issue on LXC about this, if anybody has any insights into this please do let me know! I'll post any answer I get here as well.
https://github.com/lxc/lxc/issues/4258

In the meantime, if folks have suggested resources for mapping UIDs to unprivileged containers so that I may still read/write to them as normal, that would be appreciated. Thanks!

EDIT: Well, it looks like github deleted my account -_-... Will report back with a new issue when it's back up
 
Last edited:
Hi @EpicBeagle, I'm actually still running a 6.x version on the host which is supporting my recursive bind, so I can't speak to the regression... in fact I have to plan my upgrade process even more carefully now, thanks for the heads-up! Your lxc config line is is almost the same as my working config, which would be:

Code:
lxc.mount.entry: /mypool/share/ samba/ none rbind,create=dir,optional 0 0

Note that I do not have a leading slash on the internal path. Lord knows if/why that would make a difference...

Regarding passthrough of UID/GID, I have found the following resource very helpful: https://kcore.org/2022/02/05/lxc-subuid-subgid/
 
Last edited:
  • Like
Reactions: EpicBeagle
I just ran a test on my dev cluster (running the latest non-commercial build of Proxmox) and my LXC recursive bind works just fine.

I created a new set of recursive ifs datasets like so:

Code:
root@S1-NAS01:~# zfs list
NAME                                              USED  AVAIL     REFER  MOUNTPOINT
local-hdd                                         117G  21.6T      104K  /local-hdd
local-hdd/encrypted                               117G  21.6T      280K  /local-hdd/encrypted
local-hdd/encrypted/data                         1.46M  21.6T      280K  /local-hdd/encrypted/data
local-hdd/encrypted/data/archive                  628K  21.6T      240K  /local-hdd/encrypted/data/archive
local-hdd/encrypted/data/archive/repositories     196K  21.6T      196K  /local-hdd/encrypted/data/archive/repositories
local-hdd/encrypted/data/archive/webcaptures      192K  21.6T      192K  /local-hdd/encrypted/data/archive/webcaptures
local-hdd/encrypted/data/music                    192K  21.6T      192K  /local-hdd/encrypted/data/music
local-hdd/encrypted/data/photos                   392K  21.6T      200K  /local-hdd/encrypted/data/photos
local-hdd/encrypted/data/photos/archive           192K  21.6T      192K  /local-hdd/encrypted/data/photos/archive
local-hdd/encrypted/vmbackup                     99.0G  21.6T     99.0G  /local-hdd/encrypted/vmbackup
local-hdd/encrypted/vmdata                       18.5G  21.6T      240K  /local-hdd/encrypted/vmdata

and passed it through to my unprivileged container:

Code:
arch: amd64
cores: 4
features: nesting=1
hostname: files
memory: 8192
nameserver: 1.1.1.1
net0: name=eth0,bridge=vmbr0,firewall=1,gw=172.20.5.1,hwaddr=3E:12:29:79:FA:C9,ip=172.20.5.10/24,tag=5,type=veth
ostype: debian
rootfs: local-vmdata:subvol-200-disk-0,size=40G
searchdomain: REDACTED
swap: 8192
unprivileged: 1
lxc.mount.entry: /local-hdd/encrypted/data/ data/ none rbind,create=dir,optional 0 0
/local-hdd/encrypted

and it works, I can read file content from within the recursive, nested containers.

1675343064075.png


Furthermore, if I create new datasets, they automatically appear to the container:

(on the host): root@S1-NAS01:~# zfs create local-hdd/encrypted/data/archive/newdataset

And it shows up immediately, without needing to restart the container:

Code:
root@files:~# tree /data
/data
├── archive
│   ├── newdataset        <-----
│   ├── repositories
│   │   └── test_file.txt
│   └── webcaptures
├── music
└── photos
    └── archive

7 directories, 1 file





Hope this helps! Strange issue on your end.
 
As I've thought about this some more I came up with another possibly important detail. In the example above I never mentioned that the rootfs pool is a different, encrypted zfs pool than the one I'm trying to mount for storage. I don't think this should make a difference, but it might give somebody a clue I hope :). For the sake of being thorough, here's my container config:
Code:
arch: amd64
cores: 1
features: samba-alpine
memory: 512
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=REDACTED,ip=192.168.1.101/24,type=veth
ostype: alpine
rootfs: encrypted_zfs:subvol-100-disk-0,mountoptions=noatime,size=1G
swap: 512
unprivileged: 1
lxc.mount.entry: /mypool/share/ /samba/ none rbind,create=dir,optional 0 0

Thanks!
question - how are you enabling samba-alpine feature in an unprivileged container? I'm unfamiliar with that feature module, but it looks like privileged is required from the proxmox docs.
I am running my Samba server without this feature enabled, without any issues in an unprivileged container.


According to the Proxmox folks, this feature is to facilitate mounting SMB/CIFS from a remote server to within the container. Probably not what you're looking for (https://forum.proxmox.com/threads/container-options-features-cifs.96915/).
 
Last edited:
Hey Grepler, thanks for your response.
One difference I didn't think about 'til now is that you are running a Debian container - I've only tried this on an Alpine container, but it's possible that running on a Debian container could get me different results. I've put this more-or-less on the backburner for now, though, so It'll be a bit before I get to that experiment.

In any case that's promising and confusing that you've been able to get rbind working without any hassle on Proxmox 7. Just to do my due-diligence I have a question about that: If you go to the host, make a new dataset, mount it, and put a file within that folder, does that file pop up on the container when you use rbind? While I was able to see the folder in my tests, I could not see the files inside of it. I'm not an expert in this, but behaviorally it looks like a file is being made which represents a folder (everything is a file after all), but it doesn't lead to anywhere because that child dataset isn't mounted correctly.

I'm also not sure I understand your question about running an unprivileged samba share in Alpine, if I'm understanding correctly you're unsure how I could have an unprivileged share working because that appears to be unsupported? In theory I think it should work, and I saw some folks on Reddit suggesting they were able to get an unprivileged share working (whether that's NFS or Samba) - though that link is long gone at this point. That said I had a lot of trouble with it so I'm currently using a privileged container just to have something that works...

For example, The NFS share looked like it was missing some package or service, so I didn't get very far there (it would be worth trying this on Debian as well I reckon). As for Samba though, I couldn't figure out how to map permissions from the guest->host correctly, and I'd need to do more research into AppArmor since I'm unfamiliar with the settings needed to allow sharing in a container (I'm assuming that's a potential blockage because I had issues with shares on a Fedora-based server before, which requires setting the SELinux policy on the share's folder to allow network sharing).

If you have any more insights do let me know, though. I'd still love to correct this so I'm using a bit more robust overall approach to this even if it's on the backburner for the time being...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!