Bind mount a directory containing subvolumes into a container?

incans

New Member
Apr 7, 2023
7
3
3
UK
Hi. New (very) Proxmox user here.

I have a long-running Debian based home server that needs to be updated. I am trying out a new setup for the machine based on Proxmox with file services (Samba) and Plex etc. running in containers. I have been using this video- https://youtu.be/Hu3t8pcq8O0 as a reference plus the online Proxmox docs and various postings in this forum.

I set up a vanilla Proxmox install (ext4/LVM) on the first ~300GB of the boot SSD. The storage setup then aims to replicate my existing Debian + BTRFS config.

The main storage for the system is based on a pair of 5TB disks (sda,sdb) configured as a BTRFS RAID1 mirror. The BTRFS root volume is then split into multiple subvolumes as follows-

Code:
Mount Point        Subvolume
/home            home
/share/archive    archive
/share/backup    backup
/share/music    music
/share/photo    photo
/share/video    video
/share/web        web

I have the subvolumes and mount points set up in the Proxmox host. The relevant parts of a findmnt list on the host are below-

Code:
TARGET                       SOURCE             FSTYPE   OPTIONS
/                            /dev/mapper/pve-root
...
├─/data                      /dev/mapper/fast-fast_data
│                                               ext4     rw,noatime
├─/home                      /dev/sda[/home]    btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/web                 /dev/sda[/web]     btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/music               /dev/sda[/music]   btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/photo               /dev/sda[/photo]   btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/video               /dev/sda[/video]   btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/archive             /dev/sda[/archive] btrfs    rw,relatime,space_cache=v2,subvolid=2
├─/share/backup              /dev/sda[/backup]  btrfs    rw,relatime,space_cache=v2,subvolid=2


but I'm struggling to get the disks mounted at the same points in the Guest (a Debian 11 image running Samba). As soon as I edited the
/etc/pve/lxc/101.conf file I managed to kill the container (failed to restart), and even though I have I think fixed all the syntax errors in the conf file the container still will not start.

Here is my conf file source (note- it's annoying that the container start process seems to parse and rewrite this file, losing all the comments) -
arch: amd64 cores: 1 features: nesting=1 hostname: fileserver memory: 512 net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=9A:EF:E2:62:21:75,ip=dhcp,ip6=dhcp,type=veth ostype: debian rootfs: local-lvm:vm-101-disk-0,size=8G swap: 512 unprivileged: 1 # # Map /home subvol and /share folder (parent of subvols) from host # mp0: /home,mp=/home mp1: /share,mp=/share # # Map user ids # # root (0) and uids <= 1000 map to <uid> + 100,000 # gids <= 999 map as above (gid 1000 is "family" group) # real user ids in range 1001..1010 map directly (passthrough) # group ids in range 1000..1010 map directly # uids and gids above 1010 again map to <id>+100000 # lxc.idmap: u 0 100000 1000 lxc.idmap: g 0 100000 999 lxc.idmap: u 1001 1001 9 lxc.idmap: g 1000 1000 10 lxc.idmap: u 1011 101011 65535 lxc.idmap: g 1011 101011 65535



Now the syntax errors in the conf file for the container are fixed, the syslog output doesn't really give me much info on why the container is failing to start-

Code:
Apr 07 23:56:25 queeg pvedaemon[1283]: <root@pam> starting task UPID:queeg:00018073:0041007D:64309F99:vncproxy:101:root@pam:
Apr 07 23:56:25 queeg pvedaemon[98419]: starting lxc termproxy UPID:queeg:00018073:0041007D:64309F99:vncproxy:101:root@pam:
Apr 07 23:56:25 queeg pvedaemon[1284]: <root@pam> successful auth for user 'root@pam'
Apr 07 23:56:25 queeg pvedaemon[1283]: <root@pam> end task UPID:queeg:00018073:0041007D:64309F99:vncproxy:101:root@pam: OK
Apr 07 23:56:26 queeg pvedaemon[1283]: <root@pam> starting task UPID:queeg:0001807B:004100FA:64309F9A:vzstart:101:root@pam:
Apr 07 23:56:26 queeg pvedaemon[98427]: starting CT 101: UPID:queeg:0001807B:004100FA:64309F9A:vzstart:101:root@pam:
Apr 07 23:56:26 queeg systemd[1]: Started PVE LXC Container: 101.
Apr 07 23:56:27 queeg kernel: EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Apr 07 23:56:27 queeg audit[98449]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=98449 comm="apparmor_parser"
Apr 07 23:56:27 queeg kernel: audit: type=1400 audit(1680908187.321:27): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=98449 comm="apparmor_parser"
Apr 07 23:56:27 queeg pvedaemon[98427]: startup for container '101' failed
Apr 07 23:56:27 queeg pvedaemon[1283]: <root@pam> end task UPID:queeg:0001807B:004100FA:64309F9A:vzstart:101:root@pam: startup for container '101' failed
Apr 07 23:56:27 queeg audit[98454]: AVC apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=98454 comm="apparmor_parser"
Apr 07 23:56:27 queeg kernel: audit: type=1400 audit(1680908187.541:28): apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-101_</var/lib/lxc>" pid=98454 comm="apparmor_parser"
Apr 07 23:56:27 queeg pvedaemon[1284]: unable to get PID for CT 101 (not running?)
Apr 07 23:56:28 queeg pvedaemon[1282]: <root@pam> starting task UPID:queeg:000180B7:004101B5:64309F9C:vncproxy:101:root@pam:
Apr 07 23:56:28 queeg pvedaemon[98487]: starting lxc termproxy UPID:queeg:000180B7:004101B5:64309F9C:vncproxy:101:root@pam:
Apr 07 23:56:28 queeg systemd[1]: pve-container@101.service: Main process exited, code=exited, status=1/FAILURE
Apr 07 23:56:28 queeg systemd[1]: pve-container@101.service: Failed with result 'exit-code'.

I'm left guessing as to what the problem is.

One possibility is that while (I assume) the system is happy to bind mount a directory that is itself a mount point in the host (/data), it might have a problem mounting a directory (/share) that contains multiple BTRFS subvolume mount points and making those mounted sobvols available in the guest?

Another possibility is that i have botched the permissions and uid/gid mapping somehow, although would that not allow the cintainer to start but then cause runtime errors or access failures? My subuid and subgid files are below-

Code:
more /etc/subuid
root:100000:1
root:1001:9

more /etc/subgid
root:100000:1
root:1000:10

Any suggestions on how to work out what's preventing the container from starting would be much appreciated.
 
>> Another possibility is that i have botched the permissions and uid/gid mapping somehow.

It was indeed the uid/gid mapping that was prreventing the container from starting.

I had to run the lxc command manually in forground mode lxc-start -f -n 101 in order to get some information on what was going on. After looking through multiple forum posts where people were hitting similar issues I finally got it going. The working config is below-

/etc/pve/lxc/101.conf
Code:
arch: amd64
cores: 1
features: nesting=1
hostname: fileserver
memory: 512
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=9A:EF:E2:62:21:75,ip=dhcp,ip6=dhcp,type=veth
ostype: debian
rootfs: local-lvm:vm-101-disk-0,size=8G
swap: 512
unprivileged: 1

#
# Map /home subvol and /share folder (parent of subvols) from host
#
mp0: /home,mp=/home
mp1: /share,mp=/share

#
# Map user ids
#
# root (0) and uids <= 1000 map to <uid> + 100,000
#              gids <= 999 map as above (gid 1000 is "family" group)
# real user ids in range 1001..1010 map directly (passthrough)
# real group ids in range 1000..1010 map directly
# uids and gids 1011-199,999 map to <id>+100000
#
lxc.idmap: u 0     100000  1000
lxc.idmap: g 0     100000  999
lxc.idmap: u 1001  1001    9
lxc.idmap: g 1000  1000    10
lxc.idmap: u 1011  101011  98988
lxc.idmap: g 1011  101011  98988

/etc/subuid
root:100000:99999
root:1001:9

/etc/subgid
root:100000:99999
root:1000:10

The key I think was making sure that the user running the container (root) has permission to apply the full range of uids/gids implied by the <CTid>.conf file.
 
  • Like
Reactions: leesteken
I think i've confirmed my other guess, which is that (at least by default) a bind mount of a host directory onto a mount point in a container will not honour (recursively follow) other mounts (of BTRFS subvolumes or anything else) that are under the top level mounted directory.

This concept is understood, at least in the world of Docker containers, where it is called "bind propagation", as discussed here- Docker- Configure Bind Propagation", however I haven't found this option mentioned in any Proxmox docs.

I assume the bind mount itself is being implemented by the kernel, so if Docker has this option, then LXC and Proxmox could (and maybe do?) have it as well.

Does anyone know if this option can be enabled in proxmox or are the bind mount options not available?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!