5.3 and unprivileged containers: docker works, mount nfs does not

dipe

Active Member
Mar 21, 2013
39
0
26
I upgraded from 5.2 to 5.3 and am trying to re-deploy self service unprivileged containers where users create an LXC container and then can do whatever they need to do (mostly installing docker, mounting file systems etc)

First I am disabling all the magic hacks that were required to run docker and mounts in containers in the past:

root@proxa1:# cat /usr/share/lxc/config/common.conf.d/02-docker.conf
# disable AppArmor support for the container
#lxc.aa_profile = unconfined
#lxc.apparmor.profile = unconfined

# prevents capabilities from being dropped effectively keeping super-user privileges
#lxc.cap.drop =

# allows access to all devices
#lxc.cgroup.devices.allow = a

Then I remove all the "mount fstype=nfs*" from apparmor files here

root@proxa1:~/pve-config# ls -l /etc/apparmor.d/lxc/
total 8
-rw-r--r-- 1 root root 573 Dec 9 12:56 lxc-default-cgns
-rw-r--r-- 1 root root 572 Dec 9 12:56 lxc-default-with-nesting


Then I am creating an unprivileged container will all the goodies

pct create 333 proxnfs:vztmpl/ubuntu-16.04-standard_16.04-1_amd64.tar.gz \
-cores 2 \
-description "created by pct on host $(hostname)" \
-features "mount=nfs,keyctl=1,nesting=1" \
-hostname "mytest" \
-memory 512 \
-net0 "name=eth0,bridge=vmbr0,ip=dhcp" \
-ostype "ubuntu" \
-password "abcxyz" \
-pool "teampool" \
-storage "proxZFS" \
-unprivileged 1

Then running docker-ce inside the lxc container works after a fix:

prepend the value for ExecStartPre with a hyphen to ingnore a modprobe error

# grep ExecStartPre /etc/systemd/system/multi-user.target.wants/containerd.service
ExecStartPre=-/sbin/modprobe overlay

this is the bug:
https://github.com/containerd/containerd/issues/2772


root@tstunpr1:/# docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 18.09.0
Storage Driver: vfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.18-9-pve
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 512MiB
Name: mytest
ID: ZVPU:FMS7:QJMO:O432:FJTY:STSH:IRBO:V2DX:YP44:SYFE:JOTL:3IXI
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

But mounting nfs does not work:

root@tstunpr1:/# mount -o vers=3 -vvvvv scdata:/scdata_01_S20/app/Ubuntu16.04/public /app
mount.nfs: timeout set for Mon Dec 10 17:11:20 2018
mount.nfs: trying text-based options 'vers=3,addr=10.10.130.42'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.10.130.42 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.10.130.42 prog 100005 vers 3 prot UDP port 635
mount.nfs: mount(2): Operation not permitted
mount.nfs: Operation not permitted

Mounting does work fine if I use a privileged container or mount directly on the proxmox host. I would go back to privileged containers with keyctl=0 but then docker does not work.
docker still requires the settings in /usr/share/lxc/config/common.conf.d/02-docker.conf

I don't see any info in logs. Why does mounting not work ?
 
BTW: these are the resulting lxc confings, profiles:


root@proxa1:# cat /etc/pve/lxc/333.conf
#created by pct on host proxa1
arch: amd64
cores: 2
features: keyctl=1,nesting=1,mount=nfs;nfs4
hostname: tstunpr1
memory: 512
net0: name=eth0,bridge=vmbr0,hwaddr=72:22:FD:E1:95:B8,ip=dhcp,type=veth
ostype: ubuntu
rootfs: proxZFS:subvol-333-disk-0,size=4G
swap: 0
unprivileged: 1


root@proxa1:# cat /var/lib/lxc/333/config
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.include = /usr/share/lxc/config/ubuntu.userns.conf
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 1
lxc.apparmor.raw = mount fstype=nfs,
lxc.apparmor.raw = mount fstype=nfs4,
lxc.monitor.unshare = 1
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536
lxc.tty.max = 2
lxc.environment = TERM=linux
lxc.uts.name = tstunpr1
lxc.cgroup.memory.limit_in_bytes = 536870912
lxc.cgroup.memory.memsw.limit_in_bytes = 536870912
lxc.cgroup.cpu.shares = 1024
lxc.rootfs.path = /var/lib/lxc/333/rootfs
lxc.net.0.type = veth
lxc.net.0.veth.pair = veth333i0
lxc.net.0.hwaddr = 72:22:FD:E1:95:B8
lxc.net.0.name = eth0
lxc.cgroup.cpuset.cpus = 11,18
 
The problem with mounting is that the kernel simply won't allow that regardless of any apparmor rules as most file systems (including nfs) simply aren't marked to be allowed in user namespaces. (The mount option checkboxes being enabled on the UI for unprivileged containers was an oversight, they' should be greyed out with the next updates.)
As I described here[1], docker in privileged containers does still require some extra work, but much less than before.

[1] https://forum.proxmox.com/threads/5-3-docker-on-lxc-on-zfs.49473/#post-231060
 
Thanks Wolfgang, good to know. I also noticed that if you updates the features setting on multiple lxc machines it always adds another "nfs" so after 10 boxes it says nesting=1,mount=nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs. I will ask for a solution for self service in a separate post.
 
Thanks Wolfgang, good to know. I also noticed that if you updates the features setting on multiple lxc machines it always adds another "nfs" so after 10 boxes it says nesting=1,mount=nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs;nfs. I will ask for a solution for self service in a separate post.
Thats a bug, for sure. Can be reproduced here. Did you file a bugreport already?

We also need the "nfs4" type, as "nfs" is only NFS v3! So at the moment it is impossible to mount a NFS v4 share :(
IMHO the checkbox in the panel should add "nfs;nfs4" here.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!