Docker in LXC problem after PVE kernel update.

For the record, the original issue has been fixed in lxc-pve-3.1.0-65 (lxc-pve-3.1.0-7 for PVE 5). The package is currently available in the pvetest repository, but should move to pve-no-subscription soon.

Testing is always welcome of course :)
 
Thank you for reporting back. As far as the documentation of vfs goes, the driver is inferior to anything else with lower performance and uses more space, but is the only driver that works on any block based backend. I don't see why this should be preferred over the real ZFS driver for Docker, which is CoW-based, super fast, has quota and snapshot support (for cloning) but does obviously not work in LXC. The ZFS driver is even superior than any overlay filesystem due to its internal CoW design.
Is it though? Docker themselves strongly recommend the overlay2 driver as the superior choice, and there are multiple threads and git issues about slow performance using the zfs driver for linux.

I'd love for the zfs driver to be just as performant as the overlay2 driver, just because ZFS is so great, but the reality seems to paint a different picture.
 
Is it though? Docker themselves strongly recommend the overlay2 driver as the superior choice, and there are multiple threads and git issues about slow performance using the zfs driver for linux.

Depends on what you really want of course. ZFS has features that no other filesystem has, e.g. for databases you can remove the necessity to have checksums inside of the database and let parts of ACID to ZFS increasing the overall performance of the database by a lot. Even if ZFS is slower in general I/O performance (which is normal with respect to a non-ZFS system ON the same hardware), the benefits of the features including compression will make it a lot faster. And we are not even talking about snapshots, cloning, quota which is not possible with overlay2.

I use ZFS also on the (very infrequently used) building side of things, which is known to be slower but I hardly notice it on SSD. Most of my images come directly from the CI pipeline, so I really don't care how fast or slow they're build. The benefits of the compression are just too good. Yet I have to say, that I have a more database centric view on things and it does a great job there.

On the other side, I do understand that ZFS is not an install-and-forget oder install-and-always-fast approach. As the Docker documentation states, ZFS is not for the unexperienced (ZFS) users and I can relate. You have to optimise it for every workload, especially if you run a mixed setup, but on the other hand you have this approach. Overlay2 is the "Apple approach", ... it just works, whereas the ZFS is more like the "Linux-I-do-everything-myself-approach".

I'd love for the zfs driver to be just as performant as the overlay2 driver, just because ZFS is so great, but the reality seems to paint a different picture.

Have you experienced bad performance by yourself?
 
Have you experienced bad performance by yourself?
I don't run Docker on the ZFS storage driver - just LVM-thin with overlay2. So I couldn't tell you. However, I've seen multiple threads on Github and Reddit commenting about average to poor performance on the ZFS storage driver. YMMV of course.
 
For the record, the original issue has been fixed in lxc-pve-3.1.0-65 (lxc-pve-3.1.0-7 for PVE 5). The package is currently available in the pvetest repository, but should move to pve-no-subscription soon.

Testing is always welcome of course :)

lxc-pve (3.1.0-65) solved the Problem. Thanks
 
lxc-pve 3.2.1-1 is having the same issue again ...
pve [~] pveversion
pve-manager/6.1-3/37248ce6 (running kernel: 5.3.13-1-pve)

This is a privileged lxc container.


see below error log:
Code:
gpu 12| 0:50 [~] docker run hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:109: jailing process inside rootfs caused \\\"pivot_root permission denied\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled

docker version is 19.03 from official repo:

Code:
gpu 12| 0:50 [~] docker version
Client: Docker Engine - Community
Version:           19.03.5
API version:       1.40
Go version:        go1.12.12
Git commit:        633a0ea838
Built:             Wed Nov 13 07:25:38 2019
OS/Arch:           linux/amd64
Experimental:      false
 
lxc-pve 3.2.1-1 is having the same issue again ...
pve [~] pveversion
pve-manager/6.1-3/37248ce6 (running kernel: 5.3.13-1-pve)

This is a privileged lxc container.


see below error log:
Code:
gpu 12| 0:50 [~] docker run hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:109: jailing process inside rootfs caused \\\"pivot_root permission denied\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled

docker version is 19.03 from official repo:

Code:
gpu 12| 0:50 [~] docker version
Client: Docker Engine - Community
Version:           19.03.5
API version:       1.40
Go version:        go1.12.12
Git commit:        633a0ea838
Built:             Wed Nov 13 07:25:38 2019
OS/Arch:           linux/amd64
Experimental:      false

did you set container options?
 

Attachments

  • options.jpg
    options.jpg
    13.5 KB · Views: 152
Can someone tell me, what Im doing wrong, please?

# podman run hello-world
Error: container_linux.go:344: starting container process caused "process_linux.go:424: container init caused \"rootfs_linux.go:58: mounting \\\"proc\\\" to rootfs \\\"/var/lib/containers/storage/vfs/dir/6ffcb1f5517f09e26562fe268e47aa651341c86fde7a9cd68fc18a1d42513b5f\\\" at \\\"/proc\\\" caused \\\"permission denied\\\"\"": OCI runtime permission denied error

Updated PVE6.1, non-priviledged CT open-suse-15.1, all options left default, run podman as root.

Trying Feature set Nesting=1 makes different error:
Error: container_linux.go:344: starting container process caused "process_linux.go:424: container init caused \"process_linux.go:353: setting rlimits for ready process caused \\\"error setting rlimit type 7: operation not permitted\\\"\"": OCI runtime permission denied error
 
Try to see if setting the keyctl feature (only for unprivileged containers) or nesting(for priviliged) containers helps.

keyctl=<boolean> (default = 0)

For unprivileged containers only: Allow the use of the keyctl() system call. This is required to use docker inside a container. By default unprivileged containers will see this system call as non-existent. This is mostly a workaround for systemd-networkd, as it will treat it as a fatal error when some keyctl() operations are denied by the kernel due to lacking permissions. Essentially, you can choose between running systemd-networkd or docker.


nesting=<boolean> (default = 0)

Allow nesting. Best used with unprivileged containers with additional id mapping. Note that this will expose procfs and sysfs contents of the host to the guest.

https://pve.proxmox.com/wiki/Linux_Container#pct_configuration
 
did you set container options?
Confirm: Set this Features does fix it.

Bash:
~# pveversion
pve-manager/6.1-8/806edfe1 (running kernel: 5.3.18-3-pve)


Bash:
$ docker --version
Docker version 19.03.8, build afacb8b7f0
 

Attachments

  • Screenshot_2020-04-19 pve - Proxmox Virtual Environment.png
    Screenshot_2020-04-19 pve - Proxmox Virtual Environment.png
    47.3 KB · Views: 93
There is so many entries regarding Docker on PVE over the last years that I would be most grateful if someone can point at the best method to have a basic Docker server running inside an LXC Container within PVE6 (if possible or 5 if required), ideally unprivileged but initially trying to get it working on a non-production server.
Thanks.
 
Yes VM for Docker is the save and boring way, but the future belongs not to the VM.
>>
The Future could be LXC with ZFS with Pods and Kubernetes using docker hub images.
 
Thanks for prompt reply. I assume then LXC / LVM not suitable. My problem with KVMs is bind mounts and not being able to share files with host system. i already asked a few months back regarding VirtFS - 9p which I understand is still not supported by the kernel as an alternative. NFS or SSH-fs alternative seems overcomplicated and slow.
 
Thanks for prompt reply. I assume then LXC / LVM not suitable. My problem with KVMs is bind mounts and not being able to share files with host system. i already asked a few months back regarding VirtFS - 9p which I understand is still not supported by the kernel as an alternative. NFS or SSH-fs alternative seems overcomplicated and slow.
mount bind with LXC is very simple if you use host adapted storage (ZFS, NFS; SMB) mounts.
 
Yes VM for Docker is the save and boring way, but the future belongs not to the VM.
>>
The Future could be LXC with ZFS with Pods and Kubernetes using docker hub images.

Unless there is some abstraction layer of ZFS forwarded to LXC, there is no way to control the ZFS from the LX(C) container and doing layered aufs/overlayfs(|2) inside of LXC on basis of ZFS is total waste of resources. Best to just use a VM. Easier to setup up and running from zero to working in a few minutes due to docker-drivers.
 
Unless there is some abstraction layer of ZFS forwarded to LXC, there is no way to control the ZFS from the LX(C) container and doing layered aufs/overlayfs(|2) inside of LXC on basis of ZFS is total waste of resources. Best to just use a VM. Easier to setup up and running from zero to working in a few minutes due to docker-drivers.

I don't understand. You say running container in LXC is waste of resource. Then your propasal is to run overlayfs on ext4 on raw disk on zfs which is much more waste of resources.
 
I don't understand. You say running container in LXC is waste of resource. Then your propasal is to run overlayfs on ext4 on raw disk on zfs which is much more waste of resources.

Where did I write that? I said that "doing layers aufs/overlayfs(|2) inside of LXC on basis of ZFS is total waste of resources", therefore use a VM (besides all the other benefits) and then inside that ZFS and Docker, so that you can use the combination of ZFS and Docker. That was my goal, ZFS+Docker is a dream team.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!