Migration from 6.4 to 7.0

baldy · Jul 10, 2021

Hi there,

i wanted to migrate some old Containers fron 6.4 to a fresh 7.0 PMX Cluster.
For some reason i am not able to start some old Containers becuase of cgroupsv2.

I found some documentions about how to fix it. But that seems to be not working:

Code:

cat /etc/default/grub

# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox VE"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

Code:

~# cat /etc/kernel/cmdline
systemd.unified_cgroup_hierarchy=0

After updating Grub und proxmox-boot-tool and restarting the Host I get the follwing error when i start the Container:

Code:

cgfsng_setup_limits_legacy: 2764 Bad address - Failed to set "devices.deny" to "a"
cgroup_tree_create: 808 Failed to setup legacy device limits
cgfsng_payload_create: 1171 Numerical result out of range - Failed to create container cgroup
lxc_spawn: 1644 Failed creating cgroups
__lxc_start: 2073 Failed to spawn container "103"
TASK ERROR: startup for container '103' failed

Any idea how to fix this issue?

Cheers
Daniel

t.lamprecht · Jul 11, 2021

baldy said:

Code:

cgfsng_setup_limits_legacy: 2764 Bad address - Failed to set "devices.deny" to "a"
cgroup_tree_create: 808 Failed to setup legacy device limits
cgfsng_payload_create: 1171 Numerical result out of range - Failed to create container cgroup
lxc_spawn: 1644 Failed creating cgroups
__lxc_start: 2073 Failed to spawn container "103"
TASK ERROR: startup for container '103' failed

Any idea how to fix this issue?

Can you please also share the container config here?

pct config VMID

baldy · Jul 11, 2021

Hi,

here it comes;

Code:

arch: amd64
cores: 16
hostname: blablub
memory: 16384
net0: name=eth0,bridge=vmbr0,gw=10.0.3.1,hwaddr=82:D4:B9:29:AC:1A,ip=10.0.3.76/24,type=veth
ostype: ubuntu
rootfs: local-lvm:vm-103-disk-0,size=100G
swap: 16384

Cheers

fvanlint · Jul 11, 2021

Hello, I have the same problem with all LXC containers using

Code:

systemd.unified_cgroup_hierarchy=0

.
We use docker-in-lxc and I thought it would be an easy way to avoid the cgroupv2 issue.

Example of such an LXC

Code:

pct config 150
arch: amd64
cores: 2
features: fuse=1,mknod=1,nesting=1
hostname: DDNS
memory: 256
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=36:06:55:CA:8E:A0,ip=dhcp,ip6=dhcp,type=veth
onboot: 1
ostype: ubuntu
rootfs: tank:subvol-150-disk-0,size=30G
swap: 256
lxc.apparmor.profile: unconfined
lxc.cgroup.devices.allow: a
lxc.cap.drop:

Our error in proxmox contains exactly the same as @baldy but also a warning about apparmor (does not seem to be an actual issue).

Code:

explicitly configured lxc.apparmor.profile overrides the following settings: features:fuse, features:nesting
cgfsng_setup_limits_legacy: 2764 Bad address - Failed to set "devices.deny" to "a"
cgroup_tree_create: 808 Failed to setup legacy device limits
cgfsng_payload_create: 1171 Numerical result out of range - Failed to create container cgroup
lxc_spawn: 1644 Failed creating cgroups
__lxc_start: 2073 Failed to spawn container "150"
TASK ERROR: startup for container '150' failed

Interestingly, when removing the last 3 lines, the error remains the same:

Code:

arch: amd64
cores: 2
features: fuse=1,mknod=1,nesting=1
hostname: DDNS
memory: 256
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=36:06:55:CA:8E:A0,ip=dhcp,ip6=dhcp,type=veth
onboot: 1
ostype: ubuntu
rootfs: tank:subvol-150-disk-0,size=30G
swap: 256

Code:

lxc-start -n 150 -F -lDEBUG -o lxc-150.log
lxc-start: 150: cgroups/cgfsng.c: cgfsng_setup_limits_legacy: 2764 Bad address - Failed to set "devices.deny" to "a"
lxc-start: 150: cgroups/cgfsng.c: cgroup_tree_create: 808 Failed to setup legacy device limits
lxc-start: 150: cgroups/cgfsng.c: cgfsng_payload_create: 1171 Numerical result out of range - Failed to create container cgroup
lxc-start: 150: start.c: lxc_spawn: 1644 Failed creating cgroups
lxc-start: 150: start.c: __lxc_start: 2073 Failed to spawn container "150"
lxc-start: 150: tools/lxc_start.c: main: 308 The container failed to start
lxc-start: 150: tools/lxc_start.c: main: 313 Additional information can be obtained by setting the --logfile and --logpriority options

EDIT: Well, reverting ystemd.unified_cgroup_hierarchy=0 and moving to lxc.cgroup2.devices.allow: a showed me a different issue. We run the LXC containers on ZFS, and AUFS is no longer supported... so yeah. That warrants a topic in and by itself. I hope the above helps in some way.

jpros · Jul 12, 2021

Hey all!

I don't think I have any different config on my CT, either way, I'm posting it here:

Code:

arch: amd64
cores: 1
hostname: management
memory: 512
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.8.1,hwaddr=0A:85:99:63:EF:95,ip=192.168.8.21/21,type=veth
ostype: ubuntu
rootfs: local-lvm:vm-101-disk-1,size=8G
swap: 512

Also with the same boot config to enable legacy cgroups and with the same error thrown:

Code:

cgfsng_setup_limits_legacy: 2764 Bad address - Failed to set "devices.deny" to "a"
cgroup_tree_create: 808 Failed to setup legacy device limits
cgfsng_payload_create: 1171 Numerical result out of range - Failed to create container cgroup
lxc_spawn: 1644 Failed creating cgroups
__lxc_start: 2073 Failed to spawn container "101"
TASK ERROR: startup for container '101' failed

My installation of Proxmox 7 is brand new (so no migration/upgrade on this case) and the CT is also brand new (Ubuntu 16).

kyriazis · Jul 16, 2021

Having the same problem. Posted another thread before finding this one (https://forum.proxmox.com/threads/proxmox-7-lxc-cgroup-issue-with-docker.92722/).

Also using docker inside LXC, trying to avoid using cgroup2 (for the time being).

Funny how the error message is about devices.deny, while there is no such entry in the config file; instead there is a devices.allow.

rappazz · Jul 16, 2021

In addition to the grub command line "systemd.unified_cgroup_hierarchy=0" I also added the following two lines to my container config files (in /etc/pve/lxc/#id#.conf):

lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =

See https://github.com/lxc/lxc/issues/2268 for details.

After that all my containers started up again.

kyriazis · Jul 17, 2021

I can confirm that this works. Strangely enough, this also works with docker inside the LXC container.

ednt · Jul 17, 2021

@chkern
Thank you for your fix!

It solved my big troubles with zimbra.

Because how can I update the container when it does not start anymore?

2 years ago I gave the order: only VMs no containers
Because of the live migration when upadtes are necessary.
But we still have some old containers ...

t.lamprecht · Jul 20, 2021

chkern said:
In addition to the grub command line "systemd.unified_cgroup_hierarchy=0" I also added the following two lines to my container config files (in /etc/pve/lxc/#id#.conf):

lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =

See https://github.com/lxc/lxc/issues/2268 for details.

After that all my containers started up again.

FYI, that should not be required anymore with the new lxc-pve package update with version 4.0.9-4 available on no-subscription repo at time of writing.

yswery · Jul 22, 2021

t.lamprecht said:
FYI, that should not be required anymore with the new lxc-pve package update with version 4.0.9-4 available on no-subscription repo at time of writing.

With the new fixes, do we still need to add?

```
GRUB_CMDLINE_LINUX_DEFAULT="systemd.unified_cgroup_hierarchy=0 quiet"
```

t.lamprecht · Jul 23, 2021

yswery said:
With the new fixes, do we still need to add?

```
GRUB_CMDLINE_LINUX_DEFAULT="systemd.unified_cgroup_hierarchy=0 quiet"
```

If you run distro releases that break when running in a unified cgroupv2 environment, like CentOS 7, then yes.

The bug it fixed was a bug that could only happen when enforcing the old, legacy mixed cgroup v1 + v2 environment that CentOS 7 needs to run.

There cannot be any patch from our side that fixes CentOS 7 to cope with newer cgroup, either upgrade to CentOS 8 (AppStream or one of the new derivates), force the old cgroup setting via that kernel command line, switch the workload to another distro with more frequent releases or switch to VMs.

kyriazis · Jul 26, 2021

t.lamprecht said:
FYI, that should not be required anymore with the new lxc-pve package update with version 4.0.9-4 available on no-subscription repo at time of writing.

Confirmed.

/etc/pve/lxc/*.conf files don't need any editing when moving from Proxmox 6.

kyriazis · Jul 30, 2021

I spoke too soon. This works on Ubuntu 18.04 LXC containers, but does not work with Ubuntu 20.04 containers. If this was an issue with "old" Centos 7 style containers only, then it definitely would not appear in 20.04.

yswery · Jul 30, 2021

kyriazis said:
I spoke too soon. This works on Ubuntu 18.04 LXC containers, but does not work with Ubuntu 20.04 containers. If this was an issue with "old" Centos 7 style containers only, then it definitely would not appear in 20.04.

Just to confirm, what errors are you seeing on 20.04 containers?

t.lamprecht · Jul 30, 2021

kyriazis said:
I spoke too soon. This works on Ubuntu 18.04 LXC containers, but does not work with Ubuntu 20.04 containers. If this was an issue with "old" Centos 7 style containers only, then it definitely would not appear in 20.04.

If you force the system back to old legacy cgroups then new distros can break.
For Ubuntu 20.04 it sounds a bit weird to me, and rather like some other issues, would be good to know what you actually changed and, as the other poster asked, what issues you're seeing.

kyriazis · Jul 30, 2021

t.lamprecht said:
If you force the system back to old legacy cgroups then new distros can break.
For Ubuntu 20.04 it sounds a bit weird to me, and rather like some other issues, would be good to know what you actually changed and, as the other poster asked, what issues you're seeing.

Sorry for not being specific. Details below.

What doesn't work:

The error I am seeing is:

Code:

root@vis-ct-clx-08:~# docker run -it centos:7 bash
docker: Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown.
ERRO[0000] error waiting for container: context canceled
root@vis-ct-clx-08:~#

GRUB_CMDLINE_LINUX_DEFAULT is not modified to include systemd.unified_cgroup_hierarchy=0.

My conf file looks like:

Code:

#mp0%3A /mnt/pve/scratch,mp=/scratch
arch: amd64
cores: 48
features: mount=nfs4,keyctl=1
hostname: vis-ct-clx-08
memory: 57344
net0: name=eth0,bridge=vmbr0,hwaddr=XX:XX:XX:XX:XX:XX,ip=dhcp,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-159-disk-0,size=192G
snaptime: 1573010169
swap: 16384
lxc.apparmor.profile: unconfined
lxc.cgroup.devices.allow: a
lxc.cap.drop:

Modifications and their effects:

Using:

Code:

lxc.cgroup.devices.allow:
lxc.cgroup.devices.deny:

(that worked for 18.04 containers) prior to lxc-pve version 4.0.9-4, still does not work.

An 18.04 container with an identical conf file, works.

Adding systemd.unified_cgroup_hierarchy=0 to GRUB_CMDLINE_LINUX_DEFAULT, works.

Using cgroup2 instead of cgroup with both setups (ie. devices.allow: a, and devices.allow and devices.deny) does not work.

Thank you!

George

lps90 · Aug 1, 2021

The grub alteration works to start a centos7 container, but for example, for users using centos web panel, they will not be able to
turn the cfd firewall on or it will block the login to the centos web panel.
So basically this is not a complete solution.

proxminent · Oct 14, 2021

look at
https://forum.proxmox.com/threads/h...out-systemd-unified_cgroup_hierarchy-0.94253/

pizza · Feb 3, 2022

LXC containers with Ubuntu 14.0 wil not start any services. Nesting=on solves the problem

Migration from 6.4 to 7.0

Active Member

Proxmox Staff Member

Active Member

Active Member

New Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Active Member

Well-Known Member

Renowned Member