[SOLVED] Some LXC CT not starting after 7.0 update

ca_maer · Jul 30, 2021

We updated 2 of our hypervisor from 6.4 to 7.0. Some containers started fine while other refused to start even after a backup/restore.

Here's the error I'm getting

Code:

lxc-start -n 133 -F -l DEBUG -o /tmp/lxc-133.log
Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[!!!!!!] Failed to mount API filesystems, freezing.
Freezing execution.

I've attached the lxc-133.log file to this post

pct config 133

Code:

arch: amd64
cpulimit: 2
hostname: dev-webdav-1n1
memory: 1024
net0: bridge=vmbr0,hwaddr=EA:AD:7D:75:F3:D1,name=eth0,ip=192.168.1.155/24,gw=192.168.1.1
onboot: 1
ostype: ubuntu
rootfs: local-zfs:subvol-133-disk-1,size=350G
swap: 0

Code:

pve-manager: 7.0-10 (running version: 7.0-10/d2f465d3)
pve-kernel-5.11: 7.0-5
pve-kernel-helper: 7.0-5
pve-kernel-5.4: 6.4-4
pve-kernel-5.11.22-2-pve: 5.11.22-4
pve-kernel-5.4.124-1-pve: 5.4.124-2
pve-kernel-5.4.114-1-pve: 5.4.114-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph-fuse: 14.2.21-1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.2.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-5
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-9
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.4-1
proxmox-backup-file-restore: 2.0.4-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-5
pve-cluster: 7.0-3
pve-container: 4.0-8
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-10
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

Any idea what might cause this ?

ness1602 · Jul 30, 2021

Did you read the upgrade page?
https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0

Old Container and CGroupv2

Since Proxmox VE 7.0, the default is a pure cgroupv2 environment. Previously a "hybrid" setup was used, where resource control was mainly done in cgroupv1 with an additional cgroupv2 controller which could take over some subsystems via the cgroup_no_v1 kernel command line parameter. (See the kernel parameter documentation for details.)

cgroupv2 support by the container’s OS is needed to run in a pure cgroupv2 environment. Containers running systemd version 231 (released in 2016) or newer support cgroupv2, as do containers that do not use systemd as init system in the first place (e.g., Alpine Linux or Devuan).

CentOS 7 and Ubuntu 16.10 are two prominent Linux distributions releases, which have a systemd version that is too old to run in a cgroupv2 environment, for details and possible fixes see: https://pve.proxmox.com/pve-docs/chapter-pct.html#pct_cgroup_compat

ca_maer · Jul 30, 2021

I did read the upgrade note. I was under the impression that the CT was a 18.04 (as all our other CT) but turned out I forgot to upgrade it and it was still stuck at 16.04. I will redeploy it as a VM instead.

Thanks for the help

dlasher · Aug 10, 2021

Ran into this exact issue this week, upgrading some older ubuntu-14-LTS containers, didn't realize rolling to 16 would kill them

What I did to fix it:

Code:

lxc mount $CTID

chroot /var/lib/lxc/$CTID/rootfs

apt update

apt dist-upgrade

do-release-upgrade ((( none found - had to do it by hand ))

sudo sed -i 's/xenial/bionic/g' /etc/apt/sources.list

apt update

apt dist-upgrade

(towards the end, it complained about rsyslog - so I just apt removed rsyslog and ran apt dist-update again)

exit (from chroot)
unmount /var/lib/lxc/$CTID/rootfs

start Container as normal - viola!

It did complain:

Code:

WARN: old systemd (< v232) detected, container won't run in a pure cgroupv2 environment! Please see documentation -> container -> cgroup version.
Task finished with 1 warning(s)!

But it started. I could then console in, and do-release-upgrade to get it 20 LTS, but at least it would start!!

Elfy · Feb 7, 2022

Thanks @dlasher, you saved my bacon!

dlasher · Feb 7, 2022

Elfy said:
Thanks @dlasher, you saved my bacon!

Happy to help! Glad it worked for you too!

Elfy · Feb 7, 2022

I'll add a few nuances to my configuration from what @dlasher posted:

lxc mount $CTID did not work for me, I had to use pct mount $CTID and pct enter $CTID.
Proxmox must think the container is running already for pct enter to work, unfortunately. I didn't have good luck with chroot when the container was offline. This may not work for some people as Proxmox may completely block < CGroupv2 containers from starting.
When I was able to chroot into the container using pct enter, the network interface wasn't online. So I had to bring it online with ifup $INTERFACE (in my case it was just ifup eth0).
After the above steps, I was finally able to do a dist-upgrade with do-release-upgrade.

Here are the basic commands I used, in order:

Code:

pct mount $CTID

pct start $CTID

pct enter $CTID

ifup eth0

do-release-upgrade

exit

pct stop $CTID

pct unmount $CTID

dlasher · Jan 5, 2023

Elfy said:
I'll add a few nuances to my configuration from what @dlasher posted:

Had the opportunity to try it @Elfy 's way today -- much cleaner -- thanks for sharing!

Search

Search

[SOLVED] Some LXC CT not starting after 7.0 update

ca_maer

Well-Known Member

Attachments

ness1602

Renowned Member

Old Container and CGroupv2

ca_maer

Well-Known Member

dlasher

Renowned Member

Elfy

Well-Known Member

dlasher

Renowned Member

Elfy

Well-Known Member

dlasher

Renowned Member

[SOLVED] Some LXC CT not starting after 7.0 update

Well-Known Member

Attachments

Renowned Member

Old Container and CGroupv2​

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Old Container and CGroupv2