[SOLVED] Promox 5.3.1 - Failed to create directory "/sys/fs/cgroup/systemd/lxc/*

michaelj

Renowned Member
Jun 30, 2016
57
0
71
37
Hi Proxmox's community,

i am facing issue when i want to start my container (802), for information this container was working fine before my upgrade (debian 7 > debian 8 > debian 9).

I've upgraded my container using the apt source from jessie and then stretch.

My steps :

1)
repo jessie
apt-get update && apt-get upgrade
apt-get dist-upgrade

2)
repo stretch
apt-get update && apt-get upgrade
apt-get dist-upgrade

Then when i've stop/start my container, i got the errors below (full logs in the attached file) :

lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1526 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802-1"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-1"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1526 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-1"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802-2"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-2"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1526 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-2"
lxc-start 802 20190328125832.649 ERROR start - start.c:start:2078 - No such file or directory - Failed to exec "/sbin/init"
lxc-start 802 20190328125832.650 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 7)
lxc-start 802 20190328125832.656 ERROR start - start.c:__lxc_start:1989 - Failed to spawn container "802"
lxc-start 802 20190328125833.374 ERROR lxc_start - tools/lxc_start.c:main:330 - The container failed to start
lxc-start 802 20190328125833.374 ERROR lxc_start - tools/lxc_start.c:main:336 - Additional information can be obtained by setting the --logfile and --logpriority options

PVE INFO :

proxmox-ve: 5.3-1 (running kernel: 4.15.18-12-pve)
pve-manager: 5.3-12 (running version: 5.3-12/5fbbbaf6)
pve-kernel-4.15: 5.3-3
pve-kernel-4.15.18-12-pve: 4.15.18-35
pve-kernel-4.15.18-11-pve: 4.15.18-34
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-48
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-12
libpve-storage-perl: 5.0-39
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-24
pve-cluster: 5.0-34
pve-container: 2.0-35
pve-docs: 5.3-3
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-18
pve-firmware: 2.0-6
pve-ha-manager: 2.0-8
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 2.12.1-2
pve-xtermjs: 3.10.1-2
qemu-server: 5.0-47
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

/sys/fs/cgroup/systemd//lxc/ folder :

drwxr-xr-x 3 root root 0 Mar 28 14:04 802

/sys/fs/cgroup/systemd/lxc$ l 802
total 0
-rw-r--r-- 1 root root 0 Mar 28 12:54 cgroup.procs
drwxr-xr-x 2 root root 0 Mar 28 15:05 ns
-rw-r--r-- 1 root root 0 Mar 28 15:05 tasks
-rw-r--r-- 1 root root 0 Mar 28 15:05 notify_on_release
-rw-r--r-- 1 root root 0 Mar 28 15:05 cgroup.clone_children



drwxr-xr-x 3 root root 0 Mar 28 14:04 802-1
drwxr-xr-x 3 root root 0 Mar 28 14:04 802-2
-rw-r--r-- 1 root root 0 Mar 28 14:04 cgroup.clone_children
-rw-r--r-- 1 root root 0 Mar 28 14:04 cgroup.procs
-rw-r--r-- 1 root root 0 Mar 28 14:04 notify_on_release
-rw-r--r-- 1 root root 0 Mar 28 14:04 tasks

Could you please help me to debug this please ?

Kind regards,

Michael.
 

Attachments

  • 802.log
    21.4 KB · Views: 5
Last edited:
Can you post your container configuration?
Code:
pct config CTID

If i had to take a guess, I'd say this is a systemd/sysv-init problem, since Debian switched to systemd (and doesn't use init by default).
Did you reboot after the upgrade from 7 to 8? From 8 to 9?
 
Hi Oguz,

Thanks for your reply.

Code:
pct config 802
arch: amd64
cpulimit: 4
cpuunits: 1024
hostname: mynamex
memory: 2048
mp0: /apps/scripts,mp=/apps/scripts,ro=1
mp1: /ftp,mp=/ftp
mp2: /prodftp,mp=/prodftp,ro=1
mp3: /share,mp=/share
nameserver: 172.20.1.20 172.20.1.14
net0: name=eth1,bridge=vmbr2,hwaddr=D6:A9:F7:20:D3:6A,ip=172.20.1.30/32,type=veth
net1: name=eth0,bridge=vmbr2,gw=x.x.x.x,hwaddr=7E:26:AB:EC:20:37,ip=x.x.x.x/27,type=veth
onboot: 1
ostype: debian
rootfs: zfs-storage:subvol-802-disk-0,size=10G
searchdomain: mynetworkx
swap: 1024

No, i only reboot the server (not the CT) after the full upgrade from 7 to 9 not after 7 to 8.

What it's strange is that i have an other CT, same steps and ct is working, the difference is juste some packages inside.

802 = bind server > not working
other = web server + wordpress > working
 
Last edited:
Try enabling "Nesting" feature, see if that solves the problem.

Code:
pct set CTID --features nesting=1
 
It's not working, i've found this other log message in the host when i try to start the CT :

systemd-udevd[19131]: Could not generate persistent MAC address for vethH77CKA: No such file or directory

Perhaps a better hint ..

I found this link : https://github.com/systemd/systemd/issues/3374 but it's not working for me.
 
Last edited:
Nothing comes to my mind.
other = web server + wordpress > working
Maybe post the config of the working container as well and we can compare.
 
The only difference is the arch i386 not amd64

Code:
arch: i386
cpulimit: 4
cpuunits: 1024
hostname: intwp.xxxx.com
memory: 1024
mp0: /apps/scripts,mp=/apps/scripts,ro=1
mp1: /share,mp=/share
nameserver: 172.20.1.20 172.20.1.14
net0: name=eth1,bridge=vmbr2,hwaddr=62:9B:D1:72:14:8F,ip=172.20.1.15/32,type=veth
net1: name=eth0,bridge=vmbr2,gw=x.x.x.x,hwaddr=CE:44:D5:9A:08:A2,ip=x.x.x.x/xx,type=veth
onboot: 1
ostype: debian
rootfs: zfs-storage:subvol-815-disk-0,size=20G
searchdomain: intvrack
swap: 512
 
I'm still thinking it's a systemd/init problem which was caused during the Debian upgrade. If you're not running any specific software on this container, I'd suggest you to make a regular clean Debian 9 container instead.

If you need files from the non-working one, you can just mount it and get the ones you need.
Code:
pct mount CTID
 
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1526 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802-1"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-1"
this looks like a previous start of the container failed and it did not clean up properly - either reboot the host or try to get rid of the old /sys/fs/cgroup/systemd//lxc/802* directories

this usually prevents a container from starting - once this is gone start the container again with debug-output

hope this helps!
 
Hi Stoiko,

Thanks for your feedback, i've already try to reboot .... but it's not working.
 
Then please paste the debug-log of the first container-start which fails (the one not mentioning directories 802-1, 802-2 etc.)
This usually provides the most insight into what is actually wrong
Thanks!
 
You will find the logs (from journalctl) in the attached file, this is the boot sequence after rebooting my host.

Logs are from the first CT.
 

Attachments

  • boot_first_ct.log
    3.6 KB · Views: 4
Sorry - I did not phrase this in a clear manner! - please provide the _debug_ log of starting the container the first time (i.e. set it to onboot 0, restart the server and then try to start it with `lxc-start -n 802 -F -l DEBUG -o /tmp/lxc-802.log`)

Thanks!
 
Am i forced to reboot ?

I've tried to restore the dump of this CT on an other server, and when i try your command, i get :

lxc-start -lDEBUG -o /tmp/8022.log -F -n 802
lxc-start: 802: tools/lxc_start.c: main: 290 No container config specified
 
Am i forced to reboot ?
In theory it should be enough to remove the left-over directories '/sys/fs/cgroup/systemd//lxc/802*' - but I haven't tested this extensively - so please keep this in mind

lxc-start: 802: tools/lxc_start.c: main: 290 No container config specified
IIRC PVE creates the config file the first time you start the container (with pct start) - sadly then the debug logs won't be there

You could try to:
* start the container - see how/where it fails
* try to start it with the debug log
* if it complains about the existing cgroup directory - remove that
* start it again for the debug log

hope this helps!
 
how i can "* if it complains about the existing cgroup directory - remove that" ?
 
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1526 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1301 - File exists - Failed to create directory "/sys/fs/cgroup/systemd//lxc/802-1"
lxc-start 802 20190328125831.523 ERROR cgfsng - cgroups/cgfsng.c:container_create_path_for_hierarchy:1353 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/802-1"
These are the errors I was referring to.
* usually then the directory /sys/fs/cgroup/systemd//lxc/802 does exist - if it does - remove it with `rm -r /sys/fs/cgroup/systemd//lxc/802`
 
I can't delete :

rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/cgroup.procs': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/ns/cgroup.procs': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/ns/tasks': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/ns/notify_on_release': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/ns/cgroup.clone_children': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/tasks': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/notify_on_release': Operation not permitted
rm: cannot remove '/sys/fs/cgroup/systemd//lxc/802/cgroup.clone_children': Operation not permitted
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!