Issue after Resize - LXC won't boot

Dennis Nind

New Member
Apr 23, 2018
1
0
1
35
Hi guys,

We have an issue that is baffling us. Been frustratingly trying to resolve it for a few days.

In short, we have an issue whereby creating CT is fine, they run perfectly. Somehow though, when a resize is issued (via a whmcs module / API) the server boots down and won't boot back up.

We have noticed this when it's resized, or when an additional IP address is added. I believe any sort of changes basically cause the issue, yet managing the resizes from within Proxmox works perfectly.

We have also had some customers do this from restoring backups again via the api.

Problem is, that the server won't boot, and is a little frequent so just concerned about it repeatedly happening in the future so think it's best to get some help now rather than later.

When we try to start the CT we get the following...

root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.

Everything appears to be up to date...

root@hv1:~# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

The issue with cgroups looks like it didn't clean up on shutdown...

/sys/fs/cgroup/cpu,cpuacct//lxc/119 and /sys/fs/cgroup/memory//lxc/119 somehow didnt clear up on the container stop

So we tried to clean it up...

we can see the stale cgroup mounts

lscgroup |grep 119 > cglist

for group in `cat cglist`; do cgdelete -r $group; done

Once that was done though, we had a different error altogether...

lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup_dev_console:1662 - No such file or directory - Failed to set mode "0111" to "/dev/pts/11"
lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup:3412 - Failed to setup console

root@hv1:~# mount|grep pts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)

root@hv1:~# ls -ld /dev/pts/
drwxr-xr-x 2 root root 0 Apr 22 23:48 /dev/pts/

lxc-start is somehow not able to create/modify files in /dev/pts

At this point, it seemed to us to be an apparmor issue, so we tried to downgrade and rebooted the box.

Nothing seemed to help, so was stuck at this point...

root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.

We tried to backup the VM and restore etc but again this wouldn't help.

Once the HV was rebooted the cgroups issue returned.

Can you guys please help?

Many thanks,

Dennis
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!