Hi guys,
We have an issue that is baffling us. Been frustratingly trying to resolve it for a few days.
In short, we have an issue whereby creating CT is fine, they run perfectly. Somehow though, when a resize is issued (via a whmcs module / API) the server boots down and won't boot back up.
We have noticed this when it's resized, or when an additional IP address is added. I believe any sort of changes basically cause the issue, yet managing the resizes from within Proxmox works perfectly.
We have also had some customers do this from restoring backups again via the api.
Problem is, that the server won't boot, and is a little frequent so just concerned about it repeatedly happening in the future so think it's best to get some help now rather than later.
When we try to start the CT we get the following...
root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.
Everything appears to be up to date...
root@hv1:~# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
The issue with cgroups looks like it didn't clean up on shutdown...
/sys/fs/cgroup/cpu,cpuacct//lxc/119 and /sys/fs/cgroup/memory//lxc/119 somehow didnt clear up on the container stop
So we tried to clean it up...
we can see the stale cgroup mounts
lscgroup |grep 119 > cglist
for group in `cat cglist`; do cgdelete -r $group; done
Once that was done though, we had a different error altogether...
lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup_dev_console:1662 - No such file or directory - Failed to set mode "0111" to "/dev/pts/11"
lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup:3412 - Failed to setup console
root@hv1:~# mount|grep pts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
root@hv1:~# ls -ld /dev/pts/
drwxr-xr-x 2 root root 0 Apr 22 23:48 /dev/pts/
lxc-start is somehow not able to create/modify files in /dev/pts
At this point, it seemed to us to be an apparmor issue, so we tried to downgrade and rebooted the box.
Nothing seemed to help, so was stuck at this point...
root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.
We tried to backup the VM and restore etc but again this wouldn't help.
Once the HV was rebooted the cgroups issue returned.
Can you guys please help?
Many thanks,
Dennis
We have an issue that is baffling us. Been frustratingly trying to resolve it for a few days.
In short, we have an issue whereby creating CT is fine, they run perfectly. Somehow though, when a resize is issued (via a whmcs module / API) the server boots down and won't boot back up.
We have noticed this when it's resized, or when an additional IP address is added. I believe any sort of changes basically cause the issue, yet managing the resizes from within Proxmox works perfectly.
We have also had some customers do this from restoring backups again via the api.
Problem is, that the server won't boot, and is a little frequent so just concerned about it repeatedly happening in the future so think it's best to get some help now rather than later.
When we try to start the CT we get the following...
root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.
Everything appears to be up to date...
root@hv1:~# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
The issue with cgroups looks like it didn't clean up on shutdown...
/sys/fs/cgroup/cpu,cpuacct//lxc/119 and /sys/fs/cgroup/memory//lxc/119 somehow didnt clear up on the container stop
So we tried to clean it up...
we can see the stale cgroup mounts
lscgroup |grep 119 > cglist
for group in `cat cglist`; do cgdelete -r $group; done
Once that was done though, we had a different error altogether...
lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup_dev_console:1662 - No such file or directory - Failed to set mode "0111" to "/dev/pts/11"
lxc-start 119 20180423045207.854 ERROR lxc_conf - conf.c:lxc_setup:3412 - Failed to setup console
root@hv1:~# mount|grep pts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
root@hv1:~# ls -ld /dev/pts/
drwxr-xr-x 2 root root 0 Apr 22 23:48 /dev/pts/
lxc-start is somehow not able to create/modify files in /dev/pts
At this point, it seemed to us to be an apparmor issue, so we tried to downgrade and rebooted the box.
Nothing seemed to help, so was stuck at this point...
root@hv1:~# lxc-start -F --logfile=/root/119.log --logpriority=DEBUG -n 119 -f /etc/pve/lxc/119.conf
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/net_cls,net_prio//lxc/119" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/net_cls,net_prio//lxc/119"
lxc-start: 119: cgroups/cgfsng.c: create_path_for_hierarchy: 1752 Path "/sys/fs/cgroup/hugetlb//lxc/119-1" already existed.
lxc-start: 119: cgroups/cgfsng.c: cgfsng_create: 1862 Failed to create cgroup "/sys/fs/cgroup/hugetlb//lxc/119-1"
lxc-start: 119: conf.c: lxc_setup_dev_console: 1662 No such file or directory - Failed to set mode "0111" to "/dev/pts/16"
lxc-start: 119: conf.c: lxc_setup: 3412 Failed to setup console
lxc-start: 119: start.c: do_start: 1198 Failed to setup container "119"
lxc-start: 119: sync.c: __sync_wait: 57 An error occurred in another process (expected sequence number 5)
lxc-start: 119: start.c: __lxc_start: 1883 Failed to spawn container "119"
The container failed to start.
Additional information can be obtained by setting the --logfile and --logpriority options.
We tried to backup the VM and restore etc but again this wouldn't help.
Once the HV was rebooted the cgroups issue returned.
Can you guys please help?
Many thanks,
Dennis