Hello,
One of our hosts is not cleaning up the cgroups for shutdown containers, and it prevents them from starting again. Here is a snippet of the log file I obtained by starting the container with:
/usr/bin/lxc-start -F --logfile=/root/135.log --logpriority=DEBUG -n 135
When we try to start one of these containers, the web interface becomes unresponsive for that host. 2 additional VMs on the same host are running well, totally unaffected.
I've found some conversations online about similar issues when doing a container restart that doesn't give the system enough time to cleanup, but in this case the containers were off for over 10 minutes.
How can I force the cleanup of the cgroups, at least as a workaround? Where can we look for more clues to what may be causing the problem?
One of our hosts is not cleaning up the cgroups for shutdown containers, and it prevents them from starting again. Here is a snippet of the log file I obtained by starting the container with:
/usr/bin/lxc-start -F --logfile=/root/135.log --logpriority=DEBUG -n 135
Code:
lxc-start 135 20180116013019.550 INFO lxc_cgroup - cgroups/cgroup.c:cgroup_init:67 - cgroup driver cgroupfs-ng initing for 135
lxc-start 135 20180116013019.550 ERROR lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1337 - Path "/sys/fs/cgroup/cpu//lxc/135" already existed.
lxc-start 135 20180116013019.550 ERROR lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1433 - Failed to create "/sys/fs/cgroup/cpu//lxc/135"
When we try to start one of these containers, the web interface becomes unresponsive for that host. 2 additional VMs on the same host are running well, totally unaffected.
I've found some conversations online about similar issues when doing a container restart that doesn't give the system enough time to cleanup, but in this case the containers were off for over 10 minutes.
How can I force the cleanup of the cgroups, at least as a workaround? Where can we look for more clues to what may be causing the problem?