Hi there,
we're actually running a four node Cluster with about 250 lxc containers on each node (evenly distributed). Primary Storage for almost all containers (except 4) is on the integrated ceph within proxmox.
We've had 3 Outages within the last week, all due to lxcfs fubar*ing up:
Apr 27 03:10:16 lxc-prox1 kernel: [741590.180559] cgroup: fork rejected by pids controller in /system.slice/lxcfs.service
Apr 27 03:10:16 lxc-prox1 lxcfs[1771]: fuse: error creating thread: Resource temporarily unavailable
Apr 27 03:10:18 lxc-prox1 lxcfs[1771]: bindings.c: 2473: recv_creds: Timed out waiting for scm_cred: No such file or directory
Restarting lxcfs to properly shutdown running containers (now zombie without a working /proc) and reboot the Cluster node solved the problems, but has its painpoints...
Are we hitting any limit here?
Googling around brought "https://www.suse.com/support/kb/doc/?id=000019044" to my attention that suggests to add a higher/unlimited Tasks Setting.
we're actually running a four node Cluster with about 250 lxc containers on each node (evenly distributed). Primary Storage for almost all containers (except 4) is on the integrated ceph within proxmox.
Kernel Version Linux 5.3.13-1-pve #1 SMP PVE 5.3.13-1 (Thu, 05 Dec 2019 07:18:14 +0100) |
PVE Manager Version pve-manager/6.1-3/37248ce6 |
We've had 3 Outages within the last week, all due to lxcfs fubar*ing up:
Apr 27 03:10:16 lxc-prox1 kernel: [741590.180559] cgroup: fork rejected by pids controller in /system.slice/lxcfs.service
Apr 27 03:10:16 lxc-prox1 lxcfs[1771]: fuse: error creating thread: Resource temporarily unavailable
Apr 27 03:10:18 lxc-prox1 lxcfs[1771]: bindings.c: 2473: recv_creds: Timed out waiting for scm_cred: No such file or directory
Restarting lxcfs to properly shutdown running containers (now zombie without a working /proc) and reboot the Cluster node solved the problems, but has its painpoints...
Are we hitting any limit here?
Googling around brought "https://www.suse.com/support/kb/doc/?id=000019044" to my attention that suggests to add a higher/unlimited Tasks Setting.
Code:
[Service]
TasksMax=MAX_TASKS|infinity