Unable to access lxc container(s), multiple errors (eg: Failed to send ".lxc/cgroup.procs" fds 9 and 14)

Skyrider

Active Member
May 11, 2020
55
1
28
37
So I'm attempting to access a running container, and this is the second time I received the following error within a week time:

Code:
skyrider@skyrider:/$ sudo pct enter 200
lxc-attach: 200: cgroups/cgfsng.c: cgroup_attach_create_leaf: 2169 Too many references: cannot splice - Failed to send ".lxc/cgroup.procs" fds 9 and 14
lxc-attach: 200: conf.c: userns_exec_minimal: 5156 Too many references: cannot splice - Running function in new user namespace failed
lxc-attach: 200: cgroups/cgfsng.c: cgroup_attach_move_into_leaf: 2185 No data available - Failed to receive target cgroup fd
lxc-attach: 200: conf.c: userns_exec_minimal: 5194 No data available - Running parent function failed
lxc-attach: 200: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
lxc-attach: 200: attach.c: do_attach: 1375 Failed to attach to container

If I'd restart the container/node, everything works again as it should. But I have no idea why I'm randomly getting this. I can access some other lxc containers just fine and only a few containers are affected by it.

config of the container:

Code:
cores: 4
features: nesting=1
hostname: newkirstin
memory: 10240
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.248.110.1,hwaddr=7A:F3:CC:F0:01:B4,ip=10.248.110.200/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-zfs:subvol-200-disk-0,size=50G
swap: 512
unprivileged: 1

and pveversion:

Code:
proxmox-ve: 7.2-1 (running kernel: 5.15.35-2-pve)
pve-manager: 7.2-4 (running version: 7.2-4/ca9d43cc)
pve-kernel-5.15: 7.2-4
pve-kernel-helper: 7.2-4
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-2
libpve-storage-perl: 7.2-4
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-10
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1

I prefer not to restart it every time I can't access the container with pct.

Wanted to add that I can access its terminal directly from the containers console, just not using pct enter from the main node.

EDIT:

Apparently I have to reboot the entire node/server, as rebooting the container alone is not working.. as in, it's not rebooting. Even a node reboot through the proxmox GUI is simply getting "Stuck"... I am forced to reboot the entire server through the systems control system.

What the heck is causing this issue?
 
Last edited:
hi,

* does it work if you do a container start/stop instead of a reboot? pct stop 200 && pct start 200

I can access some other lxc containers just fine and only a few containers are affected by it.
* how many containers are running on the node?

* is there anything common between the affected containers? (distribution, privilege, etc.)

* what is the output from cat /proc/cmdline
 
Also, next time this happens, could you please look for the container's `lxc-start -F -n $VMID` process PID and look at the output of `ls -l /proc/$PID/fd`?
There was an lxc package where `pct enter` would cause the start process to accumulate file descriptors, however, your version output shows 4.0.12 where this should not be the case anymore.
 
hi,

* does it work if you do a container start/stop instead of a reboot? pct stop 200 && pct start 200


* how many containers are running on the node?

* is there anything common between the affected containers? (distribution, privilege, etc.)

* what is the output from cat /proc/cmdline

Also, next time this happens, could you please look for the container's `lxc-start -F -n $VMID` process PID and look at the output of `ls -l /proc/$PID/fd`?
There was an lxc package where `pct enter` would cause the start process to accumulate file descriptors, however, your version output shows 4.0.12 where this should not be the case anymore.
Not sure, seeing I rebooted the node since yesterday. Once it happens again, i'll be sure to follow up more.

I have 20 lxc containers on the node, with the affected ones running on Ubuntu 22.04 (while others are running mixed of 22.04 and ubuntu 20.04) They all set to unprivileged. Interestingly enough, all the affected containers are running discord bots (python based) from what I can gather.

As for the output of /proc/cmdline: BOOT_IMAGE=/vmlinuz-5.15.35-2-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet
 
Last edited:
Also, next time this happens, could you please look for the container's `lxc-start -F -n $VMID` process PID and look at the output of `ls -l /proc/$PID/fd`?
There was an lxc package where `pct enter` would cause the start process to accumulate file descriptors, however, your version output shows 4.0.12 where this should not be the case anymore.
Okay, so. I'm having the issue again. At first I thought it was only my discord bot containers. But some other things such as reversed proxy container is being affected as well. It seems to "spread" to random containers, but not all of them.

Would you mind to share more details regarding lxc-start -F -n $VMID and ls -l /proc/$PID/fd?
I assume the first command I can read the PID that I need?

Code:
skyrider@skyrider:~$ sudo lxc-start -F -n 101
systemd 245.4-4ubuntu3.17 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization lxc.
Detected architecture x86-64.

Welcome to Ubuntu 20.04.4 LTS!

Set hostname to <reversedproxy>.
[  OK  ] Created slice system-container\x2dgetty.slice.
[  OK  ] Created slice system-modprobe.slice.
[  OK  ] Created slice system-postfix.slice.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Reached target User and Group Name Lookups.
[  OK  ] Reached target Remote Encrypted Volumes.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slices.
[  OK  ] Reached target Swap.
[  OK  ] Listening on Syslog Socket.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Listening on Network Service Netlink Socket.
         Starting Journal Service...
         Starting Set the console keyboard layout...
         Starting Remount Root and Kernel File Systems...
         Starting Apply Kernel Variables...
         Starting Uncomplicated firewall...
[  OK  ] Finished Uncomplicated firewall.
[  OK  ] Finished Remount Root and Kernel File Systems.
         Starting Create System Users...
[  OK  ] Finished Apply Kernel Variables.
[  OK  ] Finished Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Set the console keyboard layout.
[  OK  ] Finished Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Load AppArmor profiles...
         Starting Set console font and keymap...
         Starting Tell Plymouth To Write Out Runtime Data...
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
         Starting Network Service...
[  OK  ] Finished Set console font and keymap.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Finished Tell Plymouth To Write Out Runtime Data.
[  OK  ] Finished Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Finished Load AppArmor profiles.
[  OK  ] Finished Create Volatile Files and Directories.
[  OK  ] Reached target System Time Set.
[  OK  ] Reached target System Time Synchronized.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Finished Update UTMP about System Boot/Shutdown.
[  OK  ] Started Network Service.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Trigger to poll for Ubuntu Pro licenses (Only enabled on GCP LTS non-pro).
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Periodic ext4 Online Metadata Check for All Filesystems.
[  OK  ] Started Daily rotation of log files.
[  OK  ] Started Daily man-db regeneration.
[  OK  ] Started Message of the Day.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Started Ubuntu Advantage Timer for running repeated jobs.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on OpenBSD Secure Shell server socket.
[  OK  ] Listening on UUID daemon activation socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting Accounts Service...
[  OK  ] Started Regular background program processing daemon.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Save initial kernel messages after boot.
         Starting Remove Stale Online ext4 Metadata Check Snapshots...
         Starting Dispatcher daemon for systemd-networkd...
         Starting System Logging Service...
         Starting Login Service...
         Starting Wait for Network to be Configured...
         Starting Network Name Resolution...
[  OK  ] Started System Logging Service.
[  OK  ] Started Accounts Service.
[  OK  ] Finished Remove Stale Online ext4 Metadata Check Snapshots.
[  OK  ] Started Login Service.
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Network.
[  OK  ] Reached target Host and Network Name Lookups.
         Starting Nginx Proxy Manager...
         Starting Permit User Sessions...
[  OK  ] Started Nginx Proxy Manager.
[  OK  ] Finished Permit User Sessions.
         Starting Hold until boot process finishes up...
         Starting Terminate Plymouth Boot Screen...
[  OK  ] Started Dispatcher daemon for systemd-networkd.
[  OK  ] Finished Hold until boot process finishes up.
[  OK  ] Started Console Getty.
[  OK  ] Started Container Getty on /dev/tty1.
[  OK  ] Started Container Getty on /dev/tty2.
[  OK  ] Reached target Login Prompts.
[  OK  ] Finished Terminate Plymouth Boot Screen.
[  OK  ] Finished Wait for Network to be Configured.
[  OK  ] Reached target Network is Online.
         Starting The OpenResty Application Platform...
         Starting Postfix Mail Transport Agent (instance -)...
[  OK  ] Started The OpenResty Application Platform.
[  OK  ] Started Postfix Mail Transport Agent (instance -).
         Starting Postfix Mail Transport Agent...
[  OK  ] Finished Postfix Mail Transport Agent.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Finished Update UTMP about System Runlevel Changes.

Ubuntu 20.04.4 LTS reversedproxy console

Where is the PID exactly? As I assume this isn't 101 (same as the containers ID).
 
Where is the PID exactly? As I assume this isn't 101 (same as the containers ID).
The PID is the system's Process ID associated with your container. Yes, it is not the same is the container ID (which is just a thing used by Proxmox)

You can find the PID via invoking lxc-info -p <ctid>, so in your case lxc-info -p 101.
Please see wbumiller's answer below instead!
 
Last edited:
After a few days I got access back to my system due to moving, and the issue appears to be "gone" at the moment. I'll post more information once I know more.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!