Cannot open /proc/stat: Transport endpoint is not connected

SPQRInc

Member
Jul 27, 2015
57
1
6
Hello,

since a few seconds I can not use htop or top in my LXC-containers:
Cannot open /proc/stat: Transport endpoint is not connected

Same problem on reloading php5-fpm

Dez 18 13:40:31 systemd[1]: Failed to create cgroup /lxc/103/system.slice/systemd-random-seed.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to realize cgroups for queued unit systemd-random-seed.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to create cgroup /lxc/103/system.slice/networking.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to realize cgroups for queued unit networking.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to create cgroup /lxc/103/system.slice/kbd.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to realize cgroups for queued unit kbd.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to create cgroup /lxc/103/system.slice/systemd-journald.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: Failed to realize cgroups for queued unit systemd-journald.service: Transport endpoint is not connected

Dez 18 13:40:31 systemd[1]: php5-fpm.service: control process exited, code=exited status=219

Dez 18 13:40:31 systemd[1]: Reload failed for The PHP FastCGI Process Manager.
This is the syslog

Dec 18 13:41:13 systemd[30954]: Received SIGRTMIN+24 from PID 30962 (kill).

Dec 18 13:42:13 systemd[31685]: Starting Paths.

Dec 18 13:42:13 systemd[31685]: Reached target Paths.

Dec 18 13:42:13 systemd[31685]: Starting Timers.

Dec 18 13:42:13 systemd[31685]: Reached target Timers.

Dec 18 13:42:13 systemd[31685]: Starting Sockets.

Dec 18 13:42:13 systemd[31685]: Reached target Sockets.

Dec 18 13:42:13 systemd[31685]: Starting Basic System.

Dec 18 13:42:13 systemd[31685]: Reached target Basic System.

Dec 18 13:42:13 systemd[31685]: Starting Default.

Dec 18 13:42:13 systemd[31685]: Reached target Default.

Dec 18 13:42:13 systemd[31685]: Startup finished in 5ms.

Dec 18 13:42:13 systemd[31685]: Stopping Default.

Dec 18 13:42:13 systemd[31685]: Stopped target Default.

Dec 18 13:42:13 systemd[31685]: Stopping Basic System.

Dec 18 13:42:13 systemd[31685]: Stopped target Basic System.

Dec 18 13:42:13 systemd[31685]: Stopping Paths.

Dec 18 13:42:13 systemd[31685]: Stopped target Paths.

Dec 18 13:42:13 systemd[31685]: Stopping Timers.

Dec 18 13:42:13 systemd[31685]: Stopped target Timers.

Dec 18 13:42:13 systemd[31685]: Stopping Sockets.

Dec 18 13:42:13 systemd[31685]: Stopped target Sockets.

Dec 18 13:42:13 systemd[31685]: Starting Shutdown.

Dec 18 13:42:13 systemd[31685]: Reached target Shutdown.

Dec 18 13:42:13 systemd[31685]: Starting Exit the Session...

Dec 18 13:42:13 systemd[31685]: Received SIGRTMIN+24 from PID 31693 (kill).

Dec 18 13:42:54 systemd-timesyncd[535]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.000s/0.003s/+3ppm

Dec 18 13:43:13 systemd[32540]: Starting Paths.

Dec 18 13:43:13 systemd[32540]: Reached target Paths.

Dec 18 13:43:13 systemd[32540]: Starting Timers.

Dec 18 13:43:13 systemd[32540]: Reached target Timers.

Dec 18 13:43:13 systemd[32540]: Starting Sockets.

Dec 18 13:43:13 systemd[32540]: Reached target Sockets.

Dec 18 13:43:13 systemd[32540]: Starting Basic System.

Dec 18 13:43:13 systemd[32540]: Reached target Basic System.

Dec 18 13:43:13 systemd[32540]: Starting Default.

Dec 18 13:43:13 systemd[32540]: Reached target Default.

Dec 18 13:43:13 systemd[32540]: Startup finished in 5ms.

Dec 18 13:43:13 systemd[32540]: Stopping Default.

Dec 18 13:43:13 systemd[32540]: Stopped target Default.

Dec 18 13:43:13 systemd[32540]: Stopping Basic System.

Dec 18 13:43:13 systemd[32540]: Stopped target Basic System.

Dec 18 13:43:13 systemd[32540]: Stopping Paths.

Dec 18 13:43:13 systemd[32540]: Stopped target Paths.

Dec 18 13:43:13 systemd[32540]: Stopping Timers.

Dec 18 13:43:13 systemd[32540]: Stopped target Timers.

Dec 18 13:43:13 systemd[32540]: Stopping Sockets.

Dec 18 13:43:13 systemd[32540]: Stopped target Sockets.

Dec 18 13:43:13 systemd[32540]: Starting Shutdown.

Dec 18 13:43:13 systemd[32540]: Reached target Shutdown.

Dec 18 13:43:13 systemd[32540]: Starting Exit the Session...

Dec 18 13:43:13 systemd[32540]: Received SIGRTMIN+24 from PID 32548 (kill).

Dec 18 13:43:27 kernel: [49158.991100] audit: type=1400 audit(1450442607.793:1547): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="private/trace" pid=26896 comm="qmgr" requested_mask="r" denied_mask="r" fsuid=105 ouid=0

Dec 18 13:43:27 kernel: [49158.991104] audit: type=1400 audit(1450442607.793:1548): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="private/trace" pid=26896 comm="qmgr" requested_mask="r" denied_mask="r" fsuid=105 ouid=0

This is my PVE version

proxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)

pve-manager: 4.1-1 (running version: 4.1-1/2f9650d4)

pve-kernel-4.2.6-1-pve: 4.2.6-26

pve-kernel-2.6.32-43-pve: 2.6.32-166

pve-kernel-4.2.2-1-pve: 4.2.2-16

pve-kernel-2.6.32-26-pve: 2.6.32-114

pve-kernel-4.2.3-2-pve: 4.2.3-22

lvm2: 2.02.116-pve2

corosync-pve: 2.3.5-2

libqb0: 0.17.2-1

pve-cluster: 4.0-29

qemu-server: 4.0-41

pve-firmware: 1.1-7

libpve-common-perl: 4.0-41

libpve-access-control: 4.0-10

libpve-storage-perl: 4.0-38

pve-libspice-server1: 0.12.5-2

vncterm: 1.2-1

pve-qemu-kvm: 2.4-17

pve-container: 1.0-32

pve-firewall: 2.0-14

pve-ha-manager: 1.0-14

ksm-control-daemon: 1.2-1

glusterfs-client: 3.5.2-2+deb8u1

lxc-pve: 1.1.5-5

lxcfs: 0.13-pve1

cgmanager: 0.39-pve1

criu: 1.6.0-1

What can I do here?
 
Last edited:
As workaround rebooting the LXC-containers helped. I will monitor if the error will come back again.

Edit: I mentioned, that the error appears even after rebooting the host machine. So I have to shutdown every machine and turn it on again.
 
Last edited:
Bump: Is there any solution for this error?

The error appears 2-3x a week after the last upgrade. The problem is: I can not even restart single services:

Failed to kill control group: Transport endpoint is not connected
 
The Transport error is very likely originated by lxcfs, which is a tool that can allow containers with systemd to still interact with the cgroup system and display correct uptime and memory for the container.

https://linuxcontainers.org/lxcfs/introduction/
https://s3hh.wordpress.com/2015/02/23/introducing-lxcfs/

Is there anything special displayed on the host with:
Code:
journalctl -u lxcfs

Also I see some qmgr errors with apparmor DENIED. There is only a few days now a newer kernel on the no subscriptions packages which may fix that error.
 
Hi windinternet,

thanks a lot for your reply. I installed the new kernel - now I'm waiting for reboot and double-check if the problem persists.

This is the output of journalctl -u lxcfs

Dez 20 19:10:59 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 20 19:11:01 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 20 19:11:01 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 20 19:11:01 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 20 19:11:04 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 02:10:44 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 02:14:20 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/103/system.slice/apache2.service: Device or resource busy

Dez 21 02:33:30 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/105/system.slice/apache2.service: Device or resource busy

Dez 21 02:35:40 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/104/system.slice/apache2.service: Device or resource busy

Dez 21 02:37:16 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 03:04:42 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 03:16:31 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 03:16:35 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 09:42:59 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 09:43:01 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 09:43:01 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 09:43:01 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 09:51:09 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 10:03:17 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 10:03:19 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 10:03:19 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 10:03:19 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 10:03:19 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 10:03:21 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 10:03:21 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 10:03:21 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 10:04:01 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 10:04:03 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 10:04:03 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 10:33:55 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/102/system.slice/apache2.service: Device or resource busy

Dez 21 11:06:45 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 11:20:01 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/104/system.slice/mysql.service: Device or resource busy

Dez 21 11:20:03 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 11:20:05 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 11:20:05 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 11:20:05 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 12:20:07 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/105/system.slice/apache2.service: Device or resource busy

Dez 21 12:21:24 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/105/system.slice/apache2.service: Device or resource busy

Dez 21 12:31:49 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 12:31:51 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 12:31:51 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 12:31:51 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 12:34:23 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 12:35:22 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 12:35:24 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 12:35:24 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 12:35:57 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 12:46:59 wirtssystem lxcfs[919]: recursive_rmdir: failed to delete /run/lxcfs/controllers/name=systemd/lxc/101/system.slice/apache2.service: Device or resource busy

Dez 21 16:43:46 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 16:43:48 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 16:43:48 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 16:43:48 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 16:43:48 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 16:43:50 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 16:43:50 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 16:43:50 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 16:43:50 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 16:43:52 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 16:43:52 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 16:43:52 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 17:44:18 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 17:44:20 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 17:44:20 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 17:44:20 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Dez 21 17:44:20 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 17:44:22 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 17:44:22 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 17:44:22 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 17:44:22 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 17:44:24 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 17:44:24 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 17:44:24 wirtssystem lxcfs[919]: Failed to select for scm_cred: Success

Dez 21 22:17:34 wirtssystem lxcfs[919]: send_creds: failed at sendmsg: No such process

Dez 21 22:17:36 wirtssystem lxcfs[919]: send_creds: Error getting reply from server over socketpair

Dez 21 22:17:36 wirtssystem lxcfs[919]: do_read_pids: failed to ask child to exit: No such process

Dez 21 22:17:36 wirtssystem lxcfs[919]: Failed to select for scm_cred: No such file or directory

Edit: Character-limitation - full log is pasted here: http://d.pr/18cna
 
Can you tie the recursive rm_dir errors to moments of shutdown of that container?

It's probably a race condition on shutdown, where lxc-stop tries to remove there dirs while the container is still hanging on to them. It sometimes can happen on start too, but only if the container is improperly configured. It might damage later starting of the container. If all containers are stopped you should be able to refresh lxcfs by restarting the lxcfs service.
 
Its possible that the error appears on these events. I never saw it "live" - I always mentioned it later when I tried to restart services like dovecot or php5-fpm.

So you would suggest to stop all containers, restart lxcfs (service lxcfs restart) and start containers again?

What makes me crazy is that I already restarted the whole system, started the containers and they all are working fine for hours - sometimes for days.
 
It may be that one container shutdown ruins things for other running containers. Or it maybe that these error messages are actually totally unrelated. In any case, not being able to restart services and problems with top indicate that lxcfs is not responding anymore, and consequently systemd cannot function because it works with cgroups which are emulated by lxcfs inside the container, and the same with different files for top and uptime.

If that is the case, trying a service lxcfs restart wouldn't hurt.
 
I looked up some logs, and actually the recursive_rmdir error seems 'normal'.

The send_creds and other errors not however.
 
Hello windinternet,

unfortunately I'm still not able to reproduce this error. I'm also unable to fix it without rebooting the whole machine.

I just do not have any idea how to fix that.. :-(
 
Looking back at your software list, I notice that you have or had lxcfs 0.13-pve1. The current version is lxcfs 0.13-pve2. There have been some fixes in it. Also the newest kernel patch may help with the postfix errors that were visible in the logs.

It may be that you can't reproduce anymore because you did the updates.
 
Sorry, I forgot to give you an update. I upgraded yesterday night to the following packages:

proxmox-ve: 4.1-28 (running kernel: 4.2.6-1-pve) p
ve-manager: 4.1-2 (running version: 4.1-2/78c5f4a2)
pve-kernel-4.2.6-1-pve: 4.2.6-28
pve-kernel-2.6.32-43-pve: 2.6.32-166
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-29
qemu-server: 4.0-42
pve-firmware: 1.1-7
libpve-common-perl: 4.0-42
libpve-access-control: 4.0-10
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-18
pve-container: 1.0-35
pve-firewall: 2.0-14
pve-ha-manager: 1.0-16
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1

The server already rebooted but the error occurred again.
 
Okay. Is it an error with one container while others continue running? Or do all your systemd lxc containers get stuck, not able to restart services or display top? Did you get the send_creds errors again in the syslog? Any other errors around the same time in the syslog?

In the stuck container:
What is the output of systemctl --version? What is the content of /etc/*-release or /etc/issue?
 
Hello and thanks a lot for your reply :)

Well, all containers are running, but all are having the same problem at the same time with top/htop, reloading/starting services using systemd.

The output of systemctl --version on one of the containers (having this error now)
systemd 215

+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR

Output of /etc/*-release:

cat /etc/*-release

PRETTY_NAME="Debian GNU/Linux 8 (jessie)"

NAME="Debian GNU/Linux"

VERSION_ID="8"

VERSION="8 (jessie)"

ID=debian

HOME_URL="http://www.debian.org/"

SUPPORT_URL="http://www.debian.org/support/"

BUG_REPORT_URL="https://bugs.debian.org/"

12.5.30 debian8.0.build1205150826.19


Output of /etc/issue

Debian GNU/Linux 8 \n \l


The syslog shows these lines if I grep for send_creds:

Dec 23 09:40:43 wirtssystem lxcfs[917]: send_creds: failed at sendmsg: No such process
Dec 23 09:40:45 wirtssystem lxcfs[917]: send_creds: Error getting reply from server over socketpair
Dec 23 10:32:10 wirtssystem lxcfs[5841]: send_creds: failed at sendmsg: No such process
Dec 23 10:32:12 wirtssystem lxcfs[5841]: send_creds: Error getting reply from server over socketpair
Dec 23 14:26:26 wirtssystem lxcfs[5841]: send_creds: failed at sendmsg: No such process
Dec 23 14:26:28 wirtssystem lxcfs[5841]: send_creds: Error getting reply from server over socketpair
Dec 23 15:08:48 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:08:48 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:08:51 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:08:53 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:09:34 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:09:36 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:09:36 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:09:38 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:09:39 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:09:39 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:09:41 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:10:20 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:10:22 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:15:20 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:15:56 wirtssystem lxcfs[26620]: send_creds: failed at sendmsg: No such process
Dec 23 15:15:58 wirtssystem lxcfs[26620]: send_creds: Error getting reply from server over socketpair
Dec 23 15:41:23 wirtssystem lxcfs[920]: send_creds: failed at sendmsg: No such process
Dec 23 15:41:25 wirtssystem lxcfs[920]: send_creds: Error getting reply from server over socketpair
Dec 23 15:41:37 wirtssystem lxcfs[920]: send_creds: failed at sendmsg: No such process
Dec 23 15:44:23 wirtssystem lxcfs[920]: send_creds: failed at sendmsg: No such process
Dec 23 15:44:25 wirtssystem lxcfs[920]: send_creds: Error getting reply from server over socketpair
Dec 23 16:01:06 wirtssystem lxcfs[12474]: send_creds: failed at sendmsg: No such process
Dec 23 16:01:08 wirtssystem lxcfs[12474]: send_creds: Error getting reply from server over socketpair

So I did not restart the containers for a while now - maybe that's the reason why there are no more entries in syslog.


I already thought about faulty mounting-options. This is the fstab-file in a container:

cat /etc/fstab

proc /proc proc defaults 0 0

none /dev/pts devpts rw,gid=5,mode=620 0 0

none /run/shm tmpfs defaults 0 0
 
No, I think it is lxcfs not being able to keep up with quickly spawning deamons and a stressed process scheduler that is hurting you. It uses a process in the container to read process ids, and normal processes get scheduled out also.

Maybe it helps to give the busiest container more CPU priority.
 
Hello windinternet,

At the moment the CPU-limit is set to 12 and there are 2048 CPU-units for all 5 containers.

I thought it should be enough - maybe I should increase it to 4000 units?
 
It's a relative setting. If you increase all, it has zero effect. You must interpret this as the weight you give a container with respect to other containers. If they all have the same weight, then they all get the same amount of cpu time.
 
Okay i did not express it correctly: I could increase the values for the most CPU-intensive container, Right?

CPU-limit means the percentage of all CPU-time, correct?
 
Maybe it would also make a difference to add a:
Code:
Nice=-20

In the Service section of the /etc/systemd/system/multi-user.target.wants/lxcfs.service file and restart the lxcfs service.
 
Is it important where it is added?

[Unit]

Description=FUSE filesystem for LXC

ConditionVirtualization=!container

Before=lxc.service



[Service]

ExecStart=/usr/bin/lxcfs -f -s -o allow_other /var/lib/lxcfs/

KillMode=none

Restart=on-failure

ExecStop=/bin/fusermount -u /var/lib/lxcfs



[Install]

WantedBy=multi-user.target
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!