LXC containers refuse to start

molnart

Well-Known Member
Jul 30, 2020
41
7
48
41
I have an issue with my LXC containers. I am not able to start any LXC containers nor create new ones.

the console output is the following (this is for a new CT attempted to be created from scratch):
Code:
root@pve:~# pct start 112
run_buffer: 323 Script exited with status 1
lxc_setup: 3291 Failed to run mount hooks
do_start: 1224 Failed to setup container "112"
__sync_wait: 41 An error occurred in another process (expected sequence number 5)
__lxc_start: 1950 Failed to spawn container "112"
startup for container '112' failed

logfile for an existing containers that was running fine a while ago via lxc-start -n 110 -F -l DEBUG --logfile=boot110.log --logpriority=INFO is attached


here's a the container config file:
Code:
arch: amd64
cores: 1
hostname: Ubuntu-playground
memory: 512
net0: name=eth0,bridge=vmbr1,firewall=1,gw=192.168.50.1,hwaddr=EA:AD:68:20:AE:42,ip=192.168.50.13/24,type=veth
ostype: ubuntu
rootfs: local-lvm:vm-110-disk-0,size=16G
swap: 512
unprivileged: 1
 

Attachments

Hi,
could you share the output of pveversion -v and systemctl status lxcfs.service?
 
Code:
root@pve:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

Code:
● lxcfs.service - FUSE filesystem for LXC
   Loaded: loaded (/lib/systemd/system/lxcfs.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2021-01-30 23:24:23 CET; 4 days ago
     Docs: man:lxcfs(1)
 Main PID: 910 (lxcfs)
    Tasks: 5 (limit: 9830)
   Memory: 6.6M
   CGroup: /system.slice/lxcfs.service
           └─910 /usr/bin/lxcfs /var/lib/lxcfs

Jan 30 23:24:23 pve lxcfs[910]: - proc_diskstats
Jan 30 23:24:23 pve lxcfs[910]: - proc_loadavg
Jan 30 23:24:23 pve lxcfs[910]: - proc_meminfo
Jan 30 23:24:23 pve lxcfs[910]: - proc_stat
Jan 30 23:24:23 pve lxcfs[910]: - proc_swaps
Jan 30 23:24:23 pve lxcfs[910]: - proc_uptime
Jan 30 23:24:23 pve lxcfs[910]: - shared_pidns
Jan 30 23:24:23 pve lxcfs[910]: - cpuview_daemon
Jan 30 23:24:23 pve lxcfs[910]: - loadavg_daemon
Jan 30 23:24:23 pve lxcfs[910]: - pidfds
 
The lxc.mount.hook fails, but I cannot see why from the other information present.

Please do the following and hopefully we'll see why the script fails:

1. Make a copy of the script:
Code:
cp /usr/share/lxcfs/lxc.mount.hook /usr/share/lxcfs/lxc.mount.hook.bak

2. Edit the script /usr/share/lxcfs/lxc.mount.hook an replace the beginning
Code:
#!/bin/sh -e
with
Code:
#!/bin/bash

set -x
exec &> /tmp/lxchook.log
Like this the commands in the script will be logged to the file /tmp/lxchook.log when it is executed.

3. Try starting your container

4. Replace the script with the original
Code:
cp /usr/share/lxcfs/lxc.mount.hook.bak /usr/share/lxcfs/lxc.mount.hook
rm /usr/share/lxcfs/lxc.mount.hook.bak

5. Post the log
 
  • Like
Reactions: Stoiko Ivanov
so restarting my host solved the problem for a while, but now it it back: for some reason the lxcfs.service keeps stopping
restarting it makes containers working again, but something seems to be wrong

Code:
# systemctl status lxcfs.service
● lxcfs.service - FUSE filesystem for LXC
   Loaded: loaded (/lib/systemd/system/lxcfs.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Sun 2021-02-14 17:41:31 CET; 4 days ago
     Docs: man:lxcfs(1)
 Main PID: 910 (code=exited, status=0/SUCCESS)

Feb 14 16:31:08 pve lxcfs[910]: utils.c: 291: send_creds: No such process - Failed getting reply from server over socket
Feb 14 16:34:36 pve lxcfs[910]: utils.c: 313: send_creds: No such process - Failed at sendmsg: 1
Feb 14 16:34:38 pve lxcfs[910]: utils.c: 254: recv_creds: Timed out waiting for scm_cred: Invalid argument

Feb 14 16:34:38 pve lxcfs[910]: utils.c: 291: send_creds: No such process - Failed getting reply from server over socket
Feb 14 16:38:06 pve lxcfs[910]: utils.c: 313: send_creds: No such process - Failed at sendmsg: 1
Feb 14 16:38:08 pve lxcfs[910]: utils.c: 254: recv_creds: Timed out waiting for scm_cred: Invalid argument
Feb 14 16:38:08 pve lxcfs[910]: utils.c: 291: send_creds: No such process - Failed getting reply from server over socket
Feb 14 17:41:31 pve lxcfs[910]: Running destructor lxcfs_exit
Feb 14 17:41:31 pve fusermount[27128]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument
Feb 14 17:41:31 pve systemd[1]: lxcfs.service: Succeeded.
 
Anything else going on at the time before the destructor runs? Maybe something in /var/log/syslog?