LXC container not starting

hashime

Renowned Member
Aug 9, 2014
36
0
71
I have this container that I had to kill -f since it was using 100% RAM and CPU and it was not reboot/stop -able via gui or pct command but now it won't start again.
Ideally I don't want to reboot the server
Please advise.

logfile:
Code:
lxc-start 133 20240812105111.402 INFO     confile - ../src/lxc/confile.c:set_config_idmaps:2273 - Read uid map: type u nsid 0 hostid 100000 range 65536
lxc-start 133 20240812105111.402 INFO     confile - ../src/lxc/confile.c:set_config_idmaps:2273 - Read uid map: type g nsid 0 hostid 100000 range 65536
lxc-start 133 20240812105111.402 TRACE    commands - ../src/lxc/commands.c:lxc_cmd_timeout:525 - Connection refused - Command "get_init_pid" failed to connect command socket
lxc-start 133 20240812105111.402 TRACE    commands - ../src/lxc/commands.c:lxc_cmd_timeout:525 - Connection refused - Command "get_state" failed to connect command socket
lxc-start 133 20240812105111.403 TRACE    commands - ../src/lxc/commands.c:lxc_server_init:2138 - Created abstract unix socket "/var/lib/lxc/133/command"
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_init_handler:755 - Unix domain socket 4 for command server is ready
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_start:2228 - Doing lxc_start
lxc-start 133 20240812105111.403 INFO     lsm - ../src/lxc/lsm/lsm.c:lsm_init_static:38 - Initialized LSM security driver AppArmor
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_init:779 - Initialized LSM
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:484 - Set container state to STARTING
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:487 - No state clients registered
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_init:785 - Set container state to "STARTING"
lxc-start 133 20240812105111.403 TRACE    start - ../src/lxc/start.c:lxc_init:841 - Set environment variables
lxc-start 133 20240812105111.403 INFO     utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "133", config section "lxc"
lxc-start 133 20240812105112.409 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 133 lxc pre-start produced output: failed to remove directory '/sys/fs/cgroup/lxc/133/ns/system.slice/cron.service': Device or resource busy

lxc-start 133 20240812105112.424 ERROR    utils - ../src/lxc/utils.c:run_buffer:571 - Script exited with status 16
lxc-start 133 20240812105112.424 ERROR    start - ../src/lxc/start.c:lxc_init:845 - Failed to run lxc.hook.pre-start for container "133"
lxc-start 133 20240812105112.424 ERROR    start - ../src/lxc/start.c:__lxc_start:2034 - Failed to initialize container "133"
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:484 - Set container state to ABORTING
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:487 - No state clients registered
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:484 - Set container state to STOPPING
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_serve_state_clients:487 - No state clients registered
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_end:964 - Closed command socket
lxc-start 133 20240812105112.425 TRACE    start - ../src/lxc/start.c:lxc_end:975 - Set container state to "STOPPED"
lxc-start 133 20240812105112.425 INFO     utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxcfs/lxc.reboot.hook" for container "133", config section "lxc"
lxc-start 133 20240812105112.928 INFO     utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/lxc-pve-poststop-hook" for container "133", config section "lxc"
lxc-start 133 20240812105113.933 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-poststop-hook 133 lxc post-stop produced output: umount: /var/lib/lxc/133/rootfs: not mounted

lxc-start 133 20240812105113.933 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-poststop-hook 133 lxc post-stop produced output: command 'umount --recursive -- /var/lib/lxc/133/rootfs' failed: exit code 1

lxc-start 133 20240812105113.956 ERROR    utils - ../src/lxc/utils.c:run_buffer:571 - Script exited with status 1
lxc-start 133 20240812105113.957 ERROR    start - ../src/lxc/start.c:lxc_end:986 - Failed to run lxc.hook.post-stop for container "133"
 
Hi,
seems like the following sysfs entry is causing the issue:
Code:
failed to remove directory '/sys/fs/cgroup/lxc/133/ns/system.slice/cron.service': Device or resource busy
Please check if there is still some left-over process related to the container running e.g. with ps faxl or if something is still using that file:
Code:
fuser -vau /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service
lsof /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service

For completeness, please post the output of pveversion -v and the container configuration pct config 133.
 
Hi,
seems like the following sysfs entry is causing the issue:
Code:
failed to remove directory '/sys/fs/cgroup/lxc/133/ns/system.slice/cron.service': Device or resource busy
Please check if there is still some left-over process related to the container running e.g. with ps faxl or if something is still using that file:
Code:
fuser -vau /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service
lsof /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service

For completeness, please post the output of pveversion -v and the container configuration pct config 133.
# fuser -vau /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service
USER PID ACCESS COMMAND
/sys/fs/cgroup/lxc/133/ns/system.slice/cron.service:

# lsof /sys/fs/cgroup/lxc/133/ns/system.slice/cron.service
#

no process as far as I can see

Code:
# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.5.11-6-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-4
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
pve-kernel-5.4: 6.4-20
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
pve-kernel-5.4.203-1-pve: 5.4.203-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx9
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-1
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.3
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1

Code:
# pct config 133
arch: amd64
cores: 2
features: nesting=1
hostname: lxt-prod-cab-seafile01
memory: 4500
net0: name=eth0,bridge=vmbr3006,gw=10.75.0.1,hwaddr=CE:70:3A:1A:42:05,ip=10.75.75.133/16,type=veth
onboot: 1
ostype: debian
protection: 1
rootfs: remote-rbd:vm-133-disk-0,size=310G
startup: order=8000
swap: 0
unprivileged: 1
 
no process as far as I can see
Did you also check with ps?
Code:
proxmox-ve: 8.2.0 (running kernel: 6.5.11-6-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.8-4
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
Was a new kernel installed during this boot? In that case, a reboot is recommended.