A LXC container does not finish starting up after file system repairs.

croxis

Member
Nov 16, 2020
11
1
8
40
After fixing some file system errors from a cranky shutdown, I got all my vms and containers to work again. I have one container that refuses to start up and I can't identity any errors from the command line. Is there a fix I don't know about, or would it be better to create a new container and transfer my needed files over?

Code:
root@pve:~# pveversion
pve-manager/7.3-6/723bb6ec (running kernel: 5.15.85-1-pve)

Code:
pct start 101 --debug
INFO     lsm - ../src/lxc/lsm/lsm.c:lsm_init_static:38 - Initialized LSM security driver AppArmor
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "101", config section "lxc"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:unpriv_systemd_create_scope:1227 - Running privileged, not using a systemd unit
DEBUG    seccomp - ../src/lxc/seccomp.c:parse_config_v2:656 - Host native arch is [3221225534]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "reject_force_umount  # comment this to allow umount -f;  not recommended"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:524 - Set seccomp rule to reject force umounts
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "[all]"
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "kexec_load errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[246:kexec_load] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "open_by_handle_at errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[304:open_by_handle_at] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "init_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[175:init_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "finit_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[313:finit_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:807 - Processing "delete_module errno 1"
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding native rule for syscall[176:delete_module] action[327681:errno] arch[0]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741827]
INFO     seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:564 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741886]
INFO     seccomp - ../src/lxc/seccomp.c:parse_config_v2:1017 - Merging compat seccomp contexts into main context
INFO     start - ../src/lxc/start.c:lxc_init:881 - Container "101" is initialized
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_create:1391 - The monitor process uses "lxc.monitor/101" as cgroup
DEBUG    storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_create:1499 - The container process uses "lxc/101/ns" as inner and "lxc/101" as limit cgroup
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWNS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWPID
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWUTS
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWIPC
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWNET
INFO     start - ../src/lxc/start.c:lxc_spawn:1762 - Cloned CLONE_NEWCGROUP
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved mnt namespace via fd 18 and stashed path as mnt:/proc/1117587/fd/18
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved pid namespace via fd 19 and stashed path as pid:/proc/1117587/fd/19
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved uts namespace via fd 20 and stashed path as uts:/proc/1117587/fd/20
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved ipc namespace via fd 21 and stashed path as ipc:/proc/1117587/fd/21
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved net namespace via fd 22 and stashed path as net:/proc/1117587/fd/22
DEBUG    start - ../src/lxc/start.c:lxc_try_preserve_namespace:139 - Preserved cgroup namespace via fd 23 and stashed path as cgroup:/proc/1117587/fd/23
WARN     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3155 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
INFO     cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits:3251 - Limits for the unified cgroup hierarchy have been setup
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/lxcnetaddbr" for container "101", config section "net"
DEBUG    network - ../src/lxc/network.c:netdev_configure_server_veth:852 - Instantiated veth tunnel "veth101i0 <--> vethL3fhFE"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_rootfs:1437 - Mounted rootfs "/var/lib/lxc/101/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
INFO     conf - ../src/lxc/conf.c:setup_utsname:876 - Set hostname to "xxxx.xxxx.xxx"
DEBUG    network - ../src/lxc/network.c:setup_hw_addr:3821 - Mac address "F6:CA:68:C3:A9:D1" on "eth0" has been setup
DEBUG    network - ../src/lxc/network.c:lxc_network_setup_in_child_namespaces_common:3962 - Network device "eth0" has been setup
INFO     network - ../src/lxc/network.c:lxc_setup_network_in_child_namespaces:4019 - Finished setting up network devices with caller assigned names
INFO     conf - ../src/lxc/conf.c:mount_autodev:1220 - Preparing "/dev"
INFO     conf - ../src/lxc/conf.c:mount_autodev:1281 - Prepared "/dev"
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:736 - Invalid argument - Tried to ensure procfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:759 - Invalid argument - Tried to ensure sysfs is unmounted
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2445 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2464 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "proc" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/proc" with filesystem type "proc"
DEBUG    conf - ../src/lxc/conf.c:mount_entry:2508 - Mounted "sys" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/.lxc/sys" with filesystem type "sysfs"
DEBUG    cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgroupfs_mount:1909 - Mounted cgroup filesystem cgroup2 onto 20((null))
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxcfs/lxc.mount.hook" for container "101", config section "lxc"
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "/usr/share/lxc/hooks/lxc-pve-autodev-hook" for container "101", config section "lxc"
INFO     conf - ../src/lxc/conf.c:run_script_argv:338 - Executing script "sh -c "modprobe tun; cd ${LXC_ROOTFS_MOUNT}/dev; mkdir net; mknod net/tun c 10 200; chmod 0666 net/tun"" for container "101", config section "lxc"
INFO     conf - ../src/lxc/conf.c:lxc_fill_autodev:1318 - Populating "/dev"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "full"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "null"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "random"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "tty"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "urandom"
DEBUG    conf - ../src/lxc/conf.c:lxc_fill_autodev:1327 - Created device node "zero"
INFO     conf - ../src/lxc/conf.c:lxc_fill_autodev:1406 - Populated "/dev"
INFO     conf - ../src/lxc/conf.c:lxc_transient_proc:3804 - Caller's PID is 1; /proc/self points to 1
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1780 - Attached detached devpts mount 21 to 19/pts
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1866 - Created "/dev/ptmx" file as bind mount target
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1873 - Bind mounted "/dev/pts/ptmx" to "/dev/ptmx"
DEBUG    conf - ../src/lxc/conf.c:lxc_allocate_ttys:1105 - Created tty with ptx fd 23 and pty fd 24 and index 1
DEBUG    conf - ../src/lxc/conf.c:lxc_allocate_ttys:1105 - Created tty with ptx fd 25 and pty fd 26 and index 2
INFO     conf - ../src/lxc/conf.c:lxc_allocate_ttys:1110 - Finished creating 2 tty devices
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_ttys:1029 - Bind mounted "pts/1" onto "/dev/lxc/tty1"
DEBUG    conf - ../src/lxc/conf.c:lxc_setup_ttys:1029 - Bind mounted "pts/2" onto "/dev/lxc/tty2"
INFO     conf - ../src/lxc/conf.c:lxc_setup_ttys:1073 - Finished setting up 2 /dev/tty<N> device(s)
INFO     conf - ../src/lxc/conf.c:setup_personality:1946 - Set personality to "0lx0"
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3229 - Dropped mac_admin (33) capability
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3229 - Dropped mac_override (32) capability
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3229 - Dropped sys_time (25) capability
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3229 - Dropped sys_module (16) capability
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3229 - Dropped sys_rawio (17) capability
DEBUG    conf - ../src/lxc/conf.c:capabilities_deny:3232 - Capabilities have been setup
NOTICE   conf - ../src/lxc/conf.c:lxc_setup:4511 - The container "101" is set up
INFO     apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set_at:1189 - Set AppArmor label to "lxc-101_</var/lib/lxc>//&:lxc-101_<-var-lib-lxc>:"
INFO     apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set:1234 - Changed AppArmor profile to lxc-101_</var/lib/lxc>//&:lxc-101_<-var-lib-lxc>:
DEBUG    terminal - ../src/lxc/terminal.c:lxc_terminal_peer_default:696 - No such device - The process does not have a controlling terminal
NOTICE   utils - ../src/lxc/utils.c:lxc_drop_groups:1367 - Dropped supplimentary groups
NOTICE   start - ../src/lxc/start.c:start:2194 - Exec'ing "/sbin/init"
NOTICE   start - ../src/lxc/start.c:post_start:2205 - Started "/sbin/init" with pid "1117613"
NOTICE   start - ../src/lxc/start.c:signal_handler:446 - Received 17 from pid 1117609 instead of container init 1117613
 
Hi, I can't see anything obviously wrong with the debug output -- it looks like it's starting up fine. What symptoms do you see indicating that the container is not starting up properly? Does pct enter 101 work? Could you post the output of pveversion -v and pct config 101?
 
It says it is not running, as reflected in the ui.

Code:
root@pve:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-6
pve-kernel-5.15: 7.3-2
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-6
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20221111-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

Code:
root@pve:~# pct config 101
arch: amd64
cores: 2
features: nesting=1
hostname: xxx.xxx.xxx (there is a real domain here)
memory: 1536
mp1: /opt/homes/croxis,mp=/home/croxis
mp2: /opt/media,mp=/var/lib/transmission/Downloads
mp3: /opt/media,mp=/opt/media
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=F6:CA:68:C3:A9:D1,ip=dhcp,ip6=dhcp,type=veth
onboot: 1
ostype: archlinux
rootfs: vms:101/vm-101-disk-0.raw,mountoptions=noatime,size=8G
startup: order=2
swap: 1024
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.hook.autodev: sh -c "modprobe tun; cd ${LXC_ROOTFS_MOUNT}/dev; mkdir net; mknod net/tun c 10 200; chmod 0666 net/tun"
 
Thanks. I don't spot anything out of the ordinary in the container config, except maybe the lxc.hook.autodev line -- you could try to use a lxc.mount.entry line to create a tun device instead, as described in the wiki [1].
It says it is not running, as reflected in the ui.
Do I understand correctly that you're referring to pct enter here? In other words, pct start <vmid> && pct enter <vmid> prints container '<vmid>' not running?

Can you mount the container filesystem using pct mount <vmid>?

[1] https://pve.proxmox.com/wiki/OpenVPN_in_LXC
 
  • Like
Reactions: orditeck
Yes to all the above

Code:
root@pve:/etc/pve/lxc# pct start 101
root@pve:/etc/pve/lxc# pct enter 101
container '101' not running!
root@pve:/etc/pve/lxc# pct mount 101
mounted CT 101 in '/var/lib/lxc/101/rootfs'
root@pve:/etc/pve/lxc# nano /var/lib/lxc/101/rootfs/
bin/        dev/        lib/        .MTREE      root/       sys/        
boot/       etc/        lib64/      opt/        run/        tmp/        
.BUILDINFO  fastboot    lost+found/ .PKGINFO    sbin/       usr/        
core        home/       mnt/        proc/       srv/        var/
 
Thanks! This starts to look like the container starts up fine, but stops/crashes shortly after. Could you check the host syslog and journalctl -b for suspicious messages around the time of container startup? Depending on the distribution running in the container, you might also find something under /var/log in the container filesystem (which you have mounted above).
 
I didn't even cross my mind to check the conditions in the container! It was a mess. Took me a long time to convince it to reinstall the packages, but it did! Some databases got corrupted and guess which container I didn't do any backups on x.x

Thank you for the help!
 
  • Like
Reactions: fweber

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!