Hi all,
Hit a very annoying and odd issue, no amount of Googling (or AI) has helped. I replaced one of my servers today from an old Dell T3610, to a HP Z4 G4, and for the most part all has been ok. I didn't install a whole new instance of Proxmox and migrate over, I simply took the hardware out of the old machine and popped it into the new machine. Using SuperGrub2, I reset GRUB to work with the UEFI BIOS and after that PVE has been booting without issues, and most of my LXC/VMs are working fine... However, I've got a very stubborn LXC container that will only boot into recovery mode. Boot logs are very light, but if I attach and resume the boot, the container will start with no issues and everything that is within that container works fine.
I actually migrated an LXC container from another host that I know is working, to then have the same behaviour on that container, so something is leading me towards this being a host problem rather than the software inside the container.
Interestingly, other LXC containers on the "problematic" host are running without issue, in fact, one of them even has a GPU passed through to it.
PVE Version: 9.0.3
LXC Config:
LXC debug logs:
Hit a very annoying and odd issue, no amount of Googling (or AI) has helped. I replaced one of my servers today from an old Dell T3610, to a HP Z4 G4, and for the most part all has been ok. I didn't install a whole new instance of Proxmox and migrate over, I simply took the hardware out of the old machine and popped it into the new machine. Using SuperGrub2, I reset GRUB to work with the UEFI BIOS and after that PVE has been booting without issues, and most of my LXC/VMs are working fine... However, I've got a very stubborn LXC container that will only boot into recovery mode. Boot logs are very light, but if I attach and resume the boot, the container will start with no issues and everything that is within that container works fine.
I actually migrated an LXC container from another host that I know is working, to then have the same behaviour on that container, so something is leading me towards this being a host problem rather than the software inside the container.
Interestingly, other LXC containers on the "problematic" host are running without issue, in fact, one of them even has a GPU passed through to it.
PVE Version: 9.0.3
LXC Config:
YAML:
arch: amd64
cores: 4
cpuunits: 200
features: mount=nfs
hostname: emby
memory: 4096
mp0: data3:vm-132-disk-2,mp=/emby-tmp,size=256G
mp1: /mnt/media/,mp=/media/plex,shared=1
net0: name=eth0,bridge=vmbr0,gw=192.168.1.254,hwaddr=BC:24:11:F4:11:CA,ip=192.168.1.41/24,tag=1,type=veth
onboot: 1
ostype: ubuntu
rootfs: data3:vm-132-disk-1,size=50G
startup: order=2
swap: 0
tags: community-script;media
LXC debug logs:
Code:
INFO lsm - ../src/lxc/lsm/lsm.c:lsm_init_static:38 - Initialized LSM security driver AppArmor
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "132", config section "lxc"
INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:unpriv_systemd_create_scope:1508 - Running privileged, not using a systemd unit
DEBUG seccomp - ../src/lxc/seccomp.c:parse_config_v2:664 - Host native arch is [3221225534]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "reject_force_umount # comment this to allow umount -f; not recommended"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:532 - Set seccomp rule to reject force umounts
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:532 - Set seccomp rule to reject force umounts
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:532 - Set seccomp rule to reject force umounts
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "[all]"
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "kexec_load errno 1"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding native rule for syscall[246:kexec_load] action[327681:errno] arch[0]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741827]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[246:kexec_load] action[327681:errno] arch[1073741886]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "open_by_handle_at errno 1"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding native rule for syscall[304:open_by_handle_at] action[327681:errno] arch[0]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741827]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[304:open_by_handle_at] action[327681:errno] arch[1073741886]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "init_module errno 1"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding native rule for syscall[175:init_module] action[327681:errno] arch[0]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741827]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[175:init_module] action[327681:errno] arch[1073741886]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "finit_module errno 1"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding native rule for syscall[313:finit_module] action[327681:errno] arch[0]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741827]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[313:finit_module] action[327681:errno] arch[1073741886]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:815 - Processing "delete_module errno 1"
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding native rule for syscall[176:delete_module] action[327681:errno] arch[0]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741827]
INFO seccomp - ../src/lxc/seccomp.c:do_resolve_add_rule:572 - Adding compat rule for syscall[176:delete_module] action[327681:errno] arch[1073741886]
INFO seccomp - ../src/lxc/seccomp.c:parse_config_v2:1036 - Merging compat seccomp contexts into main context
INFO start - ../src/lxc/start.c:lxc_init:882 - Container "132" is initialized
INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_monitor_create:1679 - The monitor process uses "lxc.monitor/132" as cgroup
DEBUG storage - ../src/lxc/storage/storage.c:storage_query:231 - Detected rootfs type "dir"
INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_payload_create:1787 - The container process uses "lxc/132/ns" as inner and "lxc/132" as limit cgroup
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWNS
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWPID
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWUTS
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWIPC
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWNET
INFO start - ../src/lxc/start.c:lxc_spawn:1774 - Cloned CLONE_NEWCGROUP
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved mnt namespace via fd 18 and stashed path as mnt:/proc/65534/fd/18
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved pid namespace via fd 19 and stashed path as pid:/proc/65534/fd/19
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved uts namespace via fd 20 and stashed path as uts:/proc/65534/fd/20
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved ipc namespace via fd 21 and stashed path as ipc:/proc/65534/fd/21
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved net namespace via fd 22 and stashed path as net:/proc/65534/fd/22
DEBUG start - ../src/lxc/start.c:lxc_try_preserve_namespace:140 - Preserved cgroup namespace via fd 23 and stashed path as cgroup:/proc/65534/fd/23
WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3442 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
INFO cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits:3538 - Limits for the unified cgroup hierarchy have been setup
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/lxcnetaddbr" for container "132", config section "net"
DEBUG network - ../src/lxc/network.c:netdev_configure_server_veth:879 - Instantiated veth tunnel "veth132i0 <--> veth6bppR6"
DEBUG conf - ../src/lxc/conf.c:lxc_mount_rootfs:1223 - Mounted rootfs "/var/lib/lxc/132/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
INFO conf - ../src/lxc/conf.c:setup_utsname:671 - Set hostname to "emby"
DEBUG network - ../src/lxc/network.c:setup_hw_addr:3866 - Mac address "BC:24:11:F4:11:CA" on "eth0" has been setup
DEBUG network - ../src/lxc/network.c:lxc_network_setup_in_child_namespaces_common:4007 - Network device "eth0" has been setup
INFO network - ../src/lxc/network.c:lxc_setup_network_in_child_namespaces:4064 - Finished setting up network devices with caller assigned names
INFO conf - ../src/lxc/conf.c:mount_autodev:1006 - Preparing "/dev"
INFO conf - ../src/lxc/conf.c:mount_autodev:1067 - Prepared "/dev"
DEBUG conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:531 - Invalid argument - Tried to ensure procfs is unmounted
DEBUG conf - ../src/lxc/conf.c:lxc_mount_auto_mounts:554 - Invalid argument - Tried to ensure sysfs is unmounted
DEBUG conf - ../src/lxc/conf.c:mount_entry:2208 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
DEBUG conf - ../src/lxc/conf.c:mount_entry:2227 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
DEBUG conf - ../src/lxc/conf.c:mount_entry:2271 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
DEBUG conf - ../src/lxc/conf.c:mount_entry:2208 - Remounting "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" to respect bind or remount options
DEBUG conf - ../src/lxc/conf.c:mount_entry:2227 - Flags for "/sys/kernel/debug" were 4110, required extra flags are 14
DEBUG conf - ../src/lxc/conf.c:mount_entry:2271 - Mounted "/sys/kernel/debug" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/debug" with filesystem type "none"
DEBUG conf - ../src/lxc/conf.c:mount_entry:2208 - Remounting "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" to respect bind or remount options
DEBUG conf - ../src/lxc/conf.c:mount_entry:2227 - Flags for "/sys/kernel/security" were 4110, required extra flags are 14
DEBUG conf - ../src/lxc/conf.c:mount_entry:2271 - Mounted "/sys/kernel/security" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/kernel/security" with filesystem type "none"
DEBUG conf - ../src/lxc/conf.c:mount_entry:2208 - Remounting "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" to respect bind or remount options
DEBUG conf - ../src/lxc/conf.c:mount_entry:2227 - Flags for "/sys/fs/pstore" were 4110, required extra flags are 14
DEBUG conf - ../src/lxc/conf.c:mount_entry:2271 - Mounted "/sys/fs/pstore" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/pstore" with filesystem type "none"
DEBUG conf - ../src/lxc/conf.c:mount_entry:2271 - Mounted "mqueue" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/mqueue" with filesystem type "mqueue"
DEBUG cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgroupfs_mount:2197 - Mounted cgroup filesystem cgroup2 onto 20((null))
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxcfs/lxc.mount.hook" for container "132", config section "lxc"
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/lxc-pve-autodev-hook" for container "132", config section "lxc"
INFO conf - ../src/lxc/conf.c:lxc_fill_autodev:1104 - Populating "/dev"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "full"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "null"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "random"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "tty"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "urandom"
DEBUG conf - ../src/lxc/conf.c:lxc_fill_autodev:1113 - Created device node "zero"
INFO conf - ../src/lxc/conf.c:lxc_fill_autodev:1192 - Populated "/dev"
INFO conf - ../src/lxc/conf.c:lxc_transient_proc:3300 - Caller's PID is 1; /proc/self points to 1
DEBUG conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1537 - Attached detached devpts mount 21 to 19/pts
DEBUG conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1623 - Created "/dev/ptmx" file as bind mount target
DEBUG conf - ../src/lxc/conf.c:lxc_setup_devpts_child:1630 - Bind mounted "/dev/pts/ptmx" to "/dev/ptmx"
DEBUG conf - ../src/lxc/conf.c:lxc_allocate_ttys:900 - Created tty with ptx fd 23 and pty fd 24 and index 1
DEBUG conf - ../src/lxc/conf.c:lxc_allocate_ttys:900 - Created tty with ptx fd 25 and pty fd 26 and index 2
INFO conf - ../src/lxc/conf.c:lxc_allocate_ttys:905 - Finished creating 2 tty devices
DEBUG conf - ../src/lxc/conf.c:lxc_setup_ttys:824 - Bind mounted "pts/1" onto "/dev/lxc/tty1"
DEBUG conf - ../src/lxc/conf.c:lxc_setup_ttys:824 - Bind mounted "pts/2" onto "/dev/lxc/tty2"
INFO conf - ../src/lxc/conf.c:lxc_setup_ttys:868 - Finished setting up 2 /dev/tty<N> device(s)
INFO conf - ../src/lxc/conf.c:setup_personality:1703 - Set personality to "0lx0"
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2996 - Dropped mac_admin (33) capability
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2996 - Dropped mac_override (32) capability
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2996 - Dropped sys_time (25) capability
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2996 - Dropped sys_module (16) capability
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2996 - Dropped sys_rawio (17) capability
DEBUG conf - ../src/lxc/conf.c:capabilities_deny:2999 - Capabilities have been setup
NOTICE conf - ../src/lxc/conf.c:lxc_setup:4007 - The container "132" is set up
INFO apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set_at:1179 - Set AppArmor label to "lxc-132_</var/lib/lxc>//&:lxc-132_<-var-lib-lxc>:"
INFO apparmor - ../src/lxc/lsm/apparmor.c:apparmor_process_label_set:1224 - Changed AppArmor profile to lxc-132_</var/lib/lxc>//&:lxc-132_<-var-lib-lxc>:
DEBUG terminal - ../src/lxc/terminal.c:lxc_terminal_peer_default:704 - No such device - The process does not have a controlling terminal
NOTICE utils - ../src/lxc/utils.c:lxc_drop_groups:1481 - Dropped supplimentary groups
NOTICE start - ../src/lxc/start.c:start:2206 - Exec'ing "/sbin/init"
NOTICE start - ../src/lxc/start.c:post_start:2217 - Started "/sbin/init" with pid "65577"
NOTICE start - ../src/lxc/start.c:signal_handler:447 - Received 17 from pid 65568 instead of container init 65577