So to preface - I was doing scheduled maintenance and migrating all VM's and this lone single container which we have on one of our clusters. It failed to stop properly and in the interests of getting the maintenance done, the box was rebooted soon afterwards. Now no matter what I do I cannot get this container to run (it's the only container we have).
We were in the process of fixing our network too - we had a mix of true public and private addresses and were fixing our monitors/managers to run specifically on the private networks (using the method of adding the private network as first network in ceph.conf and destroying and recreating the monitors and managers). This was followed by an upgrade to 8.41 on all hosts followed by a reboot. Only this container isn't working.
So simply on startup:
And rbd info:
And configuration of the CT:
Also seeing this error in dmesg when you try and start it, but google fu is failing me today:
Any clues where I can look next?
We were in the process of fixing our network too - we had a mix of true public and private addresses and were fixing our monitors/managers to run specifically on the private networks (using the method of adding the private network as first network in ceph.conf and destroying and recreating the monitors and managers). This was followed by an upgrade to 8.41 on all hosts followed by a reboot. Only this container isn't working.
So simply on startup:
Code:
# pct start 105 --debug
run_buffer: 571 Script exited with status 110
lxc_init: 845 Failed to run lxc.hook.pre-start for container "105"
__lxc_start: 2034 Failed to initialize container "105"
0 hostid 100000 range 65536
INFO lsm - ../src/lxc/lsm/lsm.c:lsm_init_static:38 - Initialized LSM security driver AppArmor
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "105", config section "lxc"
DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 105 lxc pre-start produced output: In some cases useful info is found in syslog - try "dmesg | tail".
DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 105 lxc pre-start produced output: rbd: sysfs write failed
DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 105 lxc pre-start produced output: can't map rbd volume vm-105-disk-0: rbd: sysfs write failed
ERROR utils - ../src/lxc/utils.c:run_buffer:571 - Script exited with status 110
ERROR start - ../src/lxc/start.c:lxc_init:845 - Failed to run lxc.hook.pre-start for container "105"
ERROR start - ../src/lxc/start.c:__lxc_start:2034 - Failed to initialize container "105"
INFO utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxcfs/lxc.reboot.hook" for container "105", config section "lxc"
And rbd info:
Code:
# rbd info ewr-pool/vm-105-disk-0
rbd image 'vm-105-disk-0':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 4cbfb3bc45c7ae
block_name_prefix: rbd_data.4cbfb3bc45c7ae
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Wed Apr 24 01:48:04 2024
access_timestamp: Wed Apr 24 01:48:04 2024
modify_timestamp: Wed Apr 24 01:48:04 2024
And configuration of the CT:
Code:
# cat /etc/pve/lxc/105.conf
arch: amd64
cores: 2
features: nesting=1
hostname: shipyard-couch
memory: 1024
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.10.8.1,hwaddr=BC:24:11:22:37:80,ip=10.10.8.100/24,tag=108,type=veth
ostype: centos
rootfs: ewr-pool:vm-105-disk-0,size=20G,mountoptions=discard
swap: 512
unprivileged: 1
Also seeing this error in dmesg when you try and start it, but google fu is failing me today:
Code:
[ 2990.348616] libceph: another match of type 1 in addrvec
[ 2990.348621] libceph: problem decoding monmap, -22
Any clues where I can look next?