Can't start CT after upgrade to Proxmox 5.x

TheMineGeek

New Member
Sep 28, 2016
15
0
1
26
Hello,

I've just updated to Proxmox 5.x and only two of a 20th of CT can start. The others ones fail. I also have a VM which work great.

I've tried to run pct fsck 102 in case it was a fs error but I got this error :
Code:
unable to run fsck for 'ssd:subvol-102-disk-1' (format == subvol)

Here is my CT 102 config :
Code:
arch: amd64
cores: 1
cpulimit: 8
cpuunits: 4096
hostname: Plex
memory: 4096
net0: bridge=vmbr0,gw=192.168.0.1,hwaddr=32:32:32:39:35:61,ip=192.168.0.102/24,name=eth0,type=veth
onboot: 1
ostype: debian
rootfs: ssd:subvol-102-disk-1,size=14G
swap: 512

And here are the log when i run lxc-start -lDEBUG -o /var/log/lxcErr.log -F -n 102 :
Code:
lxc-start 102 20180112225943.991 INFO     lxc_start_ui - tools/lxc_start.c:main:280 - using rcfile /var/lib/lxc/102/config
      lxc-start 102 20180112225943.991 INFO     lxc_lsm - lsm/lsm.c:lsm_init:48 - LSM security driver AppArmor
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .reject_force_umount  # comment this to allow umount -f;  not recommended.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for reject_force_umount action 0(kill).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:do_resolve_add_rule:276 - Setting Seccomp rule to reject force umounts.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for reject_force_umount action 0(kill).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:do_resolve_add_rule:276 - Setting Seccomp rule to reject force umounts.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:do_resolve_add_rule:276 - Setting Seccomp rule to reject force umounts.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .[all].
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .kexec_load errno 1.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for kexec_load action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for kexec_load action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .open_by_handle_at errno 1.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for open_by_handle_at action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for open_by_handle_at action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .init_module errno 1.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for init_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for init_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .finit_module errno 1.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for finit_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for finit_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:435 - processing: .delete_module errno 1.
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:610 - Adding native rule for delete_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:614 - Adding compat rule for delete_module action 327681(errno).
      lxc-start 102 20180112225943.991 INFO     lxc_seccomp - seccomp.c:parse_config_v2:624 - Merging in the compat Seccomp ctx into the main one.
      lxc-start 102 20180112225943.991 INFO     lxc_conf - conf.c:run_script_argv:457 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "102", config section "lxc".
      lxc-start 102 20180112225944.436 DEBUG    lxc_conf - conf.c:run_buffer:429 - Script /usr/share/lxc/hooks/lxc-pve-prestart-hook 102 lxc pre-start with output: unable to detect OS distribution
.
      lxc-start 102 20180112225944.443 ERROR    lxc_conf - conf.c:run_buffer:438 - Script exited with status 2.
      lxc-start 102 20180112225944.443 ERROR    lxc_start - start.c:lxc_init:651 - Failed to run lxc.hook.pre-start for container "102".
      lxc-start 102 20180112225944.443 ERROR    lxc_start - start.c:__lxc_start:1444 - Failed to initialize container "102".
      lxc-start 102 20180112225944.443 ERROR    lxc_start_ui - tools/lxc_start.c:main:371 - The container failed to start.
      lxc-start 102 20180112225944.443 ERROR    lxc_start_ui - tools/lxc_start.c:main:375 - Additional information can be obtained by setting the --logfile and --logpriority options.

I hope I've provided all necessary informations.
TMG

PS : I can create and run new CT without any problem
 
It cannot detect the running OS:

Code:
unable to detect OS distribution

Please check what OS is running:

Code:
cat $(zfs list -o mountpoint -H | grep subvol-102-disk-1 )/etc/debian_version
 
Code:
cat $(zfs list -o mountpoint -H | grep subvol-102-disk-1 )/etc/debian_version
Code:
9.9
I've try the same command (fsck), after migrating some CT and VM to my new node and they won't start anymore.
After direct migrating, every went fine, but after restart the PVE host, this happen.

Every "old" CT can't start and came with:
Code:
Job for pve-container@102.service failed because the control process exited with error code.
See "systemctl status pve-container@102.service" and "journalctl -xe" for details.
TASK ERROR: command 'systemctl start pve-container@102' failed: exit code 1
it seems, that the CT are empty, only /dev exist.

EDIT 31.07.19:
Found something:
Code:
root@pve003:~# zfs mount
rpool/ROOT/pve-1                /
rpool                           /rpool
rpool/ROOT                      /rpool/ROOT
rpool/data                      /rpool/data
rpool/data/subvol-111-disk-0    /rpool/data/subvol-111-disk-0
hdd_pool1/KVM                   /hdd_pool1/KVM
hdd_pool1/LXC/subvol-108-disk-0  /hdd_pool1/LXC/subvol-108-disk-0
hdd_pool1/LXC/subvol-112-disk-0  /hdd_pool1/LXC/subvol-112-disk-0
root@pve003:~# zfs mount -a
cannot mount '/hdd_pool1': directory is not empty
cannot mount '/hdd_pool1/LXC': directory is not empty
cannot mount '/hdd_pool1/LXC/subvol-105-disk-0': directory is not empty
cannot mount '/hdd_pool1/LXC/subvol-106-disk-0': directory is not empty
cannot mount '/hdd_pool1/LXC/subvol-107-disk-0': directory is not empty
cannot mount '/hdd_pool1/LXC/subvol-111-disk-0': directory is not empty
cannot mount '/hdd_pool1/LXC/subvol-202-disk-0': directory is not empty


Code:
 less /var/log/syslog :
...
Jul 31 08:48:59 pve003 kernel: [   34.279203] audit: type=1400 audit(1564555739.512:12): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="/usr/bin/lxc-start" name="/hdd_pool1/KVM/" pid=2876 comm="mount.zfs" fstype="zfs" srcname="hdd_pool1/KVM" flags="rw, noatime"
Jul 31 08:48:59 pve003 kernel: [   34.312785]  zd96: p1 p2
Jul 31 08:48:59 pve003 kernel: [   34.342693]  zd112: p1 p2
Jul 31 08:48:59 pve003 kernel: [   34.348805] audit: type=1400 audit(1564555739.588:13): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="/usr/bin/lxc-start" name="/hdd_pool1/LXC/subvol-108-disk-0/" pid=2908 comm="mount.zfs" fstype="zfs" srcname="hdd_pool1/LXC/subvol-108-disk-0" flags="rw, noatime"
Jul 31 08:48:59 pve003 kernel: [   34.403411] audit: type=1400 audit(1564555739.640:14): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="/usr/bin/lxc-start" name="/hdd_pool1/LXC/subvol-112-disk-0/" pid=2925 comm="mount.zfs" fstype="zfs" srcname="hdd_pool1/LXC/subvol-112-disk-0" flags="rw, noatime"
Jul 31 08:48:59 pve003 lxc-start[2473]: lxc-start: 105: lxccontainer.c: wait_on_daemonized_start: 856 No such file or directory - Failed to receive the container state
Jul 31 08:48:59 pve003 lxc-start[2473]: lxc-start: 105: tools/lxc_start.c: main: 330 The container failed to start
Jul 31 08:48:59 pve003 lxc-start[2473]: lxc-start: 105: tools/lxc_start.c: main: 333 To get more details, run the container in foreground mode
Jul 31 08:48:59 pve003 lxc-start[2473]: lxc-start: 105: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options
Jul 31 08:48:59 pve003 systemd[1]: pve-container@105.service: Control process exited, code=exited status=1
Jul 31 08:48:59 pve003 systemd[1]: Failed to start PVE LXC Container: 105.
Jul 31 08:48:59 pve003 systemd[1]: pve-container@105.service: Unit entered failed state.
Jul 31 08:48:59 pve003 systemd[1]: pve-container@105.service: Failed with result 'exit-code'.
Jul 31 08:48:59 pve003 pve-guests[2471]: command 'systemctl start pve-container@105' failed: exit code 1
Jul 31 08:49:00 pve003 pvesh[2423]: Starting CT 105 failed: command 'systemctl start pve-container@105' failed: exit code 1
...
 
Last edited: