Container lässt sich nicht starten

Bob22

New Member
Sep 11, 2019
17
0
1
32
Hallo,

wieder einmal ist bei mir das Problem aufgetreten, dass sich meine Container nur beim Systemstart starten lassen, später nicht mehr - diesmal handelt es sich um Container 108. Dieses Problem wird langsam wirklich ärgerlich, da ich jetzt seit gut einem Jahr damit kämpfe. Das Image des Containers liegt hierbei auf einem Thinpool - mir ist aufgefallen, dass nur Container von diesem Problem betroffen sind, die auf ebendiesem Thinpool liegen (106 und 108). Vielleicht habe ich da etwas falsch aufgesetzt?

Code:
root@server1:/# lxc-start -n 108 -F -l DEBUG -o /tmp/108.log
lxc-start: 108: conf.c: run_buffer: 335 Script exited with status 32
lxc-start: 108: start.c: lxc_init: 861 Failed to run lxc.hook.pre-start for container "108"
lxc-start: 108: start.c: __lxc_start: 1944 Failed to initialize container "108"
lxc-start: 108: tools/lxc_start.c: main: 330 The container failed to start
lxc-start: 108: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options

Code:
           man:lxc
           man:pct
  Process: 10977 ExecStart=/usr/bin/lxc-start -n 108 (code=exited, status=1/FAILURE)

Feb 11 19:53:12 server1 systemd[1]: Starting PVE LXC Container: 108...
Feb 11 19:53:13 server1 lxc-start[10977]: lxc-start: 108: lxccontainer.c: wait_on_daemonized_start: 856 No such file or directory - Failed to receive the container state
Feb 11 19:53:13 server1 lxc-start[10977]: lxc-start: 108: tools/lxc_start.c: main: 330 The container failed to start
Feb 11 19:53:13 server1 lxc-start[10977]: lxc-start: 108: tools/lxc_start.c: main: 333 To get more details, run the container in foreground mode
Feb 11 19:53:13 server1 lxc-start[10977]: lxc-start: 108: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options
Feb 11 19:53:13 server1 systemd[1]: pve-container@108.service: Control process exited, code=exited, status=1/FAILURE
Feb 11 19:53:13 server1 systemd[1]: pve-container@108.service: Failed with result 'exit-code'.
Feb 11 19:53:13 server1 systemd[1]: Failed to start PVE LXC Container: 108.

So sieht mein Setup in Proxmox aus:
1.PNG2.PNG3.PNG

Code:
root@server1:/# pvs
  PV         VG  Fmt  Attr PSize    PFree 
  /dev/sda3  pve lvm2 a--  <223.07g <16.00g

Code:
root@server1:/# vgs
  VG  #PV #LV #SN Attr   VSize    VFree 
  pve   1   5   0 wz--n- <223.07g <16.00g

Code:
root@server1:/# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <141.43g             24.25  2.24                           
  root          pve -wi-ao----   55.75g                                                   
  swap          pve -wi-ao----    7.00g                                                   
  vm-106-disk-0 pve Vwi-aotz--  100.00g data        32.72                                 
  vm-108-disk-1 pve Vwi-a-tz--    8.00g data        19.77

Code:
root@server1:/# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,iso,snippets,images,rootdir
        maxfiles 0
        shared 0

dir: BigData
        path /mnt/data
        content images,rootdir,vztmpl,iso,backup
        maxfiles 3
        shared 0

lvmthin: Thinpool
        thinpool data
        vgname pve
        content images,rootdir

Code:
root@server1:/# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/mapper/pve-root /               ext4    errors=remount-ro 0       1
# /boot/efi was on /dev/sda2 during installation
UUID=C9E1-EB32  /boot/efi       vfat    umask=0077      0       1
/dev/mapper/pve-swap none            swap    sw              0       0

LABEL=storageprox /mnt/data ext4 defaults 0 2

Ich würde mich wirklich sehr über Hilfe freuen!
 
Failed to run lxc.hook.pre-start for container "108"

Hast Du da irgendwas konfiguriert?
Die Hooks laufen immer und wird beispielsweise zum Konfigurieren des CT Netzwerk benutzt.
Eigentliche Fehler des Hook Scripts sieht man aber erst mit debug log - am besten eben über den pct start 108 --debug Befehl.
 
  • Like
Reactions: CoolTux
Oh sorry, ich wollte den vollen Log am Ende posten, aber hatte wohl drauf vergessen:

Code:
lxc-start 108 20210212124803.961 INFO     lsm - lsm/lsm.c:lsm_init:50 - LSM security driver AppArmor
lxc-start 108 20210212124803.964 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "reject_force_umount  # comment this to allow umount -f;  not recommended"
lxc-start 108 20210212124803.964 INFO     seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 108 20210212124803.964 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for reject_force_umount action 0(kill)
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for reject_force_umount action 0(kill)
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for reject_force_umount action 0(kill)
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for reject_force_umount action 0(kill)
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "[all]"
lxc-start 108 20210212124803.965 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "kexec_load errno 1"
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for kexec_load action 327681(errno)
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for kexec_load action 327681(errno)
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for kexec_load action 327681(errno)
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for kexec_load action 327681(errno)
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "open_by_handle_at errno 1"
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for open_by_handle_at action 327681(errno)
lxc-start 108 20210212124803.966 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for open_by_handle_at action 327681(errno)
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for open_by_handle_at action 327681(errno)
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for open_by_handle_at action 327681(errno)
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "init_module errno 1"
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for init_module action 327681(errno)
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for init_module action 327681(errno)
lxc-start 108 20210212124803.967 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for init_module action 327681(errno)
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for init_module action 327681(errno)
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "finit_module errno 1"
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for finit_module action 327681(errno)
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for finit_module action 327681(errno)
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for finit_module action 327681(errno)
lxc-start 108 20210212124803.968 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for finit_module action 327681(errno)
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:759 - Processing "delete_module errno 1"
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for delete_module action 327681(errno)
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for delete_module action 327681(errno)
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for delete_module action 327681(errno)
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for delete_module action 327681(errno)
lxc-start 108 20210212124803.969 INFO     seccomp - seccomp.c:parse_config_v2:970 - Merging compat seccomp contexts into main context
lxc-start 108 20210212124803.970 INFO     conf - conf.c:run_script_argv:356 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "108", config section "lxc"
lxc-start 108 20210212124804.985 DEBUG    conf - conf.c:run_buffer:326 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 108 lxc pre-start with output: mount: /var/lib/lxc/108/rootfs: special device /dev/pve/vm-108-disk-1 does not exist.

lxc-start 108 20210212124804.989 DEBUG    conf - conf.c:run_buffer:326 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 108 lxc pre-start with output: command 'mount /dev/pve/vm-108-disk-1 /var/lib/lxc/108/rootfs//' failed: exit code 32

lxc-start 108 20210212124804.121 ERROR    conf - conf.c:run_buffer:335 - Script exited with status 32
lxc-start 108 20210212124804.121 ERROR    start - start.c:lxc_init:861 - Failed to run lxc.hook.pre-start for container "108"
lxc-start 108 20210212124804.122 ERROR    start - start.c:__lxc_start:1944 - Failed to initialize container "108"
lxc-start 108 20210212124804.122 ERROR    lxc_start - tools/lxc_start.c:main:330 - The container failed to start
lxc-start 108 20210212124804.122 ERROR    lxc_start - tools/lxc_start.c:main:336 - Additional information can be obtained by setting the --logfile and --logpriority options

Vielleicht führt das ja zur Lösung; wenn ich nun nachschaue:
Code:
root@server1:/dev/pve# ls
root  swap  vm-106-disk-0
fehlt da tasächlich vm-108-disk-1. Warum aber kann der Container dann während dem Systemstart starten? Auch interessant ist, dass eben Container 106 auch manchmal genau dasselbe Problem hat.

Und nur zur Sicherheit, hier bekomme ich immer noch denselben output:
Code:
root@server1:~# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <141.43g             24.60  2.26                          
  root          pve -wi-ao----   55.75g                                                  
  swap          pve -wi-ao----    7.00g                                                  
  vm-106-disk-0 pve Vwi-aotz--  100.00g data        32.72                                
  vm-108-disk-1 pve Vwi-a-tz--    8.00g data        25.98

EDIT: Nach einem Neustart ist die vm-108-disk-1 wieder zurück:
Code:
root@server1:/dev/pve# ls
root  swap  vm-106-disk-0  vm-108-disk-1

Wie kann das sein? Das Problem wird dadurch noch einmal ärgerlicher, weil manchml mehrere Neustarts nötig sind, um den Container wieder zum Laufen zu bringen.
 
Last edited:
Hallo,
um die Device-Links wieder herzustellen, kannst Du auch vgscan --mknode benutzen. Die Frage ist natürlich, warum diese bei Dir manchmal nicht angelegt werden bzw. verschwinden. Bitte Systemlogs kontrollieren, ob du dazu etwas findest.
 
@Fabian_E vielen Dank für den Tipp! Scheint soweit zu funktionieren, wenn auch nicht reibungslos:
Code:
root@server1:/dev/pve# ls
root  swap  vm-108-disk-0
root@server1:/dev/pve# vgscan --mknode
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  The link /dev/pve/vm-106-disk-0 should have been created by udev but it was not found. Falling back to direct link creation.
  Command failed with status code 5.
root@server1:/dev/pve# ls
root  swap  vm-106-disk-0  vm-108-disk-0

Bezüglich logs:
Hier mal eine Auflistung der Sachen, die mir "verdächtig" vorkommen:
Code:
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.473:25): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19424 comm="(un-parts)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.489:26): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19432 comm="(pachectl)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19432]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19432 comm="(pachectl)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19430]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19430 comm="(install)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.501:27): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19430 comm="(install)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19444]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19444 comm="(s-server)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.557:28): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19444 comm="(s-server)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19449]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19449 comm="(sh)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.569:29): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19449 comm="(sh)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19478]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19478 comm="(sh)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.625:30): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19478 comm="(sh)" flags="rw, rslave"
Feb 13 04:00:24 server1 audit[19482]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19482 comm="(un-parts)" flags="rw, rslave"
Feb 13 04:00:24 server1 kernel: audit: type=1400 audit(1613185224.665:31): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19482 comm="(un-parts)" flags="rw, rslave"
Feb 13 04:00:25 server1 audit[19757]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19757 comm="(mysqld)" flags="rw, rslave"
Feb 13 04:00:25 server1 kernel: audit: type=1400 audit(1613185225.374:32): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19757 comm="(mysqld)" flags="rw, rslave"
Feb 13 04:00:26 server1 audit[19878]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19878 comm="(an-start)" flags="rw, rslave"
Feb 13 04:00:26 server1 kernel: audit: type=1400 audit(1613185226.086:33): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19878 comm="(an-start)" flags="rw, rslave"
Feb 13 04:00:26 server1 audit[19889]: AVC apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-106_</var/lib/lxc>" name="/" pid=19889 comm="(sh)" flags="rw, rslave"
Diese und andere AppArmor Meldungen kommen ganz ganz oft. Besonders diese hier scheinen mir aber interessant.

Code:
Feb 13 03:59:59 server1 kernel: sd 2:0:0:0: [sdc] Synchronizing SCSI cache
Feb 13 03:59:59 server1 kernel: sd 2:0:0:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb 13 03:59:59 server1 kernel: sd 2:0:0:0: [sdc] Stopping disk
Feb 13 03:59:59 server1 kernel: sd 2:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Passiert bei allen Drives, die ich per Passthrough an Openmediavault geleitet habe. Ist das normal? Diese Drives sind ansonsten aber komplett unbeteiligt, also die sind nicht Teil der VG und es liegen auch keine images oder sonstiges drauf.

Code:
Feb 13 04:08:01 server1 systemd[1]: pvesr.service: Succeeded.
Feb 13 04:08:01 server1 systemd[1]: Started Proxmox VE replication runner.
Feb 13 04:09:00 server1 systemd[1]: Starting Proxmox VE replication runner...
Wird jede Minute ausgeführt.

Code:
smartd[700]: Device: /dev/sda [SAT], SMART Prefailure Attribute: 194 Temperature_Celsius changed from 76 to 72
Wie ich das gesehen habe, habe ich erst einmal Panik bekommen, aber anscheinend sind das ja "normalisierte" Werte und deswegen nicht so schlimm?

Code:
Feb 13 14:45:01 server1 CRON[5513]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb 13 14:45:01 server1 CRON[5514]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Feb 13 14:45:01 server1 CRON[5513]: pam_unix(cron:session): session closed for user root
Kommt auch sehr oft.

Code:
Feb 13 15:33:55 server1 anacron[1979]: Anacron 2.3 started on 2021-02-13
Feb 13 15:33:55 server1 anacron[1979]: Normal exit (0 jobs run)
Ist das normal, dass sowohl anacron als auch cron installiert sind?

Sonst steht nicht viel weiter auffälliges drin...
 
Last edited:
you could run udevadm monitor -pku -s block | ts | tee udev.log (requires "moreutils" being installed) after bootup and check 'udev.log' when the issue re-appears. it should at least give you a timestamp for when the removal occured which you can then cross-reference with other logs..
 
Ich hatte in letzter Zeit dieses Problem schon aufgegeben; heute habe ich durch Zufall nach dem Befehl #vgscan --mknode folgendes im Log gefunden:

Code:
Apr 10 02:25:33 server1 systemd-udevd[408]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Apr 10 02:25:33 server1 systemd-udevd[408]: Could not generate persistent MAC address for vethB2AKD1: No such file or directory

Lässt sich damit vielleicht etwas anfangen?
 
nein, das hat nix miteinander zu tun..
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!