Lxc containers are shut down without intervention

g0rck · Jun 17, 2020

Hello, I have Proxmox 6.2 over Debian 10 and I am testing LXC containers.

On the same machine I have several virtual machines without any problem, but I already have about 5 test LXC containers and on two occasions one of them has stopped without explanation.

I fix the problem with booting it again, but I wouldn't want to have these in production and find that they shutdown on their own without explanation.

I can't find the logs of the containers to review. If you tell me where to find them I can review them and paste them here.

Do you know what could be happening?

Thanks

dale · Jun 17, 2020

Hi, you can try to find logs here: /var/log/lxc/CTNUM.log

regards, dale.

t.lamprecht · Jun 17, 2020

Hi,

You should also check out the hosts syslog during the time such a CT shuts down, maybe you see some hints or errors that are more specific there.
You can do this in the webinterface, Node → Syslog.

g0rck · Jun 17, 2020

dale said:
Hi, you can try to find logs here: /var/log/lxc/CTNUM.log

regards, dale.

Hi, thank you for response, but in /var/log/lxc directory view only file lxc-monitord.log

root@XXX:/var/log/lxc# ls -l
total 4
-rw-r----- 1 root root 1379 may 28 20:03 lxc-monitord.log
root@frproxmox01:/var/log/lxc# cat lxc-monitord.log
lxc-monitord 20200526082541.490 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526082541.490 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 15855 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200526084327.406 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526084327.410 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 997 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200526084607.153 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526084607.639 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 914 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200528180341.201 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200528180341.209 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 1231 is now monitoring lxcpath /var/lib/lxc
root@XXX:/var/log/lxc#

t.lamprecht said:
Hi,

You should also check out the hosts syslog during the time such a CT shuts down, maybe you see some hints or errors that are more specific there.
You can do this in the webinterface, Node → Syslog.

Hi thank you for response. In syslog node view this information, attach txt with the log.

It seems that oom is the one for the container process, but the memory and CPU of the server do not have worrisome consumption levels.

root@XXX:~# mpstat
Linux 5.4.41-1-pve (XXX) 17/06/20 _x86_64_ (8 CPU)

16:14:48 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
16:14:48 all 2,05 0,00 3,37 3,66 0,00 0,57 0,00 1,25 0,00 89,09
root@XXX:~# uptime
16:14:52 up 19 days, 20:11, 1 user, load average: 1,13, 1,23, 1,28
root@XXX:~# free -h
total used free shared buff/cache available
Mem: 31Gi 15Gi 342Mi 1,3Gi 15Gi 13Gi
Swap: 0B 0B 0B

The storage is a local hard disks with mdadm raid. The storage in lxc is a lvm.

Do you know why this can happen?

Thanks

t.lamprecht · Jun 17, 2020

g0rck said:
It seems that oom is the one for the container process, but the memory and CPU of the server do not have worrisome consumption levels.

Yes, the OOM-Killer is killing important processes from your CT.

g0rck said:
but the memory and CPU of the server do not have worrisome consumption levels.

Well, it can be enough if it peaks for a short time really high.

Do you use ZFS?

t.lamprecht · Jun 17, 2020

g0rck said:
Swap: 0B 0B 0B

You have no swap, would be good to have a few GB to cope with such situations.

g0rck · Jun 17, 2020

t.lamprecht said:
¿Usas ZFS?

Hi, i use a ext4.

I just saw that it had swap memory disabled on the node, so on the containers too. Could this affect?

Thanks

g0rck · Jun 17, 2020

t.lamprecht said:
You have no swap, would be good to have a few GB to cope with such situations.

Thanks, i enabled 2G swap memory in proxmox node.

t.lamprecht · Jun 17, 2020

g0rck said:
i use a ext4.

g0rck said:
Thanks, i enabled 2G swap memory in proxmox node.

OK, I'd continue to monitor this for now then. If it happens again, I'd check what happened on the system before that. Maybe backup jobs ran, or some CRON job ran, or something like that which then made the systems memory usage peak.

g0rck · Jul 2, 2020

Hello,

My proxmox server has performed an oom kill on an lxc again.

Reviewing the graphs of zabbix, which monitors the server, I do not see excessive cpu load (load average around 1) and free ram memory more than 15G. Swap is almost 50% free.

I can't find an explanation why it kills these lxc sporadically and alternately.

The best regards.

Search

Search

Lxc containers are shut down without intervention

g0rck

Member

dale

Renowned Member

t.lamprecht

Proxmox Staff Member

g0rck

Member

Attachments

t.lamprecht

Proxmox Staff Member

t.lamprecht

Proxmox Staff Member

g0rck

Member

g0rck

Member

t.lamprecht

Proxmox Staff Member

g0rck

Member

Attachments