Lxc containers are shut down without intervention

g0rck

Member
Jun 17, 2020
12
0
6
34
Hello, I have Proxmox 6.2 over Debian 10 and I am testing LXC containers.

On the same machine I have several virtual machines without any problem, but I already have about 5 test LXC containers and on two occasions one of them has stopped without explanation.

I fix the problem with booting it again, but I wouldn't want to have these in production and find that they shutdown on their own without explanation.

I can't find the logs of the containers to review. If you tell me where to find them I can review them and paste them here.

Do you know what could be happening?

Thanks
 
Hi, you can try to find logs here: /var/log/lxc/CTNUM.log

regards, dale.
 
Hi,

You should also check out the hosts syslog during the time such a CT shuts down, maybe you see some hints or errors that are more specific there.
You can do this in the webinterface, Node → Syslog.
 
Hi, you can try to find logs here: /var/log/lxc/CTNUM.log

regards, dale.

Hi, thank you for response, but in /var/log/lxc directory view only file lxc-monitord.log

root@XXX:/var/log/lxc# ls -l
total 4
-rw-r----- 1 root root 1379 may 28 20:03 lxc-monitord.log
root@frproxmox01:/var/log/lxc# cat lxc-monitord.log
lxc-monitord 20200526082541.490 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526082541.490 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 15855 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200526084327.406 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526084327.410 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 997 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200526084607.153 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200526084607.639 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 914 is now monitoring lxcpath /var/lib/lxc
lxc-monitord 20200528180341.201 INFO monitor - monitor.c:lxc_monitor_sock_name:191 - Using monitor socket name "lxc/ad055575fe28ddd5//var/lib/lxc" (length of socket name 33 must be <= 105)
lxc-monitord 20200528180341.209 NOTICE lxc_monitord - cmd/lxc_monitord.c:main:451 - lxc-monitord with pid 1231 is now monitoring lxcpath /var/lib/lxc
root@XXX:/var/log/lxc#


Hi,

You should also check out the hosts syslog during the time such a CT shuts down, maybe you see some hints or errors that are more specific there.
You can do this in the webinterface, Node → Syslog.

Hi thank you for response. In syslog node view this information, attach txt with the log.

It seems that oom is the one for the container process, but the memory and CPU of the server do not have worrisome consumption levels.

root@XXX:~# mpstat
Linux 5.4.41-1-pve (XXX) 17/06/20 _x86_64_ (8 CPU)

16:14:48 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
16:14:48 all 2,05 0,00 3,37 3,66 0,00 0,57 0,00 1,25 0,00 89,09
root@XXX:~# uptime
16:14:52 up 19 days, 20:11, 1 user, load average: 1,13, 1,23, 1,28
root@XXX:~# free -h
total used free shared buff/cache available
Mem: 31Gi 15Gi 342Mi 1,3Gi 15Gi 13Gi
Swap: 0B 0B 0B


The storage is a local hard disks with mdadm raid. The storage in lxc is a lvm.

Do you know why this can happen?


Thanks
 

Attachments

  • syslog.txt
    7.7 KB · Views: 4
Last edited:
It seems that oom is the one for the container process, but the memory and CPU of the server do not have worrisome consumption levels.

Yes, the OOM-Killer is killing important processes from your CT.

but the memory and CPU of the server do not have worrisome consumption levels.

Well, it can be enough if it peaks for a short time really high.

Do you use ZFS?
 
i use a ext4.
Thanks, i enabled 2G swap memory in proxmox node.

OK, I'd continue to monitor this for now then. If it happens again, I'd check what happened on the system before that. Maybe backup jobs ran, or some CRON job ran, or something like that which then made the systems memory usage peak.
 
Hello,

My proxmox server has performed an oom kill on an lxc again.

Reviewing the graphs of zabbix, which monitors the server, I do not see excessive cpu load (load average around 1) and free ram memory more than 15G. Swap is almost 50% free.

I can't find an explanation why it kills these lxc sporadically and alternately.

The best regards.
 

Attachments

  • logs.txt
    37 KB · Views: 2
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!