All containers in Node suddenly Stop

mavillar

New Member
Feb 21, 2022
2
0
1
51
Hello,
I have been running a server for over a year now containing a number of containers that have been working properly with no problems.
All of them with very little workload.
The issue is that since a few days ago, suddenly all the containers on the node stop. The main node is still active.
The first time it happened I went back to activate each of the containers and all of them worked fine again.
The next day, all the containers shut down again.
I activated them again until today when it happened again.
I am at a complete loss as to what the reason is as I don't know what is causing this to happen.
Does anyone know the cause or reason for this?
I don't know if there is any log that can tell me what is happening.

Thanks in advance.
 
hi,

I don't know if there is any log that can tell me what is happening.
you can check the log in /var/log/syslog on that node. try grepping for the container IDs: grep -C 3 999 /var/log/syslog (where 999 would be your container's ID)

the task logs in the GUI might also give you some hints (check for shutdown/stop jobs for that container).

The first time it happened I went back to activate each of the containers and all of them worked fine again.
The next day, all the containers shut down again.
does it happen at a specific hour? or just randomly?

which PVE version are you on? can you send us the output from pveversion -v?
 
I am suspecting that it may be due to the behavior of a container.
When I run the grep command, I get the following:

root@ns3012819:~# grep -C 3 102 /var/log/syslog Feb 21 00:34:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 00:34:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 00:34:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 00:34:19 ns3012819 kernel: [297303.065102] sshd[11632]: segfault at 0 ip 00007f9e205396ca sp 00007ffc242f8398 error 4 in libc-2.28.so[7f9e20404000+148000] Feb 21 00:34:19 ns3012819 kernel: [297303.083835] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 00:34:46 ns3012819 kernel: [297329.813799] audit: type=1400 audit(1645403686.578:113): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=12110 comm="mount" flags="rw, noexec, remount" Feb 21 00:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 00:35:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 00:35:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. -- Feb 21 01:34:28 ns3012819 kernel: [300911.866407] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 01:34:42 ns3012819 kernel: [300925.616298] sshd[9439]: segfault at 0 ip 00007fa10de756ca sp 00007ffc40e049c8 error 4 in libc-2.28.so[7fa10dd40000+148000] Feb 21 01:34:42 ns3012819 kernel: [300925.634868] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 01:34:46 ns3012819 kernel: [300929.823208] audit: type=1400 audit(1645407286.577:114): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=9575 comm="mount" flags="rw, noexec, remount" Feb 21 01:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 01:35:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 01:35:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. -- Feb 21 01:36:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 01:36:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 01:36:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 01:36:22 ns3012819 kernel: [301025.606130] sshd[11974]: segfault at 0 ip 00007f9fe7b836ca sp 00007ffdfe4df1b8 error 4 in libc-2.28.so[7f9fe7a4e000+148000] Feb 21 01:36:22 ns3012819 kernel: [301025.624887] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 01:37:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 01:37:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 01:37:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. -- Feb 21 01:56:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 01:56:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 01:56:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 01:56:27 ns3012819 kernel: [302231.145102] sshd[742]: segfault at 0 ip 00007f6ee42cd6ca sp 00007ffc15428de8 error 4 in libc-2.28.so[7f6ee4198000+148000] Feb 21 01:56:27 ns3012819 kernel: [302231.163927] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 01:57:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 01:57:00 ns3012819 systemd[1]: pvesr.service: Succeeded. -- Feb 21 02:34:10 ns3012819 kernel: [304493.213058] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 02:34:19 ns3012819 kernel: [304502.817510] sshd[9387]: segfault at 0 ip 00007ff96bc0a6ca sp 00007ffe6f95c048 error 4 in libc-2.28.so[7ff96bad5000+148000] Feb 21 02:34:19 ns3012819 kernel: [304502.836146] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 02:34:46 ns3012819 kernel: [304529.835784] audit: type=1400 audit(1645410886.584:115): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=9843 comm="mount" flags="rw, noexec, remount" Feb 21 02:34:59 ns3012819 kernel: [304542.198852] sshd[10171]: segfault at 0 ip 00007f1616d3b6ca sp 00007fff75f9b918 error 4 in libc-2.28.so[7f1616c06000+148000] Feb 21 02:34:59 ns3012819 kernel: [304542.217619] Code: 80 7e 10 00 0f 84 f9 fe ff ff e9 a1 a8 f4 ff 90 89 f8 31 d2 c5 c5 ef ff 09 f0 25 ff 0f 00 00 3d 80 0f 00 00 0f 8f 56 03 00 00 <c5> fe 6f 0f c5 f5 74 06 c5 fd da c1 c5 fd 74 c7 c5 fd d7 c8 85 c9 Feb 21 02:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... -- Feb 21 03:34:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 03:34:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 03:34:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 03:34:46 ns3012819 kernel: [308129.848725] audit: type=1400 audit(1645414486.591:116): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=4229 comm="mount" flags="rw, noexec, remount" Feb 21 03:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 03:35:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 03:35:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. -- Feb 21 04:34:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 04:34:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 04:34:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 04:34:46 ns3012819 kernel: [311729.859100] audit: type=1400 audit(1645418086.590:117): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=32691 comm="mount" flags="rw, noexec, remount" Feb 21 04:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 04:35:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 04:35:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. -- Feb 21 05:34:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 05:34:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 05:34:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Feb 21 05:34:46 ns3012819 kernel: [315329.871277] audit: type=1400 audit(1645421686.594:118): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-102_</var/lib/lxc>" name="/tmp/" pid=32447 comm="mount" flags="rw, noexec, remount" Feb 21 05:35:00 ns3012819 systemd[1]: Starting Proxmox VE replication runner... Feb 21 05:35:00 ns3012819 systemd[1]: pvesr.service: Succeeded. Feb 21 05:35:00 ns3012819 systemd[1]: Started Proxmox VE replication runner. Binary file /var/log/syslog matches
 
thanks, could you also post the pveversion -v output?

is it possible that you've made some kernel package upgrades without rebooting afterwards?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!