Hi!
A few weeks ago the pve-firewall of my PVE invoked the oom-killer, which shut down one of my VMs and i couldn't turn it on again without restarting the whole PVE.
The OOM-Killer killed the process 1779 (KVM), here is the log:
Is there anything to prevent that happening or is it a one time thing?
A few weeks ago the pve-firewall of my PVE invoked the oom-killer, which shut down one of my VMs and i couldn't turn it on again without restarting the whole PVE.
Sep 18 01:55:11 pve01 kernel: pve-firewall invoked oom-killer: gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=2, oom_score_adj=0
The OOM-Killer killed the process 1779 (KVM), here is the log:
Code:
Sep 18 01:55:11 pve01 kernel: Node 0 active_anon:22493748kB inactive_anon:9459004kB active_file:6130884kB inactive_file:17267728kB unevictable:3072kB isolated(anon):0kB isolated(file):0kB mapped:42240kB dirty:4105512kB writeback:45440kB shmem:45008kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:552960kB writeback_tmp:0kB kernel_stack:8288kB pagetables:107948kB sec_pagetables:81468kB all_unreclaimable? no
Sep 18 01:55:11 pve01 kernel: Node 0 DMA free:11264kB boost:0kB min:12kB low:24kB high:36kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Sep 18 01:55:11 pve01 kernel: lowmem_reserve[]: 0 2332 64135 64135 64135
Sep 18 01:55:11 pve01 kernel: Node 0 DMA32 free:455808kB boost:10848kB min:13304kB low:15692kB high:18080kB reserved_highatomic:0KB active_anon:832432kB inactive_anon:186084kB active_file:6780kB inactive_file:414600kB unevictable:0kB writepending:105608kB present:2513916kB managed:2447820kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Sep 18 01:55:11 pve01 kernel: lowmem_reserve[]: 0 0 61802 61802 61802
Sep 18 01:55:11 pve01 kernel: Node 0 Normal free:1199736kB boost:0kB min:65108kB low:128392kB high:191676kB reserved_highatomic:14336KB active_anon:21661132kB inactive_anon:9272924kB active_file:6121924kB inactive_file:16855484kB unevictable:3072kB writepending:4044948kB present:64487424kB managed:63293856kB mlocked:0kB bounce:0kB free_pcp:1420kB local_pcp:0kB free_cma:0kB
Sep 18 01:55:11 pve01 kernel: lowmem_reserve[]: 0 0 0 0 0
Sep 18 01:55:11 pve01 kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB
Sep 18 01:55:11 pve01 kernel: Node 0 DMA32: 36246*4kB (UE) 38849*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 455776kB
Sep 18 01:55:11 pve01 kernel: Node 0 Normal: 48177*4kB (UME) 126251*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1202716kB
Sep 18 01:55:11 pve01 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Sep 18 01:55:11 pve01 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 18 01:55:11 pve01 kernel: 1085442 total pagecache pages
Sep 18 01:55:11 pve01 kernel: 0 pages in swap cache
Sep 18 01:55:11 pve01 kernel: Free swap = 0kB
Sep 18 01:55:11 pve01 kernel: Total swap = 0kB
Sep 18 01:55:11 pve01 kernel: 16754333 pages RAM
Sep 18 01:55:11 pve01 kernel: 0 pages HighMem/MovableOnly
Sep 18 01:55:11 pve01 kernel: 315074 pages reserved
Sep 18 01:55:11 pve01 kernel: 0 pages hwpoisoned
Sep 18 01:55:11 pve01 kernel: Tasks state (memory values in pages):
Sep 18 01:55:11 pve01 kernel: [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Sep 18 01:55:11 pve01 kernel: [ 599] 0 599 14414 480 224 256 0 143360 0 -250 systemd-journal
Sep 18 01:55:11 pve01 kernel: [ 615] 0 615 6966 704 512 192 0 73728 0 -1000 systemd-udevd
Sep 18 01:55:11 pve01 kernel: [ 1166] 103 1166 1970 352 96 256 0 57344 0 0 rpcbind
Sep 18 01:55:11 pve01 kernel: [ 1195] 101 1195 2314 256 96 160 0 57344 0 -900 dbus-daemon
Sep 18 01:55:11 pve01 kernel: [ 1199] 0 1199 69539 256 64 192 0 102400 0 0 pve-lxc-syscall
Sep 18 01:55:11 pve01 kernel: [ 1202] 0 1202 2994 576 416 160 0 69632 0 0 smartd
Sep 18 01:55:11 pve01 kernel: [ 1205] 0 1205 1766 211 83 128 0 53248 0 0 ksmtuned
Sep 18 01:55:11 pve01 kernel: [ 1207] 0 1207 4172 416 224 192 0 77824 0 0 systemd-logind
Sep 18 01:55:11 pve01 kernel: [ 1208] 0 1208 583 96 0 96 0 40960 0 -1000 watchdog-mux
Sep 18 01:55:11 pve01 kernel: [ 1216] 0 1216 60167 416 256 160 0 106496 0 0 zed
Sep 18 01:55:11 pve01 kernel: [ 1217] 0 1217 38189 256 64 192 0 73728 0 -1000 lxcfs
Sep 18 01:55:11 pve01 kernel: [ 1378] 0 1378 2207 224 64 160 0 57344 0 0 lxc-monitord
Sep 18 01:55:11 pve01 kernel: [ 1391] 0 1391 1468 192 32 160 0 49152 0 0 agetty
Sep 18 01:55:11 pve01 kernel: [ 1401] 0 1401 3855 576 320 256 0 69632 0 -1000 sshd
Sep 18 01:55:11 pve01 kernel: [ 1423] 100 1423 4715 266 138 128 0 69632 0 0 chronyd
Sep 18 01:55:11 pve01 kernel: [ 1433] 100 1433 2633 402 114 288 0 65536 0 0 chronyd
Sep 18 01:55:11 pve01 kernel: [ 1486] 0 1486 218708 463 276 187 0 241664 0 0 rrdcached
Sep 18 01:55:11 pve01 kernel: [ 1576] 0 1576 10665 293 133 160 0 73728 0 0 master
Sep 18 01:55:11 pve01 kernel: [ 1578] 104 1578 10774 352 160 192 0 73728 0 0 qmgr
Sep 18 01:55:11 pve01 kernel: [ 1584] 0 1584 1652 224 32 192 0 53248 0 0 cron
Sep 18 01:55:11 pve01 kernel: [ 1595] 0 1595 39788 24696 24065 352 279 307200 0 0 pve-firewall
Sep 18 01:55:11 pve01 kernel: [ 1607] 0 1607 38426 25622 24534 672 416 348160 0 0 pvestatd
Sep 18 01:55:11 pve01 kernel: [ 1621] 0 1621 58747 34250 33962 288 0 454656 0 0 pvedaemon
Sep 18 01:55:11 pve01 kernel: [ 1630] 33 1630 59090 34584 34296 288 0 466944 0 0 pveproxy
Sep 18 01:55:11 pve01 kernel: [ 1637] 33 1637 20194 12896 12576 320 0 200704 0 0 spiceproxy
Sep 18 01:55:11 pve01 kernel: [ 1657] 0 1657 3289 391 199 192 0 61440 0 0 swtpm
Sep 18 01:55:11 pve01 kernel: [ 1665] 0 1665 2696513 2148776 2148392 384 0 20398080 0 0 kvm
Sep 18 01:55:11 pve01 kernel: [ 1772] 0 1772 3289 423 231 192 0 61440 0 0 swtpm
Sep 18 01:55:11 pve01 kernel: [ 1779] 0 1779 10206524 8450693 8450213 480 0 80039936 0 0 kvm
Sep 18 01:55:11 pve01 kernel: [ 1914] 0 1914 54131 28524 28172 352 0 421888 0 0 pvescheduler
Sep 18 01:55:11 pve01 kernel: [ 617059] 0 617059 144978 14465 4825 256 9384 372736 0 0 pmxcfs
Sep 18 01:55:11 pve01 kernel: [ 617127] 0 617127 1298 224 96 128 0 49152 0 0 proxmox-firewal
Sep 18 01:55:11 pve01 kernel: [ 628823] 0 628823 1328 288 32 256 0 53248 0 0 qmeventd
Sep 18 01:55:11 pve01 kernel: [ 628969] 0 628969 55044 27976 27304 288 384 372736 0 0 pve-ha-lrm
Sep 18 01:55:11 pve01 kernel: [ 628982] 0 628982 55181 28112 27440 352 320 368640 0 0 pve-ha-crm
Sep 18 01:55:11 pve01 kernel: [1133067] 0 1133067 61030 34809 34393 320 96 454656 0 0 pvedaemon worke
Sep 18 01:55:11 pve01 kernel: [1152694] 0 1152694 78038 34943 34527 320 96 466944 0 0 pvedaemon worke
Sep 18 01:55:11 pve01 kernel: [1159536] 0 1159536 61003 34602 34282 288 32 450560 0 0 pvedaemon worke
Sep 18 01:55:11 pve01 kernel: [3380439] 33 3380439 20252 12868 12612 256 0 180224 0 0 spiceproxy work
Sep 18 01:55:11 pve01 kernel: [3380446] 33 3380446 59123 34601 34345 256 0 434176 0 0 pveproxy worker
Sep 18 01:55:11 pve01 kernel: [3380447] 33 3380447 59123 34601 34345 256 0 434176 0 0 pveproxy worker
Sep 18 01:55:11 pve01 kernel: [3380448] 33 3380448 59123 34601 34345 256 0 434176 0 0 pveproxy worker
Sep 18 01:55:11 pve01 kernel: [3247947] 0 3247947 19796 256 32 224 0 61440 0 0 pvefw-logger
Sep 18 01:55:11 pve01 kernel: [3259936] 104 3259936 10765 256 160 96 0 73728 0 0 pickup
Sep 18 01:55:11 pve01 kernel: [3264850] 0 3264850 58148 28869 28517 352 0 401408 0 0 task UPID:pve01
Sep 18 01:55:11 pve01 kernel: [3274823] 0 3274823 4390864 13485 13069 416 0 618496 0 0 kvm
Sep 18 01:55:11 pve01 kernel: [3274893] 0 3274893 69685 13120 13024 96 0 188416 0 0 zstd
Sep 18 01:55:11 pve01 kernel: [3283058] 0 3283058 1366 160 0 160 0 49152 0 0 sleep
Sep 18 01:55:11 pve01 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=pve-firewall.service,mems_allowed=0,global_oom,task_memcg=/qemu.slice/102.scope,task=kvm,pid=1779,uid=0
Sep 18 01:55:11 pve01 kernel: Out of memory: Killed process 1779 (kvm) total-vm:40826096kB, anon-rss:33800852kB, file-rss:1920kB, shmem-rss:0kB, UID:0 pgtables:78164kB oom_score_adj:0
Sep 18 01:55:11 pve01 systemd[1]: 102.scope: A process of this unit has been killed by the OOM killer.
Sep 18 01:55:11 pve01 systemd[1]: 102.scope: Failed with result 'oom-kill'.
Sep 18 01:55:11 pve01 systemd[1]: 102.scope: Consumed 1w 21h 25min 50.101s CPU time.
Sep 18 01:55:11 pve01 kernel: zd16: p1 p2
Sep 18 01:55:11 pve01 kernel: zd112: p1 p2 p3 p4
Sep 18 01:55:11 pve01 kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Sep 18 01:55:11 pve01 kernel: tap102i0 (unregistering): left allmulticast mode
Sep 18 01:55:11 pve01 kernel: fwbr102i0: port 2(tap102i0) entered disabled state
Sep 18 01:55:12 pve01 qmeventd[3283191]: Starting cleanup for 102
Sep 18 01:55:12 pve01 kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Sep 18 01:55:12 pve01 kernel: vmbr0: port 3(fwpr102p0) entered disabled state
Sep 18 01:55:12 pve01 kernel: fwln102i0 (unregistering): left allmulticast mode
Sep 18 01:55:12 pve01 kernel: fwln102i0 (unregistering): left promiscuous mode
Sep 18 01:55:12 pve01 kernel: fwbr102i0: port 1(fwln102i0) entered disabled state
Sep 18 01:55:12 pve01 kernel: fwpr102p0 (unregistering): left allmulticast mode
Sep 18 01:55:12 pve01 kernel: fwpr102p0 (unregistering): left promiscuous mode
Sep 18 01:55:12 pve01 kernel: vmbr0: port 3(fwpr102p0) entered disabled state
Sep 18 01:55:12 pve01 qmeventd[3283191]: Finished cleanup for 102
Sep 18 01:55:14 pve01 kernel: oom_reaper: reaped process 1779 (kvm), now anon-rss:0kB, file-rss:364kB, shmem-rss:0kB
Is there anything to prevent that happening or is it a one time thing?