Hello,
I have a pve running on a 8 GB RAM computer with ARC Max size set to 3 GB and a VM set to 5 Gb (4GB min and ballooning enabled) but I got an oom-kill last night (so the backup failed and the VM remain down till I check) and I wonder how can I avoid it to happen again. Here are systemctl logs :
I have a pve running on a 8 GB RAM computer with ARC Max size set to 3 GB and a VM set to 5 Gb (4GB min and ballooning enabled) but I got an oom-kill last night (so the backup failed and the VM remain down till I check) and I wonder how can I avoid it to happen again. Here are systemctl logs :
Code:
Jan 16 01:07:16 pve qmeventd[2679]: Finished cleanup for XXXXX
Jan 16 01:07:16 pve kernel: vmbr0: port 2(fwprXXXXXp0) entered disabled state
Jan 16 01:07:16 pve kernel: fwprXXXXXp0 (unregistering): left promiscuous mode
Jan 16 01:07:16 pve kernel: fwprXXXXXp0 (unregistering): left allmulticast mode
Jan 16 01:07:16 pve kernel: fwbrXXXXXi0: port 1(fwlnXXXXXi0) entered disabled state
Jan 16 01:07:16 pve kernel: fwlnXXXXXi0 (unregistering): left promiscuous mode
Jan 16 01:07:16 pve kernel: fwlnXXXXXi0 (unregistering): left allmulticast mode
Jan 16 01:07:16 pve kernel: vmbr0: port 2(fwprXXXXXp0) entered disabled state
Jan 16 01:07:16 pve kernel: fwbrXXXXXi0: port 1(fwlnXXXXXi0) entered disabled state
Jan 16 01:07:16 pve qmeventd[2679]: Starting cleanup for XXXXX
Jan 16 01:07:15 pve pvescheduler[4194276]: INFO: Backup job finished with errors
Jan 16 01:07:15 pve pvescheduler[4194276]: ERROR: Backup of VM XXXXX failed - VM XXXXX not running
Jan 16 01:07:15 pve pvescheduler[4194276]: VM XXXXX qmp command failed - VM XXXXX not running
Jan 16 01:07:15 pve pvescheduler[4194276]: VM XXXXX qmp command failed - VM XXXXX not running
Jan 16 01:07:15 pve pvescheduler[4194276]: VM XXXXX qmp command failed - VM XXXXX not running
Jan 16 01:07:15 pve systemd[1]: qemu.slice: A process of this unit has been killed by the OOM killer.
Jan 16 01:07:15 pve kernel: fwbrXXXXXi0: port 2(tapXXXXXi0) entered disabled state
Jan 16 01:07:15 pve kernel: tapXXXXXi0 (unregistering): left allmulticast mode
Jan 16 01:07:15 pve kernel: fwbrXXXXXi0: port 2(tapXXXXXi0) entered disabled state
Jan 16 01:07:15 pve systemd[1]: removable-device-attach@1y9iV0-HAjp-fIc5-bZJf-FNEq-rEPM-uINSEf.service: Deactivated successfully.
Jan 16 01:07:15 pve systemd[1]: removable-device-attach@4cf433a2-ebb2-4195-8823-97ed713c0ddb.service: Deactivated successfully.
Jan 16 01:07:15 pve systemd[1]: Started removable-device-attach@1y9iV0-HAjp-fIc5-bZJf-FNEq-rEPM-uINSEf.service - Try to mount the removable device of a dat>
Jan 16 01:07:15 pve systemd[1]: Started removable-device-attach@4cf433a2-ebb2-4195-8823-97ed713c0ddb.service - Try to mount the removable device of a datas>
Jan 16 01:07:15 pve lvm[2665]: /dev/zd16p5 excluded: device is rejected by filter config.
Jan 16 01:07:15 pve kernel: zd16: p1 p2 < p5 >
Jan 16 01:07:15 pve proxmox-backup-proxy[11665]: removing failed backup
Jan 16 01:07:15 pve proxmox-backup-proxy[11665]: backup failed: connection error: connection reset
Jan 16 01:07:15 pve systemd[1]: XXXXX.scope: Consumed 18h 43min 54.880s CPU time.
Jan 16 01:07:15 pve systemd[1]: XXXXX.scope: Failed with result 'oom-kill'.
Jan 16 01:07:15 pve systemd[1]: XXXXX.scope: A process of this unit has been killed by the OOM killer.
Jan 16 01:07:15 pve kernel: Out of memory: Killed process 1263431 (kvm) total-vm:7772612kB, anon-rss:4777928kB, file-rss:2152kB, shmem-rss:0kB, UID:0 pgtab>
Jan 16 01:07:15 pve kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=proxmox-backup-proxy.service,mems_allowed=0,global_oom,task_memcg=/q>
Jan 16 01:07:15 pve kernel: [ 2631] 0 2631 1366 192 0 192 0 49152 0 0 sleep
Jan 16 01:07:15 pve kernel: [4194276] 0 4194276 60373 30020 29764 256 0 397312 0 0 task UPID:pve:0
Jan 16 01:07:15 pve kernel: [4190228] 104 4190228 10764 192 160 32 0 77824 0 0 pickup
Jan 16 01:07:15 pve kernel: [4177396] 33 4177396 60845 35759 35567 192 0 430080 0 0 pveproxy worker
Jan 16 01:07:15 pve kernel: [4177395] 33 4177395 60845 35727 35535 192 0 430080 0 0 pveproxy worker
Jan 16 01:07:15 pve kernel: [4177394] 33 4177394 60845 35727 35535 192 0 430080 0 0 pveproxy worker
Jan 16 01:07:15 pve kernel: [4177393] 33 4177393 22038 13012 12884 128 0 200704 0 0 spiceproxy work
Jan 16 01:07:15 pve kernel: [4177389] 0 4177389 19796 96 32 64 0 61440 0 0 pvefw-logger
Jan 16 01:07:15 pve kernel: [1280750] 0 1280750 62587 35734 35510 160 64 434176 0 0 pvedaemon worke
Jan 16 01:07:15 pve kernel: [1280657] 0 1280657 62585 35702 35478 192 32 434176 0 0 pvedaemon worke
Jan 16 01:07:15 pve kernel: [1275499] 0 1275499 62617 35702 35478 128 96 434176 0 0 pvedaemon worke
Jan 16 01:07:15 pve kernel: [1263431] 0 1263431 1943153 1195020 1194482 538 0 12320768 0 0 kvm
Jan 16 01:07:15 pve kernel: [ 11665] 34 11665 612108 25521 25167 354 0 1409024 0 0 proxmox-backup-
Jan 16 01:07:15 pve kernel: [ 11655] 0 11655 180891 3062 2767 295 0 253952 0 0 proxmox-backup-
Jan 16 01:07:15 pve kernel: [ 7862] 0 7862 190652 7003 6939 64 0 311296 0 0 fail2ban-server
Jan 16 01:07:15 pve kernel: [ 5872] 0 5872 3858 384 320 64 0 73728 0 -1000 sshd
Jan 16 01:07:15 pve kernel: [ 1069] 0 1069 56058 29286 29094 192 0 372736 0 0 pvescheduler
Jan 16 01:07:15 pve kernel: [ 1064] 0 1064 56996 28909 28205 224 480 376832 0 0 pve-ha-lrm
Jan 16 01:07:15 pve kernel: [ 1062] 33 1062 21807 12832 12704 128 0 217088 0 0 spiceproxy
Jan 16 01:07:15 pve kernel: [ 1051] 33 1051 60641 35524 35332 192 0 475136 0 0 pveproxy
Jan 16 01:07:15 pve kernel: [ 1050] 0 1050 57110 28946 28338 160 448 393216 0 0 pve-ha-crm
Jan 16 01:07:15 pve kernel: [ 1042] 0 1042 60279 35189 35029 160 0 413696 0 0 pvedaemon
Jan 16 01:07:15 pve kernel: [ 1017] 0 1017 50297 25739 25003 256 480 352256 0 0 pve-firewall
Jan 16 01:07:15 pve kernel: [ 1016] 0 1016 51715 27195 26331 416 448 372736 0 0 pvestatd
Jan 16 01:07:15 pve kernel: [ 1006] 0 1006 1597 224 96 128 0 57344 0 0 proxmox-firewal
Jan 16 01:07:15 pve kernel: [ 1005] 0 1005 1652 160 32 128 0 57344 0 0 cron
Jan 16 01:07:15 pve kernel: [ 1000] 104 1000 10858 224 160 64 0 81920 0 0 qmgr
Jan 16 01:07:15 pve kernel: [ 998] 0 998 10665 165 133 32 0 73728 0 0 master
Jan 16 01:07:15 pve kernel: [ 927] 0 927 203893 14152 2997 192 10963 442368 0 0 pmxcfs
Jan 16 01:07:15 pve kernel: [ 905] 0 905 200275 403 307 96 0 225280 0 0 rrdcached
Jan 16 01:07:15 pve kernel: [ 871] 100 871 2633 179 115 64 0 65536 0 0 chronyd
Jan 16 01:07:15 pve kernel: [ 866] 100 866 4715 202 138 64 0 69632 0 0 chronyd
Jan 16 01:07:15 pve kernel: [ 817] 0 817 1468 96 32 64 0 57344 0 0 agetty
Jan 16 01:07:15 pve kernel: [ 800] 0 800 2207 160 64 96 0 61440 0 0 lxc-monitord
Jan 16 01:07:15 pve kernel: [ 690] 0 690 38189 96 32 64 0 65536 0 -1000 lxcfs
Jan 16 01:07:15 pve kernel: [ 687] 0 687 60170 352 320 32 0 106496 0 0 zed
Jan 16 01:07:15 pve kernel: [ 683] 0 683 583 32 0 32 0 40960 0 -1000 watchdog-mux
Jan 16 01:07:15 pve kernel: [ 682] 0 682 12494 288 256 32 0 106496 0 0 systemd-logind
Jan 16 01:07:15 pve kernel: [ 681] 0 681 1327 128 32 96 0 49152 0 0 qmeventd
Jan 16 01:07:15 pve kernel: [ 675] 0 675 2958 448 384 64 0 69632 0 0 smartd
Jan 16 01:07:15 pve kernel: [ 671] 0 671 1766 179 51 128 0 53248 0 0 ksmtuned
Jan 16 01:07:15 pve kernel: [ 669] 0 669 69539 128 64 64 0 106496 0 0 pve-lxc-syscall
Jan 16 01:07:15 pve kernel: [ 664] 101 664 2319 192 160 32 0 65536 0 -900 dbus-daemon
Jan 16 01:07:15 pve kernel: [ 642] 103 642 1970 160 96 64 0 57344 0 0 rpcbind
Jan 16 01:07:15 pve kernel: [ 426] 0 426 7027 665 576 89 0 81920 0 -1000 systemd-udevd
Jan 16 01:07:15 pve kernel: [ 401] 0 401 16497 320 256 64 0 135168 0 -250 systemd-journal
Jan 16 01:07:15 pve kernel: [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Jan 16 01:07:15 pve kernel: Tasks state (memory values in pages):
Jan 16 01:07:15 pve kernel: 0 pages hwpoisoned
Jan 16 01:07:15 pve kernel: 63434 pages reserved
Jan 16 01:07:15 pve kernel: 0 pages HighMem/MovableOnly
Jan 16 01:07:15 pve kernel: 2073250 pages RAM
Jan 16 01:07:15 pve kernel: Total swap = 0kB
Jan 16 01:07:15 pve kernel: Free swap = 0kB
Jan 16 01:07:15 pve kernel: 0 pages in swap cache
Jan 16 01:07:15 pve kernel: 12566 total pagecache pages
Jan 16 01:07:15 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jan 16 01:07:15 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jan 16 01:07:15 pve kernel: Node 0 Normal: 5926*4kB (UMH) 10264*8kB (UMEH) 1761*16kB (UMH) 177*32kB (UMH) 56*64kB (UMH) 27*128kB (UMH) 0*256kB 0*512kB 0*10>
Jan 16 01:07:15 pve kernel: Node 0 DMA32: 4022*4kB (UMH) 3950*8kB (UMH) 785*16kB (UMH) 623*32kB (UMH) 490*64kB (UM) 132*128kB (UM) 1*256kB (U) 0*512kB 0*10>
Jan 16 01:07:15 pve kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 3*4096kB (M) = 14336kB
Jan 16 01:07:15 pve kernel: lowmem_reserve[]: 0 0 0 0 0
Jan 16 01:07:15 pve kernel: Node 0 Normal free:149736kB boost:0kB min:38208kB low:47760kB high:57312kB reserved_highatomic:18432KB active_anon:1301964kB in>
Jan 16 01:07:15 pve kernel: lowmem_reserve[]: 0 0 4401 4401 4401
Jan 16 01:07:15 pve kernel: Node 0 DMA32 free:128256kB boost:0kB min:29240kB low:36548kB high:43856kB reserved_highatomic:6144KB active_anon:1159112kB inac>
Jan 16 01:07:15 pve kernel: lowmem_reserve[]: 0 3368 7770 7770 7770
Jan 16 01:07:15 pve kernel: Node 0 DMA free:14336kB boost:0kB min:128kB low:160kB high:192kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB acti>
Jan 16 01:07:15 pve kernel: Node 0 active_anon:2444864kB inactive_anon:2820300kB active_file:1492kB inactive_file:2728kB unevictable:124kB isolated(anon):0>
Jan 16 01:07:15 pve kernel: active_anon:725661 inactive_anon:590630 isolated_anon:0
active_file:486 inactive_file:762 isolated_file:0
unevictable:31 dirty:6 writeback:3
slab_reclaimable:2421 slab_unreclaimable:76508
mapped:12206 shmem:11572 pagetables:5673
sec_pagetables:2411 bounce:0
kernel_misc_reclaimable:0
free:62183 free_pcp:36 free_cma:0
Jan 16 01:07:15 pve kernel: Mem-Info:
Jan 16 01:07:15 pve kernel: </TASK>
Jan 16 01:07:15 pve kernel: R13: 000073e5a4036b38 R14: 0000000000004000 R15: 000073e5d3a700b8
Jan 16 01:07:15 pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 16 01:07:15 pve kernel: RBP: 0000000000000000 R08: 000073e5d3a700b8 R09: 0000000000000000
Jan 16 01:07:15 pve kernel: RDX: 0000000000004000 RSI: 000073e5fc0721a0 RDI: 000073e5d3a71000
Jan 16 01:07:15 pve kernel: RAX: 000073e5d3a700b8 RBX: 0000000000390000 RCX: 00000000000030b8
Jan 16 01:07:15 pve kernel: RSP: 002b:000073e562df2a68 EFLAGS: 00010202
Jan 16 01:07:15 pve kernel: Code: 00 01 00 00 00 74 99 83 f9 c0 0f 87 7b fe ff ff c5 fe 6f 4e 20 48 29 fe 48 83 c7 3f 49 8d 0c 10 48 83 e7 c0 48 01 fe 48 2>
Jan 16 01:07:15 pve kernel: RIP: 0033:0x73e611a51c4a
Jan 16 01:07:15 pve kernel: asm_exc_page_fault+0x27/0x30
Jan 16 01:07:15 pve kernel: exc_page_fault+0x83/0x1b0
Jan 16 01:07:15 pve kernel: do_user_addr_fault+0x169/0x660
Jan 16 01:07:15 pve kernel: handle_mm_fault+0x18d/0x380
Jan 16 01:07:15 pve kernel: __handle_mm_fault+0xbf1/0xf20
Jan 16 01:07:15 pve kernel: ? __pte_offset_map+0x1c/0x1b0
Jan 16 01:07:15 pve kernel: do_anonymous_page+0x21e/0x740
Jan 16 01:07:15 pve kernel: vma_alloc_folio+0x64/0xe0
Jan 16 01:07:15 pve kernel: ? __mod_lruvec_state+0x36/0x50
Jan 16 01:07:15 pve kernel: alloc_pages_mpol+0x91/0x1f0
Jan 16 01:07:15 pve kernel: __alloc_pages+0x10ce/0x1320
Jan 16 01:07:15 pve kernel: out_of_memory+0x26e/0x560
Jan 16 01:07:15 pve kernel: oom_kill_process+0x110/0x240
Jan 16 01:07:15 pve kernel: dump_header+0x47/0x1f0
Jan 16 01:07:15 pve kernel: dump_stack+0x10/0x20
Jan 16 01:07:15 pve kernel: dump_stack_lvl+0x76/0xa0
Jan 16 01:07:15 pve kernel: <TASK>
Jan 16 01:07:15 pve kernel: Call Trace:
Jan 16 01:07:15 pve kernel: Hardware name: LENOVO 10M8S93W00/3102, BIOS M16KT53A 11/27/2018
Jan 16 01:07:15 pve kernel: CPU: 1 PID: 501 Comm: tokio-runtime-w Tainted: P O 6.8.12-5-pve #1
Jan 16 01:07:15 pve kernel: tokio-runtime-w invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
Jan 16 01:04:41 pve proxmox-backup-proxy[11665]: rrd journal successfully committed (25 files in 0.008 seconds)
Jan 16 01:00:02 pve proxmox-backup-proxy[11665]: add blob "/mnt/datastore/PBS_local/vm/XXXXX/2025-01-16T00:00:01Z/qemu-server.conf.blob" (467 bytes, comp:>
Jan 16 01:00:02 pve proxmox-backup-proxy[11665]: created new fixed index 1 ("vm/XXXXX/2025-01-16T00:00:01Z/drive-scsi0.img.fidx")
Jan 16 01:00:01 pve proxmox-backup-proxy[11665]: download 'drive-scsi0.img.fidx' from previous backup.
Jan 16 01:00:01 pve proxmox-backup-proxy[11665]: register chunks in 'drive-scsi0.img.fidx' from previous backup.
Jan 16 01:00:01 pve proxmox-backup-proxy[11665]: download 'index.json.blob' from previous backup.
Jan 16 01:00:01 pve proxmox-backup-proxy[11665]: starting new backup on datastore 'PBS_local' from ::ffff:192.168.XXX.XXX: "vm/XXXXX/2025-01-16T00:00:01Z"