[SOLVED] Question about memory usage on PVE server

TomFIT

Member
Jul 20, 2022
8
5
8
Hi,

I'm having trouble with the RAM usage on my PVE server and looking for some advice.

My PVE server shows often RAM usage > 90% and kills VM 100 (WinServ2022) because of "Out Of Memory".
Part of the syslog can be found at the end of this posting.

The server has 64 GB RAM
The 6 VMs are as follow
100 VM Win 2022 Server (8 Core, 16 GB RAM, normally 20% guest ram used)
101 VM Win10 (4 Core, 4 GB RAM, normally 50% guest ram used)
102 VM Win10 (4 Core, 4 GB RAM, normally 50% guest ram used)
110 VM Debian Linux (4 Core, 1 GB RAM, normally 30% guest ram used)
200 VM Debian Linux (4 Core, 2 GB RAM, normally 20% guest ram used)
201 VM Windows XP (1 Core, 0.25 GB RAM) (not running)

All VM HDD-Files are thin provisioned without cache (default), ssd emulation and discard.

So the VMs should add up to 28 GB RAM usage.
All Windows VMs using Guest-Addon and Ballooning driver. Debian VMs also have the Guest-Addons installed.

I'm running proxmox 7.2 on a small Workstation-Hardware. I added an apc ups-daemon on the pve host.
The hardware is a Intel(R) Core(TM) i9-10900 CPU (20 Cores) with 64 GB RAM.
The Proxmox and Storage is on a ZFS Mirror on 2x 1.92TB SATA SSD (Enterprise), therefore there is no swap space.

I know zfs needs a lot of RAM, but I cant imagine it uses so much RAM to crash the VMs.

I can't find the reason the pve host uses so much RAM and repeatedly kills VM 100.

Any help would be appreciated.

Thanks,
Tom

Here a part of the syslog:
Aug 05 16:57:12 pve1 kernel: kthreadd invoked oom-killer: gfp_mask=0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), order=0, oom_score_adj=0
Aug 05 16:57:12 pve1 kernel: CPU: 4 PID: 2 Comm: kthreadd Tainted: P W O 5.15.39-2-pve #1
...
Aug 05 16:57:12 pve1 kernel: Mem-Info:
Aug 05 16:57:12 pve1 kernel: active_anon:4326676 inactive_anon:2025734 isolated_anon:0
active_file:1289 inactive_file:971 isolated_file:0
unevictable:3018 dirty:3 writeback:3
slab_reclaimable:16180 slab_unreclaimable:2836741
mapped:12229 shmem:8237 pagetables:16203 bounce:0
kernel_misc_reclaimable:0
free:154259 free_pcp:2167 free_cma:0
Aug 05 16:57:12 pve1 kernel: Node 0 active_anon:17306704kB inactive_anon:8102936kB active_file:5156kB inactive_file:3884kB unevictable:12072kB isolated(anon):0kB isolated(file):0kB mapped:48916kB dirty:12kB writeback:12kB shmem:32948kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 641024kB writeback_tmp:0kB kernel_stack:9112kB pagetables:64812kB all_unreclaimable? no
Aug 05 16:57:12 pve1 kernel: Node 0 DMA free:11264kB min:12kB low:24kB high:36kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Aug 05 16:57:12 pve1 kernel: lowmem_reserve[]: 0 2117 63871 63871 63871
Aug 05 16:57:12 pve1 kernel: Node 0 DMA32 free:255100kB min:12092kB low:14260kB high:16428kB reserved_highatomic:2048KB active_anon:550648kB inactive_anon:754812kB active_file:0kB inactive_file:860kB unevictable:0kB writepending:0kB present:2296508kB managed:2230868kB mlocked:0kB bounce:0kB free_pcp:1504kB local_pcp:0kB free_cma:0kB
Aug 05 16:57:12 pve1 kernel: lowmem_reserve[]: 0 0 61754 61754 61754
Aug 05 16:57:12 pve1 kernel: Node 0 Normal free:349928kB min:353028kB low:416264kB high:479500kB reserved_highatomic:0KB active_anon:16753776kB inactive_anon:7349800kB active_file:5648kB inactive_file:6424kB unevictable:12072kB writepending:24kB present:64430080kB managed:63243684kB mlocked:11948kB bounce:0kB free_pcp:7596kB local_pcp:0kB free_cma:0kB
Aug 05 16:57:12 pve1 kernel: lowmem_reserve[]: 0 0 0 0 0
Aug 05 16:57:12 pve1 kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB
Aug 05 16:57:12 pve1 kernel: Node 0 DMA32: 2639*4kB (UM) 507*8kB (UM) 740*16kB (UME) 177*32kB (UME) 85*64kB (UME) 21*128kB (ME) 201*256kB (UME) 320*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 255540kB
Aug 05 16:57:12 pve1 kernel: Node 0 Normal: 28962*4kB (UME) 978*8kB (UME) 14067*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 348744kB
Aug 05 16:57:12 pve1 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Aug 05 16:57:12 pve1 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Aug 05 16:57:12 pve1 kernel: 12715 total pagecache pages
Aug 05 16:57:12 pve1 kernel: 0 pages in swap cache
Aug 05 16:57:12 pve1 kernel: Swap cache stats: add 0, delete 0, find 0/0
Aug 05 16:57:12 pve1 kernel: Free swap = 0kB
Aug 05 16:57:12 pve1 kernel: Total swap = 0kB
Aug 05 16:57:12 pve1 kernel: 16685645 pages RAM
Aug 05 16:57:12 pve1 kernel: 0 pages HighMem/MovableOnly
Aug 05 16:57:12 pve1 kernel: 313167 pages reserved
Aug 05 16:57:12 pve1 kernel: 0 pages hwpoisoned
Aug 05 16:57:12 pve1 kernel: Tasks state (memory values in pages):
Aug 05 16:57:12 pve1 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Aug 05 16:57:12 pve1 kernel: [ 1792] 0 1792 22326 821 204800 0 -250 systemd-journal
Aug 05 16:57:12 pve1 kernel: [ 1874] 0 1874 5596 750 65536 0 -1000 systemd-udevd
Aug 05 16:57:12 pve1 kernel: [ 2413] 103 2413 1960 464 53248 0 0 rpcbind
Aug 05 16:57:12 pve1 kernel: [ 2438] 102 2438 2047 545 61440 0 -900 dbus-daemon
Aug 05 16:57:12 pve1 kernel: [ 2450] 0 2450 37728 309 57344 0 0 lxcfs
Aug 05 16:57:12 pve1 kernel: [ 2453] 0 2453 69575 373 81920 0 0 pve-lxc-syscall
Aug 05 16:57:12 pve1 kernel: [ 2461] 0 2461 55185 712 73728 0 0 rsyslogd
Aug 05 16:57:12 pve1 kernel: [ 2464] 0 2464 1742 364 49152 0 0 ksmtuned
Aug 05 16:57:12 pve1 kernel: [ 2467] 0 2467 2932 846 65536 0 0 smartd
Aug 05 16:57:12 pve1 kernel: [ 2469] 0 2469 1051 299 45056 0 0 qmeventd
Aug 05 16:57:12 pve1 kernel: [ 2474] 0 2474 109836 765 110592 0 0 systemd-logind
Aug 05 16:57:12 pve1 kernel: [ 2477] 0 2477 543 207 36864 0 -1000 watchdog-mux
Aug 05 16:57:12 pve1 kernel: [ 2484] 0 2484 59431 650 81920 0 0 zed
Aug 05 16:57:12 pve1 kernel: [ 2639] 0 2639 956 324 40960 0 0 lxc-monitord
Aug 05 16:57:12 pve1 kernel: [ 2660] 0 2660 22257 423 61440 0 0 apcupsd
Aug 05 16:57:12 pve1 kernel: [ 2668] 0 2668 2873 132 65536 0 0 iscsid
Aug 05 16:57:12 pve1 kernel: [ 2669] 0 2669 2999 2967 65536 0 -17 iscsid
Aug 05 16:57:12 pve1 kernel: [ 2677] 0 2677 3323 1011 69632 0 -1000 sshd
Aug 05 16:57:12 pve1 kernel: [ 2694] 101 2694 4743 569 57344 0 0 chronyd
Aug 05 16:57:12 pve1 kernel: [ 2697] 0 2697 1446 375 45056 0 0 agetty
Aug 05 16:57:12 pve1 kernel: [ 2699] 101 2699 2695 507 57344 0 0 chronyd
Aug 05 16:57:12 pve1 kernel: [ 2812] 0 2812 144849 622 155648 0 0 rrdcached
Aug 05 16:57:12 pve1 kernel: [ 2832] 0 2832 160641 10473 323584 0 0 pmxcfs
Aug 05 16:57:12 pve1 kernel: [ 2922] 0 2922 9997 599 73728 0 0 master
Aug 05 16:57:12 pve1 kernel: [ 2924] 106 2924 10073 625 69632 0 0 qmgr
Aug 05 16:57:12 pve1 kernel: [ 2929] 0 2929 1671 541 53248 0 0 cron
Aug 05 16:57:12 pve1 kernel: [ 2945] 0 2945 67726 21566 286720 0 0 pve-firewall
Aug 05 16:57:12 pve1 kernel: [ 2948] 0 2948 67430 21632 282624 0 0 pvestatd
Aug 05 16:57:12 pve1 kernel: [ 2950] 0 2950 576 141 45056 0 0 bpfilter_umh
Aug 05 16:57:12 pve1 kernel: [ 2974] 0 2974 86380 30636 405504 0 0 pvedaemon
Aug 05 16:57:12 pve1 kernel: [ 2975] 0 2975 88497 31383 421888 0 0 pvedaemon worke
Aug 05 16:57:12 pve1 kernel: [ 2976] 0 2976 86451 30763 405504 0 0 pvedaemon worke
Aug 05 16:57:12 pve1 kernel: [ 2977] 0 2977 86451 30771 405504 0 0 pvedaemon worke
Aug 05 16:57:12 pve1 kernel: [ 2987] 0 2987 82587 24519 339968 0 0 pve-ha-crm
Aug 05 16:57:12 pve1 kernel: [ 2993] 33 2993 86733 32005 417792 0 0 pveproxy
Aug 05 16:57:12 pve1 kernel: [ 2999] 33 2999 18546 13411 192512 0 0 spiceproxy
Aug 05 16:57:12 pve1 kernel: [ 3001] 0 3001 82509 24395 335872 0 0 pve-ha-lrm
Aug 05 16:57:12 pve1 kernel: [ 3275] 0 3275 4957438 4210635 34975744 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 3390] 0 3390 755224 210355 2383872 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 5025] 0 5025 491481 107166 1482752 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 7494] 109 7494 1164 433 49152 0 0 rpc.statd
Aug 05 16:57:12 pve1 kernel: [ 10743] 0 10743 1444952 1059483 9605120 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 25960] 0 25960 1455278 1059310 9605120 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 34333] 0 34333 231238 71958 1146880 0 0 kvm
Aug 05 16:57:12 pve1 kernel: [ 48740] 0 48740 81228 24350 344064 0 0 pvescheduler
Aug 05 16:57:12 pve1 kernel: [ 215739] 33 215739 18606 12686 188416 0 0 spiceproxy work
Aug 05 16:57:12 pve1 kernel: [ 215746] 33 215746 86767 31468 397312 0 0 pveproxy worker
Aug 05 16:57:12 pve1 kernel: [ 215747] 33 215747 86767 31468 397312 0 0 pveproxy worker
Aug 05 16:57:12 pve1 kernel: [ 215748] 33 215748 86768 31468 397312 0 0 pveproxy worker
Aug 05 16:57:12 pve1 kernel: [2719564] 0 2719564 20035 364 53248 0 0 pvefw-logger
Aug 05 16:57:12 pve1 kernel: [ 397630] 106 397630 10064 649 69632 0 0 pickup
Aug 05 16:57:12 pve1 kernel: [1847748] 0 1847748 1326 124 49152 0 0 sleep
Aug 05 16:57:12 pve1 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/qemu.slice/100.scope,task=kvm,pid=3275,uid=0
Aug 05 16:57:12 pve1 kernel: Out of memory: Killed process 3275 (kvm) total-vm:19829752kB, anon-rss:16839640kB, file-rss:2896kB, shmem-rss:4kB, UID:0 pgtables:34156kB oom_score_adj:0
Aug 05 16:57:12 pve1 systemd[1]: 100.scope: A process of this unit has been killed by the OOM killer.
 
Hi Mira,

I wasn't aware of that. That explains my problem. I will set the limit (2GB + 1GB per TB) manually.
Maybe that would be a nice option to choose from the PVE-WebGUI or maybe even as default?
Thank you.

Bye,
Tom
 
Aug 05 16:57:12 pve1 kernel: Node 0 DMA32: 2639*4kB (UM) 507*8kB (UM) 740*16kB (UME) 177*32kB (UME) 85*64kB (UME) 21*128kB (ME) 201*256kB (UME) 320*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 255540kB
The actual problem could be memory fragmentation, you do not have any bigger chunks of memory available.
You could also try to add a swap device to your system in order to not invoke the OOM. Do not create swap on ZFS, this has some drawbacks (also crashes).
 
  • Like
Reactions: TomFIT