Code:
The server has 512GB Ram.
It has 5 running vms with following allocated memory:
1:2GB
2:4GB
3:32GB
4:64GB
5:16GB
Please explain how this server can run OOM when it clearly has sufficient memory to run the meager load it is running.
Even with 2 ZFS pools with default value it will never fill ram. The below graph looks the same for average/max for aprox. 3 months. No deviation as it only runs these 5 vms.
[3134156.023918] libceph: osd6 (1)172.16.0.3:6821 socket closed (con state OPEN)
[3157331.811755] dmesg invoked oom-killer: gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=3, oom_score_adj=0
[3157331.812316] CPU: 7 PID: 281164 Comm: dmesg Tainted: P O 5.11.22-5-pve #1
[3157331.812741] Hardware name: Supermicro Super Server/H12SSL-NT, BIOS 2.0 02/22/2021
[3157331.813150] Call Trace:
[3157331.813553] dump_stack+0x70/0x8b
[3157331.813954] dump_header+0x4f/0x1f6
[3157331.814346] oom_kill_process.cold+0xb/0x10
[3157331.814729] out_of_memory+0x1cf/0x520
[3157331.815117] __alloc_pages_slowpath.constprop.0+0xc6d/0xd60
[3157331.815518] __alloc_pages_nodemask+0x2e0/0x310
[3157331.815926] alloc_pages_current+0x87/0x110
[3157331.816336] kmalloc_order+0x2d/0xa0
[3157331.816748] kmalloc_order_trace+0x1d/0x80
[3157331.817161] devkmsg_open+0x4f/0x130
[3157331.817577] memory_open+0x48/0x60
[3157331.817996] chrdev_open+0xf7/0x220
[3157331.818415] ? cdev_device_add+0x90/0x90
[3157331.818838] do_dentry_open+0x156/0x370
[3157331.819262] vfs_open+0x2d/0x30
[3157331.819679] path_openat+0xa79/0x10d0
[3157331.820099] ? unlock_page_memcg+0x38/0x70
[3157331.820527] ? page_add_file_rmap+0xd3/0x140
[3157331.820961] do_filp_open+0x8c/0x130
[3157331.821387] ? __check_object_size+0x13f/0x150
[3157331.821826] do_sys_openat2+0x9b/0x150
[3157331.822256] __x64_sys_openat+0x56/0x90
[3157331.822679] do_syscall_64+0x38/0x90
[3157331.823097] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3157331.823510] RIP: 0033:0x7f82bdea6be7
[3157331.823915] Code: 25 00 00 41 00 3d 00 00 41 00 74 47 64 8b 04 25 18 00 00 00 85 c0 75 6b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 95 00 00 00 48 8b 4c 24 28 64 48 2b 0c 25
[3157331.824809] RSP: 002b:00007ffd2fe5dc40 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[3157331.825262] RAX: ffffffffffffffda RBX: 000055e10e774660 RCX: 00007f82bdea6be7
[3157331.825725] RDX: 0000000000000800 RSI: 000055e10e7743d7 RDI: 00000000ffffff9c
[3157331.826194] RBP: 000055e10e7743d7 R08: 0000000000090800 R09: 000055e10e77578f
[3157331.826672] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000800
[3157331.827154] R13: 000055e10e77a020 R14: 0000000000000001 R15: 0000000000000000
[3157331.827657] Mem-Info:
[3157331.828159] active_anon:30278860 inactive_anon:600647 isolated_anon:0
active_file:1732260 inactive_file:30438149 isolated_file:0
unevictable:93652 dirty:1 writeback:123
slab_reclaimable:190270 slab_unreclaimable:1711323
mapped:31102 shmem:16677 pagetables:67058 bounce:0
free:831798 free_pcp:601 free_cma:0
[3157331.831279] Node 0 active_anon:121115440kB inactive_anon:2402588kB active_file:6929040kB inactive_file:121754108kB unevictable:374608kB isolated(anon):0kB isolated(file):0kB mapped:124408kB dirty:4kB writeback:492kB shmem:66708kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 91336704kB writeback_tmp:0kB kernel_stack:12416kB pagetables:268232kB all_unreclaimable? no
[3157331.833124] Node 0 DMA free:11796kB min:0kB low:12kB high:24kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15892kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[3157331.834526] lowmem_reserve[]: 0 2554 515583 515583 515583
[3157331.835308] Node 0 DMA32 free:2052348kB min:452kB low:3064kB high:5676kB reserved_highatomic:0KB active_anon:604160kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2742768kB managed:2672072kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[3157331.836964] lowmem_reserve[]: 0 0 513029 513029 513029
[3157331.837767] Node 0 Normal free:1260196kB min:91448kB low:616788kB high:1142128kB reserved_highatomic:2048KB active_anon:120511280kB inactive_anon:2402588kB active_file:6929040kB inactive_file:121756520kB unevictable:374608kB writepending:496kB present:533972992kB managed:525349464kB mlocked:374608kB bounce:0kB free_pcp:2608kB local_pcp:0kB free_cma:0kB
[3157331.839477] lowmem_reserve[]: 0 0 0 0 0
[3157331.840357] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11796kB
[3157331.841268] Node 0 DMA32: 125*4kB (UM) 481*8kB (UM) 462*16kB (UME) 417*32kB (UME) 392*64kB (UME) 396*128kB (UME) 337*256kB (UME) 243*512kB (UME) 136*1024kB (UM) 62*2048kB (UME) 360*4096kB (UME) = 2052348kB
[3157331.843064] Node 0 Normal: 9636*4kB (UE) 74662*8kB (UE) 38817*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1256912kB
[3157331.844015] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[3157331.844962] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[3157331.845911] 32193152 total pagecache pages
[3157331.846890] 0 pages in swap cache
[3157331.847832] Swap cache stats: add 0, delete 0, find 0/0
[3157331.848774] Free swap = 0kB
[3157331.849712] Total swap = 0kB
[3157331.850646] 134182939 pages RAM
[3157331.851592] 0 pages HighMem/MovableOnly
[3157331.852559] 2173582 pages reserved
[3157331.853524] 0 pages hwpoisoned
[3157331.854495] Tasks state (memory values in pages):
[3157331.855447] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[3157331.856425] [ 1200] 0 1200 88504 4764 724992 0 -250 systemd-journal
[3157331.857410] [ 1315] 0 1315 5732 1014 65536 0 -1000 systemd-udevd
[3157331.858420] [ 2530] 103 2530 1960 575 49152 0 0 rpcbind
[3157331.859445] [ 2532] 0 2532 4547 1801 73728 0 0 ceph-crash
[3157331.860425] [ 2561] 102 2561 2056 703 57344 0 -900 dbus-daemon
[3157331.861405] [ 2564] 0 2564 93128 340 86016 0 0 lxcfs
[3157331.862392] [ 2565] 0 2565 272369 512 188416 0 0 pve-lxc-syscall
[3157331.863403] [ 2571] 0 2571 55185 880 73728 0 0 rsyslogd
[3157331.864388] [ 2573] 0 2573 3013 1155 65536 0 0 smartd
[3157331.865370] [ 2574] 0 2574 1742 452 49152 0 0 ksmtuned
[3157331.866350] [ 2577] 0 2577 5538 1326 65536 0 0 systemd-logind
[3157331.867335] [ 2580] 0 2580 543 179 36864 0 -1000 watchdog-mux
[3157331.868295] [ 2585] 0 2585 40993 1059 77824 0 0 zed
[3157331.869238] [ 2588] 0 2588 1051 366 49152 0 0 qmeventd
[3157331.870192] [ 3268] 0 3268 1887 591 57344 0 0 lxc-monitord
[3157331.871124] [ 3285] 0 3285 2873 132 57344 0 0 iscsid
[3157331.872047] [ 3286] 0 3286 2999 2928 61440 0 -17 iscsid
[3157331.872986] [ 3288] 0 3288 3323 1227 65536 0 -1000 sshd
[3157331.873894] [ 3292] 101 3292 4743 790 61440 0 0 chronyd
[3157331.874813] [ 3294] 101 3294 2695 679 61440 0 0 chronyd
[3157331.875719] [ 3309] 0 3309 576 128 40960 0 0 bpfilter_umh
[3157331.876615] [ 3321] 0 3321 1446 408 49152 0 0 agetty
[3157331.877508] [ 3395] 0 3395 181748 1133 176128 0 0 rrdcached
[3157331.878396] [ 3439] 0 3439 208388 15767 462848 0 0 pmxcfs
[3157331.879293] [ 3640] 0 3640 9996 973 77824 0 0 master
[3157331.880189] [ 3680] 0 3680 195241 96904 839680 0 0 corosync
[3157331.881097] [ 3684] 0 3684 1671 621 53248 0 0 cron
[3157331.881960] [ 3822] 0 3822 66421 21625 282624 0 0 pve-firewall
[3157331.882805] [ 3824] 0 3824 67265 23457 299008 0 0 pvestatd
[3157331.883607] [ 3847] 0 3847 87541 30204 405504 0 0 pvedaemon
[3157331.884378] [ 3856] 0 3856 81060 24089 331776 0 0 pve-ha-crm
[3157331.885121] [ 4083] 33 4083 87862 35175 430080 0 0 pveproxy
[3157331.885861] [ 4090] 33 4090 17301 14318 176128 0 0 spiceproxy
[3157331.886588] [ 4092] 0 4092 80982 24125 344064 0 0 pve-ha-lrm
[3157331.887276] [ 967092] 106 967092 10076 1264 69632 0 0 qmgr
[3157331.887977] [1366438] 0 1366438 1797451 770871 10240000 0 0 kvm
[3157331.888619] [1369243] 0 1369243 1034979 334976 5808128 0 0 kvm
[3157331.889243] [1703229] 0 1703229 90788 31848 425984 0 0 task UPID:pve02
[3157331.889834] [1703230] 0 1703230 22519 1284 77824 0 0 termproxy
[3157331.890406] [1703234] 0 1703234 1774 950 57344 0 0 login
[3157331.890960] [1703349] 0 1703349 4105 2192 69632 0 0 systemd
[3157331.891480] [1703350] 0 1703350 42401 1187 102400 0 0 (sd-pam)
[3157331.891986] [1703391] 0 1703391 1993 908 57344 0 0 bash
[3157331.892489] [ 641829] 0 641829 9798630 8399755 69095424 0 0 kvm
[3157331.892978] [ 662789] 33 662789 93621 34356 446464 0 0 pveproxy worker
[3157331.893463] [1095456] 0 1095456 5583853 4218138 36122624 0 0 kvm
[3157331.893952] [ 740766] 0 740766 18414525 16974551 143527936 0 0 kvm
[3157331.894441] [2536871] 0 2536871 89664 31535 421888 0 0 pvedaemon worke
[3157331.894947] [2871790] 0 2871790 89771 31617 425984 0 0 pvedaemon worke
[3157331.895445] [2985227] 0 2985227 89723 31532 425984 0 0 pvedaemon worke
[3157331.895941] [1668449] 33 1668449 17331 12929 176128 0 0 spiceproxy work
[3157331.896441] [1668451] 0 1668451 20035 435 57344 0 0 pvefw-logger
[3157331.896941] [1668455] 33 1668455 87895 31323 409600 0 0 pveproxy worker
[3157331.897446] [1668456] 33 1668456 87895 31323 409600 0 0 pveproxy worker
[3157331.897944] [1668468] 33 1668468 87895 31323 409600 0 0 pveproxy worker
[3157331.898425] [3910659] 106 3910659 10064 1349 77824 0 0 pickup
[3157331.898912] [ 278026] 0 278026 1326 129 49152 0 0 sleep
[3157331.899391] [ 281154] 0 281154 3614 2056 65536 0 0 sshd
[3157331.899863] [ 281161] 0 281161 1993 1053 53248 0 0 bash
[3157331.900331] [ 281164] 0 281164 1385 201 45056 0 0 dmesg
[3157331.900826] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/201.scope,task=kvm,pid=740766,uid=0
[3157331.901376] Out of memory: Killed process 740766 (kvm) total-vm:73658100kB, anon-rss:67886652kB, file-rss:11552kB, shmem-rss:0kB, UID:0 pgtables:140164kB oom_score_adj:0
[3157333.887835] oom_reaper: reaped process 740766 (kvm), now anon-rss:0kB, file-rss:68kB, shmem-rss:0kB
[3157333.983076] fwbr201i0: port 2(tap201i0) entered disabled state
[3157334.011347] fwbr201i0: port 2(tap201i0) entered disabled state
[3157334.538217] fwbr201i0: port 1(fwln201i0) entered disabled state
[3157334.541070] vmbr0: port 4(fwpr201p0) entered disabled state
[3157334.566469] device fwln201i0 left promiscuous mode
[3157334.567593] fwbr201i0: port 1(fwln201i0) entered disabled state
[3157334.622859] device fwpr201p0 left promiscuous mode
[3157334.624005] vmbr0: port 4(fwpr201p0) entered disabled state
[3157420.988253] device tap201i0 entered promiscuous mode
[3157421.042276] fwbr201i0: port 1(fwln201i0) entered blocking state
[3157421.043120] fwbr201i0: port 1(fwln201i0) entered disabled state
[3157421.044837] device fwln201i0 entered promiscuous mode
[3157421.045848] fwbr201i0: port 1(fwln201i0) entered blocking state
[3157421.046534] fwbr201i0: port 1(fwln201i0) entered forwarding state
[3157421.054058] vmbr0: port 4(fwpr201p0) entered blocking state
[3157421.055085] vmbr0: port 4(fwpr201p0) entered disabled state
[3157421.056020] device fwpr201p0 entered promiscuous mode
[3157421.056860] vmbr0: port 4(fwpr201p0) entered blocking state
[3157421.057508] vmbr0: port 4(fwpr201p0) entered forwarding state