Windows VM stopped - Out Of Memory

supermicro_server

Well-Known Member
Sep 13, 2017
107
7
58
44
Hi,

In my PVE 7.2-4 host I notice that I have a very high RAM usage and the VMs has been stopped randomly.
I have already limited ZFS RAM usage and for several month every things has been going right.

I've tried to use dmesg command in this way in order to figure out what happen. The output below:

Code:
root@pve:~# dmesg -T | grep Out
[Wed Aug  3 11:41:25 2022] Out of memory: Killed process 3792306 (kvm) total-vm:17564080kB, anon-rss:16806076kB, file-rss:1952kB, shmem-rss:4kB, UID:0 pgtables:33384kB oom_score_adj:0
[Wed Aug  3 11:43:19 2022] Out of memory: Killed process 511795 (kvm) total-vm:17565112kB, anon-rss:14217436kB, file-rss:1736kB, shmem-rss:4kB, UID:0 pgtables:28400kB oom_score_adj:0
[Wed Aug  3 11:45:31 2022] Out of memory: Killed process 641427 (kvm) total-vm:17568200kB, anon-rss:14811900kB, file-rss:1716kB, shmem-rss:4kB, UID:0 pgtables:29552kB oom_score_adj:0
[Wed Aug  3 11:48:00 2022] Out of memory: Killed process 770982 (kvm) total-vm:17564088kB, anon-rss:15359420kB, file-rss:2000kB, shmem-rss:4kB, UID:0 pgtables:30572kB oom_score_adj:0
[Wed Aug  3 11:50:25 2022] Out of memory: Killed process 920082 (kvm) total-vm:17565116kB, anon-rss:15344180kB, file-rss:2012kB, shmem-rss:4kB, UID:0 pgtables:30576kB oom_score_adj:0


1659527549659.png

htop shows me:

1659527611135.png

I have the customer stuck..

Thank you
 

Attachments

  • 1659527520208.png
    1659527520208.png
    9 KB · Views: 6
you likely overcommitted your available memory, and somebody started using it -> you don't have enough.
 
you likely overcommitted your available memory, and somebody started using it -> you don't have enough.
Thank you fabian,

the very strange thing is that I have 5 VM and the RAM occupation summary should be about 8GB+12GB+12GB+16GB+8GB + ZFS 16GB (max) = 72GB RAM busy.
On the PVE web page I see about 95% of the RAM occupation!
It's the first time I see a something like this...

Thanks
 
the very strange thing is that I have 5 VM and the RAM occupation summary should be about 8GB+12GB+12GB+16GB+8GB + ZFS 16GB (max) = 72GB RAM busy.
Did you set both the min and max for ZFS? The default min for your system is about 31GB and if you only set max to 16GB it will ignore it because it is less than the min.
 
Did you set both the min and max for ZFS? The default min for your system is about 31GB and if you only set max to 16GB it will ignore it because it is less than the min.
How can you define about 31GB for zfs min?
below my current arc_summary

Code:
ZFS Subsystem Report                            Wed Aug 03 16:40:02 2022
Linux 5.15.35-2-pve                                           2.1.4-pve1
Machine: pve (x86_64)                                         2.1.4-pve1

ARC status:                                                    THROTTLED
        Memory throttle count:                                      2650

ARC size (current):                                    52.0 %    8.3 GiB
        Target size (adaptive):                        52.3 %    8.4 GiB
        Min size (hard limit):                         24.3 %    3.9 GiB
        Max size (high water):                            4:1   16.0 GiB
        Most Frequently Used (MFU) cache size:          1.3 %  107.3 MiB
        Most Recently Used (MRU) cache size:           98.7 %    8.0 GiB
        Metadata cache size (hard limit):              75.0 %   12.0 GiB
        Metadata cache size (current):                  3.5 %  431.7 MiB
        Dnode cache size (hard limit):                 10.0 %    1.2 GiB
        Dnode cache size (current):                     1.3 %   15.7 MiB

ARC hash breakdown:
        Elements max:                                               3.5M
        Elements current:                              21.3 %     743.0k
        Collisions:                                               524.8M
        Chain max:                                                     6
        Chains:                                                    16.0k

ARC misc:
        Deleted:                                                    4.5G
        Mutex misses:                                               1.6M
        Eviction skips:                                             7.7M
        Eviction skips due to L2 writes:                               0
        L2 cached evictions:                                     0 Bytes
        L2 eligible evictions:                                  50.5 TiB
        L2 eligible MFU evictions:                     11.2 %    5.7 TiB
        L2 eligible MRU evictions:                     88.8 %   44.9 TiB
        L2 ineligible evictions:                                 1.7 TiB

ARC total accesses (hits + misses):                                11.6G
        Cache hit ratio:                               68.2 %       7.9G
        Cache miss ratio:                              31.8 %       3.7G
        Actual hit ratio (MFU + MRU hits):             68.1 %       7.9G
        Data demand efficiency:                        39.6 %       4.9G
        Data prefetch efficiency:                       2.0 %     692.3M

Cache hits by cache type:
        Most frequently used (MFU):                    75.2 %       5.9G
        Most recently used (MRU):                      24.6 %       1.9G
        Most frequently used (MFU) ghost:               0.2 %      12.3M
        Most recently used (MRU) ghost:                 0.2 %      15.4M

Cache hits by data type:
        Demand data:                                   24.8 %       2.0G
        Demand prefetch data:                           0.2 %      13.5M
        Demand metadata:                               75.0 %       5.9G
        Demand prefetch metadata:                     < 0.1 %       2.4M

Cache misses by data type:
        Demand data:                                   81.1 %       3.0G
        Demand prefetch data:                          18.5 %     678.8M
        Demand metadata:                                0.3 %      12.5M
        Demand prefetch metadata:                       0.1 %       2.5M

DMU prefetch efficiency:                                            1.7G
        Hit ratio:                                      3.1 %      54.7M
        Miss ratio:                                    96.9 %       1.7G

L2ARC not detected, skipping section
 
the OOM killer will print a detailed summary of the memory situation at the time of the kill. the first kill is probably most interesting, although the series of kills shortly after eachother indicates that something caused rapid memory usage increase (or HA restarting the killed VM and running into the limit again?)..
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!