Proxmox High RAM Load after upgrade to Proxmox

Lukas Räpple

New Member
Jun 13, 2019
6
0
1
27
Good Morning.
I have a really urgent Problem with Proxmox pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve).
I updated from proxmox 5.x to proxmox 6.1-5 13 days ago.
Since then the RAM usage is insane. All proxmox nodes are using more and more ram. The usage of ram is linear increasing, so its not about temporary RAM peaks.
I think the machines will go down after tomorrow.
free -h shows 50Gi is used out of 62Gi. Furthermore if i try to track down the memory conumsing process with top it shows me that the most memory conumsing process only uses 0.2% of MEM. I think it has something to do with ZFS but im not sure. I dont found anything special in syslog btw.
arc_summary2 produces the following output:
------------------------------------------------------------------------
ZFS Subsystem Report Mon Jan 27 09:01:34 2020
ARC Summary: (HEALTHY)
Memory Throttle Count: 0

ARC Misc:
Deleted: 34.76M
Mutex Misses: 765
Evict Skips: 30.60M

ARC Size: 6.71% 2.10 GiB
Target Size: (Adaptive) 6.88% 2.16 GiB
Min Size (Hard Limit): 6.25% 1.96 GiB
Max Size (High Water): 16:1 31.35 GiB

ARC Size Breakdown:
Recently Used Cache Size: 7.15% 143.21 MiB
Frequently Used Cache Size: 92.85% 1.82 GiB
Metadata Size (Hard Limit): 75.00% 23.51 GiB
Metadata Size: 1.53% 368.40 MiB
Dnode Size (Hard Limit): 10.00% 2.35 GiB
Dnode Size: 2.79% 67.14 MiB

ARC Hash Breakdown:
Elements Max: 2.31M
Elements Current: 5.06% 116.84k
Collisions: 7.93M
Chain Max: 6
Chains: 820

ARC Total accesses: 5.09G
Cache Hit Ratio: 99.29% 5.06G
Cache Miss Ratio: 0.71% 35.97M
Actual Hit Ratio: 99.10% 5.05G

Data Demand Efficiency: 99.60% 4.66G
Data Prefetch Efficiency: 5.24% 14.67M

CACHE HITS BY CACHE LIST:
Anonymously Used: 0.17% 8.76M
Most Recently Used: 56.12% 2.84G
Most Frequently Used: 43.68% 2.21G
Most Recently Used Ghost: 0.02% 1.02M
Most Frequently Used Ghost: 0.00% 84.30k

CACHE HITS BY DATA TYPE:
Demand Data: 91.86% 4.65G
Prefetch Data: 0.02% 768.64k
Demand Metadata: 7.92% 400.58M
Prefetch Metadata: 0.21% 10.50M

CACHE MISSES BY DATA TYPE:
Demand Data: 52.48% 18.88M
Prefetch Data: 38.66% 13.91M
Demand Metadata: 6.12% 2.20M
Prefetch Metadata: 2.75% 988.45k


DMU Prefetch Efficiency: 2.28G
Hit Ratio: 0.86% 19.62M
Miss Ratio: 99.14% 2.26G

I hope somebody can help me its really urgent. The platform is by the way used to host only lxc containers.
Any suggestions where to look?
 
OK. I solved the problem. I'm using a monitoring software called check_mk. Obviously after upgrading to new kernel check_mk had a behavior eherd it didn't kill old child processes. So these processes were eating all the ram. Turned out its a check_mk bug.
Solved it by killing these processes via:
System stop systemctl.slice.check_mk.
And changing a parameter in the systemd file of the service.
If anybody has the same problem feel free to text me. Problem solved.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!