Slow mem leak VE 8.3.4

colotroy

New Member
Mar 22, 2024
6
0
1
I'm current with Ve patches as of this week at VE 8.3.4, but am still seeing a creeping memory leak. I only have 4 VMs running...
PBS 8G mem alloc
TrueNAS 32G
Jellyfin 12G
Pi-hole 4G

A few weeks back I noticed I was running out of mem with 128G installed on a Dell 730. I thought that strange with the mem alloc I have but am planning on restoring more VMs from an older 720 to the 730 so I bought more mem anyway... I now have 256G installed and now watching the mem more closely I see that it constantly creeps up over time.

I do see older threads on NFS leading but that was supposed to be fixed on prior releases. I'll add that I use TrueNAS and NFS mounts to PBS for backups and for Jellyfin to hold a 4TB of movies and TV shows. Here are the NFS package levels...
nfs-common/stable,now 1:2.6.2-4+deb12u1 amd64 [installed]
nftables/stable,now 1.0.6-2+deb12u2 amd64 [installed]

Here's a quick script I ran in the nodes cli to monitor the memory... It shows free and available memory every hour and you can see how the memory is creeping up. When I had 128G installed it consumed all memory available and the swap usage space on the node dashboard showed 100% used. For the 4 VMs I have running on the node I don't understand what's consuming the memory???

root@pve-730-1:~# while true;do date;cat /proc/meminfo|grep -e "MemF" -e "MemA";sleep
3600;done
Thu Mar 6 10:39:34 AM MST 2025
MemFree: 246762208 kB
MemAvailable: 245810836 kB
Thu Mar 6 11:39:34 AM MST 2025
MemFree: 240337128 kB
MemAvailable: 239426676 kB
Thu Mar 6 12:39:34 PM MST 2025
MemFree: 212397084 kB
MemAvailable: 218043784 kB
Thu Mar 6 01:39:34 PM MST 2025
MemFree: 205870140 kB
MemAvailable: 205324020 kB
Thu Mar 6 02:39:34 PM MST 2025
MemFree: 185939164 kB
MemAvailable: 185994612 kB
Thu Mar 6 03:39:34 PM MST 2025
MemFree: 185509920 kB
MemAvailable: 185573404 kB
Thu Mar 6 04:39:34 PM MST 2025
MemFree: 184175772 kB
MemAvailable: 184256556 kB
Thu Mar 6 05:39:34 PM MST 2025
MemFree: 183787924 kB
MemAvailable: 183876848 kB
Thu Mar 6 06:39:34 PM MST 2025
MemFree: 183402748 kB
MemAvailable: 183499452 kB
Thu Mar 6 07:39:34 PM MST 2025
MemFree: 183235132 kB
MemAvailable: 183339928 kB
Thu Mar 6 08:39:34 PM MST 2025
MemFree: 182770444 kB
MemAvailable: 182887932 kB
Thu Mar 6 09:39:34 PM MST 2025
MemFree: 182580160 kB
MemAvailable: 182726396 kB
Thu Mar 6 10:39:34 PM MST 2025
MemFree: 182310816 kB
MemAvailable: 182465164 kB
Thu Mar 6 11:39:34 PM MST 2025
MemFree: 182163160 kB
MemAvailable: 182326148 kB
Fri Mar 7 12:39:34 AM MST 2025
MemFree: 180761760 kB
MemAvailable: 180956572 kB
Fri Mar 7 01:39:34 AM MST 2025
MemFree: 183333672 kB
MemAvailable: 183564068 kB
Fri Mar 7 02:39:34 AM MST 2025
MemFree: 182752220 kB
MemAvailable: 183070264 kB
Fri Mar 7 03:39:34 AM MST 2025
MemFree: 182225700 kB
MemAvailable: 182566180 kB
Fri Mar 7 04:39:34 AM MST 2025
MemFree: 182056724 kB
MemAvailable: 182412128 kB
Fri Mar 7 05:39:34 AM MST 2025
MemFree: 181835572 kB
MemAvailable: 182198296 kB
Fri Mar 7 06:39:34 AM MST 2025
MemFree: 181620992 kB
MemAvailable: 181990812 kB

Here's a top sorted by memory usage "top -o %MEM"
top - 07:10:20 up 20:37, 2 users, load average: 0.24, 0.24, 0.33
Tasks: 1132 total, 1 running, 1131 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.5 us, 0.1 sy, 0.0 ni, 98.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 257824.5 total, 177267.1 free, 80191.7 used, 2182.6 buff/cache
MiB Swap: 8192.0 total, 8192.0 free, 0.0 used. 177632.7 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3757 root 20 0 17.0g 16.0g 12096 S 7.5 6.4 262:44.28 kvm
3627 root 20 0 9242888 4.7g 12096 S 0.7 1.9 164:03.49 kvm
3378 root 20 0 13.1g 2.2g 12544 S 1.0 0.9 18:10.85 kvm
3976 root 20 0 5123448 1.9g 9856 S 25.6 0.8 279:50.90 kvm
3354 www-data 20 0 254256 176960 34944 S 0.0 0.1 0:03.76 pveproxy
1175731 www-data 20 0 266936 158448 11648 S 0.0 0.1 0:06.49 pveproxy wor+
1194085 www-data 20 0 262840 153968 10752 S 0.3 0.1 0:01.94 pveproxy wor+
1195665 www-data 20 0 263156 152176 8960 S 0.0 0.1 0:02.61 pveproxy wor+
415330 root 20 0 261700 150236 8064 S 0.0 0.1 0:08.38 pvedaemon wo+
1176211 root 20 0 261696 149340 7616 S 0.0 0.1 0:01.97 pvedaemon wo+
39059 www-data 20 0 264460 148944 3584 S 0.0 0.1 0:10.79 pveproxy wor+
207664 root 20 0 261668 148892 7168 S 0.0 0.1 0:11.49 pvedaemon wo+
1202078 root 20 0 261504 145888 3584 S 0.0 0.1 0:00.03 task UPID:pv+
8344 root 20 0 261428 145788 3584 S 0.0 0.1 0:04.17 task UPID:pv+
3342 root 20 0 252872 143960 3136 S 0.0 0.1 0:01.52 pvedaemon
 
Last edited:
That's an idea, I am using ZFS... This was a new install a few months back with 8.3.x, I think x=2? The link says starting with 8.1 the ARC should be capped at 10%?? When the mem usage ran up to 120G that would have been well over 10% of mem or 16G.

How do I display ZFS ARC use? I'm not seeing how to display that but I may also be blind...

ZFS uses 50 % of the host memory for the Adaptive Replacement Cache (ARC) by default. For new installations starting with Proxmox VE 8.1, the ARC usage limit will be set to 10 % of the installed physical memory, clamped to a maximum of 16 GiB.
 
I found it... The ZFS ARC stats can be found in proc/spl/kstat/zfs/arcstats. This looks like it to me. It looks like it's consuming 52G. Hmm... That's well over the 16G cap or the 10% cap. I see there's ways to lock that down. I'm going to have to read more on that but at this time even though this was an 8.3.x new install it seems to be running away with the available mem...

root@pve-730-1:~# while true;do cat /proc/spl/kstat/zfs/arcstats|grep -e"c " -e"c_min" -e"c_max";sleep 360;done
c 4 52927675008
c_min 4 8448391808
c_max 4 135174268928
 
maybe the 10% cap only applies if you created the zfs volume with the installer? if you add a zfs volume in the gui afterwards the 50% could apply?
 
Thanks for the info! That's much easier... I do not have /etc/modprob.d/zfs.conf file.

root@pve-730-1:/etc/modprobe.d# ls -la
total 12
drwxr-xr-x 2 root root 4096 Mar 1 10:05 .
drwxr-xr-x 91 root root 4096 Mar 1 10:07 ..
-rw-r--r-- 1 root root 172 Nov 20 03:39 pve-blacklist.conf
root@pve-730-1:/etc/modprobe.d#

Here's the output of the arc commands...

root@pve-730-1:/etc# arcstat
time read ddread ddh% dmread dmh% pread ph% size c avail
07:57:28 0 0 0 0 0 0 0 49G 49G 164G
root@pve-730-1:/etc# arc_summary -s arc

------------------------------------------------------------------------
ZFS Subsystem Report Fri Mar 07 07:57:45 2025
Linux 6.8.12-8-pve 2.2.7-pve1
Machine: pve-730-1 (x86_64) 2.2.7-pve1

ARC status: HEALTHY
Memory throttle count: 0

ARC size (current): 39.1 % 49.2 GiB
Target size (adaptive): 39.2 % 49.3 GiB
Min size (hard limit): 6.2 % 7.9 GiB
Max size (high water): 16:1 125.9 GiB
Anonymous data size: < 0.1 % 124.0 KiB
Anonymous metadata size: 0.0 % 0 Bytes
MFU data target: 37.5 % 18.1 GiB
MFU data size: 2.2 % 1.1 GiB
MFU ghost data size: 0 Bytes
MFU metadata target: 12.5 % 6.0 GiB
MFU metadata size: 0.9 % 444.4 MiB
MFU ghost metadata size: 0 Bytes
MRU data target: 37.5 % 18.1 GiB
MRU data size: 96.1 % 46.4 GiB
MRU ghost data size: 0 Bytes
MRU metadata target: 12.5 % 6.0 GiB
MRU metadata size: 0.8 % 398.7 MiB
MRU ghost metadata size: 0 Bytes
Uncached data size: 0.0 % 0 Bytes
Uncached metadata size: 0.0 % 0 Bytes
Bonus size: < 0.1 % 1.0 MiB
Dnode cache target: 10.0 % 12.6 GiB
Dnode cache size: 0.1 % 7.3 MiB
Dbuf size: 0.1 % 31.0 MiB
Header size: 1.8 % 916.9 MiB
L2 header size: 0.0 % 0 Bytes
ABD chunk waste size: < 0.1 % 25.5 KiB

ARC hash breakdown:
Elements max: 4.0M
Elements current: 100.0 % 4.0M
Collisions: 412.0k
Chain max: 4
Chains: 219.9k

ARC misc:
Deleted: 75
Mutex misses: 0
Eviction skips: 9
Eviction skips due to L2 writes: 0
L2 cached evictions: 0 Bytes
L2 eligible evictions: 556.5 KiB
L2 eligible MFU evictions: 28.8 % 160.0 KiB
L2 eligible MRU evictions: 71.2 % 396.5 KiB
L2 ineligible evictions: 8.0 KiB

root@pve-730-1:/

Here's my ZFS list...
root@pve-730-1:/etc# zfs list
NAME USED AVAIL REFER MOUNTPOINT
ProxVMs 6.31T 848G 96K /ProxVMs
ProxVMs/vm-100-disk-0 508G 1.31T 10.4G -
ProxVMs/vm-101-disk-0 508G 1.32T 3.91G -
ProxVMs/vm-102-disk-0 406G 1.22T 9.97G -
ProxVMs/vm-103-disk-0 3M 848G 124K -
ProxVMs/vm-103-disk-1 1016G 1.69T 137G -
ProxVMs/vm-103-disk-2 1016G 1.64T 188G -
ProxVMs/vm-103-disk-3 1016G 1.82T 16.7M -
ProxVMs/vm-103-disk-4 6M 848G 72K -
ProxVMs/vm-104-disk-0 3M 848G 128K -
ProxVMs/vm-104-disk-1 1016G 1.80T 22.9G -
ProxVMs/vm-104-disk-2 6M 848G 72K -
ProxVMs/vm-201-disk-0 305G 1.10T 24.7G -
ProxVMs/vm-202-disk-0 305G 1.12T 2.11G -
ProxVMs/vm-202-disk-1 102G 923G 26.3G -
ProxVMs/vm-203-disk-0 3M 848G 56K -
ProxVMs/vm-203-disk-1 32.5G 866G 14.3G -
ProxVMs/vm-204-disk-0 32.5G 862G 18.3G -
ProxVMs/vm-303-disk-0 203G 1013G 37.9G -
TrueNAS2 28.8T 192G 192K /TrueNAS2
TrueNAS2/vm-101-disk-0 28.8T 10.1T 18.9T -
 
I'm creating an /etc/modprobe.d/zfs.conf file with the following contents to limit the ZFS ARC to 16G. That should be plenty!
options zfs zfs_arc_max=17179869184

I know this will take a reboot of the node.

Is there anything else I need to do here prior to the reboot? Do I have the zfs.conf file contents correct?

Thanks for the help, I wouldn't have found this for a lonnnngggg time without the help!
 
Creating the /etc/modprobe.d/zfs.conf file solved this. I think this should be created by default. ZFS running away with all of your available memory seems like a bug to me. I get it's configurable once you find this but why let that happen by default, 16G seems like a lot of space for ZFS cache to me?

Anyway, thanks for the help with this!
 
I think this should be created by default. ZFS running away with all of your available memory seems like a bug to me.
This is actually done if you install with ZFS-on-root. Also, it's document very extensively in our admin guide: Limit ZFS ARC memory usage :)

It's not a one-size-fits-it-all anyway and most system administrators may want to tweak it, so that's also why it is documented so clearly.
 
@cheiss is the arc cache global or can it be set for each ZFS pool independently?

i assume on a fresh pve 8.3 ZFS-on-root it is set to 10% but if you install on ext4 and add a zfs pool post install will it be set 10% too?
 
@cheiss is the arc cache global or can it be set for each ZFS pool independently?
Global.

but if you install on ext4 and add a zfs pool post install will it be set 10% too?
No, currently there is no automatism for this in place. But you can create a ticket over at https://bugzilla.proxmox.com/ if you want to definitely see this and keep track of it its status.
It could make sense to always write /etc/modprobe.d/zfs.conf, since it doesn't do any harm, I think.
Potentially also a UI option could be provided to directly tweak it without having to manually edit that file. But the latter will need some more thought.

But again, the clamped 10% we set in the installer is a rough calculations anyway and will not fit every setup.
If your using ZFS in production, there are always explicit reasoning to be had about it, be it ARC size, RAID type, SLOG, L2ARC, per-pool options, etc. etc.
 
  • Like
Reactions: MarkusKo