How do I Prevent OOMKill errors in Proxmox 9.0.11 cause by ZFS 2.3.4

zaphod80013

New Member
Dec 11, 2023
6
0
1
I've just installed Proxmox 9 (updated to 9.0.11) and seeing OOMKill errors while trying to bulk import existing data onto the new machine. The root cause appears to be the ZFS ARC cache growing to take almost all the RAM regardless of the zfs_arc_max value set by the Proxmox installer (options zfs zfs_arc_max=13481541632) The system has 128GiB Ram.

From what I've read in ZFS 2.3.4 the zfs_arc_max is ignored in favor of some heuristic in the code, so i'm not sure why the installer is setting it in the first place.

The system disk is ZFS Raid1 on 2 x 250Gb NVME, The data drives are a 25 TiB pool (4 x 12.7 TiB RaidZ2 HDD) and a separate 25 TiB pool (4 x 12.7 TiB RaidZ2 HDD with a 2 x 2TB NVME Raid1 special vdev)
 
Change:
Code:
# /etc/modprobe.d/zfs.conf
# example
# Set Min ARC size
# 512 MB
options zfs zfs_arc_min=536870912

# Set Max ARC Size
# 2 GB
#options zfs zfs_arc_max=2147483648
# 3 GB
#options zfs zfs_arc_max=3221225472
# 4 GB
#options zfs zfs_arc_max=4294967296
# 8 GB
options zfs zfs_arc_max=8589934592

# update-initramfs -u -k all
The max Size may depend on the rule N TB raw zfs datasize equals to N GB of max zfs arc cache size.
 
  • Like
Reactions: Johannes S
Did you also set the min value (in case it is larger than the max value set) and run update-initramfs -u?
No I left the default zfs.conf file created by the proxmox install, I'd previously tried an 8Gb min and 32Gb max, overall that seemed worse than the default file but it's hard to tell as my web/ssh sessions were being OOMKilled.
 
Change:
Code:
# /etc/modprobe.d/zfs.conf
# example
# Set Min ARC size
# 512 MB
options zfs zfs_arc_min=536870912

# Set Max ARC Size
# 2 GB
#options zfs zfs_arc_max=2147483648
# 3 GB
#options zfs zfs_arc_max=3221225472
# 4 GB
#options zfs zfs_arc_max=4294967296
# 8 GB
options zfs zfs_arc_max=8589934592

# update-initramfs -u -k all
The max Size may depend on the rule N TB raw zfs datasize equals to N GB of max zfs arc cache size.
Thanks for the feedback, from what I've read ZFS 2.3.4 ignores the max/min values (is that correct?). I'd tried various values up to 80Gb for the max and a fixed 8 Gb min cache size. The results were either no better or worse than just using the modprobe zfs.conf file created by the proxmox install.

I managed to complete the bulk data import using "rsync --partial", multiple reruns, and a final diff -rq across each dataset. I was importing about 18 TiB of mostly borg backup data and my old NAS which probably has 2-3 terabytes of deadwood and duplication I need to deal with.

If what I've read about how the ZFS ARC cache works in 2.3.4 is correct than I think that is a poor design decision by the OpenZFS developers (there is an arrogance in taking control away from the user that I don't like) but I think, with the import complete, I'm past the worse of the issue. Again thanks for the help.
 
No. Please share the OOM messages(s) and the output of top -co %MEM and arc_summary -s arc and free -h.
 
  • Like
Reactions: Johannes S
No. Please share the OOM messages(s) and the output of top -co %MEM and arc_summary -s arc and free -h.
I'm past the error on my primary node and too far into the build to backtrack, I have another node (identical hardware) that I need to build out, I'll try to reproduce the error there and get back to you. It may take a a couple of weeks to setup as I need to get the primary node finished first.