OOM killer during ZFS scrub

Waldo

Member
May 12, 2021
1
0
6
45
Hi,

Not very long time ago I migrated from ESXi to Proxmox. From the very beginning I had unexpected VM shutdowns due to OOM condition. After some tuning both VM and host (I configured ZFS ARC size between 4 and 6 GB) problems seemed to go away. Host has total of 32 GB RAM.

But suddenly last Sunday during biweekly ZFS scrub 2 VM-s were killed due to OOM. Is that possible that during ZFS scrub ARC can consume more RAM than configured?

Package versions are:
Code:
proxmox-ve: 6.4-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.4-5 (running version: 6.4-5/6c7bf5de)
pve-kernel-5.4: 6.4-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
qemu-server: 6.4-2
zfsutils-linux: 2.0.4-pve1

Thank you,
Krzysztof
 
we have the same issue on 3 proxmox nodes.
all 3 nodes have high RAM consumtion (95% or more).
suddenly on each of 3 nodes the VM with most memory was killed due OOM

whenever it happens it is time 00:25
Is there a ZFS scrub or so?
this happens nearly every month and only on high RAM usage proxmox nodes.

add:
when scrub is on, proxmox KVM kills OOM the node with the most memory consumption.
we see the cron and this covers with our VM failures (killed vm due oom):

# TRIM the first Sunday of every month.
24 0 1-7 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/trim ]; then /usr/lib/zfs-linux/trim; fi

# Scrub the second Sunday of every month.
24 0 8-14 * * root if [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ]; then /usr/lib/zfs-linux/scrub; fi

#################

is there a way to limit the memory usage of the scrub or can we disable the scrub?

we have not overcommited the RAM. the sum of all VMs RAMs are bit lower than hole proxmox node memory. but pagecache and ZFS cache seems to take a lot of RAM too.
how much % of proxmox node memory can we use for our VMs without getting troubles? 90%?

many thanks.
 
Last edited:
is there a way to limit the memory usage of the scrub or can we disable the scrub?

we have not overcommited the RAM. the sum of all VMs RAMs are bit lower than hole proxmox node memory. but pagecache and ZFS cache seems to take a lot of RAM too.
how much % of proxmox node memory can we use for our VMs without getting troubles? 90%?
ZFS takes up to 50% (with and without scrub) unless you change it: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_limit_memory_usage
EDIT: Proxmox itself needs at least 2GB: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_system_requirements
 
many thanks.
50% is a lot.
we have other proxmox nodes with 80% memory consumtion, without problems.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!