Heavy writes in VM crash host - ZFS - out of memory (plenty of memory)

JohnMC

Active Member
Oct 12, 2017
11
8
43
39
mclarendata.com
Dell T340
32GB ram
WD Blue 512GB SSDs
No RAID

With a clean install of 6.1 or 5.4 and ZFS either as root or a seperate disk this server appeared slow and unstable with reboots or locking with out of memory messages on the console.

The surefire way to cause the system to reboot is by running a write intensive workload in a VM, for testing i've been running a windows 10 VM with CrystalDiskMark 1GB sequential test and the system will reboot 100% of the time on the write portion of the test. I only have 4GB RAM for the VM and nothing else running on the system and yet right before the system reboots I see it jump from about 21GB free to no free memory. Even if i add some swap it will try to fill the swap and eventually crash the system.

This only happens when the workload is on the zfs storage yet I've tried reducing c_max to 8GB from the default 16GB and nothing changed, it looks like arc is not at fault here but I'm not sure how to prove that.

I've used the same configurations, testing methods and the same model of drives on an old 3rd gen i5/20GB workstation and can't reproduce the issue. I also have a number of other servers and repurposed workstations where I use the same general configurations and workloads and I don't see it anywhere else. i'm having trouble imagining how it could be hardware related but i'm not sure what to test to prove it either way.

I've tried to trigger the same issue unsuccessfully by running something like this on the server: dd if=/dev/urandom of=/rpool/data/output bs=4k count=1000k

Any ideas and troubleshooting steps would be much appreciated.

Last "watch -n1 cat /proc/spl/kstat/zfs/arcstats" before ssh dies:

Code:
12 1 0x01 98 26656 3915857582 595305586369
name                            type data
hits                            4    2740093
misses                          4    11493
demand_data_hits                4    2665846
demand_data_misses              4    46
demand_metadata_hits            4    72653
demand_metadata_misses          4    11383
prefetch_data_hits              4    1592
prefetch_data_misses            4    38
prefetch_metadata_hits          4    2
prefetch_metadata_misses        4    26
mru_hits                        4    61878
mru_ghost_hits                  4    0
mfu_hits                        4    2676621
mfu_ghost_hits                  4    0
deleted                         4    14377
mutex_miss                      4    0
access_skip                     4    0
evict_skip                      4    3
evict_not_enough                4    0
evict_l2_cached                 4    0
evict_l2_eligible               4    1175031808
evict_l2_ineligible             4    32768
evict_l2_skip                   4    0
hash_elements                   4    161092
hash_elements_max               4    175322
hash_collisions                 4    5849
hash_chains                     4    3053
hash_chain_max                  4    3
p                               4    127
c                               4    1047373440
c_min                           4    1047373440
c_max                           4    16757975040
size                            4    2708368432
compressed_size                 4    1358569472
uncompressed_size               4    1364000768
overhead_size                   4    1197100032
hdr_size                        4    111337800
data_size                       4    2525781504
metadata_size                   4    30010880
dbuf_size                       4    41038920
dnode_size                      4    162336
bonus_size                      4    28800
anon_size                       4    2271887872
anon_evictable_data             4    0
anon_evictable_metadata         4    0
mru_size                        4    281814528
mru_evictable_data              4    277028864
mru_evictable_metadata          4    0
mru_ghost_size                  4    1057112064
mru_ghost_evictable_data        4    1039704064
mru_ghost_evictable_metadata    4    17408000
mfu_size                        4    2106368
mfu_evictable_data              4    0
mfu_evictable_metadata          4    0
mfu_ghost_size                  4    0
mfu_ghost_evictable_data        4    0
mfu_ghost_evictable_metadata    4    0
l2_hits                         4    0
l2_misses                       4    0
l2_feeds                        4    0
l2_rw_clash                     4    0
l2_read_bytes                   4    0
l2_write_bytes                  4    0
l2_writes_sent                  4    0
l2_writes_done                  4    0
l2_writes_error                 4    0
l2_writes_lock_retry            4    0
l2_evict_lock_retry             4    0
l2_evict_reading                4    0
l2_evict_l1cached               4    0
l2_free_on_write                4    0
l2_abort_lowmem                 4    0
l2_cksum_bad                    4    0
l2_io_error                     4    0
l2_size                         4    0
l2_asize                        4    0
l2_hdr_size                     4    0
memory_throttle_count           4    0
memory_direct_count             4    39994
memory_indirect_count           4    7769
memory_all_bytes                4    33515950080
memory_free_bytes               4    2209509376
memory_available_bytes          3    -240123904
arc_no_grow                     4    1
arc_tempreserve                 4    0
arc_loaned_bytes                4    0
arc_prune                       4    0
arc_meta_used                   4    182578736
arc_meta_limit                  4    12568481280
arc_dnode_limit                 4    1256848128
arc_meta_max                    4    1627767088
arc_meta_min                    4    16777216
async_upgrade_sync              4    18
demand_hit_predictive_prefetch  4    12
demand_hit_prescient_prefetch   4    0
arc_need_free                   4    240123904
arc_sys_free                    4    523686720
arc_raw_size                    4    0
 
I think using ZFS with Proxmox is just begging for trouble. How will you manage all the diffrent systems that want to control your RAM? Will "VM dynamic ballooning RAM" over rule "ZFS arc dynamic memory" handling? What if you are using nested virtualization? What node controls memory usage at which level?

ZFS is a great FileSystem but you should try to separate it from the server-side to make sure you don't run in to complex memory issiues that sqeeze your RAM to dry without you noticing where it's going. At least if you use nested systems make sure that only one level uses ZFS and don't use ballooning at the same time.
 
I've used the same configurations, testing methods and the same model of drives on an old 3rd gen i5/20GB workstation and can't reproduce the issue.

This is a very good indication for a faulty hardware (ram/psu/mb)

I think using ZFS with Proxmox is just begging for trouble.

No. I am using zfs+proxmox for 3 years at least using normal PCs and enterprise servers without any problems. zfs + proxmox or zfs + non-proxmox (linux) is the same. I have start using zfs with zfs-fuse (debian6, with openvz containers - it is working even now after 6-7 years), then with kernel zfs module (starting witth centos 5.x, and then with 6.x, compiling each new module kernel by hand for many years).

So from my experience, zfs is good enough and not a problem. Now with Proxmox is realy easy. But yes, I see problems with zfs when a new major version is release (like now with 0.8).


Good luck / Bafta !
 
I've used the same configurations, testing methods and the same model of drives on an old 3rd gen i5/20GB workstation and can't reproduce the issue.

This is a very good indication for a faulty hardware (ram/psu/mb)

I think using ZFS with Proxmox is just begging for trouble.

No. I am using zfs+proxmox for 3 years at least using normal PCs and enterprise servers without any problems. zfs + proxmox or zfs + non-proxmox (linux) is the same. I have start using zfs with zfs-fuse (debian6, with openvz containers - it is working even now after 6-7 years), then with kernel zfs module (starting witth centos 5.x, and then with 6.x, compiling each new module kernel by hand for many years).

So from my experience, zfs is good enough and not a problem. Now with Proxmox is realy easy. But yes, I see problems with zfs when a new major version is release (like now with 0.8).


Good luck / Bafta !
 
Hello! I just found this thread, I have exactly the same problem, someone already solved it? I am using PROXMOX 7.1-10. Thanks in advance
 
I have the same problem on multiple "DELL T40, 32GB RAM, 2 x 480GB SSD Samsung PM893" with only one VM, 1 x Win Server 2022 Standard with 9GB RAM no Ballooning and No Cache

Running Crystal Disk Mark with 8Gib and then a 1Gib proxmox freezes when it reaches the writes.

I added "options zfs zfs_arc_max=8589934592" to "/etc/modprobe.d/zfs.conf" with same results...
 
There seems to be a difference between "No Cache" and "Default No Cache".

Setting the VM Disk to "No Cache" and testing with Crystal Disk Mark in the steps I told in the previous post Proxmox always freezes with error "Kernel panic - not syncing: System is deadlocked on memory"

Setting the VM Disk to "Default No Cache" and testing with Crystal Disk Mark, Proxmox is not crashing anymore but the VM will occasionally crash with OOM Error like once in 5 tests.

EDIT:
I added 32GB more memory, total 64GB and same problem.
 
Last edited:
  • Like
Reactions: stone_well

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!