High IO Delay ZFS crashes VMs

jbaileypro

Active Member
Jan 26, 2015
8
0
41
Hi all,

Proxmox VE community subscription user.

I have a newer server a Dell R640 with a 96GB Ram, Xeon Gold 6134. Storage is a Samsung PM1735 1.6TB NVMe SSD, 2x Intel 480GB SSD in ZFS Mirror for boot, 2x 2TB WD Gold HDD in ZFS Mirror.

When a big disk operation such as a backup restore is happening then the IO delay starts to build and then the VMs complain about memory issues and then fail (bluescreen or lock up). These are mostly Windows VMs as I disable memory ballooning on them. When all VMs are booted the utilised memory sits around 60% normally so have 30-40GB free of RAM.

I've guessed this may be ZFS is using up too much RAM while doing operations especially when writing to slower media such as the WD Gold HDDs. I've tried to limit RAM usage but this hasn't seemed to make any difference.

Is there someone with similar issues that can help? Restore was from an NFS share on a 1G link so shouldn't saturate the host which is connected at 10G.

Many thanks,
 
Hi,
if there is too much IO-related load during restore, you might want to set a bandwidth limit for that. Can be done cluster-wide in the UI Datacenter > Options > Bandwidth Limits > Backup Restore or for the specific storage with the CLI command pvesm set <storage ID> -bwlimit restore=<value in KiB/s> (note that restoring a backup from PBS currently doesn't honor the limit, because it's not implemented for that yet).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!