ZFS ARC 64GB Limit realistic performance impact?

tomstephens89

Renowned Member
Mar 10, 2014
202
8
83
Kingsclere, United Kingdom
Hi all, have just deployed a new 3 node cluster consisting of 3 dell boxes, each with two 14 core CPU’s, 512GB RAM and 10 SSD’s.

I know out of the box ZFS ARC will consume upto half of the hosts memory so 256GB in my case. I also know that this memory is released to other processes when needed.

However, it makes at a glance capacity and resource checking via the GUI a bit of a pain, since memory usage will be upto 250GB higher than actual VM utilisation.

If I set my ARC conf to a minimum of 4GB, with a max of 64GB, how much of a performance hit will I realistically take? I mean 64GB is still a big cache, and I’ve got 10 SSDs in the pool.

Thanks
Tom
 
Hi,
what we have seen in small and medium deployments with an all Flash-Disk-Setup that even 8GB max are enough.
Only if you have very I/O-Intensive workloads consider setting it bigger.

But with 3 Nodes each having SSDs I would also recommend using CEPH when you have at least a dedicated 10GBe or faster NIC.
 
Hi,
what we have seen in small and medium deployments with an all Flash-Disk-Setup that even 8GB max are enough.
Only if you have very I/O-Intensive workloads consider setting it bigger.

But with 3 Nodes each having SSDs I would also recommend using CEPH when you have at least a dedicated 10GBe or faster NIC.

Really, 8GB max!?. I’m conscious here that best practice is to give ZFS as much RAM you can can….

I’ve used Ceph on other clusters. It works, but I don’t need the shared storage model here.
 
Hi,
yes you can start with 8GB and validate how it works. If you really experience high I/O-Wait-Times you can extend to 12 or 16 GB.
It all depends on your workloads. As I said, an all Flash-Setup with Enterprise SSDs can be fast enough even with smaller cache in ZFS.
But do not use RAID5/6 like setups the distributed parity calculation really does not like small caches. This will kill the performance. Consider RAID10-Like ZFS-Setup.
 
Hi,
yes you can start with 8GB and validate how it works. If you really experience high I/O-Wait-Times you can extend to 12 or 16 GB.
It all depends on your workloads. As I said, an all Flash-Setup with Enterprise SSDs can be fast enough even with smaller cache in ZFS.
But do not use RAID5/6 like setups the distributed parity calculation really does not like small caches. This will kill the performance. Consider RAID10-Like ZFS-Setup.

I’m using a RAID Z1 with default config. Which I understand is basically the same as a traditional RAID 5. Performance appears to be good. But then again I do have a massive 256GB max cache.

Which ZFS option on the installer corresponds to which equivalent RAID layout? When using traditional controllers I would always use RAID10’s for production / compute anything. Only ever RAID 5/6 for backup servers.
 
Hi, the Proxmox GUI-Installer shows ZFS RAID-10 and you can select at least 4 Disks to create it.
Also the WebGUI shows RAID10 as RAID10 when you create a ZFS Volume.
I would not recommend to use RAID-5-Like ZFS for production workload. What it currently makes it behave fast is the massive amount of cache.
So you have to decide what fits your needs best. RAID10 and smaller cache and only 50% usable space or large cache with parity on ZFS.
 
Hi, the Proxmox GUI-Installer shows ZFS RAID-10 and you can select at least 4 Disks to create it.
Also the WebGUI shows RAID10 as RAID10 when you create a ZFS Volume.
I would not recommend to use RAID-5-Like ZFS for production workload. What it currently makes it behave fast is the massive amount of cache.
So you have to decide what fits your needs best. RAID10 and smaller cache and only 50% usable space or large cache with parity on ZFS.

Ah I remember now.

I think im going to bin these cluster nodes one at a time, rebuild with ZFS RAID 10, set a 32GB cache and re add to cluster.

Almost forgot how much parity based RAID sucks.
 
Ah I remember now.

I think im going to bin these cluster nodes one at a time, rebuild with ZFS RAID 10, set a 32GB cache and re add to cluster.

Almost forgot how much parity based RAID sucks.
Report back how it works, so others can benefit from your experience.
Thank you....
 
Report back how it works, so others can benefit from your experience.
Thank you....

I am back. 2 nights ago we broke the cluster down 1 node at a time and rebuilt with ZFS RAID 10. Performance improvement is massive. I done some quick Fio testing and could acheive over 2200MB/s sequential. Random performance appeared to present between 90k and 250k IOPS depending on the type of FIO jobs.

I'll post some bench results soon, using the below as a guide:

https://cmdref.net/os/linux/command/fio.html
 
  • Like
Reactions: itNGO

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!