ZFS L2ARC sizing and memory requirements

gkovacs

Well-Known Member
Dec 22, 2008
509
48
48
Budapest, Hungary
We are planning our Proxmox VE 4 cluster, and decided on ZFS (provided that snapshot backups will work for both KVM and LXC guests). We plan to use small nodes (4C/8T CPU, 32 GB RAM, 5-8 disk RAIDZ2 in the 3-6 TB usable range). We will employ one SSD drive per node as ZIL and L2ARC (if using 2, ZIL will be mirrored, L2ARC striped), and need to decide on how big SSDs to buy. We were planning to buy 512GB SSDs (using 16GB for ZIL, rest for L2ARC), but it looks like from people on the web that with only 32GB memory, we might be better off with much smaller L2ARC partitions.

According to this ZFS on Linux issue, one guy with 4GB of ARC space has problems managing a 125GB L2ARC.
https://github.com/zfsonlinux/zfs/issues/1420

There is of course many info on the web, advice for ARC/L2ARC ratio ranges from 1:5 to 1:40, and I know there is no hard rule, it depends on your block size and use case.

So I am looking for some practical advice from people running ZFS with L2ARC on SSD:

- are you running KVM only or containers as well
- are you using zvols or zpools, if zvol what filesystem
- how big are your arrays
- what is your block size
- how big is your ARC (is it fixed size or dynamic)
- how big is your L2ARC SSD cache
- what are your arcstats and l2 header stats
 
Last edited:

mir

Famous Member
Apr 14, 2012
3,559
122
83
Copenhagen, Denmark
Also:
Should an SSD disk be underprovisioned given the fact that ZFS lack trim support. Some say under provisioning was for historical reasons and that modern SSD's are delivered underprovisioned by manufacturers.
 

vkhera

Member
Feb 24, 2015
192
13
18
Maryland, USA
Take the money you want to spend on an L2ARC and spend it on RAM instead. The ZIL will only do you good if your writes are synchronous. I don't know if the KVM disks are written in sync mode. I'd likely spend that money on RAM also.

My PVE boxes do not have either L2ARC or ZIL drives and they are fast enough for my needs, which is software development support. I don't have any production services on it. I sized the RAM to be big enough for my projected VM needs + the ARC. I expect 50% to be in ARC use.

I do have a pair of large database servers running FreeBSD using ZFS and they have an L2ARC which is rarely used. The machines have 256GB of RAM, and about 60% of that goes to the ARC. The essence of that is that my DBs run mostly out of RAM for their working set, which was modeled and predicted to need what we provisioned. I did not put a ZIL on those machines because of space considerations, and I don't think I miss it at all. The DB is still fast enough that people do not notice any delays in the UI. I think in retrospect, I'll probably use a ZIL for a DB server next time I build one.
 

vkhera

Member
Feb 24, 2015
192
13
18
Maryland, USA
Also:
Should an SSD disk be underprovisioned given the fact that ZFS lack trim support. Some say under provisioning was for historical reasons and that modern SSD's are delivered underprovisioned by manufacturers.

My vendor under provisioned the L2ARC drives in my DB servers (100GB Intel DC S3700 Series HET-MLC). They were built about 18 months ago. This turned out to be unnecessary given how little the L2ARC gets used with the large RAM available for the ARC.
 

mir

Famous Member
Apr 14, 2012
3,559
122
83
Copenhagen, Denmark
Take the money you want to spend on an L2ARC and spend it on RAM instead. The ZIL will only do you good if your writes are synchronous. I don't know if the KVM disks are written in sync mode. I'd likely spend that money on RAM also.
They are written in sync mode unless you have configured your pool to always use async (sync = off)

My PVE boxes do not have either L2ARC or ZIL drives and they are fast enough for my needs, which is software development support. I don't have any production services on it. I sized the RAM to be big enough for my projected VM needs + the ARC. I expect 50% to be in ARC use.

I do have a pair of large database servers running FreeBSD using ZFS and they have an L2ARC which is rarely used. The machines have 256GB of RAM, and about 60% of that goes to the ARC. The essence of that is that my DBs run mostly out of RAM for their working set, which was modeled and predicted to need what we provisioned. I did not put a ZIL on those machines because of space considerations, and I don't think I miss it at all. The DB is still fast enough that people do not notice any delays in the UI. I think in retrospect, I'll probably use a ZIL for a DB server next time I build one.
If your usecase are random write intensive and sync on the pool are default or always you will see a noticeable performance gain. Of course this is only true for a ZFS HDD pool. If your pool is made of SSD there is no point in using a zlog, or l2arc for that matter.

For a database pool on SAS or SATA HDD I would definitely recommend a zlog. This test for SSD as ceph journal is also valid for SSD zlog -> http://www.sebastien-han.fr/blog/20...-if-your-ssd-is-suitable-as-a-journal-device/
 

spirit

Famous Member
Apr 2, 2010
5,871
704
133
www.odiso.com
Each block in l2arc, need around 400Bytes in arc.

So, it's depend if you use small blocks (4K) or big blocks (128K) for your zfs array.


(I the past I didn't known that, put 1TB l2arc, 4k block with only 25GB ram, and then a lot of problems and hang occur, crash and hangs)
 

gkovacs

Well-Known Member
Dec 22, 2008
509
48
48
Budapest, Hungary
Take the money you want to spend on an L2ARC and spend it on RAM instead. The ZIL will only do you good if your writes are synchronous. I don't know if the KVM disks are written in sync mode. I'd likely spend that money on RAM also.

My PVE boxes do not have either L2ARC or ZIL drives and they are fast enough for my needs, which is software development support. I don't have any production services on it. I sized the RAM to be big enough for my projected VM needs + the ARC. I expect 50% to be in ARC use.

I'm not sure how to phrase my post any clearer than I already did: we are going to use 32GB RAM nodes (maximum allowed by the platform), so I can't "spend it on RAM instead". I also asked for people who are running SSD drives for L2ARC, which you clearly don't. Don't want to offend you as I'm sure you have the best intentions, but I can't really do anything with your advice.
 

mir

Famous Member
Apr 14, 2012
3,559
122
83
Copenhagen, Denmark
My pool:
vMotion recordsize 128K default

l2arc size: 92.2 GB
zlog: 7.7 GB
RAM: 16 GB

Current SSD: Corsair Force GS 128 GB. Only 100 GB partitioned.

I have never had any problems or hangs what so ever.

In a few days the zlog will be replaced by a Intel DC S3510 80 GB and the the full 100 GB will be used as l2arc.

The server was rebooted some days ago due to kernel upgrade.

Code:
ARC/L2ARC Readcache: arcstat.pl, get cache hits for next 5s, please wait..
see arcstat.pl

 time  	 read  	 hits  	 miss  	 hit%  	 l2read  	 l2hits  	 l2miss  	 l2hit%  	 arcsz  	 l2size  	 l2asize  	   
 20:54:51  	 0  	 0  	 0  	 0  	 0  	 0  	 0  	 0  	 12G  	 3.4G  	 0  	   
 20:54:56  	 782  	 776  	 6  	 99  	 6  	 0  	 6  	 6  	 12G  	 3.4G  	 0 

ZIL syncronous write cache: zilstat.ksh 2 3
(3 measurements with 2s interval, please wait 10s)

http://www.richardelling.com/Home/scripts-and-programs-1/zilstat

   N-Bytes  N-Bytes/s N-Max-Rate    B-Bytes  B-Bytes/s B-Max-Rate    ops  <=4kB 4-32kB >=32kB
    176336      88168     166880    1056768     528384     921600     10      2      0      8
      49720      24860       30808     610304     305152     339968       9      3      0      6
    280792     140396     224888   1323008     661504     798720     13      3      0     10
 
Last edited:

vkhera

Member
Feb 24, 2015
192
13
18
Maryland, USA
I also asked for people who are running SSD drives for L2ARC, which you clearly don't. Don't want to offend you as I'm sure you have the best intentions, but I can't really do anything with your advice.

I do run SSD L2ARC, just not with proxmox, and I found that having adequate RAM is better than the L2ARC.
 
  • Like
Reactions: DonMcCoy and hat

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!