Slow read, low IO ZFS pool issue

Pito2317

New Member
Feb 8, 2024
4
0
1
Hi, I have problem with low IO and really low read on my backup server. Operations such as cleaning old backups chunks take like 10 days (i have 20TB HDD pool with SSD for ZIL and L2ARC, and its almost full):
Code:
2024-05-16T14:10:00+02:00: starting garbage collection on store HDD1
2024-05-16T14:10:00+02:00: task triggered by schedule 'daily'
2024-05-16T14:10:00+02:00: Start GC phase1 (mark used chunks)
2024-05-16T23:53:49+02:00: marked 1% (36 of 3553 index files)
2024-05-17T12:42:43+02:00: marked 2% (72 of 3553 index files)
2024-05-17T23:48:08+02:00: marked 3% (107 of 3553 index files)
2024-05-18T09:53:53+02:00: marked 4% (143 of 3553 index files)
2024-05-18T22:50:36+02:00: marked 5% (178 of 3553 index files)
2024-05-19T07:14:28+02:00: marked 6% (214 of 3553 index files)
2024-05-19T11:43:56+02:00: marked 7% (249 of 3553 index files)
2024-05-19T16:29:23+02:00: marked 8% (285 of 3553 index files)
2024-05-20T02:02:22+02:00: marked 9% (320 of 3553 index files)
2024-05-20T02:14:49+02:00: marked 10% (356 of 3553 index files)
2024-05-20T02:23:23+02:00: marked 11% (391 of 3553 index files)
2024-05-20T02:35:52+02:00: marked 12% (427 of 3553 index files)
2024-05-20T02:44:05+02:00: marked 13% (462 of 3553 index files)
2024-05-20T05:12:40+02:00: marked 14% (498 of 3553 index files)
2024-05-20T05:42:39+02:00: marked 15% (533 of 3553 index files)
2024-05-20T07:05:40+02:00: marked 16% (569 of 3553 index files)
2024-05-20T07:33:52+02:00: marked 17% (605 of 3553 index files)
2024-05-20T07:54:24+02:00: marked 18% (640 of 3553 index files)
2024-05-20T09:45:18+02:00: marked 19% (676 of 3553 index files)
2024-05-20T15:35:15+02:00: marked 20% (711 of 3553 index files)
2024-05-20T16:05:12+02:00: marked 21% (747 of 3553 index files)
2024-05-20T16:13:02+02:00: received abort request ...
2024-05-20T16:13:02+02:00: TASK ERROR: abort requested - aborting task

I have configured L2ARC and ZIL on two different SSDs, but even if I remove them from pool (to see if anything changes), IOps and read speed won't change.

Heres benchmark:
1716294549609.png

And iotop while marking chanks:

1716294662498.png
And htop here:
1716295325951.png

My Hardware:
  • Intel(R) Xeon(R) CPU E5-1650 0 @ 3.20GHz
  • 16GB RAM
  • 8x HUA723030ALA641 - ZFS Pool
  • 1x Samsung MZVLW256 - 256GB
  • 1x Samsung PM851 - 128GB
Whats wrong? Is there any way to speed it up?
 
Hi!
i missed your note that the pool is nearly full. In this case performance will obviously degrade by a lot. This can also be dangerous, because if the pool runs full, the Garbage Collection cannot run anymore and you can't cleanly delete chunks! So set a quota on your pool (zfs set quota=<size> <zpool>/<datastet>) and perhaps delete some backups/add some more disks.

Another thing you could try would be to add a special device[0].

Also note that the proxmox-backup-client benchmark doesn't do any disk operations, it only tests the encryption/compression speeds.
By the way, what raid level do you use? (stripe,mirror and stripe+mirror are quite fast, while raidz* are slower.)

[0]: https://openzfs.github.io/openzfs-docs/man/master/7/zpoolconcepts.7.html#special
 
Last edited:
Both ZIL and L2ARC are useless for a PBS workload. The first is used only for sync writes, which PBS does not. The second because for it to have any benefit you would some very big SSD for L2ARC (at least 4TB). I have done tests with 1TB L2ARC for a 8TB datastore and cache hit ratio was under 5% while it wrote a lot to the SSD making it wearout fast for no real benefit.

Keep used space under 85% at all costs. Add as much RAM as you can. As ggoller said, use a special device on a mirror of at least two 512GB enterprise SSDs. If you add the special device now, only newly written metadata will be allocated to the special device: you won't notice any measurable performance benefit.

In the long run, your best option would be to setup a new PBS with more ram and special device and sync data from the old to the new.
 
So, i configured zfs quota size, and added new special device on nvme in mirror mode. But what size i should set up for special device small blocks, or how can i calculate this?
 
4k should be enough for a PBS workload:

Code:
zfs set special_small_blocks=4k YOUR_ZFS_POOL

If special device is big enough, you may use up to 16k, but the performance benefit is marginal for PBS.

Remember my previous statement:
If you add the special device now, only newly written metadata will be allocated to the special device: you won't notice any measurable performance benefit.

That means that the directory structure created when you deployed your datastore will be kept in your HDDs with all the current metadata, hence the limited performance benefit unless you rewrite everything. An option could be to create another dataset, enable special_small_blocks, create a datastore and use sync to move your backups from the old PBS datastore to the new one. That will require space for both in your pool.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!