Poor GC/Verify performance on PBS

triggad

Member
Jun 20, 2023
2
1
8
Hi,
we run a Dell Poweredge T640 with 1 Xeon Silver 4114 (10 Core @ 2,2GHz) and 64GB RAM.

Storage is 2x 32GB Sata SSD for System and 18x 4TB SAS HDD (7200 RPM) for Data.

The Controller is a PERC H730P, but all Disks are configured as Non-Raid.

The System Pool is a Mirror and the datapool is a raidz3

The backup itself is running "ok", not fast but suitable for our needs. What takes forever is a GC or Verify job.

At the moment we have about 28TB of Backup Data on the store and the last successful GC Job did take about 4 days...

Another GC Job is running since 2 days and so far marked only 51% (phase1).

Code:
2025-09-25T14:22:00+02:00: starting garbage collection on store DS1
2025-09-25T14:22:00+02:00: task triggered by schedule 'daily'
2025-09-25T14:22:01+02:00: Access time update check successful, proceeding with GC.
2025-09-25T14:22:01+02:00: Using access time cutoff 1d 5m, minimum access time is 2025-09-24T12:17:00Z
2025-09-25T14:22:01+02:00: Start GC phase1 (mark used chunks)
2025-09-25T14:43:22+02:00: marked 1% (28 of 2770 index files)
2025-09-25T15:00:15+02:00: marked 2% (56 of 2770 index files)
2025-09-25T15:08:02+02:00: marked 3% (84 of 2770 index files)
.
.
.
2025-09-27T13:39:15+02:00: marked 51% (1466 of 2873 index files)

What can we do to improve the performance?

Could there be made better hardware choices? More Ram or a second processor would be the easiest. Exchanging the HDDs for SSDs is to expensive a the moment, but we could add a NVMe SSD as cache i.e.

Anything else we could do?

If you need some additional data, feel free to ask. I will try to deliver what is possible.

Greets
Daniel
 
  • Like
Reactions: FrankList80
Thats expected, the more backups you add the slower GC & verify will get due to how PBS handles deduplication (many small chunk files) and hdd's can't handle that well.

Solutions would be, switch to all flash storage for your backups or if that's too expensive you could add 2 (enterprise!) ssd's as vdev mirror special devices to your zfs hdd pool. The hdd pool should be zfs vdev mirrors (RAID10 equivalent) because any zfs raid mode would be way slower.

Unfortunately adding special devices now won't help, i think you have to do this at creation of the pool, metadata already present on the hdd pool won't be written to the special devices, only newly written metadata will be stored on the special devices.

If you decide to add special devices to your hdd pool then make sure they are good enterprise ssd's with PLP and at least 1DWPD, if you loose your special devices then all data is gone ...