Poor GC/Verify performance on PBS

triggad

Member
Jun 20, 2023
3
2
8
Hi,
we run a Dell Poweredge T640 with 1 Xeon Silver 4114 (10 Core @ 2,2GHz) and 64GB RAM.

Storage is 2x 32GB Sata SSD for System and 18x 4TB SAS HDD (7200 RPM) for Data.

The Controller is a PERC H730P, but all Disks are configured as Non-Raid.

The System Pool is a Mirror and the datapool is a raidz3

The backup itself is running "ok", not fast but suitable for our needs. What takes forever is a GC or Verify job.

At the moment we have about 28TB of Backup Data on the store and the last successful GC Job did take about 4 days...

Another GC Job is running since 2 days and so far marked only 51% (phase1).

Code:
2025-09-25T14:22:00+02:00: starting garbage collection on store DS1
2025-09-25T14:22:00+02:00: task triggered by schedule 'daily'
2025-09-25T14:22:01+02:00: Access time update check successful, proceeding with GC.
2025-09-25T14:22:01+02:00: Using access time cutoff 1d 5m, minimum access time is 2025-09-24T12:17:00Z
2025-09-25T14:22:01+02:00: Start GC phase1 (mark used chunks)
2025-09-25T14:43:22+02:00: marked 1% (28 of 2770 index files)
2025-09-25T15:00:15+02:00: marked 2% (56 of 2770 index files)
2025-09-25T15:08:02+02:00: marked 3% (84 of 2770 index files)
.
.
.
2025-09-27T13:39:15+02:00: marked 51% (1466 of 2873 index files)

What can we do to improve the performance?

Could there be made better hardware choices? More Ram or a second processor would be the easiest. Exchanging the HDDs for SSDs is to expensive a the moment, but we could add a NVMe SSD as cache i.e.

Anything else we could do?

If you need some additional data, feel free to ask. I will try to deliver what is possible.

Greets
Daniel
 
  • Like
Reactions: FrankList80
Thats expected, the more backups you add the slower GC & verify will get due to how PBS handles deduplication (many small chunk files) and hdd's can't handle that well.

Solutions would be, switch to all flash storage for your backups or if that's too expensive you could add 2 (enterprise!) ssd's as vdev mirror special devices to your zfs hdd pool. The hdd pool should be zfs vdev mirrors (RAID10 equivalent) because any zfs raid mode would be way slower.

Unfortunately adding special devices now won't help, i think you have to do this at creation of the pool, metadata already present on the hdd pool won't be written to the special devices, only newly written metadata will be stored on the special devices.

If you decide to add special devices to your hdd pool then make sure they are good enterprise ssd's with PLP and at least 1DWPD, if you loose your special devices then all data is gone ...
 
Hi MarkusKo,
thanks for your fast reply! So all flash would be the best way.

As the my Dell T640 has no NVMe backplane (and getting such things can be from impossible to very very expensive...) I'm thinking of using PCIe to m.2 riser cards. For SSDs im planing on using Lexar NM790 4TB SSDs (4TB as 8TB are nearly 3x the price). As a riser I'm not quite sure which one to take.

As read here:
https://www.dell.com/support/manual...e470f9-16b1-4a77-a17e-637bfc3ba641&lang=en-us
the T640 seems to support Bifurication, so something like this:
https://www.delock.de/produkt/90054/merkmale.html?g=PCIE_10
should work...

With one processor I can use 3 Slots (1 8x and 2 16x) So starting with 2 Cards and 8 SSDs, I should get 22TB usable running as RaidZ2.
(The x8 Slot is used for a SAS Controller for a Tape Changer which I'm planing to use in the future).

If I need more storage, i would have to add a 2nd CPU before.

This Storage i Could use as a Fast Storage for the daily jobs and the existing Harddrive Storage I could snyc and use for long term backup...

So here are some new questions:

- Are my expectations right concerning speed of backup/GC/verify on the NVMe SSDs?

- Are there better hardware options? The Lexar was the best TB/€ value from a known brand, and delock was the cheapest non noname brand...

- can I sync just older backup to another datastore? So that I have different prune settings for both datastores and also faster GC on the HDDs?

Greets
Daniel
 
  • Like
Reactions: FrankList80
For reference
https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements

  • Backup storage:
    • Prefer fast storage that delivers high IOPS for random IO workloads; use only enterprise SSDs for best results.
    • If HDDs are used: Using a metadata cache is highly recommended, for example, add a ZFS special device mirror.”

We have SATA (not nvme) SSD storage and not anywhere as large as yours, but no issues.

The sync copies all chunks but you can have a longer/different retention period on the second/off site datastore.
 
  • Like
Reactions: Johannes S
An import consideration is the needed capacity ( rule of thumb is 0.02-0.03 of HDD storage capacity ) and amount of ssds since the redundant of the special device should match the redundancy of the pool:

The redundancy of the special device should match the one of the pool, since the special device is a point of failure for the entire pool.
Warning

Adding a special device to a pool cannot be undone!
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_special_device

See also: https://forum.proxmox.com/threads/need-help-for-zfs-special-device.100260/
So you will need at least need four or more SSDs of around 2TB Capacity. And they should be Server ssds with power-loss-protection anything else is just asking for trouble imho.

The special device will also not help for existing data you need to rewrite it with zfs send/receive to get the speedup.
 
  • Like
Reactions: UdoB and MarkusKo
Before your one spent much money try to use btrfs raid 10 as your storage system and test the GC and verify job again.