Environment
- PBS 4.1.1 on ZFS RAIDZ2, 14× 20TB HDD + NVMe mirror as Special Device
- `special_small_blocks = 0`
- Datastore: ~75TB used, 257 groups, ~11.400 snapshots, growing
- Verify config: 1 reader, 4 workers, 30-day interval, chunk iteration order: inode
---
Observation
After a full initial sync from the source server the verify run took approximately 5 days. During that time the server is largely unresponsive for UI operations — snapshot browsing times out consistently. We expect this to be the baseline duration for every subsequent 30-day run, growing as the datastore fills.
We measured resource utilization during an active verify run:
- IO-Delay: 1.5% — HDDs are nearly idle
- Load average: 1.76 / 1.88 / 1.84 on a 64-core system — effectively 2.75% CPU utilization
- Transfer rate: 50–200 MB/s, IOPS: ~100 — well below what 14 HDDs in RAIDZ2 could sustain
Neither CPU nor HDDs appear to be the bottleneck. The verify process appears to be limited by a single sequential read path — which matches the default configuration of
1 verification reader.
---
What we have considered
Increasing readers might seem counterproductive on HDDs due to seek contention. However, given that IO-Delay is only 1.5% and IOPS are around 100, the drives appear to have significant headroom. With `chunk iteration order: inode` reducing seek overhead, it is unclear whether additional readers would cause contention or simply better utilize available throughput.
Increasing workers beyond 4 seems unlikely to help — SHA256 computation is fast and the workers appear to be starved by the single reader rather than being the bottleneck themselves.
Namespace-based staggering does not help either since deduplication is datastore-wide — chunks referenced across namespaces exist once physically and must still be read regardless of which namespace is being verified.
---
Questions
Given the measured utilization numbers — 1.5% IO-Delay, ~2.75% CPU, ~100 IOPS on 14 HDDs — how should verification readers and workers be tuned for a datastore of this size? Is 1 reader intentionally conservative, and what are the trade-offs of increasing it?
Is there a recommended approach for large HDD-based datastores that we are missing?
Tanks for your efforts.
Greetings, Stefan
- PBS 4.1.1 on ZFS RAIDZ2, 14× 20TB HDD + NVMe mirror as Special Device
- `special_small_blocks = 0`
- Datastore: ~75TB used, 257 groups, ~11.400 snapshots, growing
- Verify config: 1 reader, 4 workers, 30-day interval, chunk iteration order: inode
---
Observation
After a full initial sync from the source server the verify run took approximately 5 days. During that time the server is largely unresponsive for UI operations — snapshot browsing times out consistently. We expect this to be the baseline duration for every subsequent 30-day run, growing as the datastore fills.
We measured resource utilization during an active verify run:
- IO-Delay: 1.5% — HDDs are nearly idle
- Load average: 1.76 / 1.88 / 1.84 on a 64-core system — effectively 2.75% CPU utilization
- Transfer rate: 50–200 MB/s, IOPS: ~100 — well below what 14 HDDs in RAIDZ2 could sustain
Neither CPU nor HDDs appear to be the bottleneck. The verify process appears to be limited by a single sequential read path — which matches the default configuration of
1 verification reader.
---
What we have considered
Increasing readers might seem counterproductive on HDDs due to seek contention. However, given that IO-Delay is only 1.5% and IOPS are around 100, the drives appear to have significant headroom. With `chunk iteration order: inode` reducing seek overhead, it is unclear whether additional readers would cause contention or simply better utilize available throughput.
Increasing workers beyond 4 seems unlikely to help — SHA256 computation is fast and the workers appear to be starved by the single reader rather than being the bottleneck themselves.
Namespace-based staggering does not help either since deduplication is datastore-wide — chunks referenced across namespaces exist once physically and must still be read regardless of which namespace is being verified.
---
Questions
Given the measured utilization numbers — 1.5% IO-Delay, ~2.75% CPU, ~100 IOPS on 14 HDDs — how should verification readers and workers be tuned for a datastore of this size? Is 1 reader intentionally conservative, and what are the trade-offs of increasing it?
Is there a recommended approach for large HDD-based datastores that we are missing?
Tanks for your efforts.
Greetings, Stefan