[TUTORIAL] Datastore Performance Tester for PBS

Der Harry

Active Member
Sep 9, 2023
222
48
28
This is a performance tester for datastores of your PBS.

-> Intended before you setup a production PBS. <-

Bash:
apt-get update
apt-get install git
git clone https://github.com/egandro/pbs-storage-perf-test.git
cd pbs-storage-perf-test
# replace datastore-dir with your own directory
./create_random_chunks.py /datastore-dir
rm -rf /datastore-dir/dummy-chunks

- Read how/what/and why we test in this way: https://github.com/egandro/pbs-storage-perf-test/blob/main/README.md
- Our results: https://github.com/egandro/pbs-storage-perf-test/blob/main/results.md
- Our conclusion: https://github.com/egandro/pbs-storage-perf-test/blob/main/conclusion.md
 
I like to hear this. I've got one of these.
"it is ok to have your PBS installed as VM and put the virtual datastore disk (the .qcow2 file in Proxmox) on nfs"

And this was obvious, but now its been proven.
"avoid nfs and samba like the plague"

And yes, I built one of these out of a pile of junk. Much more challenging than it sounds! Good to hear that i could have done worse somehow.
"usb is not super bad (in contrast to nfs, smb)"


I don't really python. I'm gonna dig through the code. I want to repurpose it as a before and after test for disk tuning.
 
Concerning "give some reasons why zfs should be preferred over ext4, our numbers don't show any benefit in zfs so far":
ZFS is about data integrity and not performance. Ext4 is way simpler and therefore usually performs better. Same for mdadm as software raid.
ZFS for PBS is useful to:
- integrity checks of non-chunk files like of index/catalog files (as the built-in verify jobs will only verify the chunks)
-the ability to fix corrupted chunks (PBS verify tasks can only detect but not correct corrupted chunks...PBS will try to upload that chunk again once it is marked as corrupted...but you are screwed if that data doesn't exist anymore on the PVE or another PBS...especially as a single corrupted chunk could make ALL of the backup snapshots of a VM non-restorable...fro mthe backup done an hour ago down to the first backup snapshot you did 4 years ago...or even the backup snapshots of multiple VMs. Thats the problem with deduplication. If nothing is ever stored twice, you are screwed if that single copy is lost or gets damaged. Another reason why a single PBS isn't a good idea and why it's a good thing to have multiple synced PBSs at different locations)
- snapshots (search the forums how often people ask if an accidentally deleted backup snapshot could be restored)
- way faster GC tasks in case you use some SSDs as special devices or L2ARC to store all the metadata on fast SSDs and only the data on HDDs (so many people don't want to spend for SSD-only)
- software raid
- some people also like to abuse the PBSs ZFS pool as a target for replication for backups reasons outside of PVE ("zfs send | zfs recv")

So performance-wise ZFS only helps to boost HDD performance using SSDs or by combining lots of smaller disks into a raid array for better bandwidth/IOPS instead of few bigger disks.
 
Last edited:
  • Like
Reactions: wbk and UdoB
Concerning "give some reasons why zfs should be preferred over ext4, our numbers don't show any benefit in zfs so far":
ZFS is about data integrity and not performance. Ext4 is way simpler and therefore usually performs better. Same for mdadm as software raid.
ZFS for PBS is useful to:
We run a speed test. Do you have numbers that it performs better?

I can't - probably on raid - I covered that but I didn't test it.

Anything else about ZFS - yes - true - but this is a filesystem speed test not a feature test.
 
Anything else about ZFS - yes - true - but this is a filesystem speed test not a feature test.
I would count that "HDD for data + SSD for metadata" vs "HDD for data as well as metadata" as a important part of a filesystem speedtest, not just for a feature test. Will result in magnitudes of faster GC tasks and a cheap option for people who aren't willing to pay for SSD-only, as the SSDs need only to be 1-2% of the capacity of the HDDs.
We run a speed test. Do you have numbers that it performs better?
There are multiple threads in this forum where people benchmarked ZFS raid vs mdadm raid. From what I remember mdadm was always better performing. Same for ext4 on LVM vs ZFS dataset.
 
Last edited:
I would count that "HDD for data + SSD for metadata" vs "HDD for data as well as metadata" as a important part of a filesystem speedtest, not just for a feature test.

Feel free to send a PR request with the data you collect. I am happy to put that in the result set. I consider that > not < important. If ext4 vs. zfs for a large e.g. n=100.000 only has 1-5 sec of a difference it's not relevant. There is no significant gain in using that.

If you have the time and the resources to proove that lvm (what was never tested!) is 3 seconds faster then zfs raid - feel free to send a PR.

For my tests - that acutally you can read > I < could not find a benefit in zfs.

Probably in the features there is...

There are multiple threads in this forum where people benchmarked ZFS raid vs mdadm raid. From what I remember mdadm was always better performing. Same for ext4 on LVM vs ZFS dataset.

I didn't run multiple hardware / ...

My point I wanted to prove - avoid nfs / smb ... if you are forced to use a remote fs (which is bad!) use sshfs. That's the fastes of the worse.


Feel free to send PR! I am happy to add them and provide them for the public.

I coudn't prove for 500.000 files - that given the same (single drive) hardware - ext4 was better or worse then zfs.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!