Benchmark ideas for a ghetto build

IsThisThingOn · Dec 2, 2024

Ok, I will try

Why does a external 2,5" HDD perform as well as an NVME drive for reusing data?

Exceptions:
Poor performance, because it has to read many small files/metadata/hashes to compare that to the backup task. I thought that is why Proxmox officially recommends special vdev, because metadata performance is so important.

Reality:
Turns out even a very bad HDD performs as well as a NVME for that task.

IsThisThingOn · Dec 2, 2024

Ohh, and for what hardware and runs I am comparing, I am comparing:

From PVE to PBS ext4 external USB 3.0 2,5“ HDD
vs.
From PVE to PBS ext4 INTEL SSDPEKKW128G7

IsThisThingOn · Dec 9, 2024

Nobody able to answer me this?

Why does a external 2,5" HDD perform as well as an NVME drive for reusing data?

Otherwise I can't really plan for what would be my bottleneck for a real setup.

cave · Dec 9, 2024

I've read it twice but i do not understand the question.

IsThisThingOn · Dec 9, 2024

I have two PBS systems.
One uses a 2,5" HDD.
The other a half decent NVME SSD.
When I run the backup tasks twice (aka no new data, aka 100% data reused), why is there no performance difference?

_gabriel · Dec 9, 2024

Compare a Verify Job then you'll see the diff.
PBS can't write at max sequential disk speed as it doesn't write in sequential.
The point of PBS is the forever incremental and dedup backups, which allow skip full backups each time, the must for daily backups.

_gabriel · Dec 9, 2024

IsThisThingOn said:
why is there no performance difference?

if no new data, backups disks aren't used.
running backup twice with different destination disks, dirty map is cleared (=qemu incremental) so full disk source must be read.

IsThisThingOn · Dec 9, 2024

_gabriel said:
Compare a Verify Job then you'll see the diff.

That is not that important to me. As long as the verify job can finish within 6 days

No but seriously, I mostly do my backups irregularly. My VMs have their none OS and important data mostly on a NFS shares.
So I don't regularly backup them, only during maintenance. Changes are small, I only care about backup performance because I don't want to wait multiple minutes.

_gabriel said:
if no new data, backups disks aren't used.

Interesting. I thought that PVE somehow needs to know what has changed and what it has to resend and thous read some information from PBS. Apparently this is not very read intensive on PBS but more on PVE.

cave · Dec 9, 2024

IsThisThingOn said:
Interesting. I thought that PVE somehow needs to know what has changed and what it has to resend and thous read some information from PBS. Apparently this is not very read intensive on PBS but more on PVE.

PVE sends everything to PBS. So Full Read on PVE. PBS than deduplicates on the disk what has already been stored.

IsThisThingOn · Dec 9, 2024

cave said:
PVE sends everything to PBS. So Full Read on PVE. PBS than deduplicates on the disk what has already been stored.

Ahh that makes sense. So basically both disk were not the bottleneck and able to receive roughly 100mb/s, but the bottleneck was the PVE TLS speed of 117.54 MB/s?

_gabriel · Dec 9, 2024

cave said:
PVE sends everything to PBS. So Full Read on PVE

Except for Live / "Snapshot" backup where "dirty map" feature skip "Full" read and only new data is read and send.
So Live / "Snapshot" backup is faster than "Shutdown" backup in many cases.

IsThisThingOn · Dec 9, 2024

That is very interesting.
I somehow, without any knowledge to back it up, don't trust live backups.
Yeah I know that QEMU should handle that, but to me it feels like added complexity to something where I wanna reduce the chances for errors as much as possible.

Snapshot is probably also not applicable to me, since I use ZFS and you are not talking about ZFS snapshots, but qcow2 snapshots.

An official ZFS snapshot sending mechanism from Proxmox would be pretty cool

But until then, I will use PBS, shutdown backups and try to figure out the bottleneck of my workload.
Currently, for my use case, I think it is the PVE TLS speed. Would anyone agree with that?

_gabriel · Dec 9, 2024

IsThisThingOn said:
Snapshot is probably also not applicable to me, since I use ZFS and you are not talking about ZFS snapshots, but qcow2 snapshots.

of course "Snapshot" mode works with ZFS and Lvmthin.

IsThisThingOn · Dec 10, 2024

_gabriel said:
of course "Snapshot" mode works with ZFS and Lvmthin.

Sorry I wasn't precise in my words.

Sure I can create a backup in snapshot mode, but at least for a NFS share as backup destination, it will create another full backup and not only send the changes. While in the ZFS world (zfs send), for a snapshot you only send a delta data.

UdoB · Dec 10, 2024

cave said:
PVE sends everything to PBS. So Full Read on PVE. PBS than deduplicates on the disk what has already been stored.

Slightly different: PVE reads everything (piece for piece) and it calculates the checksum of each chunk. This checksum is sent to PBS. The actual data is only transmitted if that checksum is not known already.

(Bitmaps are already mentioned in #31)

IsThisThingOn · Dec 10, 2024

UdoB said:
Slightly different: PVE reads everything (piece for piece) and it calculates the checksum of each chunk. This checksum is sent to PBS. The actual data is only transmitted if that checksum is not known already.

(Bitmaps are already mentioned in #31)

Ahh, that makes sense. That probably also explains why the random metadata read speed on PBS is unimportant in my scenario and both disks offer similar results.

I know this could come off as tedious, but I am interested in how the backup process works, otherwise, I can't really plan on what hardware I should use. The documentation in the steps of how it is done is a little bit scarce and the hardware requirements are pretty vague.

So for my workload, would you agree that these are the steps involved?

1. Reading data on PVE. Probably sequential? Not in the benchmark.
2. Calculating the checksum for a chunk. Probably the SHA256 value from the benchmark.
3. Since I backup VMs, PVS and PBS use fixed sized chunks. By default 4MiB. Since I do shutdown VMs for backup, dirty bitmaps are not an option. So to backup the example 32GB VM, there will be roughly 8200 chunks compared to PBS.
4. Comparing these values is a sequential (?) but rather easy small file read operation on PBS. Is this also depending on network latency? Probably not, since my guess is that the bitmap is also stored on the PBS, so there is no need to compare 8200 SHA256 values over the internet. Not in the benchmark.
5. If the chunk is new, compress it on PVS. This is the compression speed from the benchmark.
6. Encrypt it to send it over TLS to PBS. This is the TLS speed of the benchmark.

UdoB · Dec 10, 2024

IsThisThingOn said:
So for my workload, would you agree that these are the steps involved?

Looks good. Note that I am not a developer, just a user:

IsThisThingOn said:
1. Reading data on PVE. Probably sequential?

Sequential in regard of the "top view", be it a .qcow-file or a LVM/ZFS block device. Note that the actual data may (will!) be scattered on the physical device. That's why IOPS are often more relevant than bandwidth.

IsThisThingOn said:
4. Comparing these values is a sequential (?) but rather easy small file read read operation on PBS. Is this also depending on network latency? Not in the benchmark.

Sure! As soon as you need to go over a wire for each and every chunk (possibly multiple times!) you introduce a relevant latency. Local storage (NVMe) may give you a hundred thousands of IOPS, while TCP/IP over 1/10/100 GBit/s will always result in only a fraction of it. That's one reason why often caches are involved - introducing their own problems...

If this is really dramatic/important or if it can just be ignored depends on your actual use case and requirements. I can not judge it.

Search

Search

Benchmark ideas for a ghetto build

IsThisThingOn

Active Member

IsThisThingOn

Active Member

IsThisThingOn

Active Member

cave

Renowned Member

IsThisThingOn

Active Member

_gabriel

Famous Member

_gabriel

Famous Member

IsThisThingOn

Active Member

cave

Renowned Member

IsThisThingOn

Active Member

_gabriel

Famous Member

IsThisThingOn

Active Member

_gabriel

Famous Member

IsThisThingOn

Active Member

UdoB

Distinguished Member

IsThisThingOn

Active Member

UdoB

Distinguished Member

We value your privacy