First backup of a single vm, to PBS runs VERY VERY slowly

lanepark5033

New Member
Feb 20, 2025
27
2
3
After installing a qemu-img qcow2 copy of a raw metal Manjaro system via 'QM importdisk' then running a first backup, it took 15+ hours.
Immidiately after ran a second backup (as expected nothing transfered) and it only took 15 minutes.
The volume size is 485GB
The fs is ext4.
The backup storage is a 2TB nvme and has very little stored on it.
All the vm's and LXC's were shutdown including the one being backed up.
I do understand that the total vm disk gets transfered, IE incremental backup in the PVE is meanless on the first backup and the second backup only exercises the incremental process in the PVE.

I ran nload on both the PVE and PBS while running the first backup and saw about 20MB/s rate on both servers.

So why is there a 60:1 ratio of speed in exactly the same network ? (Network is 2.5Gb router, cables and nic's)
 
Last edited:
What's the model of your 2 TB NVMe?
 
What's the model of your 2 TB NVMe?
My mistake, after opening PBS remembered that the storage inside cnsists of 1 500GB ssds and 2 rotating HDD. The the SSD is full, and the 2TB (7200rpm) is full. The second HDD (7200RPM) is a 6TB disk about 8% full.
The nvme drive I was thinking about is in the PVE and is a Crucial ct2000p3ssdb, gen 3 and does not have volitial front end, relying on host memory buffer.
The host has 128GB so plenty of memory in a Ryzen 7 7700 am5 mboard.
 
My mistake, after opening PBS remembered that the storage inside cnsists of 1 500GB ssds and 2 rotating HDD. The the SSD is full, and the 2TB (7200rpm) is full. The second HDD (7200RPM) is a 6TB disk about 8% full.
The nvme drive I was thinking about is in the PVE and is a Crucial ct2000p3ssdb, gen 3 and does not have volitial front end, relying on host memory buffer.
The host has 128GB so plenty of memory in a Ryzen 7 7700 am5 mboard.

Hmm, I see—assuming that your datastore is on that Crucial NVMe SSD, I think that the host memory buffer might be at fault here, then. The problem with an HMB is that it (eventually) will cause performance to degrade during random reads/writes.

I double-checked, just to be sure; to quote:
HMB doesn’t fully replicate the DRAM, but it is an innovative solution to enhance SSD performance. HMB stores the FTL but not the entire table. Additionally, it doesn’t generally perform other tasks that the DRAM would typically do. HMB is slower, uses a tiny chunk of your system RAM, and is mostly just for metadata—it can’t buffer writes well or match DRAM’s performance. Therefore, if your SSD has a larger capacity, you may experience performance issues with HMB, especially in random read/write scenarios.
(Source: https://storedbits.com/hmb-vs-pseudo-slc-vs-dram-ssd/)

This is one of the reasons why we're recommending enterprise SSDs—consumer-grade SSDs usually can't deal with such sustained randread / randwrite IOPS loads. The same goes for HDDs if you don't provide them with some kind of metadata cache.

Note that this doesn't mean that you have to toss out your SSDs and HDDs though; you can still use them for secondary datastores with long retention settings. You can then run a periodic pull sync job that syncs the data from your fast primary datastore to your slower secondary datastore. So, in such a scenario, your primary datastore would then just be a single smaller enterprise-grade SSD with lower retention settings. In the pull sync job, don't tick "Remove Vanished" in the configuration and let a prune job for the secondary datastore handle the retention.

Or if you happen to have an unused machine lying around, you could use that as your slower, secondary PBS that you sync to (pull or push, in this case).

I hope that helps!
 
  • Like
Reactions: news
Hmm, I see—assuming that your datastore is on that Crucial NVMe SSD, I think that the host memory buffer might be at fault here, then. The problem with an HMB is that it (eventually) will cause performance to degrade during random reads/writes.

I double-checked, just to be sure; to quote:

(Source: https://storedbits.com/hmb-vs-pseudo-slc-vs-dram-ssd/)

This is one of the reasons why we're recommending enterprise SSDs—consumer-grade SSDs usually can't deal with such sustained randread / randwrite IOPS loads. The same goes for HDDs if you don't provide them with some kind of metadata cache.

Note that this doesn't mean that you have to toss out your SSDs and HDDs though; you can still use them for secondary datastores with long retention settings. You can then run a periodic pull sync job that syncs the data from your fast primary datastore to your slower secondary datastore. So, in such a scenario, your primary datastore would then just be a single smaller enterprise-grade SSD with lower retention settings. In the pull sync job, don't tick "Remove Vanished" in the configuration and let a prune job for the secondary datastore handle the retention.

Or if you happen to have an unused machine lying around, you could use that as your slower, secondary PBS that you sync to (pull or push, in this case).

I hope that helps!
Thanks Max for taking the time to research the info regarding HMB. Thou I knew about HMB and nvme devices I was unaware that HMB only stores some of the FTL and will constrain random reads and random writes. Actually a backup of a newly installed large vm disk should not put a very large load on the source of the backup. I would expect the majortiy of the load in the total process would be on the PBS.
In my case the DASD installed in the PBS is a 500GB SSD and 2 HDD's in a LVM setup. When I do a PVDISPLAY the SSD is sda3 so part of the Proxmox BS OS and the 2TB HDD show as full, which means the complete output of the backup is in one (6TB HDD) drive of the LVM. I'm wondering if the overhead of the LVM and one of the members of the LVM containing the PBS OS as well as part of the LVM, will cause a large load. I may need to reduce the LVM for the datastore to the 2 HDDs. Your thoughts ?
 
Thanks Max for taking the time to research the info regarding HMB. Thou I knew about HMB and nvme devices I was unaware that HMB only stores some of the FTL and will constrain random reads and random writes.
You're welcome! Yeah these things can be a bit (actually, quite) annoying—especially in the case of SSDs. Sometimes these issues show up even when you think you've ruled every problem out through extensive testing and benchmarking. SSDs with a fast SLC cell cache and slow QLC cells otherwise are one of the things I personally really despise, because the device will tell the OS that the writes have been completed as soon as they land on the cache—even though the SSD will continue moving stuff from the cache to the actual storage in the background. So if your benchmark didn't run long enough, you might not actually see the bottleneck appear. And once it appears, write performance just slows to a crawl.

Actually a backup of a newly installed large vm disk should not put a very large load on the source of the backup. I would expect the majortiy of the load in the total process would be on the PBS.
Also, just to clarify here: Whether there's going to be any load on the source of the backup (so, your PVE host / cluster) depends on several factors. For example, if your VM's disk is backed by a storage that uses HDDs, you might get quite a few hiccups inside the VM (file operations taking longer, etc.), as there is even more concurrent IO that the HDDs have to deal with. As another example, if the target storage is slow, then backups will usually just take more time, I don't remember if there are any performance penalties on the source in that case (but I believe that there will be eventually, e.g. if the backup is made in "snapshot" mode; I actually never bothered to test this in detail myself).

The network speed overall is bottlenecked by source and target storage bandwidth and latency, as well as the overall available network bandwidth (for the backup job) and latency. For example, let's say you have a beefy 25Gbit/s network interface, and you have a lightning-fast storage on your PBS host (the target) that can manage sustained random write loads with a bandwidth of 5GB/s (so, 40Gbit/s) without any issues. Now, if your source's storage can at most only offer read speeds at 250MB/s (so, 2Gbit/s), then the source's storage will be the bottleneck. So, when it comes to things like these, the slowest component will usually be the bottleneck.

(I apologize if you're aware of all of this by the way, I thought I might clarify all of this for other readers as well, in case they're not as familiar with this.)

In my case the DASD installed in the PBS is a 500GB SSD and 2 HDD's in a LVM setup. When I do a PVDISPLAY the SSD is sda3 so part of the Proxmox BS OS and the 2TB HDD show as full, which means the complete output of the backup is in one (6TB HDD) drive of the LVM. I'm wondering if the overhead of the LVM and one of the members of the LVM containing the PBS OS as well as part of the LVM, will cause a large load. I may need to reduce the LVM for the datastore to the 2 HDDs. Your thoughts ?

Ah, so if I understand correctly, you're mixing your HDDs and your SSD in your LVM setup? Yeah, that isn't too ideal. I'm not 100% sure how LVM handles this under the hood (and how it behaves in all the different ways you can configure that) but in most storage setups, you don't want to mix different "classes" of storage devices like that (with some tiny exceptions).

For example, if you have a mirror consisting of two devices, one of them being an HDD and the other an SSD, then most volume managers / filesystems will only report that a write operation has been completed if the data was written to both devices. So in that case, writes are bottlenecked by the HDD. Reads can still be fast though, but I'm not sure if LVM automatically "understands" that one device is faster than the other.

So what I would personally recommend here is that you put your OS on the SSD alone and leave it in its own LVM pool or ZFS configuration. Using ZFS has the additional benefit that you can convert it into a mirror by adding another 500GB SSD later on, if you so wish. Otherwise, it's fine to stick with LVM. Then, put your primary datastore on a fast, enterprise-grade SSD. Alternatively, you can test if your current 2TB SSD works here too; if you do hit a bottleneck, it's probably still not going to be as bad as your HDDs under load. So, you might be able to get away with it. You'll have to test and measure, though.

Finally, configure the two HDDs in whatever way you'd like; those would then be your slower, secondary datastore that will be kept in sync with the primary one, but has higher rentention settings (since you have more space availabe). I can again recommend ZFS here, even for something like RAID-0 or RAID-1, because it makes it easier to tinker on your setup later (e.g. if you need to expand the storage of your secondary datastore).

Overall, I'm personally a big fan of ZFS for such setups, because—as I already mentioned—you can expand your ZFS pool later on if you so wish. So, if you set up your secondary datastore using a zpool with a mirror vdev (so, RAID-1 basically), and you find that it still is slow, you can still add a fast special device later on that stores the metadata for your pool. At least in the case of PBS, a lot of random IO is performed just for accessing metadata. (So, looking up chunks on the datastore, checking access times, etc. I could elaborate on this more, but this post is already long enough. :P) Just ensure that your special device in ZFS is also mirrored, because if it dies, your pool's gone, too. Since that might become a bit expensive though (you'd need two new enterprise-grade SSDs for that), you at least can always do that later, or just not at all.

I hope that all of this helps (sorry for the wall of text)—there should be more around here in our forum regarding such setups if you sift through it a little.
 
You're welcome! Yeah these things can be a bit (actually, quite) annoying—especially in the case of SSDs. Sometimes these issues show up even when you think you've ruled every problem out through extensive testing and benchmarking. SSDs with a fast SLC cell cache and slow QLC cells otherwise are one of the things I personally really despise, because the device will tell the OS that the writes have been completed as soon as they land on the cache—even though the SSD will continue moving stuff from the cache to the actual storage in the background. So if your benchmark didn't run long enough, you might not actually see the bottleneck appear. And once it appears, write performance just slows to a crawl.


Also, just to clarify here: Whether there's going to be any load on the source of the backup (so, your PVE host / cluster) depends on several factors. For example, if your VM's disk is backed by a storage that uses HDDs, you might get quite a few hiccups inside the VM (file operations taking longer, etc.), as there is even more concurrent IO that the HDDs have to deal with. As another example, if the target storage is slow, then backups will usually just take more time, I don't remember if there are any performance penalties on the source in that case (but I believe that there will be eventually, e.g. if the backup is made in "snapshot" mode; I actually never bothered to test this in detail myself).

The network speed overall is bottlenecked by source and target storage bandwidth and latency, as well as the overall available network bandwidth (for the backup job) and latency. For example, let's say you have a beefy 25Gbit/s network interface, and you have a lightning-fast storage on your PBS host (the target) that can manage sustained random write loads with a bandwidth of 5GB/s (so, 40Gbit/s) without any issues. Now, if your source's storage can at most only offer read speeds at 250MB/s (so, 2Gbit/s), then the source's storage will be the bottleneck. So, when it comes to things like these, the slowest component will usually be the bottleneck.

(I apologize if you're aware of all of this by the way, I thought I might clarify all of this for other readers as well, in case they're not as familiar with this.)



Ah, so if I understand correctly, you're mixing your HDDs and your SSD in your LVM setup? Yeah, that isn't too ideal. I'm not 100% sure how LVM handles this under the hood (and how it behaves in all the different ways you can configure that) but in most storage setups, you don't want to mix different "classes" of storage devices like that (with some tiny exceptions).

For example, if you have a mirror consisting of two devices, one of them being an HDD and the other an SSD, then most volume managers / filesystems will only report that a write operation has been completed if the data was written to both devices. So in that case, writes are bottlenecked by the HDD. Reads can still be fast though, but I'm not sure if LVM automatically "understands" that one device is faster than the other.

So what I would personally recommend here is that you put your OS on the SSD alone and leave it in its own LVM pool or ZFS configuration. Using ZFS has the additional benefit that you can convert it into a mirror by adding another 500GB SSD later on, if you so wish. Otherwise, it's fine to stick with LVM. Then, put your primary datastore on a fast, enterprise-grade SSD. Alternatively, you can test if your current 2TB SSD works here too; if you do hit a bottleneck, it's probably still not going to be as bad as your HDDs under load. So, you might be able to get away with it. You'll have to test and measure, though.

Finally, configure the two HDDs in whatever way you'd like; those would then be your slower, secondary datastore that will be kept in sync with the primary one, but has higher rentention settings (since you have more space availabe). I can again recommend ZFS here, even for something like RAID-0 or RAID-1, because it makes it easier to tinker on your setup later (e.g. if you need to expand the storage of your secondary datastore).

Overall, I'm personally a big fan of ZFS for such setups, because—as I already mentioned—you can expand your ZFS pool later on if you so wish. So, if you set up your secondary datastore using a zpool with a mirror vdev (so, RAID-1 basically), and you find that it still is slow, you can still add a fast special device later on that stores the metadata for your pool. At least in the case of PBS, a lot of random IO is performed just for accessing metadata. (So, looking up chunks on the datastore, checking access times, etc. I could elaborate on this more, but this post is already long enough. :P) Just ensure that your special device in ZFS is also mirrored, because if it dies, your pool's gone, too. Since that might become a bit expensive though (you'd need two new enterprise-grade SSDs for that), you at least can always do that later, or just not at all.

I hope that all of this helps (sorry for the wall of text)—there should be more around here in our forum regarding such setups if you sift through it a little.
I'll try to be brief, I scratched all storage in my PVE and re-installed with a dedicated 500GB sata ssd as a xfs disk as you suggested (Thanks). Then defined the 2 TB nvme drive as local-xfs used as vm and lxc storage. The second 500GB sata ssd is empty and awaiting use as future needs arise. (xfs does not let me reduce the size of a volume so keeping the second sata ssd us unused disk space.)
When I tried to restore vm's and lxc's i ran into issues as I had deleted (from the pve) the local-lvm definition, so the first attempt to restore anything failed, but surprising a second attempt restored all of the vm's. Lesson leared, dont delete things until you are sure they are not needed. I did discover how to restore the local-lvm definition, another lesson learned.
So thank you so very much for your guidence and research. Oh, the restores ran MUCH faster and performance is MUCH better.
Regards