Hi, I am just posting to make sure I have my understanding on this topic correct.
Core issue - relates to backup speeds for LXC containers using proxmox and PBS/latest.
Most of the time, most LXC Containers I am running are ~relatively small (50gig or less; often smaller - 20gig etc)
I have one proxmox host I manage with a single LXC container that is a lot bigger than this (500gig)
and the backups are really slow to PBS. (ie, 12 hours approx)
This is arguably ~meh spec equipment, but KVM VM on same setup/location are working much more smoothly. So I think in part I'm just seeing the simple thing that
KVM backups with PBS <> have "CBT change block tracking" style backups / more efficient / deltas can be captured and backed up (only)
LXC backups - don't have any such measure, so you always end up doing a multi-step rsync > topup rsync > then compare this latest backup data blob vs the bobs in PBS and sending over deltas. But you basically need to look at all the blocks, prep the backup (local cache dir) and then do the diffs (effectively) to see what needs to be backed up > and then it happens. And so - for a box with - ~meh performance local cache disk space (ie, SATA raid mirror) you basically get ~meh performance for PBS<>LXC Backups for a big VM like this. Even if the nightly deltas appear to be trivial once the dust all settles (ie, looking at logs I think I am averaging about 650megs of new data for each daily backup for this 500gig VM)
So. Kind of just sanity checking this.
1) is there any near-term feature inbound for example upstream LXC "change block tracking" similar feature on the horizon? which presumably would make LXC backups in this scenario a lot faster if that was an option?
2) just give up on large LXC containers, basically assume once I am above 100gig disk footprint, go with a KVM based VM and that way I'll get block change track and backups are faster and life is just better?
3) any other suggestion? I did find a few things via mister google search, I think at least one person has made a sneaky wrapper script that will do snazzy things with under-the-hood snapshot (ie, LVM snapshot on underlying storage where LXC container storage resides) and doing sneaky things this way for making the backup performance not suck. But it feels like I'm opening a new can of worms if I start trying such methods. My preference is generally 'keep it simple" to minimize the risk of human 'oops, broke it' etc situations.
for clarity in case it is helpful, this specific setup in question
3 physical hosts - Proxmox small cluster - 2 proxmox nodes, and one PBS node
first older proxmox host is a dell - rackmount - has blend of perc hardware raid, plus bcache SSD to make SATA data pool less-slow (in theory)
newer-ish proxmox host is a supermicro rackmount, has pure SSD (2.5" drives) hardware raid LSI controller - basically - but a smaller storage pool size for VM
PBS host has a big honking raid5 sata SW raid storage config
they are all connected (ie, cluster comms / PBS<>backups comms) - with ~commodity cheap (ie, 'best you can get on amazon for modest price point") 10gig gear
so the 10gig performance is - better than 1gig but not 'best of class' performance 10gig by any means.
KVM backup performance - is fine on this hardware (ie, most VM backups take minutes not hours)
Proxmox performance is generally fine. Clearly the proxmox host with big honking SATA raid / bcache - is not as fast as pure SSD raid, But it is sufficient for suitable workloads (ie, general purpose file server which has a modest pool of client computers talking to it at 100-1000mb speeds via vanilla gig ether public interface)
LXC backups for small VM - is ~fine
LXC backup for big VM is - really quite painful. So this is the thing I am guessing I must change.
thank you if reading has made it this far!
Tim
Core issue - relates to backup speeds for LXC containers using proxmox and PBS/latest.
Most of the time, most LXC Containers I am running are ~relatively small (50gig or less; often smaller - 20gig etc)
I have one proxmox host I manage with a single LXC container that is a lot bigger than this (500gig)
and the backups are really slow to PBS. (ie, 12 hours approx)
This is arguably ~meh spec equipment, but KVM VM on same setup/location are working much more smoothly. So I think in part I'm just seeing the simple thing that
KVM backups with PBS <> have "CBT change block tracking" style backups / more efficient / deltas can be captured and backed up (only)
LXC backups - don't have any such measure, so you always end up doing a multi-step rsync > topup rsync > then compare this latest backup data blob vs the bobs in PBS and sending over deltas. But you basically need to look at all the blocks, prep the backup (local cache dir) and then do the diffs (effectively) to see what needs to be backed up > and then it happens. And so - for a box with - ~meh performance local cache disk space (ie, SATA raid mirror) you basically get ~meh performance for PBS<>LXC Backups for a big VM like this. Even if the nightly deltas appear to be trivial once the dust all settles (ie, looking at logs I think I am averaging about 650megs of new data for each daily backup for this 500gig VM)
So. Kind of just sanity checking this.
1) is there any near-term feature inbound for example upstream LXC "change block tracking" similar feature on the horizon? which presumably would make LXC backups in this scenario a lot faster if that was an option?
2) just give up on large LXC containers, basically assume once I am above 100gig disk footprint, go with a KVM based VM and that way I'll get block change track and backups are faster and life is just better?
3) any other suggestion? I did find a few things via mister google search, I think at least one person has made a sneaky wrapper script that will do snazzy things with under-the-hood snapshot (ie, LVM snapshot on underlying storage where LXC container storage resides) and doing sneaky things this way for making the backup performance not suck. But it feels like I'm opening a new can of worms if I start trying such methods. My preference is generally 'keep it simple" to minimize the risk of human 'oops, broke it' etc situations.
for clarity in case it is helpful, this specific setup in question
3 physical hosts - Proxmox small cluster - 2 proxmox nodes, and one PBS node
first older proxmox host is a dell - rackmount - has blend of perc hardware raid, plus bcache SSD to make SATA data pool less-slow (in theory)
newer-ish proxmox host is a supermicro rackmount, has pure SSD (2.5" drives) hardware raid LSI controller - basically - but a smaller storage pool size for VM
PBS host has a big honking raid5 sata SW raid storage config
they are all connected (ie, cluster comms / PBS<>backups comms) - with ~commodity cheap (ie, 'best you can get on amazon for modest price point") 10gig gear
so the 10gig performance is - better than 1gig but not 'best of class' performance 10gig by any means.
KVM backup performance - is fine on this hardware (ie, most VM backups take minutes not hours)
Proxmox performance is generally fine. Clearly the proxmox host with big honking SATA raid / bcache - is not as fast as pure SSD raid, But it is sufficient for suitable workloads (ie, general purpose file server which has a modest pool of client computers talking to it at 100-1000mb speeds via vanilla gig ether public interface)
LXC backups for small VM - is ~fine
LXC backup for big VM is - really quite painful. So this is the thing I am guessing I must change.
thank you if reading has made it this far!
Tim