is it possible to merge datastore content ?

roadrunner_rad

Active Member
Aug 6, 2019
51
5
28
62
Hi,

when I started with pbs I created two datastores one for holding pve vm backups and one for pbs client (host) backups. Now 1,5 years later I see that the two datastores grow completely different and that the content is seperated with name prefix host/ or vm/. I have lots of free space in the host/ datastore but the vm/ datastore is nearly full.

Question: is it possible to merge the content of the two (or more) datastores into a new one or one into the other and if yes how is the best way to achieve this ?

PBS 2.1.5 zfs on iscsi storage
 
You can setup a "remote" for localhost, then pull the contents from one store to the other. See

# proxmox-backup-manager pull
 
thank you so much works like a charme !
Another question is, I have two pbs servers that are used to backup my data. Whe I have the same vm or host in both with different snapshots, can I marge them too in this manner ?
 
no, because for each group (like 'vm/123') the sync will only look at snapshots newer than the last local one.

the datastores and snapshots are pretty self-contained though, so - !! WARNING, COMPLETELY UNTESTED !! - merging two datastores manually on the disk level should probably work (if you include the .chunks dir), as long as you don't do any modifications (backup, tape-restore, sync, prune, GC) on the PBS level at the same time. the snapshot metadata files reference the chunks in .chunks, so copying the latter first should ensure no inconsistencies. I'd still recommend testing the whole procedure with some test instances first, and running a verify afterwards (and obviously if you have the space, copying both source datastores into a third, new datastore is a lot safer than attempting to merge in-place).
 
no, because for each group (like 'vm/123') the sync will only look at snapshots newer than the last local one.

the datastores and snapshots are pretty self-contained though, so - !! WARNING, COMPLETELY UNTESTED !! - merging two datastores manually on the disk level should probably work (if you include the .chunks dir), as long as you don't do any modifications (backup, tape-restore, sync, prune, GC) on the PBS level at the same time. the snapshot metadata files reference the chunks in .chunks, so copying the latter first should ensure no inconsistencies. I'd still recommend testing the whole procedure with some test instances first, and running a verify afterwards (and obviously if you have the space, copying both source datastores into a third, new datastore is a lot safer than attempting to merge in-place).

Is this approach something that has been testet and confirmed in the time since this post was made?
 
no, but the post still stands. in the meantime, maintenance mode arrived in PBS 2.x, so you can set both datastores to block any operations before starting the manual merge, which should rule out any intereference from GC/pruning/...
 
not having a possibility to merge/sync old snapshots will ruin my weekend now. who the hell thought the sync only syncing newer snapshots was a good idea? ngl i'm so mad at the person that came up with this glourious idea! still can't believe this is not implemented..
 
not having a possibility to merge/sync old snapshots will ruin my weekend now. who the hell thought the sync only syncing newer snapshots was a good idea? ngl i'm so mad at the person that came up with this glourious idea! still can't believe this is not implemented..
it follows the same principle as doing backups - it's dangerous to allow filling in slots between already existing backups by default, since that would allow stuffing the backup group with invalid snapshots in a way that the next prune would remove all valid ones.. a new sync mode that allows filling in missing snapshots, and re-syncing broken ones, is planned, but will potentially require an additional privilege for that reason.
 
also, it's fairly easy to work around this if you have raw access to the datastore - create a new namespace, sync all snapshots into that (already existing ones are basically free, since only the snapshot metadata will be synced), then merge back the vm/XXX dirs into the original namespace (just ensure no destructive tasks are running in parallel for the last part).
 
also, it's fairly easy to work around this if you have raw access to the datastore - create a new namespace, sync all snapshots into that (already existing ones are basically free, since only the snapshot metadata will be synced), then merge back the vm/XXX dirs into the original namespace (just ensure no destructive tasks are running in parallel for the last part).
thanks for the hint. i solved it by going with the "not really suggested but if you feel adventurous route" instead, which was "run rsync to a new datastore folderand sync all existing datastores to that, then let pbs sort the mess out by running pruning, garbage collection, multiple verifys, restore and test some vm's from that to validate". which to my surprise worked shockingly well. i had absolutely zero issues with it, all backups that i restored from this "mess" worked flawlessly, i never would have imagined it to be this easy afterall. still, it's good to know the topic is on your radar and implementing this is planned. thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!