[SOLVED] push sync job users - how many?

binary_jam

New Member
Jul 1, 2024
2
0
1
Hey everybody,

I can't quite wrap my head around how many users are necessary for sync jobs in push-direction in this setup:
PBS1 has 5 datastores with snapshots from different sources in them.
PBS2 also has 5 datastores which contain mirrors of all snapshots from their equivalent datastores on PBS1.
Right now, this is set up with sync jobs in pull direction on PBS2. Those sync jobs use a single user on PBS1 and a single remote config on PBS2.

Now that sync jobs in push direction are supported, this setup is to be converted to pushing from PBS1 to PBS2.
Again, all snapshots of datastore X on PBS1 should be push-synced to datastore X on PBS2.
For testing purposes, one sync job was already switched to push-direction. It works fine.

What I don't quite fully understand is this block in the documentation:
It is strongly advised to create a dedicated remote configuration for each individual sync job in
push direction, using a dedicated user on the remote. Otherwise, sync jobs pushing to the same
target might remove each others snapshots and/or groups, if the remove vanished flag is set or
skip snapshots if the backup time is not incremental. This is because the backup groups on the
target are owned by the user given in the remote configuration.

What exactly is "the same target"? Same remote PBS? Same datastore? Same namespace?
In this setup it would be five sync jobs - one for each datastore.
The docs are speaking of skipped snapshots in syncs and unwanted deletions of previous snapshots on the target side.
Isn't this only the case if multiple sync jobs push to the same datastore (and even the same namespace)?

Is it safe in this setup (one sync job per datastore) to only have one remote config and one remote user?
The setup should be as complex as necessary and as simple as possible.

Thanks in advance for your input!
 
Hoo boy.

First off, why are you so eager to use push jobs? Gotta get into the new .0 feature?
Let it break a bit for other people first. So much easier that way.

I'm doing an awful lot of the same business as you, and I don't understand your question.
Its an interesting question tho ...

My overall take on this and the answer I've personally deployed is a series of Namespaces.
  • I used a Site root element (Helps distinguish where you are when you setup Sync Jobs.)
  • I combined Group into the root element. (My groups had different site-to-site sync requirements. I only have 2 Groups, but obviously this is extensible as needed.)
  • I used a functional sub-level of "back" and "sync".
  • I repeat this back and sync structure under every Group.

Site-Group
- back (This is backups that were generated at this site. We dump backup jobs here.)
- sync (This is backups that were generated at the remote site. We dump site-to-site here.)

It seems to me that using this scenario, I will never have the issue they describe.

....
Oh ... and users ... again, not sure why we want to get all complicated.
I do have discrete backup and sync users.
And that's ok, because they write to discrete Namespaces with discrete Jobs.


Oh yeah ... Make your Namespaces before you create the Users and assign Permissions.
PBS is still fairly new. There have been some bugs in how permissions are applied to Namespaces.
If you run into mystery permission issues with Users and Namespaces, nuke it all and start over with the Namespaces first.
 
Last edited:
  • Like
Reactions: Johannes S
Hi,
Hey everybody,

I can't quite wrap my head around how many users are necessary for sync jobs in push-direction in this setup:
PBS1 has 5 datastores with snapshots from different sources in them.
PBS2 also has 5 datastores which contain mirrors of all snapshots from their equivalent datastores on PBS1.
Right now, this is set up with sync jobs in pull direction on PBS2. Those sync jobs use a single user on PBS1 and a single remote config on PBS2.

Now that sync jobs in push direction are supported, this setup is to be converted to pushing from PBS1 to PBS2.
Again, all snapshots of datastore X on PBS1 should be push-synced to datastore X on PBS2.
For testing purposes, one sync job was already switched to push-direction. It works fine.

What I don't quite fully understand is this block in the documentation:


What exactly is "the same target"? Same remote PBS? Same datastore? Same namespace?
In this setup it would be five sync jobs - one for each datastore.
The docs are speaking of skipped snapshots in syncs and unwanted deletions of previous snapshots on the target side.
Isn't this only the case if multiple sync jobs push to the same datastore (and even the same namespace)?

Is it safe in this setup (one sync job per datastore) to only have one remote config and one remote user?
The setup should be as complex as necessary and as simple as possible.

Thanks in advance for your input!
the situation is as follows: You might have 2 sources, let's call them source A and source B with different contents, both performing push syncs to a target (same namespace) via the same user. Since both sync jobs use the same user, both can see and access each others contents. If now e.g. source A's sync job has the remove vanished flag set, it will remove contents synced by source B to the shared target, as it does not have these contents in its source datastore. This might be unexpected and not intended, therefore the recommendation to use different users if backing up from different sources to the same target (namespace). If you strictly separate the sync jobs by namespace, that will work as well.
 
Thanks for your input!

First off, why are you so eager to use push jobs? Gotta get into the new .0 feature?
Let it break a bit for other people first. So much easier that way.
Well, there also need to be people who find the breakage. ;)
Jokes aside: The idea here is to "centralize" the schedules (PBS1 is on-site), with the off-site PBS2 basically acting as "remote storage".

the situation is as follows: You might have 2 sources, let's call them source A and source B with different contents, both performing push syncs to a target (same namespace) via the same user. Since both sync jobs use the same user, both can see and access each others contents. If now e.g. source A's sync job has the remove vanished flag set, it will remove contents synced by source B to the shared target, as it does not have these contents in its source datastore. This might be unexpected and not intended, therefore the recommendation to use different users if backing up from different sources to the same target (namespace). If you strictly separate the sync jobs by namespace, that will work as well.
Thanks! That cleared it up for me.
Somehow I got the impression, that sync jobs keep track of their past syncs in a separate per-user-file, which would lead to unwanted effects when multiple sync jobs use the same remote user. But following your explanation, that's not the case.