[SOLVED] Sync job fails on a second remote PBS

Apr 26, 2023
16
1
3
Switzerland
Hi,

I am setting up a second PBS host. I added sync jobs to sync datastores from the first PBS to the second new one.

After the sync was done I found out that all backup groups were synced but one VM was missing. Also the job log contained this error:

Code:
2023-08-07T08:37:05+02:00: sync group vm/100 failed - group lock failed: not a valid user id

What does this mean? What could be the issue?

I created a single job with only this group filter, here is the full log:

Code:
2023-08-07T09:00:00+02:00: Starting datastore sync job 'pbs1:datastore_nimble:store_01::s-e472ca79-d829'
2023-08-07T09:00:00+02:00: task triggered by schedule 'hourly'
2023-08-07T09:00:00+02:00: sync datastore 'store_01' from 'pbs1/datastore_nimble'
2023-08-07T09:00:00+02:00: ----
2023-08-07T09:00:00+02:00: Syncing datastore 'datastore_nimble', root namespace into datastore 'store_01', root namespace
2023-08-07T09:00:00+02:00: found 1 groups to sync (out of 50 total)
2023-08-07T09:00:00+02:00: sync group vm/100 failed - group lock failed: not a valid user id
2023-08-07T09:00:00+02:00: Finished syncing namespace , current progress: 0 groups, 0 snapshots
2023-08-07T09:00:00+02:00: TASK ERROR: sync failed with some errors.

Thanks
 
Hi,

I am setting up a second PBS host. I added sync jobs to sync datastores from the first PBS to the second new one.

After the sync was done I found out that all backup groups were synced but one VM was missing. Also the job log contained this error:

Code:
2023-08-07T08:37:05+02:00: sync group vm/100 failed - group lock failed: not a valid user id

What does this mean? What could be the issue?

I created a single job with only this group filter, here is the full log:

Code:
2023-08-07T09:00:00+02:00: Starting datastore sync job 'pbs1:datastore_nimble:store_01::s-e472ca79-d829'
2023-08-07T09:00:00+02:00: task triggered by schedule 'hourly'
2023-08-07T09:00:00+02:00: sync datastore 'store_01' from 'pbs1/datastore_nimble'
2023-08-07T09:00:00+02:00: ----
2023-08-07T09:00:00+02:00: Syncing datastore 'datastore_nimble', root namespace into datastore 'store_01', root namespace
2023-08-07T09:00:00+02:00: found 1 groups to sync (out of 50 total)
2023-08-07T09:00:00+02:00: sync group vm/100 failed - group lock failed: not a valid user id
2023-08-07T09:00:00+02:00: Finished syncing namespace , current progress: 0 groups, 0 snapshots
2023-08-07T09:00:00+02:00: TASK ERROR: sync failed with some errors.

Thanks
Hi,
please share the content of cat /etc/proxmox-backup/sync.cfg as well as proxmox-backup-manager versions --verbose from the host trying to sync. Seems like the owner in the sync job might not match the regex used during parsing.
 
Sure, here they are:

Code:
╭─root@pbs-host3 ~
╰─➤  cat /etc/proxmox-backup/sync.cfg
sync: s-02a2a5c3-e28e
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_syno
        remove-vanished false
        schedule hourly
        store store_01

sync: s-d88520d0-e66e
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_nimble
        remove-vanished false
        schedule hourly
        store store_01

sync: s-d60afce7-cfaf
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_qnap
        remove-vanished true
        schedule daily
        store store_02

sync: s-e472ca79-d829
        comment Only VM/100
        group-filter group:vm/100
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_nimble
        remove-vanished false
        schedule hourly
        store store_01
╭─root@pbs-host3 ~
╰─➤  proxmox-backup-manager versions --verbose
proxmox-backup                2.3-1        running kernel: 5.15.102-1-pve
proxmox-backup-server         2.4.1-1      running version: 2.4.1
pve-kernel-helper             7.3-8
pve-kernel-5.15               7.3-3
pve-kernel-5.15.102-1-pve     5.15.102-1
ifupdown2                     3.1.0-1+pmx3
libjs-extjs                   7.0.0-1
proxmox-backup-docs           2.4.1-1
proxmox-backup-client         2.4.1-1
proxmox-mail-forward          0.1.1-1
proxmox-mini-journalreader    1.2-1
proxmox-offline-mirror-helper unknown
proxmox-widget-toolkit        3.6.5
pve-xtermjs                   4.16.0-1
smartmontools                 7.3-1
zfsutils-linux                2.1.9-pve1
 
Sure, here they are:

Code:
╭─root@pbs-host3 ~
╰─➤  cat /etc/proxmox-backup/sync.cfg
sync: s-02a2a5c3-e28e
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_syno
        remove-vanished false
        schedule hourly
        store store_01

sync: s-d88520d0-e66e
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_nimble
        remove-vanished false
        schedule hourly
        store store_01

sync: s-d60afce7-cfaf
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_qnap
        remove-vanished true
        schedule daily
        store store_02

sync: s-e472ca79-d829
        comment Only VM/100
        group-filter group:vm/100
        ns
        owner root@pam
        remote pbs1
        remote-ns
        remote-store datastore_nimble
        remove-vanished false
        schedule hourly
        store store_01
╭─root@pbs-host3 ~
╰─➤  proxmox-backup-manager versions --verbose
proxmox-backup                2.3-1        running kernel: 5.15.102-1-pve
proxmox-backup-server         2.4.1-1      running version: 2.4.1
pve-kernel-helper             7.3-8
pve-kernel-5.15               7.3-3
pve-kernel-5.15.102-1-pve     5.15.102-1
ifupdown2                     3.1.0-1+pmx3
libjs-extjs                   7.0.0-1
proxmox-backup-docs           2.4.1-1
proxmox-backup-client         2.4.1-1
proxmox-mail-forward          0.1.1-1
proxmox-mini-journalreader    1.2-1
proxmox-offline-mirror-helper unknown
proxmox-widget-toolkit        3.6.5
pve-xtermjs                   4.16.0-1
smartmontools                 7.3-1
zfsutils-linux                2.1.9-pve1
Well, the owner seems fine. Does the same issue occur also for the other sync job, without the group-filter, or is it just this single job which produces this error?
 
The job sync: s-d88520d0-e66e contains all VM and CT and also fails on this particular VM/100.
I created the job with only this VM to check for the issue only on this VM...

Actually I have a 3rd PBS witch also syncs from the first one. And it does not have this problem...

Any idea?
 
Any difference in version for the 3 PBS instances?
 
Actually yes.

I have 3 PBS right now: PBS1, PBS2 and PBS3
PBS1 is the main where PVE do the backups, PBS2 is a remote one which syncs.
PBS3 is a new one that will replace PBS1 (I need to decommission PBS1 once PBS3 is fully synced)

Actually we have 2 licenses for PBS1 and PBS2, PBS3 has just been installed from the ISO and will get the license from PBS1 once it gets decommissioned. As such I can't update right now because the repo because of the missing license
Code:
Err:3 https://enterprise.proxmox.com/debian/pbs bullseye InRelease
  401  Unauthorized [IP: 212.224.123.70 443]

this is from PBS1:
Code:
╭─root@pbs-host1 ~
╰─➤  proxmox-backup-manager versions --verbose                                                                                                   130 ↵
proxmox-backup                2.4-1        running kernel: 5.15.108-1-pve
proxmox-backup-server         2.4.2-2      running version: 2.4.2
pve-kernel-5.15               7.4-4
pve-kernel-5.13               7.1-9
pve-kernel-5.15.108-1-pve     5.15.108-2
pve-kernel-5.15.107-2-pve     5.15.107-2
pve-kernel-5.13.19-6-pve      5.13.19-15
pve-kernel-5.13.19-1-pve      5.13.19-3
ifupdown2                     3.1.0-1+pmx4
libjs-extjs                   7.0.0-1
proxmox-backup-docs           2.4.2-1
proxmox-backup-client         2.4.2-1
proxmox-mail-forward          0.1.1-1
proxmox-mini-journalreader    1.2-1
proxmox-offline-mirror-helper 0.5.2
proxmox-widget-toolkit        3.7.3
pve-xtermjs                   4.16.0-2
smartmontools                 7.2-pve3
zfsutils-linux                2.1.11-pve1

And PBS2:
Code:
╭─root@pbs-host2 ~
╰─➤  proxmox-backup-manager versions --verbose                                                                                                     1 ↵
proxmox-backup                2.4-1        running kernel: 5.15.108-1-pve
proxmox-backup-server         2.4.2-2      running version: 2.4.2
pve-kernel-5.15               7.4-4
pve-kernel-5.15.108-1-pve     5.15.108-2
pve-kernel-5.15.107-2-pve     5.15.107-2
pve-kernel-5.15.102-1-pve     5.15.102-1
ifupdown2                     3.1.0-1+pmx4
libjs-extjs                   7.0.0-1
proxmox-backup-docs           2.4.2-1
proxmox-backup-client         2.4.2-1
proxmox-mail-forward          0.1.1-1
proxmox-mini-journalreader    1.2-1
proxmox-offline-mirror-helper unknown
proxmox-widget-toolkit        3.7.3
pve-xtermjs                   4.16.0-2
smartmontools                 7.3-1
zfsutils-linux                2.1.11-pve1

How to achieve what I need regarding the license and updating PBS3?
 
In that case you could simply upgrade the host using the no-subscription repo for now and once the license is switched (which you can do up to 3 times in your proxmox shop account) switch back to the subscription repo.
 
Ok, thank you.

I upgraded PBS3, reboot and retried the sync job.
Still the same error:

Code:
2023-08-07T14:32:02+02:00: sync group vm/100 failed - group lock failed: not a valid user id
 
Please also try to recreate the sync job after the upgrade. Does the problem persist even after that?
 
Do you maybe have a preexisting vm/100 group with different owner? When looking through the code, I could not really find a reason why the sync should fail with this error, as the owner in you config looks valid.

Maybe try to clear this group and resync.
 
What actually an "owner" for this group mean?
Because i'm syncing with the root@pam user it should have full access.
I also don't understand because the other PBS2 sync without any issue.
Also I tried to sync from PBS2(instead of PBS1) to PBS3, and the same error happens.
This is strange.

Last thing I tried is to update all PBS servers to latest debian bookworm with PBS 3.0.0.
Same error...

I'm out of idea now.
 
Ok, I tried to change the owner of the backup on the PBS1 for vm/100, but it did not change anything.

I also tried some different:
I connected PBS3 to our PVE cluster. I then started a backup from vm/100 directly to PBS3 and it failed with the same error message:

Code:
INFO: starting new backup job: vzdump 100 --remove 0 --storage pbs-host3-store-01 --node pve-host3 --notes-template '{{guestname}}' --mode snapshot
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2023-08-08 08:56:47
INFO: status = running
INFO: VM Name: svr-dc2
INFO: include disk 'scsi0' 'pool-SSD:vm-100-disk-0' 51508M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2023-08-08T06:56:47Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: VM 100 qmp command 'backup' failed - backup connect failed: command error: not a valid user id
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - VM 100 qmp command 'backup' failed - backup connect failed: command error: not a valid user id
INFO: Failed at 2023-08-08 08:57:05
INFO: Backup job finished with errors
TASK ERROR: job errors

Any idea?
 
Ok, I tried to change the owner of the backup on the PBS1 for vm/100, but it did not change anything.

I also tried some different:
I connected PBS3 to our PVE cluster. I then started a backup from vm/100 directly to PBS3 and it failed with the same error message:

Code:
INFO: starting new backup job: vzdump 100 --remove 0 --storage pbs-host3-store-01 --node pve-host3 --notes-template '{{guestname}}' --mode snapshot
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2023-08-08 08:56:47
INFO: status = running
INFO: VM Name: svr-dc2
INFO: include disk 'scsi0' 'pool-SSD:vm-100-disk-0' 51508M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2023-08-08T06:56:47Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: VM 100 qmp command 'backup' failed - backup connect failed: command error: not a valid user id
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - VM 100 qmp command 'backup' failed - backup connect failed: command error: not a valid user id
INFO: Failed at 2023-08-08 08:57:05
INFO: Backup job finished with errors
TASK ERROR: job errors

Any idea?
Something is off with that group on node PBS3. Did you already try to completely remove and resync that group on node PBS3?
 
Last edited:
Please check the content and the file permissions of the file owner located at <datastore>/vm/100. The file should contain a valid user with format username@realm so in your case root@pam.
 
  • Like
Reactions: emcp

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!