9 Day old stuck sync job.

RobFantini

Famous Member
May 24, 2012
2,042
110
133
Boston,Mass
on today 8/11 i noticed a long running sync job :
Code:
2020-08-02T04:11:41-04:00: no data changes
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/3120/2020-07-28T04:02:20Z" done
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/4120/2020-07-26T04:05:00Z"
2020-08-02T04:11:41-04:00: no data changes
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/4120/2020-07-26T04:05:00Z" done
2020-08-02T04:11:41-04:00: sync snapshot "vm/4120/2020-08-02T04:05:13Z"
2020-08-02T04:11:41-04:00: sync archive qemu-server.conf.blob
2020-08-02T04:11:41-04:00: sync archive drive-scsi0.img.fidx
2020-08-02T04:12:33-04:00: sync archive drive-scsi1.img.fidx

at that time someone had unplugged the network power , it was restored a few hours later.


the sync job runs daily, and did well for many days before that.

Question: Is it possible to get a notice when a sync job tries to run and can not due to already running?
 
also I am unable to stop the job at the pbs https screen. task viewer > stop does not work. this got to /var/log/syslog
Code:
Aug 11 16:47:29 pve-maint proxmox-backup-proxy[1703]: CSRF prevention token: "5F3303D0:g9tKNgEp12pwmmteOhpDQLlW9/KHCZAHZXFdDEQgc3o"
Aug 11 16:47:29 pve-maint proxmox-backup-proxy[1703]: set abort flag for worker UPID:pve-maint:000006A7:02FE25E4:0000001D:5F267280:syncjob:get-fbc-backup:backup@pam:

so will try to kill the process manually. this worked
Code:
systemctl restart proxmox-backup-proxy.service
 
on today 8/11 i noticed a long running sync job :
Code:
2020-08-02T04:11:41-04:00: no data changes
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/3120/2020-07-28T04:02:20Z" done
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/4120/2020-07-26T04:05:00Z"
2020-08-02T04:11:41-04:00: no data changes
2020-08-02T04:11:41-04:00: re-sync snapshot "vm/4120/2020-07-26T04:05:00Z" done
2020-08-02T04:11:41-04:00: sync snapshot "vm/4120/2020-08-02T04:05:13Z"
2020-08-02T04:11:41-04:00: sync archive qemu-server.conf.blob
2020-08-02T04:11:41-04:00: sync archive drive-scsi0.img.fidx
2020-08-02T04:12:33-04:00: sync archive drive-scsi1.img.fidx

at that time someone had unplugged the network power , it was restored a few hours later.


the sync job runs daily, and did well for many days before that.

Question: Is it possible to get a notice when a sync job tries to run and can not due to already running?

you should get a failed task log entry already. but adding notifications to scheduled sync/GC runs is probably a good idea ;)