Stop sync job after xxx hours

michaelsage

Active Member
Jul 10, 2020
25
4
43
44
Hi,
I have a bit of a weird issue, sometimes my sync job just runs and runs and won't stop. This means the following days sync task doesn't run (and it also appears to "hang" a garbage collection job). Is there any way to stop a job running after x number of hours? Normally this job completes within an hour, I suspect it something to do with the fact is it's going to an s3 compatible endpoint and is probably not PBSs "fault".

The job won't terminate with a normal stop within the webgui, so I think a force stop after x hours would be the best approach. Is there a way to do this in the GUI or using a script?

Thanks

Michael
 
Hi,
Hi,
I have a bit of a weird issue, sometimes my sync job just runs and runs and won't stop. This means the following days sync task doesn't run (and it also appears to "hang" a garbage collection job). Is there any way to stop a job running after x number of hours? Normally this job completes within an hour, I suspect it something to do with the fact is it's going to an s3 compatible endpoint and is probably not PBSs "fault".
Do you see some errors in the systemd journal during the timespan of the sync job? Please also post proxmox-backup-manager version --verbose

The job won't terminate with a normal stop within the webgui, so I think a force stop after x hours would be the best approach. Is there a way to do this in the GUI or using a script?
Could be done using an api client or proxmox-backup-debug via the api endpoint https://pbs.proxmox.com/docs/api-viewer/index.html#/nodes/{node}/tasks/{upid} but better would be to identify what the issue is so we might fix this.
 
Output from proxmox-backup-manager version --verbose
Code:
proxmox-backup                      4.0.0         running kernel: 6.14.11-4-pve
proxmox-backup-server               4.0.20-1      running version: 4.0.20
proxmox-kernel-helper               9.0.4
proxmox-kernel-6.14.11-4-pve-signed 6.14.11-4
proxmox-kernel-6.14                 6.14.11-4
proxmox-kernel-6.14.11-3-pve-signed 6.14.11-3
proxmox-kernel-6.8.12-13-pve-signed 6.8.12-13
proxmox-kernel-6.8                  6.8.12-13
ifupdown2                           3.3.0-1+pmx11
libjs-extjs                         7.0.0-5
proxmox-backup-docs                 4.0.20-1
proxmox-backup-client               4.0.20-1
proxmox-mail-forward                1.0.2
proxmox-mini-journalreader          1.6
proxmox-offline-mirror-helper       0.7.3
proxmox-widget-toolkit              5.1.1
pve-xtermjs                         5.5.0-3
smartmontools                       7.4-pve1
zfsutils-linux                      2.3.4-pve1

Just checked the journal... There is something weird, which seems to start at the same time the backup "hangs"

Code:
Nov 21 16:02:03 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:xxxx SRC=45.33xxx DST=37.187.xxx LEN=44 TOS=0x00 PREC=0x00 TTL=241 ID=30156 PROTO=TCP SPT=59634 DPT=3312 WINDOW=1025 RES=0x00 SYN URGP=0
Nov 21 16:02:06 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:xxxx SRC=207.90xxx DST=37.187.xxx LEN=44 TOS=0x08 PREC=0x20 TTL=111 ID=53767 PROTO=TCP SPT=26200 DPT=7050 WINDOW=16769 RES=0x00 SYN URGP=0
Nov 21 16:02:26 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:xxxx SRC=20.171xxx DST=37.187.xxx LEN=52 TOS=0x00 PREC=0x00 TTL=42 ID=26744 PROTO=TCP SPT=59596 DPT=8888 WINDOW=65535 RES=0x00 SYN URGP=0
This repeats again and again in the logs until the box is rebooted. I haven't changed anything on the ufw firewall, and it only happens sometimes, I would say it could be to do with a established session expiring?
 
Make sure the UFW firewall is not interfering with the traffic from the clients to the PBS and from the PBS to the s3 backend.
 
Had another look through the logs and I think it's probably a red herring. This is what the log looks like just before it stops caching and hangs

Code:
Nov 18 00:02:45 sinclair proxmox-backup-proxy[888]: Caching of chunk 83efc0394c77f67c2144cadd243b64591bf5f3c9669a71e55017d571cf1461ed
Nov 18 00:02:45 sinclair proxmox-backup-proxy[888]: Upload new chunk 8f1ce77492108db7c48cd0c1efff313ccd6f3d46da5c7314f4536f2562f74da8
Nov 18 00:02:45 sinclair proxmox-backup-proxy[888]: Caching of chunk 8bf741733b8eab58ffda5e99978596bf48f2aacc89b17e13a59aa3f8befee945
Nov 18 00:02:46 sinclair proxmox-backup-proxy[888]: Caching of chunk 8f1ce77492108db7c48cd0c1efff313ccd6f3d46da5c7314f4536f2562f74da8
Nov 18 00:02:46 sinclair proxmox-backup-proxy[888]: Caching of chunk 51cee1a7f51821ec6ce724c1410a002020194844de312f2db12549ab9306c1a6
Nov 18 00:02:46 sinclair proxmox-backup-proxy[888]: Caching of chunk bcc75653b2565c46f47c4c166e7f1d9b32d5106c00c6cc710b42c407d49b7805
Nov 18 00:02:47 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=78.128.114.xx DST=37.187.xx.xx LEN=40 TOS=0x00 PREC=0x00 TTL=241 ID=34793 PROTO=TCP SPT=43933 DPT=3389 WINDOW=1024 RES=0x00 SYN URGP=0
Nov 18 00:03:09 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=167.94.138.xx DST=37.187.xx.xx LEN=60 TOS=0x00 PREC=0x00 TTL=46 ID=21001 PROTO=TCP SPT=62589 DPT=83 WINDOW=42340 RES=0x00 SYN URGP=0
Nov 18 00:03:39 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=147.185.133.xx DST=37.187.xx.xx LEN=44 TOS=0x00 PREC=0x00 TTL=243 ID=31358 PROTO=TCP SPT=50113 DPT=17550 WINDOW=1024 RES=0x00 SYN URGP=0
Nov 18 00:03:48 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=5.188.206.xx DST=37.187.xx.xx LEN=40 TOS=0x00 PREC=0x00 TTL=241 ID=17698 PROTO=TCP SPT=51437 DPT=3173 WINDOW=1024 RES=0x00 SYN URGP=0
Nov 18 00:04:12 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=198.235.24.xx DST=37.187.xx.xx LEN=44 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=50076 DPT=3690 WINDOW=65535 RES=0x00 SYN URGP=0
Nov 18 00:04:39 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=206.168.34.xx DST=37.187.xx.xx LEN=60 TOS=0x00 PREC=0x00 TTL=48 ID=16335 PROTO=TCP SPT=7919 DPT=30545 WINDOW=42340 RES=0x00 SYN URGP=0
Nov 18 00:04:54 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=185.12.59.xx DST=37.187.xx.xx LEN=52 TOS=0x00 PREC=0x00 TTL=52 ID=39818 PROTO=TCP SPT=41514 DPT=443 WINDOW=65535 RES=0x00 SYN URGP=0
Nov 18 00:04:59 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=205.210.31.xx DST=37.187.xx.xx LEN=44 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=52336 DPT=8883 WINDOW=65535 RES=0x00 SYN URGP=0
Nov 18 00:05:35 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=185.244.104.xx DST=37.187.2xx.xx LEN=52 TOS=0x00 PREC=0x00 TTL=48 ID=16385 PROTO=TCP SPT=45996 DPT=443 WINDOW=65535 RES=0x00 SYN URGP=0
Nov 18 00:06:00 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=176.65.148.xx DST=37.187.xx.xx LEN=52 TOS=0x00 PREC=0x00 TTL=51 ID=290 PROTO=TCP SPT=34705 DPT=37 WINDOW=65535 RES=0x00 SYN URGP=0
Nov 18 00:06:03 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=35.203.211.xx DST=37.187.xx.xx LEN=44 TOS=0x00 PREC=0x00 TTL=244 ID=45894 PROTO=TCP SPT=50514 DPT=50726 WINDOW=1024 RES=0x00 SYN URGP=0
Nov 18 00:06:32 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=35.203.210.205 DST=37.187.xx.xx LEN=44 TOS=0x00 PREC=0x60 TTL=245 ID=40317 PROTO=TCP SPT=56544 DPT=5070 WINDOW=1024 RES=0x00 SYN URGP=0
Nov 18 00:06:44 sinclair kernel: [UFW BLOCK] IN=vmbr0 OUT= PHYSIN=eno0 MAC=00:22:4d:ae:51:51:38:0e:4d:xx SRC=204.76.203.254 DST=37.187.xx.xx LEN=32 TOS=0x00 PREC=0x00 TTL=243 ID=54321 PROTO=UDP SPT=60277 DPT=3283 LEN=12

It looks like it just "stops" trying.

This is the section from after the reboot

Code:
Nov 21 16:04:53 sinclair proxmox-backup-proxy[8701]: rrd journal successfully committed (33 files in 0.241 seconds)
Nov 21 16:16:29 sinclair proxmox-backup-proxy[888]: received abort request ...
Nov 21 16:16:43 sinclair proxmox-backup-proxy[888]: received abort request ...
Nov 21 16:16:56 sinclair proxmox-backup-proxy[8701]: error during snapshot file listing: 'unable to load blob '"/pbs/cache/vm/101/2025-11-17T02:00:05Z/index.json.blob"' - No such file or directory (os error 2)'
Nov 21 16:20:20 sinclair systemd[1]: Stopping proxmox-backup-proxy.service - Proxmox Backup API Proxy Server...
Nov 21 16:20:20 sinclair systemd[1]: proxmox-backup-proxy.service: Killing process 8532 (tokio-runtime-w) with signal SIGKILL.
Nov 21 16:20:20 sinclair systemd[1]: proxmox-backup-proxy.service: Deactivated successfully.
Nov 21 16:20:20 sinclair systemd[1]: Stopped proxmox-backup-proxy.service - Proxmox Backup API Proxy Server.
Nov 21 16:20:20 sinclair systemd[1]: proxmox-backup-proxy.service: Consumed 1d 14h 46min 9.474s CPU time, 3.4G memory peak, 42.8M memory swap peak.
-- Boot 36b1f8a97eb44789a250daaa721e4ad3 --
Nov 21 16:21:43 sinclair systemd[1]: Starting proxmox-backup-proxy.service - Proxmox Backup API Proxy Server...
Nov 21 16:21:44 sinclair proxmox-backup-proxy[883]: catching shutdown signal
Nov 21 16:21:44 sinclair proxmox-backup-proxy[883]: catching reload signal
Nov 21 16:21:44 sinclair systemd[1]: Started proxmox-backup-proxy.service - Proxmox Backup API Proxy Server.
Nov 21 16:21:47 sinclair proxmox-backup-proxy[883]: applied rrd journal (3090 entries in 3.114 seconds)
Nov 21 16:21:47 sinclair proxmox-backup-proxy[883]: rrd journal successfully committed (33 files in 0.391 seconds)
Nov 21 16:22:00 sinclair proxmox-backup-proxy[883]: Using datastore cache with capacity 13081 for store MJSSyncTest

The only other oddity when filtering the journal from the proxmox-backup-proxy is this line
Nov 18 02:03:36 sinclair systemd[1]: proxmox-backup-proxy.service: Supervising process 8701 which is not our child. We'll most likely not notice when it exits.

Which is in this block
Code:
Nov 18 02:01:07 sinclair proxmox-backup-proxy[888]: Upload backup log to datastore 'Backups', namespace 'MJS/Trinity' vm/101/2025-11-18T02:00:05Z/client.log.blob
Nov 18 02:03:35 sinclair systemd[1]: Reloading proxmox-backup-proxy.service - Proxmox Backup API Proxy Server...
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: got reload request (SIGHUP)
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: request_shutdown
Nov 18 02:03:36 sinclair systemd[1]: Reloaded proxmox-backup-proxy.service - Proxmox Backup API Proxy Server.
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: request_shutdown
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: daemon reload...
Nov 18 02:03:36 sinclair systemd[1]: proxmox-backup-proxy.service: Supervising process 8701 which is not our child. We'll most likely not notice when it exits.
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: daemon shut down.
Nov 18 02:03:36 sinclair proxmox-backup-proxy[888]: server shutting down, waiting for active workers to complete
Nov 18 02:03:36 sinclair proxmox-backup-proxy[8701]: catching shutdown signal
Nov 18 02:03:36 sinclair proxmox-backup-proxy[8701]: catching reload signal
Nov 18 02:03:36 sinclair proxmox-backup-proxy[8701]: applied rrd journal (2884 entries in 0.576 seconds)
Nov 18 02:03:37 sinclair proxmox-backup-proxy[8701]: rrd journal successfully committed (33 files in 0.487 seconds)
Nov 18 02:04:00 sinclair proxmox-backup-proxy[8701]: Using datastore cache with capacity 13451 for store MJSSyncTest