TASK ERROR: connection error: not connected

Dunuin

Distinguished Member
Jun 30, 2020
14,793
4,630
258
Germany
Hi,

As suggested here I tried to use a sync task to copy/move all backup groups with all snapshots from the root namespace to some custom namespaces:
to actually "move" (copy) the backup groups and snapshots, you can use a sync job (with the remote pointing to the same PBS instance) or a one-off proxmox-backup-manager pull. when moving from datastore A root namespace to datastore A namespace foo, only the metadata (manifest and indices and so on) will be copied, the chunks will be re-used. when moving from datastore A to datastore B the chunks that are not already contained in B will of course be copied as well, no matter whether namespaces are involved at either side ;).
For that I created a local remote like this:
pbs_remote.png
Tried it previously with the local IP instead of 127.0.0.1 and the token that owns my backup groups instead of root@pam but its the same problem.

Then I created these sync tasks to run them once manually to migrate my backups to the new namespace or even new datastore+namespace:
pbs_sync1.png
pbs_sync2.png
"Local Owner" is my token that owns my backup groups in the root namespaces and I want to keep it that way for the custom namespaces too.

But when I run sync task they finish with "OK" but the task log is spammed with red errors:
pbs_errors.png
I didn't looked at all errors but looks like they all should contain something like this:
Code:
2022-05-19T11:18:52+02:00: starting new backup reader datastore 'PBS_DS2': "/mnt/pbs2"
2022-05-19T11:18:52+02:00: protocol upgrade done
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/index.json.blob"
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/qemu-server.conf.blob"
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/fw.conf.blob"
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/drive-scsi1.img.fidx"
2022-05-19T11:18:52+02:00: register chunks in 'drive-scsi1.img.fidx' as downloadable.
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/drive-scsi0.img.fidx"
2022-05-19T11:18:52+02:00: register chunks in 'drive-scsi0.img.fidx' as downloadable.
2022-05-19T11:18:52+02:00: GET /download
2022-05-19T11:18:52+02:00: download "/mnt/pbs2/vm/124/2022-05-16T03:45:11Z/client.log.blob"
2022-05-19T11:18:52+02:00: TASK ERROR: connection error: not connected

To me it looks like the sync was bascially successful but I want to move those backup groups to the new namespace, not just copy them. So next I would need to delete all the backups in my root namespaces and I don`t want to end up with deleted backups that worked fine and a bunch of partial not working syned backups.

But according to what @fabian wrote here...
these are benign - the connection error is because the sync client drops the connection and the endpoint doesn't yet expect that (should be improved). downloading the log is just best effort - it doesn't have to be there at all (or yet), and unless your host backup scripts upload a log (after the backup) there won't be one
...I can ignore all those "TASK ERROR: connection error: not connected" errors? Doesn't expect the sync client that dopped connection because it is a local one?

Is there a way to verify that all synced groups/snapshots are correct and complete before deleting the snapshots from the root namespace?
Looks like a verify task isn't working for that as all snapshots were verified in the root namespace before doing the sync and the synced snapshots are still shown as "all ok" even if I never ran a verify job for that namespace.
 
Last edited:
you can run verify in a ns an uncheck the "skip verified" checkbox ;)
 
any news on this? we also still get these errors. i created a local loop 'remote sync' as mentioned somewhere to provide a pre-pruned namespace for a slow downstream pbs instance (which accesses this pre-pruned namespace just fine).

verify all works flawlessly on all bits and pieces, there seems to be no real error here?

Code:
2023-04-25T14:12:24+02:00: starting new backup reader datastore 'zwischenstation': "/mnt/datastore/zwischenstation"
2023-04-25T14:12:24+02:00: protocol upgrade done
2023-04-25T14:12:24+02:00: GET /download
2023-04-25T14:12:24+02:00: download "/mnt/datastore/zwischenstation/ns/cluster/ct/104/2023-04-20T22:03:24Z/index.json.blob"
2023-04-25T14:12:24+02:00: GET /download
2023-04-25T14:12:24+02:00: download "/mnt/datastore/zwischenstation/ns/cluster/ct/104/2023-04-20T22:03:24Z/pct.conf.blob"
2023-04-25T14:12:24+02:00: GET /download
2023-04-25T14:12:24+02:00: download "/mnt/datastore/zwischenstation/ns/cluster/ct/104/2023-04-20T22:03:24Z/root.pxar.didx"
2023-04-25T14:12:24+02:00: register chunks in 'root.pxar.didx' as downloadable.
2023-04-25T14:12:24+02:00: GET /download
2023-04-25T14:12:24+02:00: download "/mnt/datastore/zwischenstation/ns/cluster/ct/104/2023-04-20T22:03:24Z/catalog.pcat1.didx"
2023-04-25T14:12:24+02:00: register chunks in 'catalog.pcat1.didx' as downloadable.
2023-04-25T14:12:24+02:00: GET /download
2023-04-25T14:12:24+02:00: download "/mnt/datastore/zwischenstation/ns/cluster/ct/104/2023-04-20T22:03:24Z/client.log.blob"
2023-04-25T14:12:24+02:00: TASK ERROR: connection error: not connected

is PBS smart enough to realize it can access the file locally, but some "final" block expects to close a socket conenction?
 
there is no "real" error - it just logs that the connection was closed unexpectedly (from the server's point of view).

is PBS smart enough to realize it can access the file locally, but some "final" block expects to close a socket conenction?

no, PBS has no idea that this sync is from itself :) there is ongoing work to implement a real "local" sync (that uses file access directly, instead of going over the backup reader API like a remote sync).

for your original problem - in git here now is a "transfer-last" parameter for syncing that transfers only the last N new snapshots for each sync, should be included in the next released version - maybe it helps you avoiding the local sync altogether?
 
spiraling a bit off-topic:

for your original problem - in git here now is a "transfer-last" parameter for syncing that transfers only the last N new snapshots for each sync, should be included in the next released version - maybe it helps you avoiding the local sync altogether?

that sounds like a nice solution, i'm not sure it fits our use case though. i think our approach is not that uncommon: we have an ssd-equipped intermediate backup target that should keep, say, 4 backups per day, and daily backups for a week, as space permits. next stop is a spinning-rust pbs instance that should store at most one backup per day, for two weeks back, and then some weekly, monthly and yearly backups. at the very end of the chain we will add a tape library for even fewer "last resort" / offsite backups, probably weekly.

we are trying to optimize for fast restore times for common cases, while maintaining desaster recovery capabilites in case a full failure, where loosing a couple of days worth of work is not a major concern.

i understand the current solution to implement this is still the approach i am following so far - is that correct?

thanks!

P.S.: pbs is awesome, i am a long time user of proxmox VE and for a long time the backup process has been the only weak spot. you didn't just fix the weak spot, you made it another strength of the proxmox solution, way exceeding my wildest expectations. thanks!!
 
ah yeah, for that use case having a namespace with specific pruning settings offers more flexibility/control. transfer-last is mainly relevant if your link to offsite PBS is not that fast (so syncing all snapshots is not an option because it can't keep up), or space on the target is very limited (e.g., you only want to keep X recent snapshots there for local restore, so you only send the latest N where N<X, then prune+GC)
 
We see the same task errors on our newly installed PBS. We use SSD-Storage for the actual backup and sync some of the snapshots to RDX (where we still have to solve the eject/remount/media changing process) through a localhost sync as Proxmox Backup Server cannot sync from one to another repository without defining a remote PBS.
 
like I said - the errors are an unfortunate side-effect of how the connection closing is implemented and hard to fix.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!