error during replication

jensus11

Well-Known Member
Mar 17, 2018
30
3
48
46
Hello,

I've been getting error messages during two replications recently.
How can fix this?
What information do I have to provide or do you still need?
2024-01-01 17:13:45 111-0: start replication job
2024-01-01 17:13:45 111-0: guest => CT 111, running => 0
2024-01-01 17:13:45 111-0: volumes => HV-Speicher:subvol-111-disk-0
2024-01-01 17:13:46 111-0: create snapshot '__replicate_111-0_1704125625__' on HV-Speicher:subvol-111-disk-0
2024-01-01 17:13:46 111-0: using secure transmission, rate limit: none
2024-01-01 17:13:46 111-0: full sync 'HV-Speicher:subvol-111-disk-0' (__replicate_111-0_1704125625__)
2024-01-01 17:13:48 111-0: full send of HV-Speicher/subvol-111-disk-0@Sicherheitssnapshot estimated size is 25.3G
2024-01-01 17:13:48 111-0: send from @Sicherheitssnapshot to HV-Speicher/subvol-111-disk-0@vzdump estimated size is 5.26M
2024-01-01 17:13:48 111-0: send from @vzdump to HV-Speicher/subvol-111-disk-0@__replicate_111-0_1704125625__ estimated size is 7.85M
2024-01-01 17:13:48 111-0: total estimated size is 25.3G
2024-01-01 17:13:48 111-0: TIME SENT SNAPSHOT HV-Speicher/subvol-111-disk-0@Sicherheitssnapshot
2024-01-01 17:13:49 111-0: 17:13:49 51.4M HV-Speicher/subvol-111-disk-0@Sicherheitssnapshot
2024-01-01 17:13:50 111-0: client_loop: send disconnect: Broken pipe
2024-01-01 17:13:50 111-0: warning: cannot send 'HV-Speicher/subvol-111-disk-0@Sicherheitssnapshot': signal received
2024-01-01 17:13:50 111-0: TIME SENT SNAPSHOT HV-Speicher/subvol-111-disk-0@vzdump
2024-01-01 17:13:50 111-0: warning: cannot send 'HV-Speicher/subvol-111-disk-0@vzdump': Broken pipe
2024-01-01 17:13:50 111-0: TIME SENT SNAPSHOT HV-Speicher/subvol-111-disk-0@__replicate_111-0_1704125625__
2024-01-01 17:13:50 111-0: warning: cannot send 'HV-Speicher/subvol-111-disk-0@__replicate_111-0_1704125625__': Broken pipe
2024-01-01 17:13:50 111-0: cannot send 'HV-Speicher/subvol-111-disk-0': I/O error
2024-01-01 17:13:50 111-0: command 'zfs send -Rpv -- HV-Speicher/subvol-111-disk-0@__replicate_111-0_1704125625__' failed: exit code 1
2024-01-01 17:13:50 111-0: delete previous replication snapshot '__replicate_111-0_1704125625__' on HV-Speicher:subvol-111-disk-0
2024-01-01 17:13:50 111-0: end replication job with error: command 'set -o pipefail && pvesm export HV-Speicher:subvol-111-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_111-0_1704125625__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=node2' root@192.168.1.111 -- pvesm import HV-Speicher:subvol-111-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_111-0_1704125625__ -allow-rename 0' failed: exit code 255

Bildschirmfoto 2024-01-01 um 17.24.21.png

When I look at the drive, the ones that aren't replicated have user root written on them.
Could that be the reason? If so, how can I change that?

Bildschirmfoto 2024-01-02 um 11.44.21.png
 
I had he same issue with 2 VM's.

I did not check the zfs volume ownership.

My fix was to remove the replication job, wait for it to be completely removed from the view, and create a new replication job.

We will have to regularly check the replication job status because of this. I have also set a more frequent backup job on PBS, depending on the criticality and update rate of the VM's data.