Problem Replication

cr_eis

Member
Feb 3, 2021
20
3
23
52
Hello,
When an automatically job's replication is launch over another node i have sometimes this error :

Replication job 120-0 with target 'pve2' and schedule '*/2:00' failed! Last successful sync: 2022-05-18 02:00:01 Next sync try: 2022-05-18 04:05:00 Failure count: 1 Error: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' [EMAIL]root@192.168.1.12[/EMAIL] -- pvesr prepare-local-job 120-0 --scan local-zfs local-zfs:vm-120-disk-0 local-zfs:vm-120-disk-1 local-zfs:vm-120-disk-2 --last_sync 1652832001' failed: exit code 255

Howewer, the automatically job's replication is ok 9/10.
I don't find what is the problem.
Anyone can help me ?
Thanks
 
Hi,
please post the output of pveversion -v and try to obtain the log right after the job failed (there's a Log button in the UI). Since you said it works most of the time, it might be a sporadic issue with too much load/timeouts. This has been improved a bit in libpve-storage-perl >= 7.2-1, but maybe it's still not enough for you?
 
Hello,
Here's the output of pveversion. Yesterday, i did the upgrade but i had the problem this night.

Code:
proxmox-ve: 7.2-1 (running kernel: 5.15.35-1-pve)
pve-manager: 7.2-3 (running version: 7.2-3/c743d6c1)
pve-kernel-5.15: 7.2-3
pve-kernel-helper: 7.2-3
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-4-pve: 5.11.22-9
ceph-fuse: 15.2.14-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-8
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-6
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.2-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.1.8-1
proxmox-backup-file-restore: 2.1.8-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-10
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-1
pve-qemu-kvm: 6.2.0-6
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1

For the log of the error, is it possible to keep all the logs and not the last
Thanks
 
For the log of the error, is it possible to keep all the logs and not the last
Unfortunately only the last one is kept. You might also want to check /var/log/syslog on both nodes from around the time the issue occurred.