rsync failed: exit code 255

chebedewel

Active Member
Nov 16, 2018
20
5
43
Noumea, New Caledonia
www.mynet.nc
Hi,

I'm running a cluster of 6 hosts (VM on ESXi) with community subscription, one one which is used only for quarantine. All 6 are running the same configuration (!) 3 are dealing with two domains and 2 are dealing with the rest (±300 domains) via MX records.
All of them are up2date proxmox-mailgateway: 5.1-1 (API: 5.1-3/529c5439, running kernel: 4.15.18-7-pve)
All of them have similar error messages in the daemon.log:
Code:
Dec 17 15:26:19 mail4 pmgmirror[1185]: starting cluster syncronization
Dec 17 15:26:20 mail4 pmgmirror[1185]: database sync 'mail21' failed - command 'rsync '--rsh=ssh -l root -o BatchMode=yes -o HostKeyAlias=mail21' -q --timeout 10 2001:0DB8::21:/var/spool/pmg /var/spool/pmg --files-from /tmp/quarantinefilelist.1185' failed: exit code 255
Dec 17 15:26:20 mail4 pmgmirror[1185]: database sync 'mail11' failed - command 'rsync '--rsh=ssh -l root -o BatchMode=yes -o HostKeyAlias=mail11' -q --timeout 10 2001:0DB8::11:/var/spool/pmg /var/spool/pmg --files-from /tmp/quarantinefilelist.1185' failed: exit code 255
Dec 17 15:26:20 mail4 pmgmirror[1185]: database sync 'mail24' failed - command 'rsync '--rsh=ssh -l root -o BatchMode=yes -o HostKeyAlias=mail24' -q --timeout 10 2001:0DB8::24:/var/spool/pmg /var/spool/pmg --files-from /tmp/quarantinefilelist.1185' failed: exit code 255
Dec 17 15:26:21 mail4 pmgmirror[1185]: database sync 'mail7' failed - command 'rsync '--rsh=ssh -l root -o BatchMode=yes -o HostKeyAlias=mail7' -q --timeout 10 2001:0DB8::7:/var/spool/pmg /var/spool/pmg --files-from /tmp/quarantinefilelist.1185' failed: exit code 255
Dec 17 15:26:21 mail4 pmgmirror[1185]: cluster syncronization finished  (4 errors, 1.80 seconds (files 0.25, database 1.55, config 0.00))
Weird thing, there is no message for the mail12 node
All the servers are able to communicate through ssh to one another.
The /tmp/quarantinefilelist.1185 file does not exist on any servers (the 4 final numbers are different on each server).
I've looked into this, because the quarantine is not "in sync" it only sees the two main domains, if I want to look at the quarantine of the 300 other domains, I have to log in an another server (?)

If someone could point me in the right direction or even give me a solution to fix this problem that would be awesome !
 
The /tmp/quarantinefilelist.1185 does exist sometimes ...
Code:
-rw-r--r-- 1 root   root   8192 Dec 17 17:12 quarantinefilelist.1185
But the error is still there, so maybe it's not because the file does not exist.
The mail12 node is registered on the cluster with it's IPv4 ans does not show on any error log.
Can I change safely the IP addresses in /etc/pmg/cluster.conf ?
 
I did replace the IPv6 addresses with the IPv4, the error message is gone ...
Code:
Dec 17 18:00:30 mail4 pmgmirror[1185]: starting cluster syncronization
Dec 17 18:00:31 mail4 pmgmirror[1185]: cluster syncronization finished  (0 errors, 1.89 seconds (files 1.28, database 0.61, config 0.00))
And now, all the state of the node went from syncing to active. And of course the quarantine host has access to all the quarantined emails !
So there is indeed an IPv6 bug in the sync... that should be addressed
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!