Storage Replication - Starts fast and then slows to a crawl, second dataset slow from start. Using another way to send snapshots goes full speed.

effgee · Aug 16, 2020

Update: Believe this has to do with Wireguard VPN, possibly an MTU issue but the high speed going down to a trickle is very odd. If anyone has any insight would appreciate it.

Kinda new to Proxmox built in replication but I am very familiar with zfs send / recv

Testing out simple storage replication and having an issue. Not sure what is going on.

First dataset, transfer started out fast going 50-130 MBS, which is normal for my networking, then near the end slowed to a crawl.
The cpu / memory and network were all not being utilized, just crawled to a slow trickle.

First dataset finished, and then second dataset continued but the lower performance was there from the start ~ 2 megabytes a second.

Eventually I canceled the transfer with the additional datasets and used syncoid which sent the datasets at full speed.

I have included the log file although I dont know what good it will be .

Both nodes are completely up to date (GPL/Free)

Code:

pveversion --verbose
proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve)
pve-manager: 6.2-10 (running version: 6.2-10/a20769ed)
pve-kernel-5.4: 6.2-4
pve-kernel-helper: 6.2-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.44-1-pve: 5.4.44-1
pve-kernel-4.15: 5.4-12
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-24-pve: 4.15.18-52
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 8.0-2
ifupdown: 0.8.35+pve1
ksmtuned: 4.20150325+b1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-5
libpve-guest-common-perl: 3.1-2
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-9
pve-cluster: 6.1-8
pve-container: 3.1-12
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-11
pve-xtermjs: 4.3.0-1
pve-zsync: 2.0-3
qemu-server: 6.2-11
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1

Alwin · Oct 13, 2020

effgee said:
Kinda new to Proxmox built in replication but I am very familiar with zfs send / recv

The storage replication is zfs send/receive under the hood.

effgee said:
Update: Believe this has to do with Wireguard VPN, possibly an MTU issue but the high speed going down to a trickle is very odd. If anyone has any insight would appreciate it.

Did you test the replication without a VPN (to verify)?

effgee · Oct 17, 2020

Alwin said:
The storage replication is zfs send/receive under the hood.

Did you test the replication without a VPN (to verify)?

Never really figured out exactly what was causing it, did the same behavior over SSH as well. Ended up going with Sanoid / syncoid for replication which has an adjustable mbuffer switch and is generally, no offense intended a more clever and forgiving replication mechanism.

Adjusting the mbuffer (gave it 750 megabytes buffer) seemed to help although there were still times where little bits of data took way longer than they should. Is there a way to adjust buffers on the Proxmoxian replication?

Search

Search

Storage Replication - Starts fast and then slows to a crawl, second dataset slow from start. Using another way to send snapshots goes full speed.

effgee

Renowned Member

Attachments

Alwin

Proxmox Retired Staff

effgee

Renowned Member

We value your privacy