Error when trying to replicate/failover

sirebral

Member
Feb 12, 2022
50
10
13
Oregon, USA
Hello!

I am working on setting up HA/replication this evening. I have 2 boxes and an arbiter node in the cluster. The replication only occurs between the two main nodes, PVE and PVE2. There's no shared storage, so everything is snapshots and local ZFS DAS. The boxes are interconnected at 40 gigabits over a single pipe that's dedicated to failover. I don't see anything pointing towards connectivity, just want to make my setup clear.

Here's what happens...

- I turn on HA for a CT or VM.
- I enable replication and chose the HA group that only contains PVE and PVE2.
- I force a replication just to be sure it's working, it comes up as OK and shows a last replication time.
- I force a failover (I've tested both directions with the same results).
- The failover is successful.
- I try forcing a fail back to the original node.
- Fail!

Here's the message in the log:

2023-04-20 00:55:02 102-0: start replication job
2023-04-20 00:55:02 102-0: guest => CT 102, running => 1
2023-04-20 00:55:02 102-0: volumes => nvmect:subvol-102-disk-0
2023-04-20 00:55:04 102-0: freeze guest filesystem
2023-04-20 00:55:04 102-0: create snapshot '__replicate_102-0_1681977302__' on nvmect:subvol-102-disk-0
2023-04-20 00:55:04 102-0: thaw guest filesystem
2023-04-20 00:55:04 102-0: using insecure transmission, rate limit: none
2023-04-20 00:55:04 102-0: incremental sync 'nvmect:subvol-102-disk-0' (__replicate_102-0_1681977055__ => __replicate_102-0_1681977302__)
2023-04-20 00:55:06 102-0: send from @__replicate_102-0_1681977055__ to nvme/ct/subvol-102-disk-0@__replicate_102-0_1681977302__ estimated size is 55.4M
2023-04-20 00:55:06 102-0: total estimated size is 55.4M
2023-04-20 00:55:07 102-0: warning: cannot send 'nvme/ct/subvol-102-disk-0@__replicate_102-0_1681977302__': signal received
2023-04-20 00:55:07 102-0: internal error: cannot send 'nvme/ct/subvol-102-disk-0': Connection reset by peer
2023-04-20 00:55:07 102-0: command 'zfs send -Rpv -I __replicate_102-0_1681977055__ -- nvme/ct/subvol-102-disk-0@__replicate_102-0_1681977302__' failed: got signal 6
2023-04-20 00:55:07 102-0: [pve] cannot receive incremental stream: incremental send stream requires -L (--large-block), to match previous receive.
2023-04-20 00:55:07 102-0: [pve] command 'zfs recv -F -- nvme/ct/subvol-102-disk-0' failed: exit code 1
2023-04-20 00:55:07 102-0: delete previous replication snapshot '__replicate_102-0_1681977302__' on nvmect:subvol-102-disk-0
2023-04-20 00:55:07 102-0: end replication job with error: command 'set -o pipefail && pvesm export nvmect:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-0_1681977302__ -base __replicate_102-0_1681977055__' failed: exit code 255

I've checked the two pools, they're identical for ZFS settings. They're both 4KN and set to 4K (NVME drives).

I've done some Google searching and come up with almost nothing, which is strange, so, I'm kinda stuck. Can anyone give some insight as to what I could try? I'm looking forward to having a happy failover cluster.

Thanks all,

Keith
 
Thanks for that, so it's broken in Proxmox. Do the devs happen to have a fix planned? Seems like a really large issue that HA/failover is completely broken for anyone who's running a blocksize larger than 128K. It's good practice to match your workload. Looking forward to the fix! :)

All the best and glad you were able to point out the thread.

Keith
 
Last edited:
OK, so, yes… -RpvL does the trick. Don't generally like modifying code unless it's my project, yet this works for the time being.

Thanks again!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!