PVESR - Volume already exist

CharlesErickT · Nov 3, 2017

Hello,

I've recently setup 2 nodes (5.1). 1 Being a master that use PVESR to replicate the the other slave node.

Everything worked for a week but now one KVM out of the bunch won't replicate onto the slave node. I'm getting an error that the volume already exist but to my understanding it's fine since it should overwrite it.

Here are the logs of my job

Code:

2017-11-03 16:04:01 128-0: start replication job
2017-11-03 16:04:01 128-0: guest => VM 128, running => 6216
2017-11-03 16:04:01 128-0: volumes => local-zfs:vm-128-disk-1
2017-11-03 16:04:02 128-0: freeze guest filesystem
2017-11-03 16:04:10 128-0: create snapshot '__replicate_128-0_1509739441__' on local-zfs:vm-128-disk-1
2017-11-03 16:04:10 128-0: thaw guest filesystem
2017-11-03 16:04:14 128-0: full sync 'local-zfs:vm-128-disk-1' (__replicate_128-0_1509739441__)
2017-11-03 16:04:15 128-0: full send of rpool/data/vm-128-disk-1@__replicate_128-0_1509739441__ estimated size is 79.8G
2017-11-03 16:04:15 128-0: total estimated size is 79.8G
2017-11-03 16:04:15 128-0: TIME SENT SNAPSHOT
2017-11-03 16:04:15 128-0: rpool/data/vm-128-disk-1 name rpool/data/vm-128-disk-1 -
2017-11-03 16:04:15 128-0: volume 'rpool/data/vm-128-disk-1' already exists
2017-11-03 16:04:15 128-0: warning: cannot send 'rpool/data/vm-128-disk-1@__replicate_128-0_1509739441__': signal received
2017-11-03 16:04:15 128-0: cannot send 'rpool/data/vm-128-disk-1': I/O error
2017-11-03 16:04:15 128-0: command 'zfs send -Rpv -- rpool/data/vm-128-disk-1@__replicate_128-0_1509739441__' failed: exit code 1
2017-11-03 16:04:15 128-0: delete previous replication snapshot '__replicate_128-0_1509739441__' on local-zfs:vm-128-disk-1
2017-11-03 16:04:16 128-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-128-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_128-0_1509739441__ | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=dev-proxmox-2' root@192.168.1.173 -- pvesm import local-zfs:vm-128-disk-1 zfs - -with-snapshots 1' failed: exit code 255

From what I can see it seems like it's trying to send the whole disk again (which already exist on the target node) instead of just the snapshot

And my pveversion -v

Code:

proxmox-ve: 5.1-25 (running kernel: 4.13.4-1-pve)
pve-manager: 5.1-35 (running version: 5.1-35/722cc488)
pve-kernel-4.13.4-1-pve: 4.13.4-25
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.2-pve1~bpo90

Any idea what might cause this ?

Also is there a way to received email notification on failed job sync ?

Thanks

wolfgang · Nov 6, 2017

Hi,

I think you have lose your state, this happend if you make a ha-migration (fail-over) or a manual migration on the pmxcfs.

The solution is to erase the job and make a new one.

CharlesErickT said:
Also is there a way to received email notification on failed job sync ?

Will come soon.

CharlesErickT · Nov 6, 2017

wolfgang said:
Hi,

I think you have lose your state, this happend if you make a ha-migration (fail-over) or a manual migration on the pmxcfs.

The solution is to erase the job and make a new one.

Will come soon.

This is weird as I haven't used these feature. It just stopped working. In the mean time, I will delete the job an recreate it. I will also monitor it closely to make sure it doesn't crash again.

Thanks for your help

rholighaus · Mar 3, 2018

I keep having the same issue. Replication is happening every 10 minutes, and then it suddenly fails, complaining "volume xxx already exists", trying to do a full replication even though this would not be necessary. 10 Minutes before it worked.

I have already opened a ticket but I think this is a bug in PVE.

I wish there was a manual recovery to avoid having to destroy the target file system / volume and re-transmit tons of data every time it reaches this state.

Davyd · Mar 5, 2018

This issue appears, when destination storage has some load.
As I can see, pvesr makes this steps:
1. Make snapshot on source storage
2. Send it to destination
3. Remove previous snapshot from destination storage - here it got an 10 seconds time out! zfs destroy operation from your storage some times takes a longer time.
4. pvesr got timeout error in pos.3 and removed all snapshots, which it made in this iteration. On source and destination.
5. But zfs destroy operation in pos.3 was also completed successfully (but takes more than 10 secs).

Now we have destination storage without any state snapshots, which prevents apply future incremental snapshots.
You need to remove destination storage and make full sync again.

Davyd · Mar 5, 2018

It will be good, if proxmox team make a some improvements, that we can configure this timeout for a storage

rholighaus · Mar 5, 2018

@PVE team:

High load on a destination breaking replication is not acceptable behavior. Having to re-send hundreds of Gigabytes of data due to a missing recovery routine can definitely not be the answer and is not feasible in a production environment.

I suggest to fix the bug that high load on a destination machine can irrecoverably break replication. Please try to find a way to recover from a time-out instead of breaking replication. If fixing this takes time, I suggest to urgently

1) come up with a manual recovery routine that avoids having to re-send everything, or even
2) create an auto-recovery mechanism

Thank you!

wolfgang · Mar 5, 2018

Hi Davyd,

thanks for the analyse. I will check and see how we can handle this delay.
Can you please also open a bug ticket on https://bugzilla.proxmox.com/

SiriusNZ · Mar 10, 2018

Having the same issue in the home lab when it gets busy, can't find the bug report in Bugzilla, did anyone file it (so I don't create a duplicate)?

Davyd · Mar 10, 2018

SiriusNZ said:
Having the same issue in the home lab when it gets busy, can't find the bug report in Bugzilla, did anyone file it (so I don't create a duplicate)?

I'm on vacation now, so please create it. I will append it if necessary.

SiriusNZ · Mar 10, 2018

Bug filed. https://bugzilla.proxmox.com/show_bug.cgi?id=1694

Perhaps the solution lies in a checkpoint implementation - when replication send number n succeeds, delete the snapshots from replication send n-1 or maybe even n-2, not n.

wolfgang · Mar 12, 2018

Thanks for the report.

rholighaus · Mar 12, 2018

SiriusNZ said:
Bug filed. https://bugzilla.proxmox.com/show_bug.cgi?id=1694

Perhaps the solution lies in a checkpoint implementation - when replication send number n succeeds, delete the snapshots from replication send n-1 or maybe even n-2, not n.

Very good idea!

rholighaus · Mar 19, 2018

Only partially related, and maybe worth an enhancement request:

If a replication fails, e.g. at 1:55am, the system tries to do a full sync again and again, which obviously fails, each time overwriting the former replication log file, which makes it impossible to provide log information of the root cause when the replication failed for the first time due to the time-out issue. A continuous log file per replication that gets rotated by logrotate would probably be more helpful...

Rufus Ebonhawk · May 15, 2018

We are experiencing the same problem. Is there any progress?

wolfgang · May 15, 2018

The bug tracker is tagged with Feedback.
This means it is fixed and waiting for feedback.
All version of pve-guest-common > 2.0-15 are fixed.

SiriusNZ · Jun 7, 2018

Thanks Wolfgang,

Sorry I didn't provide feedback as I'd since moved my storage to Ceph, since that's what is used at my work and I had to get familiar with it, but if I get a chance to reconfigure some storage back to pvesr, I'll give it a go.

tonci · Jun 23, 2018

Hello to all, I'm still experiencing the same issue like everybody that reported this problem. I do really count on pvesr replication I think this feature is real step forward towards "2Node-HA" . This means that both hosts and storage could be considered as redundant (not 100% but very high redundancy level) In my clients scope I could consider it as 99% redundant. Zfs replication is so fast and effective that i.e. I run it every 5 mins. Offline migration (I can afford it) works and sync jobs get perfectly adjusted/updated. Just for test I create test-vm on the other node and run it, between two replications, with replicated image from original VM , it runs correctly, I make some changes on the disk and power it off. After next replication those changes gets overwritten by the original VM form. Everything works as expected and even more.
But if above mentioned errors arise , and unfortunately they do, this redundancy that I count on lowers down. In my case I did not have to delete the sync-jobs, but I have to delete all VM-replicas on the destination host. (and there was not any load on destination host, it was just collecting replicas and HA was not involved). After that full-sync starts and during that time (and the time we did not know that errors occurred) we are not protected as we should and want to be.
Additionally if we get any notification that incremental sync failed , the situation gets worse (in term of risk ) . it would not be so bad if full sync start on its own , but it does not because it waits for us to make "something".
Honestly I think that improving 2node-HA scenario in this way could help us a lot. There are many SMB/SMEs that are too small for having shared san-storage or something like that , but they are big enough to have more than one server. As soon as they have two servers stable pvesr could offer really a lot in terms of redundancy.
Hopefully we will see some progress regarding this subject too

Thank you very much in advance for your understanding and support
BR

Tonci Stipicevic

wolfgang · Jun 25, 2018

tonci said:
"2Node-HA"

In no world, there exists an 2 node "HA" cluster.
This is impossible to make with two nodes HA.
You can run two nodes without HA but never with HA.

I do not exactly understand your problem, because you tell about your scenario but not about your problem.
You have to provide really debuggable information.
Also please open a new thread, because as you said your pool has no hi load.
This problem here was about hi load on zfs pools.

tonci · Jun 25, 2018

Hi Wolfgang, yes, don't worry my "mistake"

you are right of course ... there is no 2-node-ha cluster, but in this scenario there will be RPI as 3rd node (the quorum one) but there will be just 2 "data" nodes.
When I read this thread there were no doubt whether I fit in or not. I get exactly the same error like CharlesErickT that opened this thread. So suddenly for "maybe no reason" host decided to start full repl instead of incremental (and after that it fails)
Like I said there was no load on destination, but the result was failure like everybody has.
Btw, reading this whole thread I my ask what makes you think hi-load was the sole cause of this misbehaviour ?
My scenario w/o hi-load produced the same error .
2nd btw, from my opinion it could be useful if we keep few more snapshots after replications ....

So my scenario is : HA (or not HA) 2node cluster that involves pvesr between two nodes,
My problem is : Suddenly replication fails , Log says that host tries to start full-repl but it fails
I'm running last no-subs patches

If you still think i should open new thread please let me know

BR and thank you very much in advance

Tonci

PVESR - Volume already exist

Member

Proxmox Retired Staff

Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Retired Staff

Active Member

Renowned Member

Active Member

Proxmox Retired Staff

Renowned Member

Renowned Member

Well-Known Member

Proxmox Retired Staff

Active Member

Renowned Member

Proxmox Retired Staff

Renowned Member

We value your privacy