pvesr vs vzdump

hk@

Renowned Member
Feb 10, 2010
247
7
83
Vienna
kapper.net
We're still testing zfs replication using pvesr.
One issue we ran into was a vzdump snapshot being created during backup, while then pvesr did it's thing and we got stuck with this vzdump snapshot, that when trying to remove responded: snapshot 'vzdump' needed by replication job - run replication first...
Tried scheduling the replication again for "now", but this didn't help.

All we could do to get rid of this snapshot was to remove the replication for this VEID, only then we could remove the vzdump snapshot and well setup replication again.

Did we hit a race condition here or simply bad luck or should we have done anything different to avoid this situation?
 
if you encounter it again, please document the state of snapshots on the source and all target nodes (via zfs list), and if possible via logs, also the order of actions (i.e., replication to node A, backup start, replication to node B, backup end). it does sound unexpected.

please also ensure you are on the latest version, there have been replication related edge cases fixed in the past :)
 
systems are at the latest available (imho) - will report back if we create this again :)

pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve2
 
Hi,
Tried scheduling the replication again for "now", but this didn't help.
What exactly does "didn't help" mean? Did replication fail? Or did removing the vzdump snapshot still not succeed after the replication finished?
 
i am stuck at the same situation. any solutions?
i am running the todays actual versions from the enterprise repo
replication works fine.

Source (TM Pool is excluded from replication):
Code:
rpool/data/subvol-234005-disk-0@automittag231115120004               2.08M      -      842M  -
rpool/data/subvol-234005-disk-0@vzdump                               1.39M      -      842M  -
rpool/data/subvol-234005-disk-0@automittag231116120004                900K      -      842M  -
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700134500__     720K      -      842M  -

rpool/data/subvol-234005-disk-1@automittag231115120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@vzdump                                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@automittag231116120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700134500__       0B      -      454G  -

TMPool/subvol-234005-disk-0@automittag231115120004                   69.9M      -      262G  -
TMPool/subvol-234005-disk-0@automittag231116120004                    168K      -      263G  -

Target:
Code:
rpool/data/subvol-234005-disk-0@automittag231115120004               2.07M      -      842M  -
rpool/data/subvol-234005-disk-0@vzdump                               1.38M      -      842M  -
rpool/data/subvol-234005-disk-0@automittag231116120004                892K      -      842M  -
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700134500__     224K      -      842M  -

rpool/data/subvol-234005-disk-1@automittag231115120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@vzdump                                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@automittag231116120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700134500__       0B      -      454G  -

vzdump is not deleteable:
TASK ERROR: snapshot 'vzdump' needed by replication job '234005-0' - run replication first
 
Hi,
i am stuck at the same situation. any solutions?
i am running the todays actual versions from the enterprise repo
replication works fine.

Source (TM Pool is excluded from replication):
Code:
rpool/data/subvol-234005-disk-0@automittag231115120004               2.08M      -      842M  -
rpool/data/subvol-234005-disk-0@vzdump                               1.39M      -      842M  -
rpool/data/subvol-234005-disk-0@automittag231116120004                900K      -      842M  -
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700134500__     720K      -      842M  -

rpool/data/subvol-234005-disk-1@automittag231115120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@vzdump                                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@automittag231116120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700134500__       0B      -      454G  -

TMPool/subvol-234005-disk-0@automittag231115120004                   69.9M      -      262G  -
TMPool/subvol-234005-disk-0@automittag231116120004                    168K      -      263G  -

Target:
Code:
rpool/data/subvol-234005-disk-0@automittag231115120004               2.07M      -      842M  -
rpool/data/subvol-234005-disk-0@vzdump                               1.38M      -      842M  -
rpool/data/subvol-234005-disk-0@automittag231116120004                892K      -      842M  -
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700134500__     224K      -      842M  -

rpool/data/subvol-234005-disk-1@automittag231115120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@vzdump                                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@automittag231116120004                  0B      -      454G  -
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700134500__       0B      -      454G  -

vzdump is not deleteable:
could you share the output of
Code:
pct config 234005
zfs list -t snapshot -o name,guid,creation rpool/data/subvol-234005-disk-0 rpool/data/subvol-234005-disk-1
pveversion -v
(with the ZFS command on both nodes)?

Does manually running the replication make the vzdump snapshot deletable?
 
Does manually running the replication make the vzdump snapshot deletable?
no not deleteable. but the daily backup seems to be able to create new snapshots.

Backup Log:
Code:
INFO: found old vzdump snapshot (force removal)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'
INFO: resume vm
....
INFO: cleanup temporary 'vzdump' snapshot
snapshot 'vzdump' was not (fully) removed - snapshot 'vzdump' needed by replication job '234005-0' - run replication first

pve2: main server
pve1: backup/replication target

Code:
root@pve2:~# pct config 234005
arch: amd64
cores: 2
features: nesting=1
hostname: fileserver
memory: 256
mp0: local-zfs:subvol-234005-disk-1,mp=/mnt/data,backup=1,size=800G
mp1: TimeMachine:subvol-234005-disk-0,mp=/mnt/TimeMachine,replicate=0,size=700G
net0: name=eth0,bridge=vmbr0,gw=xxx.xxx.xxx.x,hwaddr=xx:xx:xx:xx:xx:xx,ip=xxx.xxx.xxx.x/24,type=veth
ostype: debian
parent: automittag231122120004
rootfs: local-zfs:subvol-234005-disk-0,size=8G
swap: 0
tags: intern
unprivileged: 1
Code:
root@pve1:~# zfs list -t snapshot -o name,guid,creation rpool/data/subvol-234005-disk-0 rpool/data/subvol-234005-disk-1
NAME                                                                GUID  CREATION
rpool/data/subvol-234005-disk-0@automittag231117120004             9418401079994353929  Fri Nov 17 12:00 2023
rpool/data/subvol-234005-disk-0@automittag231118120004             6340438081927422241  Sat Nov 18 12:00 2023
rpool/data/subvol-234005-disk-0@vzdump                             7703056407260815176  Sat Nov 18 23:01 2023
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700352006__  3069050107047120567  Sun Nov 19  1:00 2023
rpool/data/subvol-234005-disk-1@automittag231117120004             11725357021604118369  Fri Nov 17 12:00 2023
rpool/data/subvol-234005-disk-1@automittag231118120004             10023616640817193619  Sat Nov 18 12:00 2023
rpool/data/subvol-234005-disk-1@vzdump                             5965778584871507683  Sat Nov 18 23:01 2023
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700352006__  623174949071739542  Sun Nov 19  1:00 2023
Code:
root@pve2:~# zfs list -t snapshot -o name,guid,creation rpool/data/subvol-234005-disk-0 rpool/data/subvol-234005-disk-1
NAME                                                                GUID  CREATION
rpool/data/subvol-234005-disk-0@__replicate_234005-0_1700352006__  3069050107047120567  Sun Nov 19  1:00 2023
rpool/data/subvol-234005-disk-0@automittag231121120004             10144814189766887309  Tue Nov 21 12:00 2023
rpool/data/subvol-234005-disk-0@vzdump                             886681946747281139  Tue Nov 21 23:01 2023
rpool/data/subvol-234005-disk-0@automittag231122120004             17589802794098123352  Wed Nov 22 12:00 2023
rpool/data/subvol-234005-disk-1@__replicate_234005-0_1700352006__  623174949071739542  Sun Nov 19  1:00 2023
rpool/data/subvol-234005-disk-1@automittag231121120004             14336628619649610285  Tue Nov 21 12:00 2023
rpool/data/subvol-234005-disk-1@vzdump                             11023239417511934089  Tue Nov 21 23:01 2023
rpool/data/subvol-234005-disk-1@automittag231122120004             12301492413885027006  Wed Nov 22 12:00 2023
Code:
root@pve2:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-18-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1
 
Last edited:
Sorry for the delay, only got around to take a closer look now. This is indeed a bug in presence of a volume with replicate=0. However, backup should be able to force-remove the vzdump snapshot before it runs, so it shouldn't cause any real issues.

A fix has been sent to the mailing list: https://lists.proxmox.com/pipermail/pve-devel/2023-December/061048.html
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!