[SOLVED] Issue with replication

Lost_Ones

New Member
Oct 9, 2024
4
1
3
Hello,

I am hoping to get some direction to how to correct my issue.

I have a ZFS share, that I would like to have a VM replicated to another device in the cluster. It should be noted that I needed to restore the Dst device and rebuild the FS share. This was working previously. I had also moved the storage on the Src from the zfs-shre to the local storage and rebuilt the ZFS share. Obviously the issue remand.

When I set up and try to run my replication job, I get a message that my drive is full. I have seen several post on this and the issue seems to point to the refreservation being set and thus taking up the space. Here is the actual error I find on the Src host:

2024-10-16 16:49:02 120-0: start replication job
2024-10-16 16:49:02 120-0: guest => VM 120, running => 984986
2024-10-16 16:49:02 120-0: volumes => zfs-share:vm-120-disk-0
2024-10-16 16:49:02 120-0: create snapshot '__replicate_120-0_1729118942__' on zfs-share:vm-120-disk-0
2024-10-16 16:49:03 120-0: end replication job with error: zfs error: cannot create snapshot 'zfs-share/vm-120-disk-0@__replicate_120-0_1729118942__': out of space



My Src drive and destination drive are the same size, and the VM size is 64 gigs.

The Dst device is empty and here is the out put of the command zfs get all
NAME PROPERTY VALUE SOURCE
zfs-share type filesystem -
zfs-share creation Tue Oct 15 13:03 2024 -
zfs-share used 552K -
zfs-share available 108G -
zfs-share referenced 96K -
zfs-share compressratio 1.00x -
zfs-share mounted yes -
zfs-share quota none default
zfs-share reservation none default
zfs-share recordsize 128K default
zfs-share mountpoint /zfs-share default
zfs-share sharenfs off default
zfs-share checksum on default
zfs-share compression on local
zfs-share atime on default
zfs-share devices on default
zfs-share exec on default
zfs-share setuid on default
zfs-share readonly off default
zfs-share zoned off default
zfs-share snapdir hidden default
zfs-share aclmode discard default
zfs-share aclinherit restricted default
zfs-share createtxg 1 -
zfs-share canmount on default
zfs-share xattr on default
zfs-share copies 1 default
zfs-share version 5 -
zfs-share utf8only off -
zfs-share normalization none -
zfs-share casesensitivity sensitive -
zfs-share vscan off default
zfs-share nbmand off default
zfs-share sharesmb off default
zfs-share refquota none default
zfs-share refreservation none default
zfs-share guid 17752960111506726003 -
zfs-share primarycache all default
zfs-share secondarycache all default
zfs-share usedbysnapshots 0B -
zfs-share usedbydataset 96K -
zfs-share usedbychildren 456K -
zfs-share usedbyrefreservation 0B -
zfs-share logbias latency default

Any additional pointers on what to check to correct this would be much appreciated.

Regards
 
you need to check the actual volume that triggers the issue..

i.e., post "zfs get all zfs-share/vm-120-disk-0"
 
you need to check the actual volume that triggers the issue..

i.e., post "zfs get all zfs-share/vm-120-disk-0"
Hello Fabian and thank you for the response.

Are you referring to the Src host? I presume yes as it doesnt exist on the destination yet.

Is the 'space' issue in relation to the Src disk?

root@proxmox2:~# zfs get all zfs-share/vm-120-disk-0
NAME PROPERTY VALUE SOURCE
zfs-share/vm-120-disk-0 type volume -
zfs-share/vm-120-disk-0 creation Tue Oct 15 13:50 2024 -
zfs-share/vm-120-disk-0 used 65.0G -
zfs-share/vm-120-disk-0 available 59.0G -
zfs-share/vm-120-disk-0 referenced 48.5G -
zfs-share/vm-120-disk-0 compressratio 1.24x -
zfs-share/vm-120-disk-0 reservation none default
zfs-share/vm-120-disk-0 volsize 64G local
zfs-share/vm-120-disk-0 volblocksize 16K default
zfs-share/vm-120-disk-0 checksum on default
zfs-share/vm-120-disk-0 compression on inherited from zfs-share
zfs-share/vm-120-disk-0 readonly off default
zfs-share/vm-120-disk-0 createtxg 150 -
zfs-share/vm-120-disk-0 copies 1 default
zfs-share/vm-120-disk-0 refreservation 65.0G local
zfs-share/vm-120-disk-0 guid 136628528238198903 -
zfs-share/vm-120-disk-0 primarycache all default
zfs-share/vm-120-disk-0 secondarycache all default
zfs-share/vm-120-disk-0 usedbysnapshots 0B -
zfs-share/vm-120-disk-0 usedbydataset 48.5G -
zfs-share/vm-120-disk-0 usedbychildren 0B -
zfs-share/vm-120-disk-0 usedbyrefreservation 16.5G -
zfs-share/vm-120-disk-0 logbias latency default
zfs-share/vm-120-disk-0 objsetid 145 -
zfs-share/vm-120-disk-0 dedup off default
zfs-share/vm-120-disk-0 mlslabel none default
zfs-share/vm-120-disk-0 sync standard default
zfs-share/vm-120-disk-0 refcompressratio 1.24x -
zfs-share/vm-120-disk-0 written 48.5G -
zfs-share/vm-120-disk-0 logicalused 60.3G -
zfs-share/vm-120-disk-0 logicalreferenced 60.3G -
zfs-share/vm-120-disk-0 volmode default default
zfs-share/vm-120-disk-0 snapshot_limit none default
zfs-share/vm-120-disk-0 snapshot_count none default
zfs-share/vm-120-disk-0 snapdev hidden default
zfs-share/vm-120-disk-0 context none default
zfs-share/vm-120-disk-0 fscontext none default
zfs-share/vm-120-disk-0 defcontext none default
zfs-share/vm-120-disk-0 rootcontext none default
zfs-share/vm-120-disk-0 redundant_metadata all default
zfs-share/vm-120-disk-0 encryption off default
zfs-share/vm-120-disk-0 keylocation none default
zfs-share/vm-120-disk-0 keyformat none default
zfs-share/vm-120-disk-0 pbkdf2iters 0 default
zfs-share/vm-120-disk-0 prefetch all default


Much thanks again, and best regards
 
yes, the issue is with the source dataset, as that is the one getting snapshotted for replication..

Code:
zfs-share/vm-120-disk-0 refreservation 65.0G local
zfs-share/vm-120-disk-0 usedbydataset 48.5G -

so your dataset currently uses 48.5G, and it's reservation is 65G. that means creating a snapshot will require at least 48.5G of additional space (since that is the data that will be referenced by your new snapshot, and you must be able to fully overwrite the volume afterwards, which would take up another 65G). depending on your pool setup, there might be additional overhead though (raidz can be particularly expensive). and of course, other volumes that are snapshotted as part of the same replication run might increase the space needed.
 
yes, the issue is with the source dataset, as that is the one getting snapshotted for replication..

Code:
zfs-share/vm-120-disk-0 refreservation 65.0G local
zfs-share/vm-120-disk-0 usedbydataset 48.5G -

so your dataset currently uses 48.5G, and it's reservation is 65G. that means creating a snapshot will require at least 48.5G of additional space (since that is the data that will be referenced by your new snapshot, and you must be able to fully overwrite the volume afterwards, which would take up another 65G). depending on your pool setup, there might be additional overhead though (raidz can be particularly expensive). and of course, other volumes that are snapshotted as part of the same replication run might increase the space needed.
Thank you for the insight. Cheeers
 
  • Like
Reactions: fabian

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!