ZFS Replication running away?

rholighaus

Well-Known Member
Dec 15, 2016
97
8
48
60
Berlin
I have the following file system that gets replicated to two other PVE nodes:

Code:
root@carrier:~# zfs list -ospace -tall -r rpool/data/subvol-104-disk-2
NAME                                                         AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rpool/data/subvol-104-disk-2                                  855G   330G      249G   81.1G             0B         0B
rpool/data/subvol-104-disk-2@2019-04-18-000000                   -  1.64G         -       -              -          -
rpool/data/subvol-104-disk-2@2019-04-26-080000                   -   457M         -       -              -          -

[snapshots removed to shorten list]

rpool/data/subvol-104-disk-2@2019-05-20-080000                   -  10.0M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-090000                   -  15.3M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-100000                   -  14.1M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-110000                   -  12.8M         -       -              -          -
rpool/data/subvol-104-disk-2@__replicate_104-1_1558344610__      -  8.28M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-120000                   -  7.79M         -       -              -          -
rpool/data/subvol-104-disk-2@__replicate_104-2_1558348775__      -  8.75M         -       -              -          -
rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__      -  9.42M         -       -              -          -

Replication to one node (replication job 104-2) went well, the other one (replication job 104-1) is causing problems but only for the second subvol (subvol-104-disk-2, old PVE subvol counting that still started at 1):

Code:
2019-05-20 14:59:16 104-1: incremental sync 'rpool:subvol-104-disk-2' (__replicate_104-1_1558344610__ => __replicate_104-1_1558357140__)
2019-05-20 14:59:16 104-1: rpool/data/subvol-104-disk-2@__replicate_104-1_1558344610__    name    rpool/data/subvol-104-disk-2@__replicate_104-1_1558344610__    -
2019-05-20 14:59:17 104-1: send from @__replicate_104-1_1558344610__ to rpool/data/subvol-104-disk-2@2019-05-20-120000 estimated size is 157M
2019-05-20 14:59:17 104-1: send from @2019-05-20-120000 to rpool/data/subvol-104-disk-2@__replicate_104-2_1558348775__ estimated size is 47.7M
2019-05-20 14:59:17 104-1: send from @__replicate_104-2_1558348775__ to rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__ estimated size is 1.52T
2019-05-20 14:59:17 104-1: total estimated size is 1.52T

1.52TB is very strange. Current transfer has already reached 144GB and still running.
Any idea what is happening?

I had migrated the LXC from one PVE node to the current one (restart method).
Replication to the former node ran well (job 104-2), while replication to another node (job 104-1) seems to cause problems.

EDIT: Log from pending send job:

Code:
2019-05-20 15:35:12 104-1: 15:35:12    225G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 15:35:13 104-1: 15:35:13    225G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 15:35:14 104-1: 15:35:14    225G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 15:35:15 104-1: 15:35:15    226G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__

Here is the receiving side - note only 6.90G have been allocated so there must be some wasted data in the send job:

Code:
NAME                                                         AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rpool/data/subvol-104-disk-2                                  777G   281G      248G   25.8G             0B      6.90G
rpool/data/subvol-104-disk-2@2019-04-18-000000                   -  1.64G         -       -              -          -
rpool/data/subvol-104-disk-2@2019-04-26-080000                   -   457M         -       -              -          -

[snapshots omitted to short list]

rpool/data/subvol-104-disk-2@2019-05-20-060000                   -  58.0M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-070000                   -  10.0M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-080000                   -  9.98M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-090000                   -  15.2M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-100000                   -  14.1M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-110000                   -  12.8M         -       -              -          -
rpool/data/subvol-104-disk-2@__replicate_104-1_1558344610__      -  8.22M         -       -              -          -
rpool/data/subvol-104-disk-2@2019-05-20-120000                   -  7.75M         -       -              -          -
rpool/data/subvol-104-disk-2@__replicate_104-2_1558348775__      -     0B         -       -              -          -
 
Last edited:
By now we have reached 770GB transfer:

Code:
2019-05-20 17:01:31 104-1: 17:01:31    771G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 17:01:32 104-1: 17:01:32    771G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 17:01:33 104-1: 17:01:33    771G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 17:01:34 104-1: 17:01:34    771G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__
2019-05-20 17:01:35 104-1: 17:01:35    771G   rpool/data/subvol-104-disk-2@__replicate_104-1_1558357140__

And created 18GB more data on the receiving side:

Code:
root@carrier-3:~# zfs list -ospace -tall -r rpool/data/subvol-104-disk-2 | head
NAME                                                         AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rpool/data/subvol-104-disk-2                                  759G   298G      248G   25.8G             0B      24.1G
 
Looks like here is one explanation - gotta dig deeper why there is such a huge compressratio (the sub-volume holds a zimbra file system which uses OpenLDAP, which makes use of sparse files, so there could be one reason):

Code:
rpool/data/subvol-104-disk-2  refcompressratio            27.51x                                                          -
rpool/data/subvol-104-disk-2  written                     93.9G                                                           -
rpool/data/subvol-104-disk-2  logicalused                 11.9T                                                           -
rpool/data/subvol-104-disk-2  logicalreferenced           4.50T
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!