Can't remove replicated VM disk

alc · Mar 5, 2025

After launching a replication, the task failed and I've received the following error :

Code:

end replication job with error: No common base snapshot on volume(s) local-zfs:vm-102-disk-0

Note that there are no snapshots associated with this VM.
Also, I've had this error with VM's and CT's too, neither having snapshots.

Then when trying to remove the destination disk, I get "zfs error: cannot destroy 'rpool/vm-102-disk-0': dataset is busy".
Also note that the replication seems to be borked, as the destination drive shows a size of 0B.

I've waited a few hours and tried again with the same results.
I've also removed the replication tasks (and they did go away successfully, but I still couldn't delete the target disk).

What else can I try ?

shbaek · Mar 6, 2025

alc said:
After launching a replication, the task failed and I've received the following error :

Code:

end replication job with error: No common base snapshot on volume(s) local-zfs:vm-102-disk-0

Note that there are no snapshots associated with this VM.
Also, I've had this error with VM's and CT's too, neither having snapshots.

Then when trying to remove the destination disk, I get "zfs error: cannot destroy 'rpool/vm-102-disk-0': dataset is busy".
Also note that the replication seems to be borked, as the destination drive shows a size of 0B.

I've waited a few hours and tried again with the same results.
I've also removed the replication tasks (and they did go away successfully, but I still couldn't delete the target disk).

What else can I try ?

1. Check if the dataset is in use

Run the following command to check active usage:
lsof | grep rpool/vm-102-disk-0
Ensure no mounts or locks are holding it:
zfs get mounted rpool/vm-102-disk-0
If mounted, unmount it:
zfs unmount rpool/vm-102-disk-0

2. Verify Replication State

Check for lingering replication configurations:
cat /etc/pve/replication.cfg
If there are still replication entries for the affected VM/CT, remove them manually.

3. Force Delete the ZFS Dataset

If the dataset is still "busy":
zfs destroy -f rpool/vm-102-disk-0
If it still refuses:
zfs destroy -f -r rpool/vm-102-disk-0

4. Restart ZFS Services

Restart ZFS-related services:
systemctl restart zfs.target
If the issue persists, reboot the node:
reboot

5. Fix Future Replication Issues

Since replication requires snapshots:

Create a manual snapshot before starting a new replication:
zfs snapshot local-zfs/vm-102-disk-0@base
Then recreate the replication task in the Proxmox UI.
If issues persist, check system logs:
journalctl -xe | grep zfs

fiona · Mar 6, 2025

Hi,
the replication uses ZFS snapshots to transfer the data. If there is no common snapshot between source and target, replication will not be possible. Please share the output of zpool history | grep vm-102-disk-0 on the target as well as pveversion -v and zfs list -r -t all rpool/data/vm-102-disk-0 on both source and target.

Is there any task referencing the disk still running, ps aux | grep -e pvesr -e vm-102-disk-0 -e zfs?

alc · May 22, 2025

fiona said:
Hi,
the replication uses ZFS snapshots to transfer the data. If there is no common snapshot between source and target, replication will not be possible. Please share the output of zpool history | grep vm-102-disk-0 on the target as well as pveversion -v and zfs list -r -t all rpool/data/vm-102-disk-0 on both source and target.

Is there any task referencing the disk still running, ps aux | grep -e pvesr -e vm-102-disk-0 -e zfs?

Hello, sorry for the very long delay.
After an upgrade, I still can't delete the problematic disk image.

Here are the requested outputs, pasted externally when too long:

zpool history | grep vm-102-disk-0 on the target node [pastebin]
pveversion -v on the target [pastebin]
pveversion -v on the source [pastebin]

zfs list -r -t all rpool/vm-102-disk-0 on the target :

Code:

NAME                  USED  AVAIL  REFER  MOUNTPOINT
rpool/vm-102-disk-0  1.02G  1.42T     8K  -

zfs list -r -t all rpool/vm-102-disk-0 on the source :

Code:

NAME                  USED  AVAIL  REFER  MOUNTPOINT
rpool/vm-102-disk-0  1.02G  1.20T   445M  -

Also, thank you @shbaek, I've tried all the commands you've provided with no success. Here are the results :

Run the following command to check active usage:
lsof | grep rpool/vm-102-disk-0

On the target :
- No output
On the source :
- An extremely long list of lsof: no pwd entry for UID 100100, with various UID's (100100, 100033, 100133, ...)

Ensure no mounts or locks are holding it:
zfs get mounted rpool/vm-102-disk-0

On both target and source :

Code:

NAME                 PROPERTY  VALUE    SOURCE
rpool/vm-102-disk-0  mounted   -        -

If mounted, unmount it:
zfs unmount rpool/vm-102-disk-0

On both target and source :

Code:

cannot open 'rpool/vm-102-disk-0': operation not applicable to datasets of this type

Check for lingering replication configurations:
cat /etc/pve/replication.cfg

On both target and source :
- No mention of VM 102

If the dataset is still "busy":
zfs destroy -f rpool/vm-102-disk-0

On both target and source :
cannot destroy 'rpool/vm-102-disk-0': dataset is busy

If it still refuses:
zfs destroy -f -r rpool/vm-102-disk-0

On both target and source :
cannot destroy 'rpool/vm-102-disk-0': dataset is busy

Restart ZFS-related services:
systemctl restart zfs.target

If the issue persists, reboot the node:
reboot

Rebooting had no effect.

fiona · May 23, 2025

alc said:
Hello, sorry for the very long delay.
After an upgrade, I still can't delete the problematic disk image.

Here are the requested outputs, pasted externally when too long:

zpool history | grep vm-102-disk-0 on the target node [pastebin]

So the last operation there was a removal on March 4th, it's very strange that the image still seems to exist then:
2025-03-04.13:13:04 zfs destroy -r rpool/vm-102-disk-0

What does zpool status -v say? Do you see the image mentioned when you run ps aux | grep vm-102-disk-0?

Search

Search

Can't remove replicated VM disk

alc

Active Member

shbaek

Member

1. Check if the dataset is in use

2. Verify Replication State

3. Force Delete the ZFS Dataset

4. Restart ZFS Services

5. Fix Future Replication Issues

fiona

Proxmox Staff Member

alc

Active Member

fiona

Proxmox Staff Member

We value your privacy

Can't remove replicated VM disk

alc

Active Member

shbaek

Member

1. Check if the dataset is in use​

2. Verify Replication State​

3. Force Delete the ZFS Dataset​

4. Restart ZFS Services​

5. Fix Future Replication Issues​

fiona

Proxmox Staff Member

alc

Active Member

fiona

Proxmox Staff Member

We value your privacy

1. Check if the dataset is in use

2. Verify Replication State

3. Force Delete the ZFS Dataset

4. Restart ZFS Services

5. Fix Future Replication Issues