zfs error: cannot destroy : dataset is busy

svennd · Jan 28, 2016

I got a fresh install with ZFS raidz-3 where I have 2 containers, (lxc) one was for testing and I wanted to remove it. However I got this error :

Code:

zfs error: cannot destroy 'rpool/ROOT/subvol-101-disk-1': dataset is busy

How can I force remove ? I did not reboot the host, since this is not an acceptable solution ... It is part of a cluster, but the lxcontainer is shutdown obv.

cat /etc/pve/storage.cfg

Code:

dir: local
        path /var/lib/vz
        maxfiles 0
        content rootdir,images,iso,vztmpl

zfspool: lxc_storage
        pool rpool/ROOT
        content images,rootdir

Code:

root@noname:~# pveversion
pve-manager/4.1-5/f910ef5c (running kernel: 4.2.6-1-pve)

wolfgang · Jan 28, 2016

Hi,
did you exclude zvol from lvm?

see https://pve.proxmox.com/wiki/Storage:_ZFS#How_to_prevent_lvm_of_scanning_zvols

svennd · Jan 29, 2016

I have found that yes, however on the latest update this was already included.

Code:

    # Do not scan ZFS zvols (to avoid problems on ZFS zvols snapshots)
    filter = [ "r|/dev/zd*|" ]

svennd · Feb 1, 2016

bump ? Still doesn't want to delete the container, although lvm is excluding zvol ... (by default)

wolfgang · Feb 1, 2016

did you try to reboot the machine?

svennd · Feb 1, 2016

When both containers are down and rebooted I could remove it, then again, this is my test machine, I don't want to do that on a machine with multiple containers ... is this a bug ? How can I avoid this state ? Thanks for the advice.

// edit
I took a snapshot of the other container (the one I did not want to remove) might that be the problem ?

wolfgang · Feb 1, 2016

Can you give me a step by step description to reproduce.
Include config of the container

svennd · Feb 1, 2016

- fresh install
- created container (see first topic /etc/pve/storage.cfg, for zfs addition)
- created a snapshot
- added a server to the cluster
- on server 2, I added a container backup (from v3.x, openvz)
- on server 1, I created a new container to test nfs
- after shutting down the new container I tried to remove the container ...

Rebooting on this setup is no problem, but its not something I can't do when I move over the v3 cluster to v4.

a4t3rburn3r · Mar 30, 2016

Same here. I powered server off for maintenance at 19 march, and after that i started it. Today one of the stupid support guys killed mysql in one of the lxc container. I tryed to recover it from backup (nfs backup), but got same error:
TASK ERROR: zfs error: cannot destroy 'pool1/subvol-161-disk-1': dataset is busy
At the same time i have tried to remove one of unused lxc container, and got error one more time.

I have tried:
#zfs list -t snapshot
no snapshots
#zfs umount pool1/subvol-161-disk-1
currently not mounted
#zfs destroy pool1/subvol-161-disk-1
same error
and nothing gave me nothing.

I have tried than:
#grep pool1/subvol-161-disk-1 /proc/*/mounts
and got
#/proc/31344/mounts: pool1/subvol-161-disk-1 /pool1/subvol-161-disk-1 zfs rw,realtime,xattr,posixacl 0 0

Now i can't reboot server to check, because it's production server. Will try to reboot at midnight.
Any suggestions?

Thx.

svennd · Mar 30, 2016

The reboot fixed it for me, but it held me back from going with our main production servers over to the latest version of proxmox ... So wait for the next reboot ...

a4t3rburn3r · Mar 30, 2016

Yep, reboot fixed for me too - after reboot i have deleted all manualy using zfs destroy pool1/subvol_bla_bla_bla. It's definetly not good for production server such bugs.

Denny Fuchs · Aug 31, 2016

hi,

same for me:

zfs destroy -r rpool/pve-container/vm-122-disk-1
cannot destroy 'rpool/pve-container/vm-122-disk-1': dataset is busy

ii proxmox-ve 4.2-60 all The Proxmox Virtual Environment

Linux ina-pmox-04 4.4.13-1-pve #1 SMP Tue Jun 28 10:16:33 CEST 2016 x86_64 GNU/Linux

jeffwadsworth · Aug 31, 2016

I suggest trying the steps here.
https://forums.freebsd.org/threads/54541/
They should shed light on what is accessing your dataset.

Plexus · Nov 7, 2016

recently I tried to reimport some contailers unprivileged on our new proxmox 4.3 but removing the containers gave me this error:

Task viewer: CT 1 - Destroy
Output
Status
Stop
TASK ERROR: zfs error: cannot destroy 'rpool/data/subvol-1-disk-1': dataset is busy

as the machine was stopped no furhter mounts should have existed but:

~# grep rpool/data/subvol-1-disk-1 /proc/*/mounts
/proc/16775/mounts:rpool/data/subvol-1-disk-1 /rpool/data/subvol-1-disk-1 zfs rw,noatime,xattr,posixacl 0 0

~# ps auxf | grep 16775
root 9718 0.0 0.0 14456 1696 pts/9 S+ 08:00 0:00 \_ grep 16775
root 16775 0.0 0.0 27876 2472 ? Ss 03:56 0:01 [lxc monitor] /var/lib/lxc 2

...so I restarted CT2 and the removal of CT1 was no problem anymore.

vip · Dec 7, 2016

We're having the same issue here.

As Plexus has pointed out, the "lxc monitor" of every container seems to hold a handle on all mountpoints which are available when starting the container.

The issue only occurs after running a backup for another container.

Let's say we have two running containers: 1+2.
We'll try to remove 1 later. First we'll run a backup for 2.
During VZDUMP of 2, the root mountpoint is recursively made private ("mount --make-rprivate /"). 2 is started again after the backup.
Now the lxc-monitor of 2 holds a handle for the mountpoint of 1.

When you try to remove 1 the "umount" for the mountpoint isn't propagated to 2.
It won't be possible to delete the dataset, since 2 still holds the handle.
If you restart 2 now, then the dataset can be deleted.

@wolfgang: The commit b6c491ee4 on pve-container came from you? Would "--make-rslave" instead of "--make-rprivate" suffice to stop the security issue that you wanted to stop together with 2cfae16ee?

fabian · Dec 12, 2016

Thanks for catching this, patch with a fix is on the pve-devel list!

Jorge Sorondo · Apr 19, 2017

I know this is a little old now but for those still struggling with this I was able to replicate and solve without rebooting. Make sure the container you are trying to delete is not included in the current backups, if it is go to Datacenter > Backup > Select the Backup entry and click on Edit, un-check the box for the container you want to delete and click ok. I followed this process and was able to delete the zfs vol without issues.

Lee Kuper · Jan 13, 2019

I have had the similar situation on test server with undestroyable "vm-137-disk-2" that was not mount, without any snapshots and VM 137 did not exist from long ago:

>zfs destroy -r -f rpool/data/vm-137-disk-2
cannot destroy 'rpool/data/vm-137-disk-2': dataset is busy

I have fixed this with following procedure:
>fuser -am /dev/rpool/data/vm-137-disk-2
/dev/zd64: 2633

PID 2633 was KVM
I have stopped all KVM VMs and after that I'll be able to destroy vm-137-disk-2 even from Proxmox GUI.

Lee Kuper · Jan 13, 2019

Be carefull and do not destroy needed disk, as it was in my case

. And now I know about is my backup working or not.

mackpaul1967 · Mar 31, 2019

I ran into this same issue and it turned out to the scsi multipath service.

After ruining

systemctl stop multipathd.service

I was able to destroy the disks.

zfs error: cannot destroy : dataset is busy

Renowned Member

Proxmox Retired Staff

Renowned Member

Renowned Member

Proxmox Retired Staff

Renowned Member

Proxmox Retired Staff

Renowned Member

Active Member

Renowned Member

Active Member

Renowned Member

Member

Member

New Member

Proxmox Staff Member

New Member

Member

Member

Renowned Member