Can't remove old vm-disk "data set is busy"

Stevebeer

New Member
Jan 29, 2023
3
0
1
Hi all,

From GUI, I cannot delete an disk in the local-zfs storage list.
Also not possible from CLI.

Code:
root@pve:~# zfs destroy rpool/vm-109-disk-0
cannot destroy 'rpool/vm-109-disk-0': dataset is busy

I've stopped al the containers and vm's. And also restarted the hypervisor. Nothing helps.
Any idea where to look? I have been Googling for hours and can't find a working solutions.
 
Then you should plug this into google "zfs destroy cannot destroy dataset is busy" and review resulting articles. There could be many reasons for this condition, ranging from PVE bug to system misconfiguration, to something you inadvertently did. You just have to try various debug commands (i.e. fuser) to zoom in on the culprit.
https://forum.proxmox.com/threads/zfs-error-cannot-destroy-dataset-is-busy.25811/


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Stevebeer
Thank you for your quick responses.
Your exact Google suggestion pointed me towards a Github discussion:
link-discussion

I've looked up the device numbers and deleted the partitions on it. Then created an default partition.
After a reboot, i could finally remove the disks. :)

Code:
fuser -am /dev/rpool/vm-108-disk-0
/dev/zd80

gdisk /dev/zd80
    delete al partitions
    create new default partition
    write info to disk
    
reboot system
 
That worked here as well.
Thank you for your quick responses.
Your exact Google suggestion pointed me towards a Github discussion:
link-discussion

I've looked up the device numbers and deleted the partitions on it. Then created an default partition.
After a reboot, i could finally remove the disks. :)

Code:
fuser -am /dev/rpool/vm-108-disk-0
/dev/zd80

gdisk /dev/zd80
    delete al partitions
    create new default partition
    write info to disk
   
reboot system

It's a bit annoying that we need to restart the entire PVE host when we want to delete an old VM disk...
 
Hello,

I had the same issue and the reason was that the mdraid-daemon somehow held the zfs resource busy. I had to stop the mdraid of the zfs dataset and then it was easily destroyed.
 
Not sure how relevant it is but this is my experience after struggling A LOT with these almost random issues.

Note however that this is for a DATASET, NOT A ZVOL !

lsof unfortunately never worked for me troubleshooting ZFS busy issues ...

Bash:
# Dataset to be destroy
DATASET="zdata/PODMAN-TEST"

# Sequence of Attempts
root@HOSTNAME:~# zfs destroy "${DATASET}"
cannot destroy 'zdata/PODMAN-TEST': dataset is busy

root@HOSTNAME:~# zfs list -t snapshot | grep -i "${DATASET}"
zdata/PODMAN-TEST@zfs-auto-snap_frequent-2024-04-18-1415             0B      -    96K  -

root@HOSTNAME:~# zfs destroy "${DATASET}"@zfs-auto-snap_frequent-2024-04-18-1415

root@HOSTNAME:~# zfs destroy "${DATASET}"
cannot destroy 'zdata/PODMAN-TEST': dataset is busy

root@HOSTNAME:~#  zfs get mountpoint | grep -i "${DATASET}"  | grep -v @
zdata/PODMAN-TEST                                                 mountpoint  /zdata/PODMAN-TEST          inherited from zdata

# This is what FINALLY Fixes it
root@HOSTNAME:~# zfs set mountpoint=none "${DATASET}"
root@HOSTNAME:~# zfs set canmount=off "${DATASET}"
root@HOSTNAME:~# zfs umount "${DATASET}"
cannot unmount 'zdata/PODMAN-TEST': not currently mounted
root@HOSTNAME:~# zfs destroy "${DATASET}"
# No Error :)
root@HOSTNAME:~# zfs list

Alternatively you might also try using findmnt:
Bash:
# Search in static table of filesystems (FSTAB file)
findmnt -l -s --types zfs --output target

# Search in table of mounted filesystems (includes user space mount options)
findmnt -l -m --types zfs --output target

# Search in kernel table of mounted filesystems (default)
findmnt -l -k --types zfs --output target

Other options specifically for ZVOLs:
Bash:
# Try to see if any process is accessing your device
ps aux | grep -i "zvol"

# Try to see if there is some configuration still referencing them (and potentially Proxmox VE processes blocking them)
grep -ri "vm-MYVMID" /etc/pve/
grep -ri "vm-MYVMID" /etc/

# Try to see if there are any read/writes to the concered device
iostat --human --pretty

# Or using iotop in batch mode
logfile="test.log"
iotop --batch > "${logfile}"
cat "${logfile}" | grep -i "zvol" | grep -i "vm-MYVMID"

See also for instance: https://unix.stackexchange.com/questions/86875/determining-specific-file-responsible-for-high-i-o
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!