Can't remove old vm-disk "data set is busy"

Stevebeer

New Member
Jan 29, 2023
3
0
1
Hi all,

From GUI, I cannot delete an disk in the local-zfs storage list.
Also not possible from CLI.

Code:
root@pve:~# zfs destroy rpool/vm-109-disk-0
cannot destroy 'rpool/vm-109-disk-0': dataset is busy

I've stopped al the containers and vm's. And also restarted the hypervisor. Nothing helps.
Any idea where to look? I have been Googling for hours and can't find a working solutions.
 
Then you should plug this into google "zfs destroy cannot destroy dataset is busy" and review resulting articles. There could be many reasons for this condition, ranging from PVE bug to system misconfiguration, to something you inadvertently did. You just have to try various debug commands (i.e. fuser) to zoom in on the culprit.
https://forum.proxmox.com/threads/zfs-error-cannot-destroy-dataset-is-busy.25811/


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Stevebeer
Thank you for your quick responses.
Your exact Google suggestion pointed me towards a Github discussion:
link-discussion

I've looked up the device numbers and deleted the partitions on it. Then created an default partition.
After a reboot, i could finally remove the disks. :-)

Code:
fuser -am /dev/rpool/vm-108-disk-0
/dev/zd80

gdisk /dev/zd80
    delete al partitions
    create new default partition
    write info to disk
    
reboot system
 
That worked here as well.
Thank you for your quick responses.
Your exact Google suggestion pointed me towards a Github discussion:
link-discussion

I've looked up the device numbers and deleted the partitions on it. Then created an default partition.
After a reboot, i could finally remove the disks. :)

Code:
fuser -am /dev/rpool/vm-108-disk-0
/dev/zd80

gdisk /dev/zd80
    delete al partitions
    create new default partition
    write info to disk
   
reboot system

It's a bit annoying that we need to restart the entire PVE host when we want to delete an old VM disk...
 
Hello,

I had the same issue and the reason was that the mdraid-daemon somehow held the zfs resource busy. I had to stop the mdraid of the zfs dataset and then it was easily destroyed.
 
Not sure how relevant it is but this is my experience after struggling A LOT with these almost random issues.

Note however that this is for a DATASET, NOT A ZVOL !

lsof unfortunately never worked for me troubleshooting ZFS busy issues ...

Bash:
# Dataset to be destroy
DATASET="zdata/PODMAN-TEST"

# Sequence of Attempts
root@HOSTNAME:~# zfs destroy "${DATASET}"
cannot destroy 'zdata/PODMAN-TEST': dataset is busy

root@HOSTNAME:~# zfs list -t snapshot | grep -i "${DATASET}"
zdata/PODMAN-TEST@zfs-auto-snap_frequent-2024-04-18-1415             0B      -    96K  -

root@HOSTNAME:~# zfs destroy "${DATASET}"@zfs-auto-snap_frequent-2024-04-18-1415

root@HOSTNAME:~# zfs destroy "${DATASET}"
cannot destroy 'zdata/PODMAN-TEST': dataset is busy

root@HOSTNAME:~#  zfs get mountpoint | grep -i "${DATASET}"  | grep -v @
zdata/PODMAN-TEST                                                 mountpoint  /zdata/PODMAN-TEST          inherited from zdata

# This is what FINALLY Fixes it
root@HOSTNAME:~# zfs set mountpoint=none "${DATASET}"
root@HOSTNAME:~# zfs set canmount=off "${DATASET}"
root@HOSTNAME:~# zfs umount "${DATASET}"
cannot unmount 'zdata/PODMAN-TEST': not currently mounted
root@HOSTNAME:~# zfs destroy "${DATASET}"
# No Error :)
root@HOSTNAME:~# zfs list

Alternatively you might also try using findmnt:
Bash:
# Search in static table of filesystems (FSTAB file)
findmnt -l -s --types zfs --output target

# Search in table of mounted filesystems (includes user space mount options)
findmnt -l -m --types zfs --output target

# Search in kernel table of mounted filesystems (default)
findmnt -l -k --types zfs --output target

Other options specifically for ZVOLs:
Bash:
# Try to see if any process is accessing your device
ps aux | grep -i "zvol"

# Try to see if there is some configuration still referencing them (and potentially Proxmox VE processes blocking them)
grep -ri "vm-MYVMID" /etc/pve/
grep -ri "vm-MYVMID" /etc/

# Try to see if there are any read/writes to the concered device
iostat --human --pretty

# Or using iotop in batch mode
logfile="test.log"
iotop --batch > "${logfile}"
cat "${logfile}" | grep -i "zvol" | grep -i "vm-MYVMID"

See also for instance: https://unix.stackexchange.com/questions/86875/determining-specific-file-responsible-for-high-i-o
 
Last edited:
  • Like
Reactions: alexdelprete
Not sure how relevant it is but this is my experience after struggling A LOT with these almost random issues.

Note however that this is for a DATASET, NOT A ZVOL !

lsof unfortunately never worked for me troubleshooting ZFS busy issues ...

Bash:
# This is what FINALLY Fixes it
root@HOSTNAME:~# zfs set mountpoint=none "${DATASET}"
root@HOSTNAME:~# zfs set canmount=off "${DATASET}"
root@HOSTNAME:~# zfs umount "${DATASET}"
cannot unmount 'zdata/PODMAN-TEST': not currently mounted
root@HOSTNAME:~# zfs destroy "${DATASET}"
# No Error :)
root@HOSTNAME:~# zfs list

These are the gems you find on this forum (via google, because forum search is not optimal unfortunately).

I was getting crazy, couldn't delete a container because of a busy disk, tried 99 things and nothing worked, then I find this post, and not only I solved the issue, but I learned a lot of new stuff.

If you happen to be in vacation in Rome, I owe you a good coffee. :)

(sorry for the OT everybody, but I had to thank the person that wanted to share his hard-earned wisdom)
 
  • Like
Reactions: silverstone
These are the gems you find on this forum (via google, because forum search is not optimal unfortunately).

I was getting crazy, couldn't delete a container because of a busy disk, tried 99 things and nothing worked, then I find this post, and not only I solved the issue, but I learned a lot of new stuff.

If you happen to be in vacation in Rome, I owe you a good coffee. :)

(sorry for the OT everybody, but I had to thank the person that wanted to share his hard-earned wisdom)
You are Welcome :). I agree that unfortunately Google > Built-in Search Feature :rolleyes:.

If you ever happen to rescue a system from LiveCD/LiveUSB, and the Pool (of the Chrooted System that you attempt to recover) refuses to export (Cannot export 'rpool': Pool is busy Error Message), that's an EVEN more difficult Problem ...

I solved that via https://github.com/luckylinux/linux-setup/blob/main/umount_everything.sh in Particular the 2 for Loops of https://github.com/luckylinux/linux-setup/blob/main/modules/umount_chroot.sh. You basically need to kill whatever chrooted process might have (manually or automatically) started, eg. apt / dpkg / etc. And obviously (not stated there because it's not applicable ?) make sure to cd out of it (e.g. with cd /) :).

Not necessarily relevant on a Live System, but just for complementary Information :).
 
  • Like
Reactions: alexdelprete
Bookmarked also this post. You never know what can happen in these homelabs, but that's the funny part for me. It's so boring when everything's working as expected. :)

It's a good thing to learn these things that can be really useful when working on customers projects.

Thanks again.
 
Bookmarked also this post. You never know what can happen in these homelabs, but that's the funny part for me. It's so boring when everything's working as expected. :)

It's a good thing to learn these things that can be really useful when working on customers projects.

Thanks again.
You are welcome :) . I also just adapted that line from something I found online. Cannot remember exactly where. I usually put a comment with the "Source" or "Reference" for it. Pretty sure I just googled linux kill/list processes running inside chroot online or something along those Lines.

This was I believe the first Reference I used, but for some Reason I quickly switched to using lsof with the extra Options (those Options are ESSENTIAL - lsof is completely and utterly useless without them) without even doing a commit of the "Testing" Version I was troubleshooting. Especially the -D part is critical I believe.

I have enough "Things" happening every Day. I sure as Hell don't want to get more just for the "fun" of it :rolleyes: ...
 
  • Like
Reactions: waltar