Error when moving disk on busy VM

Thomas Plant

Member
Mar 28, 2018
93
1
13
55
Hello all,

I'm testing a Proxmox installation with local and iSCSI Storage. I wanted to test how a disk move behaves when the VM is busy working on the disks. So I launched a fio in the VM, with a sequential write profile. And while this was running I tried to move the disk from local ZFS to the iSCSI storage (LVM on iSCSI).

At the end I got the following message:
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
zfs error: cannot destroy 'rpool/data/vm-100-disk-1': dataset is busy
TASK OK

VM never had an interruption and worked flawlessly, but the disk in the message did remain on the local storage. So first question is: how do I remove it? When I click 'Remove' on the local storage it says to remove it from the VM hardware settings window, but there is only the moved disk on iSCSI.

And second: is there a fix to this? Or is this expected on a very busy disk?

Proxmox is version 5.3-8, with the "no subscription repo" and all patches installed.

Thanks,
Thomas
 
if it does not appear under the vm config as 'unusedX' you can run the following from the console:
qm rescan

this rescans all storages for leftover disk images and adds it to the respective config

then you can delete the 'unusedX' disk
 
Hi,

that worked and the drive is showing up as 'Unused Disk 0', but still says "zfs error: cannot destroy 'rpool/data/vm-100-disk-1': dataset is busy".

EDIT: Tried a reboot of Proxmox, but same error.
 
Last edited:
can you post the output of

zfs list -t all

and

lsof -n 2>&1 | grep vm-100-disk-1
 
Found the following Proxmox forum post https://forum.proxmox.com/threads/zfs-error-cannot-destroy-dataset-is-busy.25811/
and as this is a test machine I did the following:

looked in /dev/rpool/data. There was the following:

lrwxrwxrwx 1 root root 9 Jan 25 16:40 vm-100-disk-0 -> ../../zd0
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part1 -> ../../zd0p1
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part2 -> ../../zd0p2
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part3 -> ../../zd0p3
lrwxrwxrwx 1 root root 10 Jan 25 16:40 vm-100-disk-1 -> ../../zd16
lrwxrwxrwx 1 root root 12 Jan 25 16:39 vm-100-disk-1-part1 -> ../../zd16p1

then with "fuser -am /dev/rpool/data/vm-100-disk-1" I saw the pid of the process which held the volume busy.

It was 'multipathd'. I killed it and could destroy the disks with 'zfs destroy'.

So I think probably this happened because I did not create a multipath.conf with the general blacklist in it:

blacklist {
wwid .*
}

So could it be that multipathd somehow attached to the ZFS Volume trying to manage it?
 
  • Like
Reactions: asterix91
Did my tests again. When I move the disks from iSCSI to local storage and then back to iSCSI multipathd somehow interferes.
After the sucessfull move and the error it gives, I restartet multipathd and could destroy manually the remaining disk. 'qm rescan' did not attach it as unused to the VM.....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!