Error when moving disk on busy VM

Thomas Plant · Jan 25, 2019

Hello all,

I'm testing a Proxmox installation with local and iSCSI Storage. I wanted to test how a disk move behaves when the VM is busy working on the disks. So I launched a fio in the VM, with a sequential write profile. And while this was running I tried to move the disk from local ZFS to the iSCSI storage (LVM on iSCSI).

At the end I got the following message:
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
zfs error: cannot destroy 'rpool/data/vm-100-disk-1': dataset is busy
TASK OK

VM never had an interruption and worked flawlessly, but the disk in the message did remain on the local storage. So first question is: how do I remove it? When I click 'Remove' on the local storage it says to remove it from the VM hardware settings window, but there is only the moved disk on iSCSI.

And second: is there a fix to this? Or is this expected on a very busy disk?

Proxmox is version 5.3-8, with the "no subscription repo" and all patches installed.

Thanks,
Thomas

dcsapak · Jan 25, 2019

if it does not appear under the vm config as 'unusedX' you can run the following from the console:
qm rescan

this rescans all storages for leftover disk images and adds it to the respective config

then you can delete the 'unusedX' disk

Thomas Plant · Jan 25, 2019

Hi,

that worked and the drive is showing up as 'Unused Disk 0', but still says "zfs error: cannot destroy 'rpool/data/vm-100-disk-1': dataset is busy".

EDIT: Tried a reboot of Proxmox, but same error.

dcsapak · Jan 25, 2019

can you post the output of

zfs list -t all

and

lsof -n 2>&1 | grep vm-100-disk-1

Thomas Plant · Jan 25, 2019

Found the following Proxmox forum post https://forum.proxmox.com/threads/zfs-error-cannot-destroy-dataset-is-busy.25811/
and as this is a test machine I did the following:

looked in /dev/rpool/data. There was the following:

lrwxrwxrwx 1 root root 9 Jan 25 16:40 vm-100-disk-0 -> ../../zd0
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part1 -> ../../zd0p1
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part2 -> ../../zd0p2
lrwxrwxrwx 1 root root 11 Jan 25 16:39 vm-100-disk-0-part3 -> ../../zd0p3
lrwxrwxrwx 1 root root 10 Jan 25 16:40 vm-100-disk-1 -> ../../zd16
lrwxrwxrwx 1 root root 12 Jan 25 16:39 vm-100-disk-1-part1 -> ../../zd16p1

then with "fuser -am /dev/rpool/data/vm-100-disk-1" I saw the pid of the process which held the volume busy.

It was 'multipathd'. I killed it and could destroy the disks with 'zfs destroy'.

So I think probably this happened because I did not create a multipath.conf with the general blacklist in it:

blacklist {
wwid .*
}

So could it be that multipathd somehow attached to the ZFS Volume trying to manage it?

Thomas Plant · Jan 25, 2019

Did my tests again. When I move the disks from iSCSI to local storage and then back to iSCSI multipathd somehow interferes.
After the sucessfull move and the error it gives, I restartet multipathd and could destroy manually the remaining disk. 'qm rescan' did not attach it as unused to the VM.....

Search

Search

Error when moving disk on busy VM

Thomas Plant

Member

dcsapak

Proxmox Staff Member

Thomas Plant

Member

dcsapak

Proxmox Staff Member

Thomas Plant

Member

Thomas Plant

Member

We value your privacy