how to remove a disk from a live (running) vm

genfoch01

New Member
Oct 4, 2023
13
2
3
proxmox 8.4
I have a vm I added a boot drive and 4 drives which i set up in a raid6 array. I wanted to test the failure of a drive (and its recovery ) so tried to remove the drive while the VM was live.

according to the doc I have read, you can go into the vm hardware and detach the drive ( which I did) and then it should show as not used and you can delete it.

when I detach the drive i see this. the only button that is live at this point is "revert" and I see no way to remove it.

can anyone shed some light on what I am doing wrong?

1745173059921.png
thanks, for your time,
GF
 
Maybe hotplug is not enabled for disks (see Options, Hotplug, Disk), in which case you should enable it and stop and restart the VM?
Maybe the virtual SATA does not support hot(un)plug, in which case you should try virtual SCSI instead (but first stop the VM)?
 
BEGIN /root/bin/faildisk.sh
Code:
#!/bin/bash

# for zfs / RAID disk failure testing
# arg = e.g. sdq           # ( without /dev/ )
#d2f=$(ls -l /sys/dev/block |awk '/'$1'/ {print $9}')

logf=/root/disk-fail-test.log

fdisk -l /dev/$1 |tee -a $logf
echo "$(date) - Test-Failing disk $1 - Enter to proceed or ^C" |tee -a $logf
read
echo offline > /sys/block/$1/device/state
echo 1 > /sys/block/$1/device/delete

sleep 2
dmesg |grep $1

Disk should come back with a reboot.
 
Maybe hotplug is not enabled for disks (see Options, Hotplug, Disk), in which case you should enable it and stop and restart the VM?
Maybe the virtual SATA does not support hot(un)plug, in which case you should try virtual SCSI instead (but first stop the VM)?
this may be a really stupid question but do you think switching the disks from sata to scsi will clobber the raid (zfs disk pool) ?
 
BEGIN /root/bin/faildisk.sh
Code:
#!/bin/bash

# for zfs / RAID disk failure testing
# arg = e.g. sdq           # ( without /dev/ )
#d2f=$(ls -l /sys/dev/block |awk '/'$1'/ {print $9}')

logf=/root/disk-fail-test.log

fdisk -l /dev/$1 |tee -a $logf
echo "$(date) - Test-Failing disk $1 - Enter to proceed or ^C" |tee -a $logf
read
echo offline > /sys/block/$1/device/state
echo 1 > /sys/block/$1/device/delete

sleep 2
dmesg |grep $1

Disk should come back with a reboot.
Thanks for this! I'll try to test it tomorrow
 
  • Like
Reactions: Kingneutron
Thanks for this! I'll try to test it tomorrow

Getting time to do this took longer than expected.

however I was able to "remove" a drive thusly (thank you! )
Code:
root@openmediavault:~# echo offline > /sys/block/sdb/device/state
root@openmediavault:~# echo 1 > /sys/block/sdb/device/delete

now I'm seeing this which is exactly what I thought i'd see.


Code:
root@openmediavault:~# zpool status diskpool
  pool: diskpool
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using zpool online' or replace the device with
        'zpool replace'.
config:

        NAME                           STATE     READ WRITE CKSUM
        diskpool                       DEGRADED     0     0     0
          raidz2-0                     DEGRADED     0     0     0
            ata-QEMU_HARDDISK_QM00007  REMOVED      0     0     0
            ata-QEMU_HARDDISK_QM00009  ONLINE       0     0     0
            ata-QEMU_HARDDISK_QM00011  ONLINE       0     0     0
            ata-QEMU_HARDDISK_QM00013  ONLINE       0     0     0

errors: No known data errors

I am running a checksum on the files just to test everything is good. when I reboot the drive will come back online will this trigger a resilver operation?

do you happen to know how i'd map ata-QEMU_HARDDISK_QM00007 (hard disk seen by the vm ) to the hard disk created by proxmox for the vm?

Thanks for your time! I'm really happy with the results so far.
 
  • Like
Reactions: Kingneutron
this may be a really stupid question but do you think switching the disks from sata to scsi will clobber the raid (zfs disk pool) ?
Ok it seems I have to delete the disks to recreate them as scsi. so yes that would clobber the diskpool.
 
You don’t have to delete to change protocol, just detach, then reattach, ZFS survives that although don’t do it while the VM is online.

Disks can be hot plugged in and out if virtio-guest-tools is installed in the guest. The guest tools is what handles the detachment on the guest side, if your disk is considered in use (mounted or part of a volume manager), it cannot be unplugged, so the unplug would happen when the system is offline.
 
  • Like
Reactions: Kingneutron
You don’t have to delete to change protocol, just detach, then reattach, ZFS survives that although don’t do it while the VM is online.

Disks can be hot plugged in and out if virtio-guest-tools is installed in the guest. The guest tools is what handles the detachment on the guest side, if your disk is considered in use (mounted or part of a volume manager), it cannot be unplugged, so the unplug would happen when the system is offline.
it is likely I have misunderstood what you are saying, but this vm is using an OpenMediaVault installation iso. OMV is debian based and from what I can tell QEMU agent is not installed. I am not familiar with virtio-guest-tools, though i'm fairly sure no virtual drivers are included with the image. Does proxmox auto install these tools if it detects a new installation? ( my proxmox skill is 1 out of 10 I recently switched from virtualbox to proxmox so there is a lot to learn )

Thanks for taking the time to help me out,
I really appreciate it.
-GF
 
If it’s Debian based, you can probably install it manually. Proxmox does not automatically do anything to guests, it’s up to the guest to install whatever it needs.