After 6.3 upgrade, VM boot device missing if member of an HA group

alyarb

Well-Known Member
Feb 11, 2020
140
25
48
37
If a virtual machine is not a member of an HA group, you can add a disk, ISO, or network boot device, and expect it will boot to them in the order given on the options screen. If you power on the machine and interrupt POST with the escape key, you can make a one-time selection to manually boot to a different device irrespective of the boot order.

However, if you power off the VM, and then add it to an HA group, you will still see your boot order on the options screen, but the VM will not have any boot devices available after power-on. If you interrupt POST with the escape key, all of your boot devices will be gone from the one-time menu.

If you leave the VM out of the HA group, and power it on, it will of course boot fine. If you add the VM to the HA group after booting the system, all will be fine. However, if that machine were to reboot for any reason, it will then get stuck in an endless boot loop with no boot device.

I'm pretty sure this happened when we upgraded to 6.3....anyone else seeing this?
 
Hi,

for the actual start of the VM both HA and non HA use the same code paths, just triggered by different processes - a change here can be possible due to some side effect, but to be honest, seems quite unlikely.

So, lets investigate a bit: can you please post the VM Config here (e.g., with qm config VMID)?

Also, does "add to HA Group" mean, it is always configured for HA but added to a group or removed from it, or does it mean that all works if not under HA at all but this behaviour can be observed when adding to HA (and a HA group)?
 
This issue is not confined to a single VM. It is affecting all the machines in the cluster. Here is a config from one affected VM

root@virtual38:~# qm config 103
agent: enabled=1,fstrim_cloned_disks=1
balloon: 3072
boot: order=ide2;scsi0
cores: 8
cpu: host
ide2: none,media=cdrom
machine: q35
memory: 24576
name: JMI-DC1-01
net0: virtio=2A:DE:5E:E3:3E:0C,bridge=JMILAN
numa: 1
onboot: 1
ostype: win10
parent: AutoSnap_02_23_2021_05_00_32
scsi0: CephRBD_NVMe:vm-103-disk-0,cache=writeback,discard=on,iothread=1,size=401G
scsihw: virtio-scsi-pci
smbios1: uuid=69a75ac9-8582-4e47-a7bd-442b7eba0160
sockets: 2
vmgenid: 60c6d275-b12a-4d8c-a31e-6a1944519b1e



The machine boots if not under HA at all. If you enable it and add it to a group, the VM conf does not change, but the boot devices are innaccessible once the machine is powered on.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!