[SOLVED] VMs won't boot after upgrade

H4R0

Well-Known Member
Apr 5, 2020
616
143
48
I just installed upgrades via apt full-upgrade and rebooted afterwards.

Now i've got the problem that 3 vm's that are based of a template wont boot anymore.


Even if i clone the template into a new vm and start it i get the same problem.


"BdsDxe: failed to load Boot0001 ..."

1588852398893.png



The disks seem totally fine to me, i can mount them and browse the content, boot partition with EFI and root are there and look fine.

1588852752641.png


VM config:
Code:
root@server1:/mnt# cat /etc/pve/qemu-server/103.conf
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 1
cpu: host
efidisk0: local-zfs:base-102-disk-1/vm-103-disk-1,size=1M
machine: q35
memory: 300
name: dns1
net0: virtio=C2:4F:A8:C5:81:D8,bridge=vmbr0,tag=3
numa: 0
onboot: 1
ostype: l26
protection: 1
scsi0: local-zfs:base-102-disk-0/vm-103-disk-0,discard=on,size=256G,ssd=1
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=4bdd961c-ead3-449d-9d30-eb210a2dd138
sockets: 1
startup: order=2,up=30
vmgenid: 8a489845-75d0-4088-8cca-417f4588ab32


List of package versions that got replaced by upgrade:
Code:
Start-Date: 2020-05-07  09:11:12
Commandline: apt full-upgrade
Install: zstd:amd64 (1.3.8+dfsg-3, automatic), libproxmox-acme-perl:amd64 (1.0.2, automatic), idn:amd64 (1.33-2.2, automatic)
Upgrade: proxmox-widget-toolkit:amd64 (2.1-3, 2.1-6), libpve-access-control:amd64 (6.0-6, 6.0-7), libpve-storage-perl:amd64 (6.1-5, 6.1-7), libpve-cluster-api-perl:amd64 (6.1-4, 6.1-8), libpve-cluster-perl:amd64 (6.1-4, 6.1-8), pve-firewall:amd64 (4.0-10, 4.1-2), pve-container:amd64 (3.0-23, 3.1-4), pve-cluster:amd64 (6.1-4, 6.1-8), pve-i18n:amd64 (2.0-4, 2.1-1), pve-manager:amd64 (6.1-8, 6.1-11), libpve-guest-common-perl:amd64 (3.0-5, 3.0-10), libpve-common-perl:amd64 (6.0-17, 6.1-1), lxc-pve:amd64 (3.2.1-1, 4.0.2-1), qemu-server:amd64 (6.1-7, 6.1-20), pve-kernel-helper:amd64 (6.1-8, 6.1-9), lxcfs:amd64 (4.0.1-pve1, 4.0.3-pve2)
End-Date: 2020-05-07  09:11:28

pveversion -v
Code:
proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-11 (running version: 6.1-11/f2f18736)
pve-kernel-helper: 6.1-9
pve-kernel-5.3: 6.1-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksmtuned: 4.20150325+b1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.2
libpve-access-control: 6.0-7
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-1
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-6
pve-cluster: 6.1-8
pve-container: 3.1-4
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.0-7
pve-ha-manager: 3.0-9
pve-i18n: 2.1-1
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-20
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

Any help is welcome
 
Last edited:
I couldnt find any solution on the forum or google.

Im going to rollback now.
 
Same problem here! VMs with OVMF (UEFI) won't boot anymore! No problems with Default (SeaBIOS) VM though!

1588875668017.png
BdsDxe: failed to load Boot0002 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x)/Pci(0x1E,0x0)/Pci(0x1,0x0)/pci(0x5,0x0)/Scsi/0x0,0x0): Not Found

Configuration of problematic VM:
Bash:
root@pve01:~# cat /etc/pve/qemu-server/110.conf
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host
efidisk0: images:vm-110-disk-0,size=1M
ide2: none,media=cdrom
machine: q35
memory: 4096
name: enterprise
net0: virtio=76:DF:78:E7:6A:A7,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
protection: 1
scsi0: images:vm-110-disk-1,discard=on,size=240G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=0ca59e84-5831-401d-b88e-6047ad49c70c,base64=1,product=UHJveG1veCBWTQ==,serial=MTEw
sockets: 1
startup: order=10
vmgenid: ef327fc6-7dfc-4e37-8667-dd9b224be732

Proxmox Version:
Bash:
root@pve01:~# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.4.30-1-pve)
pve-manager: 6.1-11 (running version: 6.1-11/f2f18736)
pve-kernel-5.4: 6.1-9
pve-kernel-helper: 6.1-9
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.30-1-pve: 5.4.30-1
pve-kernel-5.4.27-1-pve: 5.4.27-1
pve-kernel-5.4.24-1-pve: 5.4.24-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.2
libpve-access-control: 6.0-7
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-1
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-6
pve-cluster: 6.1-8
pve-container: 3.1-4
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.0-7
pve-ha-manager: 3.0-9
pve-i18n: 2.1-1
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-20
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
 
OK, got my VM running again with the hint from @onepamopa in this thread to switch to machine type pc-q35-3.1.

I edited my configuration /etc/pve/qemu-server/110.conf and replaced the line
machine: q35
with
machine: pc-q35-3.1
and the VM booted just fine!

The list of supported machine types:
Bash:
root@pve01:~# qemu-system-x86_64 -machine help
Supported machines are:
pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-4.1)
pc-i440fx-4.1        Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-4.0        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-3.1        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-3.0        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.9        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.8        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.7        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.6        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.5        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.4        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.3        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.2        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.12       Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.11       Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.10       Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.1        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.0        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.7        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.6        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.5        Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.4        Standard PC (i440FX + PIIX, 1996)
pc-1.3               Standard PC (i440FX + PIIX, 1996)
pc-1.2               Standard PC (i440FX + PIIX, 1996)
pc-1.1               Standard PC (i440FX + PIIX, 1996)
pc-1.0               Standard PC (i440FX + PIIX, 1996)
pc-0.15              Standard PC (i440FX + PIIX, 1996) (deprecated)
pc-0.14              Standard PC (i440FX + PIIX, 1996) (deprecated)
pc-0.13              Standard PC (i440FX + PIIX, 1996) (deprecated)
pc-0.12              Standard PC (i440FX + PIIX, 1996) (deprecated)
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-4.1)
pc-q35-4.1           Standard PC (Q35 + ICH9, 2009)
pc-q35-4.0.1         Standard PC (Q35 + ICH9, 2009)
pc-q35-4.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-3.1           Standard PC (Q35 + ICH9, 2009)
pc-q35-3.0           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.8           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.7           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.6           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.5           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.4           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.12          Standard PC (Q35 + ICH9, 2009)
pc-q35-2.11          Standard PC (Q35 + ICH9, 2009)
pc-q35-2.10          Standard PC (Q35 + ICH9, 2009)
isapc                ISA-only PC
none                 empty machine
 
My rollback went smooth, i just upgraded again to test your fix thanks for the details.

And it works. I tested some more and which is really interesting:

Setting "machine: q35" doesnt work, but "pc-q35-4.1" works !
But "q35" is linked to "pc-q35-4.1", so they should be the same ??
 
very strange indeed.. if you can easily reproduce it by setting/unsetting 'machine', it would help if you could post 'qm showcmd XXX' output with 'q35' and 'pc-q35-4.1' set as machine type!
 
Here are the different results:

pc-q35-4.1
Code:
/usr/bin/kvm -id 110 -name enterprise -chardev 'socket,id=qmp,path=/var/run/qemu-server/110.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/110.pid -daemonize -smbios 'type=1,product=Proxmox VM,uuid=0ca59e84-5831-401d-b88e-6047ad49c70c,serial=110' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=/dev/zvol/datassd01/images/vm-110-disk-0' -smp '2,sockets=1,cores=2,maxcpus=2' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/110.vnc,password -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt -m 4096 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=ef327fc6-7dfc-4e37-8667-dd9b224be732' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/110.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:929a346be78a' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/datassd01/images/vm-110-disk-1,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=native,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=76:DF:78:E7:6A:A7,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc-q35-4.1+pve0'

q35
Code:
/usr/bin/kvm -id 110 -name enterprise -chardev 'socket,id=qmp,path=/var/run/qemu-server/110.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/110.pid -daemonize -smbios 'type=1,serial=110,product=Proxmox VM,uuid=0ca59e84-5831-401d-b88e-6047ad49c70c' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/datassd01/images/vm-110-disk-0' -smp '2,sockets=1,cores=2,maxcpus=2' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/110.vnc,password -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt -m 4096 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=ef327fc6-7dfc-4e37-8667-dd9b224be732' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/110.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:929a346be78a' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/datassd01/images/vm-110-disk-1,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=native,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=76:DF:78:E7:6A:A7,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=q35+pve2'

1588931530421.png
The first difference only the parameters are sorted differently.
In the second difference there is however an additional size=131072 parameter. And the 131072 Bytes do not match the EFI-disk size with 1 MB!
And machine tpye as well as pve0 vs. pve2 in the last difference. Whatever those pveX-parameters do.
 
Last edited by a moderator:
  • Like
Reactions: onepamopa
so after some further analysis this seems to be fallout from a bug fix for an issue affecting efidisks on some storages. unfortunately the fix could not be made backwards compatible for all cases - to fully fix your VMs, you need to re-write your EFI entries or re-install your bootloader after booting with the plain 'q35' machine type (e.g., using the EFI shell, or a live-CD of your choice).
 
so after some further analysis this seems to be fallout from a bug fix for an issue affecting efidisks on some storages. unfortunately the fix could not be made backwards compatible for all cases - to fully fix your VMs, you need to re-write your EFI entries or re-install your bootloader after booting with the plain 'q35' machine type (e.g., using the EFI shell, or a live-CD of your choice).

So, a guide for this is available somewhere?
 
so after some further analysis this seems to be fallout from a bug fix for an issue affecting efidisks on some storages. unfortunately the fix could not be made backwards compatible for all cases - to fully fix your VMs, you need to re-write your EFI entries or re-install your bootloader after booting with the plain 'q35' machine type (e.g., using the EFI shell, or a live-CD of your choice).

Can't we just leave it as it is ? Or will pc-q35-4.1 be upgraded to pc-q35-4.1+pve2 including the fix as well ?
 
Last edited:
Can't we just leave it as it is ? Or will pc-q35-4.1 be upgraded to pc-q35-4.1+pve2 including the fix as well ?

you can leave it as is, but won't get any changes from future machine type updates, or fixes that are incompatible with 4.1 machines and version guarded (like the current one). the original issue was that with the EFI disk on some storages, EFI settings were not persisted correctly because of a mismatch of EFI disk size. changing the machine type to -4.1 will revert to the old, broken behaviour. unfortunately it was not possible to fix this in a compatible way, so you need to either re-initialize the EFI disk with the new machine type, or fix the affected VM to the old, buggy machine type.
 
What i don't understand is, why does it work with pc-q35-4.1 but not with q35, when the later is only an alias for ther first?
Code:
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-4.1)
 
What i don't understand is, why does it work with pc-q35-4.1 but not with q35, when the later is only an alias for ther first?
Code:
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-4.1)

because q35 gets an implicit pve-specific suffix +pve2 (the newest, since q35 is not version specific), but pc-q35-4.1 gets +pve0 (the oldest, since it's version-locked). the check whether to use the new or old behaviour is guarded by >= 4.1+pve2
 
you can leave it as is, but won't get any changes from future machine type updates, or fixes that are incompatible with 4.1 machines and version guarded (like the current one). the original issue was that with the EFI disk on some storages, EFI settings were not persisted correctly because of a mismatch of EFI disk size. changing the machine type to -4.1 will revert to the old, broken behaviour. unfortunately it was not possible to fix this in a compatible way, so you need to either re-initialize the EFI disk with the new machine type, or fix the affected VM to the old, buggy machine type.

Alright good to know, i will reinstall the efi bootloader later this month.
 
  • Like
Reactions: fabian
Can someone tell me how to "reinstall the efi bootloader" for the affected VMs ?

that depends on your guest OS. for example, most Linux distributions have it as an option in their installer's rescue mode, or you could boot with a live CD and use that.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!