vm efi boot corrupt

aax

Member
Jan 23, 2021
4
0
6
43
Hi ,
I was messing around with home assistant VM and stopped/rebooted it incorrectly. Now the VM will not boot anymore. A previous snapshot still works but misses some data thus would like check if it possible to restore it.

My question is would it be possible to fix the boot issue (maybe with the help of a previous snapshot?) ? Or acces the data of that crashed snapshot and copy some data from it?

The BIOS does not show any boot options in bios - add boot options - efi. The efi is empty and does not show anything.

The error that I get is:

BdsDxe: failed to load Boot0001 "UEFI QEMU HARDDISK QM00005 " from PciRoot (0x0) /Pci (0x7,0x0) /Sata(0x0,0xFFFF,0x0) : Not Found


some output..


Code:
root@pve:~# cat /etc/pve/qemu-server/101.conf

cores: 2
efidisk0: t_pool:vm-101-disk-0,size=4M
memory: 5120
name: hassosova-3.5
net0: virtio=C6:8E:DF:50:26:49,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
parent: after_os_update
runningmachine: pc-i440fx-4.0
sata0: t_pool:vm-101-disk-1,size=32G
scsihw: virtio-scsi

the crashed snapshot
Code:
ores: 2
[crashed]
bios: ovmf
bootdisk: sata0
cores: 2
efidisk0: t_pool:vm-101-disk-0,size=4M
memory: 5120
name: hassosova-3.5
net0: virtio=C6:8E:DF:50:26:49,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
parent: thermostat
runningcpu: kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep
runningmachine: pc-i440fx-5.1+pve0
sata0: t_pool:vm-101-disk-1,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=8d258fe0-5cae-4495-a169-c7cb9b219648
snaptime: 1611353678
sockets: 1
usb0: host=0403:6015
usb1: host=0658:0200
vmgenid: 8157d35c-591d-4919-a2ce-c1352d19b48f
vmstate: t_pool:vm-101-state-crashed

a working version
Code:
restored_2020_10]
#restored 2020-10 version and updated hassio to latets version
bios: ovmf
bootdisk: sata0
cores: 2
efidisk0: t_pool:vm-101-disk-0,size=4M
memory: 5120
name: hassosova-3.5
net0: virtio=C6:8E:DF:50:26:49,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
parent: thermostat
runningcpu: kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep
runningmachine: pc-i440fx-5.1+pve0
sata0: t_pool:vm-101-disk-1,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=8d258fe0-5cae-4495-a169-c7cb9b219648
snaptime: 1611379887
sockets: 1
usb0: host=0403:6015
usb1: host=0658:0200
vmgenid: f77299e2-db22-4af4-9678-5eadd92bf49a
vmstate: t_pool:vm-101-state-restored_2020_10

Code:
root@pve:~# pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-15 (running version: 6.2-15/48bd51b6)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.18-1-pve: 5.0.18-3
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-9
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 0.9.4-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-6
pve-cluster: 6.2-1
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-4
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-18
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve2
 
It's probably a hack, but this usually works for me:

Code:
# Make sure VM is disabled:
ha-manager set vm:<VMID> --state disabled
# Open GDISK to modify disk partition map
gdisk /dev/zvol/rpool/vm-<VMID>-disk-<DISK#>
# Once GDISK opens, then just use the W command to re-write the partiion map
# Re-enable (start) VM to verify the VM boots using the disk
ha-manager set vm:<VMID> --state enabled
 
It's probably a hack, but this usually works for me:

Code:
# Make sure VM is disabled:
ha-manager set vm:<VMID> --state disabled
# Open GDISK to modify disk partition map
gdisk /dev/zvol/rpool/vm-<VMID>-disk-<DISK#>
# Once GDISK opens, then just use the W command to re-write the partiion map
# Re-enable (start) VM to verify the VM boots using the disk
ha-manager set vm:<VMID> --state enabled
what if i'm not using zfs, i don't have the zvol directory.... Please help
 
Hi, I am still waiting for the solution here. I also don't use zfs and don't have zvol directory. What do I do, please help
 
@timdonovan and @batrovich
I haven’t tested it but there’s a chance this same solution could work for you. You would want to find the path of the disk that is used for the vm and run the same command with that path.
 
It's probably a hack, but this usually works for me:

Code:
# Make sure VM is disabled:
ha-manager set vm:<VMID> --state disabled
# Open GDISK to modify disk partition map
gdisk /dev/zvol/rpool/vm-<VMID>-disk-<DISK#>
# Once GDISK opens, then just use the W command to re-write the partiion map
# Re-enable (start) VM to verify the VM boots using the disk
ha-manager set vm:<VMID> --state enabled

You saved my life.
 
  • Like
Reactions: meichthys
root@proxmox-2:~# qm stop 101 root@proxmox-2:~# gdisk /dev/zvol/rpool/data/vm-101-disk-1 GPT fdisk (gdisk) version 1.0.6 Partition table scan: MBR: not present BSD: not present APM: not present GPT: present Found valid GPT with corrupt MBR; using GPT and will write newprotective MBR on save. Command (? for help): W Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING PARTITIONS!! Do you want to proceed? (Y/N): y OK; writing new GUID partition table (GPT) to /dev/zvol/rpool/data/vm-101-disk-1. The operation has completed successfully.

This post is still saving once a year ;) Seems like the way HA is built it is not very fault tolerant to unplanned shutdowns.
 
Hi,
root@proxmox-2:~# qm stop 101 root@proxmox-2:~# gdisk /dev/zvol/rpool/data/vm-101-disk-1 GPT fdisk (gdisk) version 1.0.6 Partition table scan: MBR: not present BSD: not present APM: not present GPT: present Found valid GPT with corrupt MBR; using GPT and will write newprotective MBR on save. Command (? for help): W Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING PARTITIONS!! Do you want to proceed? (Y/N): y OK; writing new GUID partition table (GPT) to /dev/zvol/rpool/data/vm-101-disk-1. The operation has completed successfully.

This post is still saving once a year ;) Seems like the way HA is built it is not very fault tolerant to unplanned shutdowns.
are you also using SATA as the disk controller like @aax? There was a longstanding rare bug in QEMU's SATA emulation that could lead to sector 0 being overwritten. It'll be fixed in pve-qemu-kvm >= 8.0.2-7 (not yet packaged as of now) with https://git.proxmox.com/?p=pve-qemu.git;a=commit;h=816077299c92b2e20b692548c7ec40c9759963cf
 
  • Like
Reactions: meichthys
Hi @fiona - I am indeed! I believe this came out the box with however homeassistant used to ship the setup instructions. Good to know it'll be patched, thanks! :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!