[SOLVED] Ubuntu VM can't boot after reboot of Proxmox (SOLVED)

Steen_A

New Member
Oct 10, 2023
7
0
1
Hi group

Could really need some help !!
After reboot of my Proxmox server, one of my VM's can't boot any longer
And of course it is the most important of the VM's I have running on my Proxmox server

I get this error trying to boot the VM

1699881676179.png

I do off course have backups of my VM's and LXC's, but a restore of the backups for this particular VM is given exact same problem. All of the backups !!

There isn't any entries in the logs I've looked in regarding what is the problem why this VM wont boot

The system can see the partitions on the HD belonging to this "Non-bootable" VM

1699881850143.png

Any help or suggestion is appreciated.

root@pve1:/var/log/pve/tasks# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.126-1-pve)
pve-manager: 7.4-17 (running version: 7.4-17/513c62be)
pve-kernel-5.15: 7.4-7
pve-kernel-5.13: 7.1-9
pve-kernel-5.0: 6.0-11
pve-kernel-5.15.126-1-pve: 5.15.126-1
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.104-1-pve: 5.15.104-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36+pve2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
Forgot the config for this VM

oot@pve1:/var/log/pve/tasks# qm config 103
boot: order=sata0;ide2;net0
cores: 2
description: * Type%3A Vm%0A* OS%3A Ubuntu server 20.04.5 LTS%0A* Function%3A NGINX proxy, ssh-GW, webserver, mailserver med iRedMail (postfix, dovecot & spamassisin m
fl)%0A* IP%3A 192.168.10.2%0A* Network%3A vmbr0-netv%C3%A6rket VLAN%3A 110 (DMZ)
ide2: none,media=cdrom
memory: 4096
name: mail.<my-domain>.dk
net0: virtio=B2:9B:B6:99:87:AF,bridge=vmbr0,firewall=1,tag=110
numa: 0
onboot: 1
ostype: l26
sata0: local-lvm:vm-103-disk-0,discard=on,size=25G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=8e08155a-70ad-4b45-9e0f-36c31faaf68a
sockets: 1
vmgenid: 717a654d-a8f6-4031-82cc-8a892ad73935
root@pve1:/var/log/pve/tasks#
 
Last edited:
Looks to me like the beginning of the virtual disk got corrupted at some point. The protective MBR was wiped and I would expect the second partition to be of type EF00, but I can't be sure, as it's also possible that you installed with a separate ext4 /boot. Maybe boot the VM with an Ubuntu live installer ISO (same version?) and do a (chroot) boot repair? Maybe Ubuntu support/forums know how to approach this in more detail?
 
1. Boot Live CD
2. fsck your partitions
3. mount your partitions under /mnt (on live system)
e.g.
mount /dev/sda5 /mnt
mount /dev/sda2 /mnt/boot
mount /dev/sda1 /mnt/boot/efi

for i in /proc /sys /dev /run ; do mount -o bind $i /mnt/$i; done
chroot /mnt
update-grub
grub-install /dev/sda

exit

umount /mnt/*
umount /mnt

reboot
 
1. Boot Live CD
2. fsck your partitions
3. mount your partitions under /mnt (on live system)
e.g.
mount /dev/sda5 /mnt
mount /dev/sda2 /mnt/boot
mount /dev/sda1 /mnt/boot/efi

for i in /proc /sys /dev /run ; do mount -o bind $i /mnt/$i; done
chroot /mnt
update-grub
grub-install /dev/sda

exit

umount /mnt/*
umount /mnt

reboot
Hi "ubu"

Thank you for your suggestion !!
I've booted into Ubuntu-Desktop, that seems to be Ubuntu's live CD
I do see my sda disk with lsblk

1699886085894.png

But neither fdisk nor gparted see the partitions on the disk so it is not possible to mount these partitions and run a fsck on them

Strangely enough I do see the partitions from the Proxmox cli by use of gdisk -l /dev/pve/vm-103-disk-0 as shown earlier in this thread

Since Proxmox do see the partitions Is it possible in any waty or form to to mount the partitions on a mountpoint on the proxmox file-tree ?
 
1. Boot Live CD
2. fsck your partitions
3. mount your partitions under /mnt (on live system)
e.g.
mount /dev/sda5 /mnt
mount /dev/sda2 /mnt/boot
mount /dev/sda1 /mnt/boot/efi

for i in /proc /sys /dev /run ; do mount -o bind $i /mnt/$i; done
chroot /mnt
update-grub
grub-install /dev/sda

exit

umount /mnt/*
umount /mnt

reboot

Hi Ubu

Again your suggestions pointed me in the right direction

I should have looked deeper into what the "gdisk .l" told me. Thet the MBR was missing !!

I wasn't sure if my VM (Ubuntu LTS) used MBR or GPT so during my Google search a guy suggested this command that must show the string "GRUB" if it uses MBR for booting: "dd bs=512 count=1 if=dev/sda | strings"

I got this output:

1699950158605.png

So no wonder why neither fdisk nor gdisk could see the MBR, and why I couldn't see any partitions with the command "lsblk"
Everyting in the boot record and in the GPT partitioning table had been overwritten with something that looks like log-entries

Using the command from the Proxmox cmd-line "gdisk /dev/pve/vm-103-disk-0" told me that the Primary GPT was lost but it could be regenerated from the Secondary by writing back to disk with "w"
That done I now had a valid partitioning table, and now got a "Booting from disk" message, but so far nothing happened
However I could mount the individual partiotions and LVM's and run a chkdsk on them, and all passed this test...so far so good

When everything in the boot record got deleted alsdo the GRUB bootloader got deleted, so I had to reinstall this

Since my root-filesystem is on a LVM I needed to mount it in these steps

mkdir /mnt
mkdir /mnt/boot
mount /dev/mapper<root-LVM> /mnt
mount /dev/sda2 /mnt/boot (sda = /boot filesystem)
sudo grub-install --boot-directory=/mnt/boot /dev/sda


And now the VM could be booted again

I have no clue about how and what overwrote the boot record with garbish information, and it has been there for at least 4 backups that is one month of time
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!