PVE 5.2 KVM with UEFI/OVMF have only the frist drive

AMINET

Active Member
Aug 8, 2018
10
0
41
45
Hi,

I just upgraded my server from 5.1 to 5.2 but i have most of my VM booting with only the first drive: /dev/sda which has only the UEFI boot partition in FAT32.
As a result grub complain of not finding the OS disk (UUID)
In the grub shell: ls return (hd0) (cd0)

I have no problem under PVE 5.1

I have downgraded packages to previous version : pve-qemu-kvm to 2.11.1-5 and pve-edk2-firmware to 1.20180316-1 with no effect (I even try a last buit of edk2: on kraxel.org /repos/jenkins/edk2/edk2.git-ovmf-x64-0-20180807.217.g9e6c4f1527.noarch.rpm )

one of the qemu-server conf file:
agent: 1
bios: ovmf
boot: dc
bootdisk: scsi0
cores: 6
cpu: host
cpuunits: 1024
efidisk0: zfs-sas:vm-72102-disk-4,size=128K
ide2: none,media=cdrom
keyboard: fr
memory: 512
name: dev01
net0: virtio=A6:D5:D9:29:BE:C2,bridge=vmbr22,tag=22
numa: 1
ostype: l26
scsi0: zfs-sas:vm-72102-disk-2,size=200M
scsi1: zfs-sas:vm-72102-disk-1,size=12G
scsi2: zfs-sas:vm-72102-disk-3,size=1G
scsihw: virtio-scsi-single
smbios1: uuid=e960e799-74ea-4c0f-84a4-fadf865d53b3
sockets: 1
vga: qxl


adding io-thread or changing virtio-scsi-single to virtio-scsi doesn't help.

Arguments passed to KVM for disk look ok:
-drive if=none,id=drive-ide2,media=cdrom,aio=threads
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100

-device virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1 -drive file=/dev/zvol/zfs-sas/vm-72102-disk-2,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on
-device scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=200

-device virtio-scsi-pci,id=virtioscsi1,bus=pci.3,addr=0x2 -drive file=/dev/zvol/zfs-sas/vm-72102-disk-1,if=none,id=drive-scsi1,format=raw,cache=none,aio=native,detect-zeroes=on
-device scsi-hd,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1

-device virtio-scsi-pci,id=virtioscsi2,bus=pci.3,addr=0x3 -drive file=/dev/zvol/zfs-sas/vm-72102-disk-3,if=none,id=drive-scsi2,format=raw,cache=none,aio=native,detect-zeroes=on
-device scsi-hd,bus=virtioscsi2.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi2,id=scsi2


(with SeaBIOS all drives are present)

Does anyone has an idea on what's going on ?
 
Last edited:
I checked kvm process arguments between a not stopped VM (hot migrated between servers) and a stopped and then started VM, the difference is not on the disk.
the not stopped (VM started under PVE 5.1) has extra arguments:
"-machine type=pc-i440fx-2.9" : new VM the default is pc-i440fx-2.11 adding the old machine type in args doesn't change anything
"-incoming unix:/run/qemu-server/VMID.migrate -S" are present because the VM migrated
 
Is a new installed VM working? I tested a ubuntu 18.04 install with UEFI and it worked.

/usr/bin/kvm
-id 102
-name testuefi
-chardev 'socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait'
-mon 'chardev=qmp,mode=control'
-pidfile /var/run/qemu-server/102.pid
-daemonize
-smbios 'type=1,uuid=f35a9b17-50d5-4542-a87f-627ced02c471'
-drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd'
-drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=/dev/zvol/nvme/vm-102-disk-2'
-smp '1,sockets=1,cores=1,maxcpus=1'
-nodefaults
-boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg'
-vga std
-vnc unix:/var/run/qemu-server/102.vnc,x509,password
-cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce
-m 512
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e'
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f'
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2'
-device 'usb-tablet,id=tablet,bus=uhci.0,port=1'
-device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3'
-iscsi 'initiator-name=iqn.1993-08.org.debian:01:5652841c0d9'
-drive 'if=none,id=drive-ide2,media=cdrom,aio=threads'
-device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200'
-device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5'
-drive 'file=/dev/zvol/nvme/vm-102-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on'
-device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100'
-netdev 'type=tap,id=net0,ifname=tap102i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on'
-device 'virtio-net-pci,mac=EE:D2:34:FC:C8:22,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'
 
Hi Alwin

In fact I discovered the problem while creating a new VM.
But it was a clone from a templace, I just create a new one from scratch booting from a live CD :

bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host
efidisk0: local:100/vm-100-disk-2.raw,size=128K
ide2: local:iso/systemrescuecd-x86-5.2.0_UEFI.iso,media=cdrom
memory: 512
name: testnewvm
net0: virtio=4A:AD:7A:3A:04:A1,bridge=vmbr227
numa: 1
ostype: l26
scsi0: local:100/vm-100-disk-1.raw,size=209716K
scsi1: local:100/vm-100-disk-3.raw,iothread=1,size=5G
scsi2: local:100/vm-100-disk-4.raw,size=1G
scsihw: virtio-scsi-single
smbios1: uuid=f0fd1c0c-69ae-4eef-917b-ad55eafdab5e
sockets: 1

So I booted from the same live CD the cloned VM and lsblk show the disk.
Then I installed a Ubuntu Server 18.04.1 and it goes well.
So the problem can came from Grub ? (version from Debian 9.5 and Oracle Linux 7.5)
I copied (cold mount raw disk) the grub stuff from Ubuntu 18.04 to old VM but no luck
Then I restarted the Unbuntu 18.04 VM, but it is facing the same problem.
I reinstalled Ubuntu 18.04, at the end of the install the reboot was OK, but I rebooted and Grub only see hd0 :(
I also tried to boot from the Ubuntu 18.04 install, it see the disks when it was at the partitioning step, then I rebooted and Grub still only see hd0.

So I think there is something in "initialization" of the VM that live-CD fix temporary (linux kernel, udev/systemd ... ?)

Regards

Aurélien
 
scsi0: zfs-sas:vm-72102-disk-2,size=200M
If I understand correctly, then on the first disk there is only the EFI partition and no kernel to boot. AFAIK, this isn't working.
 
Yes, the first disk has a GTP partition table and only a FAT32 partition (sda1) for EFI bootloader (grub ...)
The other disks don't have avy partition table, there are directly formatted. This allow simple and quick disk growning ''resize disk" at Proxmox level and then resize2fs at the OS level.
 
The disk layout of the Ubuntu installer
scsi-0/sda: 600Mo "make a boot disk" created a FAT32 partition (/dev/sda1) with 50% of the available space mounted as /boot/efi
scsi-1/sdb: 5Go with Ubuntu installer need a partition for / , so /dev/sdb1 a 5Go ext4 partion
scsi-2/sdc: 1Go formatted as swap (no partition)

unbutu_part.png
 
Hi Alwin,

do you reproduce the problem ?
or have an idea to fix it ?
(I could create the /boot partition as sda2 but I have to do it for every VM ... and it's not a fix)

Regards
Aurélien
 
The current implementation doesn't allow to have the EFI partition on a separate disk. AFAICS, the second disk needs a bootindex too, this is currently not possible to set.

Please feel free to open up a bug report.
https://bugzilla.proxmox.com/

EDIT: Why are you using a separate EFI disk anyway?
 
Hi,

Sorry for the delay, I was offline (in vacation).

The EFI partition isn't on a separate disk, it's on the first disk (sda).
The OS (Linux) is installed on the second disk (sdb), partitioned or not (sdb or sdb1) with the swap on a third disk (sdc, without partition).

As a result, EFI start the GRUB binary (grubx64.efi) which in turn as to read grub.cfg, modules ...
if the are on sdb (in /boot/grub/... ) it fails
if they are on the EFI partition it works (menu is shown) but it fail to boot the Linux kernel which in /boot is on sdb

With my disks layout the workaround is to have /boot on sda2 or to have /boot directly in sda1 (but it's FAT fs)

The disks layout:
disk1/sda: 200Mo with a GTP partition table for 1 partition in FAT dedicated to EFI with boot and esp flags (mounted as /boot/efi)
dsik2/sdb: 5Go+ with no partition table, directly formatted in ext4 dedicated for the operating system (mounted as / )
disk3/sdc: 1Go+ with not partition table, directly formatted as linux swap

I use this layout as disk without partition could be:
- resized easily (just "resize disk" under PVE interface and the resize2fs /dev/sdb from the guest)
- as is doesn't use qcow2 or vmdk but raw, disk file could be mounted directly (no need to find the offset of the partition or to use qemu-nbd (partx))
It is useful as I have a script that use "qm clone", pvesh and mount the disk sdb of the new VM to set IP, hostname, new accounts, ssh host keys, user authorised key, new user/password before the OS boots ... (automation as cloud-init would do on a booted system)

I done a bug report: https://bugzilla.proxmox.com/show_bug.cgi?id=1896

Regards

Aurélien
 
Thanks for providing the report. For now the workaround would be to go into the EFI setup screen (hit ESC) and select the disk to boot manual, this way it seems to add the bootindex to the second disk.
 
you're welcome for the report, it's the least we can do.

For any miss understanding, the problem is not about booting on the second disk.
EDK2 firmware read EFI vars on the EFI Disk properly and respect the boot order.
efibootmgr -v return
Code:
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0003,0002,0001,0000,0008,0004,0005,0006,0007,0009
Boot0000* UiApp   FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0001* UEFI QEMU QEMU HARDDISK    PciRoot(0x0)/Pci(0x5,0x0)/Pci(0x1,0x0)/SCSI(0,0)N.....YM....R,Y.
Boot0002* debian   HD(1,GPT,b9834ae7-5e2a-4d6c-9be2-7bdbceccef18,0x800,0x637df)/File(\EFI\debian\grubx64.efi)
Boot0003* UEFI QEMU DVD-ROM QM00003    PciRoot(0x0)/Pci(0x1,0x1)/Ata(1,0,0)N.....YM....R,Y.
Boot0004* UEFI Floppy   PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x0)N.....YM....R,Y.
Boot0005* UEFI Floppy 2   PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x1)N.....YM....R,Y.
Boot0006* UEFI QEMU QEMU HARDDISK  2   PciRoot(0x0)/Pci(0x5,0x0)/Pci(0x2,0x0)/SCSI(0,1)N.....YM....R,Y.
Boot0007* UEFI QEMU QEMU HARDDISK  3   PciRoot(0x0)/Pci(0x5,0x0)/Pci(0x3,0x0)/SCSI(0,2)N.....YM....R,Y.
Boot0008* EFI Internal Shell   FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
Boot0009* UEFI PXEv4 (MAC:A6D5D929BEC2)   PciRoot(0x0)/Pci(0x12,0x0)/MAC(a6d5d929bec2,1)N.....YM....R,Y.

as there are no CD it boot on 0002 (Debian for that VM).

Note: it actually boot without problem as I moved /boot stuff on the first disk (directly in the FAT32 partition)


The problem is that Grub start ( \EFI\debian\grubx64.efi), and as my disk layout /boot/grub is on the second disk (sdb/vdb) it has no acces to grub.cfg and modules and drop his prompt.
In the grub prompt the ls command only show the first disk and the dvd but not the second and the third disk.


As I post in the bug report 1896, I have done cross testing with the Gentoo on my workstation. The problem seems to come from EDK2 firmware.
 
Note: it actually boot without problem as I moved /boot stuff on the first disk (directly in the FAT32 partition)
That's how it works w/o intervention, but in your original case, (leaving the efidisk aside) using separate disks for grub and boot will not work with EFI. This is because on starting the VM, our code sets the bootindex only for the selected drive and not for any other (eg. /boot).
 
I don't understand why to you say it will not work with EFI.
For any misunderstanding, it was working well with PVE 5.1 and Grub main binary is installed on the EFI partition (thanks to grub-install --efi-directory=/boot/efi ... and sda1 is mounted in /boot/efi), which has the file EFI/debian/grubx64.efi
The KVM's command line bootindex options was set in the same way in PVE 5.1 version.
Grub is well started by the OVMF as it is on the first fisk (in the EFI partition), but it can't access grub.cfg and his modules, linux kernel which is located on an other drive. In the Grub shell, the ls command only return hd0 (and cd0).

As I posted on the bug report after testing the lasted EDK2 version I found that pressing ESC to go the setup and then directly chose "Continue" just fixed the problem for the current boot (needs to done at each boot/reboot) in order to have Grub seeing all the disks.
 
Last edited: