Our VM bringup procedure frequently results in VMs that don't start and need manual fixing with a partition utility before they will boot. It happens like this:
1. On creation,
2. When the VM is started, it hangs for a while waiting for the root device before dropping into the initramfs prompt. What resolves the problem is using
We are running Proxmox 6.4 with an LVM storage backend (the data store is an iSCSI SAN device). The VMs are always based on recent Ubuntu server ISOs: 20.04 and 22.04.
Our current hypothesis is that there is some vestige of ZFS or mdadm exists on the newly created volumes. Maybe so, but I don't completely understand that. I wouldn't think a recognizable signature would appear on every "chunk" of the datastore allocated by Proxmox.
It does seem that the way we do the creation is not quite right. We run
More generally, conceptually (and practically, I guess), what goes into the "preparation process" when Proxmox creates a new block device for use by a VM. Regardless of this particular problem, I'm curious about how it works.
[1]
[2]
1. On creation,
pvesh create
prints a warning about detecting an existing GPT signature on both of the volumes it creates. An example is [1] at the bottom of this post.2. When the VM is started, it hangs for a while waiting for the root device before dropping into the initramfs prompt. What resolves the problem is using
gdisk
to repair the GPT partition table of the VM's disk 0. gdisk
's rescue submenu has some options that make this easy. Then, we stop the VM, restart it and it boots just fine. An abbreviated boot log is [2] at the bottom of this post.We are running Proxmox 6.4 with an LVM storage backend (the data store is an iSCSI SAN device). The VMs are always based on recent Ubuntu server ISOs: 20.04 and 22.04.
Our current hypothesis is that there is some vestige of ZFS or mdadm exists on the newly created volumes. Maybe so, but I don't completely understand that. I wouldn't think a recognizable signature would appear on every "chunk" of the datastore allocated by Proxmox.
It does seem that the way we do the creation is not quite right. We run
pvesh create
from a script, i.e non-interactively, and it doesn't have the ability to answer 'y' to the request to wipe the partition table. Do we need to figure out way to do this? Is there an option to the create
subcommand that instructs it to wipe?More generally, conceptually (and practically, I guess), what goes into the "preparation process" when Proxmox creates a new block device for use by a VM. Regardless of this particular problem, I'm curious about how it works.
[1]
Code:
RUN: pvesh create /nodes/pve1-c1n2/qemu/10004/clone --newid 136 --target pve1-c4n3 --name server7.dc1.internal --full true
create full clone of drive scsi0 (storage2:vm-10004-disk-0)
WARNING: PMBR signature detected on /dev/STOR2-PVE1-01/vm-136-disk-0 at offset 510. Wipe it? [y/n]: [n] Aborted wiping of PMBR.
Logical volume "vm-136-disk-0" created. 1 existing signature left on the device.
... many transferring message ...
create full clone of drive ide2 (storage2:vm-10004-cloudinit)
WARNING: iso9660 signature detected on /dev/STOR2-PVE1-01/vm-136-cloudinit at offset 32769. Wipe it? [y/n]: [n]
Aborted wiping of iso9660. Logical volume "vm-136-cloudinit" created.
1 existing signature left on the device."UPID:pve1-c1n2:00006BA5:276EC1D88:67303F0E:qmclone:10004:root@pam:"
[2]
Code:
Booting from Hard Disk...
[ 0.000000] Linux version 5.15.0-46-generic (buildd@lcy02-amd64-115) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38) )
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-46-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
... boot messages ...
begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... [ 3.568375] Btrfs loaded, crc32c=crc32c-intel, zoned=yes, fsverity=yes
Scanning for Btrfs filesystems
done.
Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: No arrays found in config file or automatically
done.
mdadm: No arrays found in config file or automatically
... duplicates of above ....
mdadm: error opening /dev/md?*: No such file or directory
mdadm: No arrays found in config file or automatically
... duplicates of above ....
done.
Gave up waiting for root file system device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Missing modules (cat /proc/modules; ls /dev)
ALERT! LABEL=cloudimg-rootfs does not exist. Dropping to a shell!
BusyBox v1.30.1 (Ubuntu 1:1.30.1-7ubuntu3) built-in shell (ash)
Enter 'help' for a list of built-in commands.
(initramfs)