Cloning VM Template With Cloud Init Fails

vanilla_wizard

New Member
Feb 28, 2024
9
0
1
Hi all,

I've been trying to deploy machines via the Proxmox VE API by cloning an existing template and configuring it with cloud-init.
Sometimes this works, however most times my VMs either get stuck in a bootloop or get stuck in initramfs:

Bootloop:
Screenshot 2024-02-28 at 17.40.18.png

Initramfs:

ALERT! UUID=<UUID> does not exist. Dropping to a shell.
(initramfs)

My environment:

- Proxmox Virtual Environment 8.1.4 (Also tested on 8.0.x)
- Kernel Version: Linux 6.5.13-1-pve (2024-02-05T13:50Z)
- Template specs: 1 vCPU | 1 GB RAM | 3.5 GB bootdisk size

Machines are cloned from a template created using the cloud-init official image from ubuntu and the following steps are followed to create this template:
Code:
qm create <VMID> --memory 1024 --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci --onboot=1 --ostype l26
qm set <VMID> --scsi0 <STORAGE>:0,import-from=<PATH-TO-IMG>
qm set <VMID> --ide2 <STORAGE>:cloudinit
qm set <VMID> --boot order=scsi0
qm set <VMID> --serial0 socket --vga serial0
qm set <VMID> --cicustom "vendor=local:snippets/ubuntu-22.04-vendor.yaml"
qm template <VMID>

Machines are then cloned via the Proxmox VE API which include the following parameters:
- ciuser
- ipconfig0
- nameserver
- cipassword

Example cloned machine config:
Code:
boot: order=scsi0
cicustom: vendor=local:snippets/ubuntu-22.04-vendor.yaml
cipassword: **********
ciuser: blingbling
description: Post cloud init scripts found here%3A /var/lib/vz/snippets%0A%0AVM Configs found here%3A /etc/pve/qemu-server
ide2: vms-vhds:vm-1144-cloudinit,media=cdrom,size=4M
ipconfig0: gw=10.251.0.1,ip=10.251.192.91/16
memory: 1024
meta: creation-qemu=8.0.2,ctime=1702576222
name: S-hascicustom
nameserver: 1.1.1.1 8.8.8.8
net0: virtio=BC:24:11:4C:07:6C,bridge=vmbr0
onboot: 1
ostype: l26
scsi0: vms-vhds:vm-1144-disk-0,size=3584M
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=2a5a7016-9519-40d3-8c93-fe5008458f69
vga: serial0
vmgenid: b617bbc1-c8ee-4094-934f-e960a98b541b

Template config:
Code:
boot: order=scsi0
cicustom: vendor=local:snippets/ubuntu-22.04-vendor.yaml
description: Post cloud init scripts found here%3A /var/lib/vz/snippets%0A%0AVM Configs found here%3A /etc/pve/qemu-server
ide2: vms-vhds:vm-100-cloudinit,media=cdrom
memory: 1024
meta: creation-qemu=8.0.2,ctime=1702576222
net0: virtio=A2:9C:B0:50:9F:CD,bridge=vmbr0
onboot: 1
ostype: l26
scsi0: vms-vhds:base-100-disk-0,size=3584M
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=068f9ff0-3196-4e37-9b87-0b52572881e2
template: 1
vga: serial0
vmgenid: af7c5060-f4c8-42b5-a8ed-f42de8e52d98
Am I missing something obvious here? I've been having this issue for the past 3 months.
 
Last edited:
Hello dont use ide for cloudinit-drive, there a multiple issues. Try this one as a reference https://www.thomas-krenn.com/de/wiki/Cloud_Init_Templates_in_Proxmox_VE_-_Quickstart and you should use virtio-scsi-single for the scsihw-controller.
Hey there jsterr, thanks for replying. I just followed this guide, but still seem to encounter the same problem, do you have any other suggestions that I may be able to try? Also I wonder why the official Proxmox guide suggests we use ide drives if there are multiple issues with them...
 
Is that configuration output of the template or of the VM that was cloned from template?
Can you post both, preferably using text and CODE tags?

The failure to find boot device happens extremely before any cloud-init interaction, so I cant imagine how CI can be related.
Can you boot the VM before your convert it to template? (btw you posted "qm template" in your procedure, I assume its a copy/paste error since its missing vm ID?).

I would : a) check that my image is correct, ie amd and not arm b) checksum the image c) try to boot VM freshly made from image, ie no template

We use very similar procedure with official cloud images daily and never had an issue.

@jsterr IDE works fine as long as one doesnt use UEFI/ovmf BIOS. That said, you are not wrong that SCSI can be used for controller simplicity.



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: jsterr
Hi,
I would : a) check that my image is correct, ie amd and not arm b) checksum the image c) try to boot VM freshly made from image, ie no template
see also here for some commands to verify that the disk contents are as expected. You can do this for the cloned image, the one from the template and the original one.
 
Is that configuration output of the template or of the VM that was cloned from template?
Can you post both, preferably using text and CODE tags?

The failure to find boot device happens extremely before any cloud-init interaction, so I cant imagine how CI can be related.
Can you boot the VM before your convert it to template? (btw you posted "qm template" in your procedure, I assume its a copy/paste error since its missing vm ID?).

I would : a) check that my image is correct, ie amd and not arm b) checksum the image c) try to boot VM freshly made from image, ie no template

We use very similar procedure with official cloud images daily and never had an issue.

@jsterr IDE works fine as long as one doesnt use UEFI/ovmf BIOS. That said, you are not wrong that SCSI can be used for controller simplicity.



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Hey there,
The posted config is from the VM cloned from the template, I also just added the config of the template too.
I just checked a, b and c and all still seems to look good.

I updated the initial post to update the template command as it was indeed a typo :).

Hi,

see also here for some commands to verify that the disk contents are as expected. You can do this for the cloned image, the one from the template and the original one.

Hi Fiona, I ran all of the commands and here are all of the results:

Code:
root@pve12-test:~# fdisk -l /dev/zvol/vms-vhds/vm-1150-disk-0
Disk /dev/zvol/vms-vhds/vm-1150-disk-0: 3.5 GiB, 3758096384 bytes, 7340032 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 16384 bytes
I/O size (minimum/optimal): 16384 bytes / 16384 bytes

Code:
root@pve12-test:~# lsblk -o NAME,FSTYPE /dev/zvol/vms-vhds/vm-1150-disk-0
NAME  FSTYPE
zd320 ext4

Code:
root@pve12-test:~# wipefs /dev/zvol/vms-vhds/vm-1150-disk-0
DEVICE         OFFSET TYPE UUID                                 LABEL
vm-1150-disk-0 0x438  ext4 43cf0ceb-1f9d-4018-89da-4f1585f40449 BOOT
 
Last edited:
Code:
root@pve12-test:~# wipefs /dev/zvol/vms-vhds/vm-1150-disk-0
DEVICE         OFFSET TYPE UUID                                 LABEL
vm-1150-disk-0 0x438  ext4 43cf0ceb-1f9d-4018-89da-4f1585f40449 BOOT
Okay, so there is an ext4 filesystem on the volume, looking good. Is this a new clone (the original you posted had a different ID)? What does qemu-img compare /dev/zvol/vms-vhds/vm-1150-disk-0 /dev/zvol/vms-vhds/base-100-disk-0 say?

I've been trying to deploy machines via the Proxmox VE API by cloning an existing template and configuring it with cloud-init.
Sometimes this works, however most times my VMs either get stuck in a bootloop or get stuck in initramfs:
If a given clone starts successfully, does the same clone always start successfully, or is it random for each attempt to start?

Some things you could try (just wild guesses):
  • When editing the VM's diks, in the Advanced Settings, set Async IO to aio=threads.
  • Detach the disk and re-attach as VirtIO block instead of VirtIO SCSI (don't forget to update the Boot Order in the VM's Options).
  • Also enable the ide2 disk in the VM's Boot Order as a second entry.
 
Okay, so there is an ext4 filesystem on the volume, looking good. Is this a new clone (the original you posted had a different ID)? What does qemu-img compare /dev/zvol/vms-vhds/vm-1150-disk-0 /dev/zvol/vms-vhds/base-100-disk-0 say?


If a given clone starts successfully, does the same clone always start successfully, or is it random for each attempt to start?

Some things you could try (just wild guesses):
  • When editing the VM's diks, in the Advanced Settings, set Async IO to aio=threads.
  • Detach the disk and re-attach as VirtIO block instead of VirtIO SCSI (don't forget to update the Boot Order in the VM's Options).
  • Also enable the ide2 disk in the VM's Boot Order as a second entry.

Running
Code:
qemu-img compare /dev/zvol/vms-vhds/vm-1150-disk-0 /dev/zvol/vms-vhds/base-100-disk-0
gives:

Code:
root@pve12-test:~# qemu-img compare /dev/zvol/vms-vhds/vm-1150-disk-0 /dev/zvol/vms-vhds/base-100-disk-0
Warning: Image size mismatch!
Images are identical.

I'm unsure if it always starts successfully, however my workflow for when it does start successfully is to migrate it to another node after the initial cloud-init config completes. After migration the machine sometimes starts to fail and re-migrating it back to the initial machine can make it work from time to time.

I also tried all of the wild guesses, but unfortunately they also do not seem to work.

Do you reckon it could be related to this ZFS issue?: https://github.com/openzfs/zfs/issues/15904
If so, how would I verify it?
 
Last edited:
Nope, unfortunately when I created a VM from the image directly it did not boot correctly.
If you cant start basic image based VM, then its neither clone nor cloudinit problem. Additionally it means that (c) was not good?

My recommendation is to go back and concentrate on just trying the simplest case of : reliably import official cloud image and boot the VM.
Try different images, try different (ie local) storage - its possible your storage is lying to you and not flushing/writing everything to disk. Put template, cloning and cloud-init aside.

For full picture, share with forum what exact image you used and what other images you tried. As well as what is the hardware behind vms-vhds storage.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!