IDE cloud-init volumes are not usable when cold-booting a VM

jamesharr · Oct 31, 2022

Hi everyone,

I think I found an issue with cloud-init volumes. I'm wondering if anyone else has run into this and I'm also wondering if anyone else can replicate what I'm seeing.

It seems that when cold-starting a VM with a cloud-init volume attached to IDE2, that the OS (Alma9 Cloud image in this case) cannot see the volume. After restarting warm (VM reset, or ctrl-alt-delete in the VM), the volume is readable.

I'm wondering if anyone else has run into this and/or can find something silly I'm doing. I've looked to see if it's a bug and I haven't found anything specific to this. If this does wind up being a bug, I'm wondering if anyone has advice on things I can provide to help triage/get it addressed.

For the time-being, I found that attaching the cloud-init drive via SCSI works without issue.

Reproducing the issue

I did this all via pvesh, however I think it might actually work via the qm create or via the WebUI

Bash:

pvesh create /nodes/vdev-1/qemu \
    --name=kdevdev-compute-1 \
    --agent=1 \
    --boot=c \
    --bootdisk=scsi0 \
    --scsihw=virtio-scsi-pci \
    --bios=ovmf \
    --onboot=1 \
    --serial0=socket \
    --net0=virtio,bridge=vmbr301,firewall=1 \
    --ostype=l26 \
    --citype=nocloud \
    --ciuser=root \
    --cipassword=$6$--REDACTED-- \
    --sshkeys=ssh-rsa%20--REDACTED-- \
    --cpu=host \
    --memory=65536 \
    --cores=8 \
    --efidisk0=local-nvme2-blk:0,efitype=4m,pre-enrolled-keys=1 \
    --scsi0=local-nvme2-blk:0,discard=on,size=100G,import-from=/mnt/pve/ceph-fs/ci-images/AlmaLinux-9-GenericCloud-latest.x86_64.qcow2 \
    --scsi1=local-nvme2-blk:300,discard=on,size=300G \
    --ide2=local-nvme2-blk:cloudinit \
    --vmid=1061 \
    --ipconfig0=ip=--REDACTED--/26,gw=--REDACTED--,ip6=--REDACTED--/64,gw6=--REDACTED-- \
    --output-format=json-pretty

pvesh set /nodes/vdev-1/qemu/1061/firewall/options \
    --enable=1 \
    --output-format=json-pretty

pvesh create /nodes/vdev-1/qemu/1061/firewall/rules \
    --action=kdevdev-compute \
    --enable=1 \
    --pos=0 \
    --type=group

pvesh create /nodes/vdev-1/qemu/1061/status/start
# output
generating cloud-init ISO
UPID:vdev-1:002209BE:105C8F9C:635FD5DC:qmstart:1061:root@pam:

At the grub boot-menu, edit default entry and add init=/bin/bash

Bash:

linux ... init=/bin/bash

After the kernel has booted up:

Bash:

bash-5.1# lsblk -f
NAME   FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda                                       
|-sda1                                    
|-sda2                                    
|-sda3                                    
`-sda4                            8.6G     8% /
sdb                                       

### Rescan scsi devices just to be sure...
bash-5.1# for h in /sys/class/scsi_host/host*; do echo "- - -" > $h/scan; done

bash-5.1# lsblk -f
NAME   FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda                                       
|-sda1                                    
|-sda2                                    
|-sda3                                    
`-sda4                            8.6G     8% /
sdb                                       

### Look for any sr0 related messages in dmesg
bash-5.1# dmesg | grep sr0
[nothing]

After a system reset, sr0 device will be present

Bash:

echo "b" > /proc/sysrq-trigger

### At grub, add `init=/bin/bash`

bash-5.1# lsblk -f
NAME   FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda                                       
|-sda1                                    
|-sda2                                    
|-sda3                                    
`-sda4                            8.6G     8% /
sdb                                       
sr0                                       

### Check for sr0 related messages in dmesg
bash-5.1# dmesg | grep sr0
[    1.810820] sr 2:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
[    1.833423] sr 2:0:0:0: Attached scsi CD-ROM sr0

Things I've done to narrow down the issue

Almost universally, resetting, or rebooting after that initial startup will clear the issue.
- reset via WebUI
- Guest VM self-reset via echo b > /proc/sysrq-trigger
- Guest VM reboot via ctrl-alt-del. I suspect a reboot would work too, except I'm not able to login to the cloud image.
Creating the VM via pvesh, but booting via WebUI
- Result: no block device present on sr0
Creating the VM via pvesh, clicking "Regenerate Image" in Cloud-Init in the WebUI, then booting via the WebUI
- Result: no block device present on sr0
Creating the VM via pvesh, but removing/re-adding the cloud-init drive prior to first-boot
- Result: no block device present on sr0
Creating the VM via pvesh, but changing cloud-init settings prior to first-boot
- Result: no block device present on sr0
Creating the VM using a SCSI to attach the cloud-init volume
- Result: This seems to work consistently

Other observations

These are things I noted, but may or may not be relevant to the problem.

Qemu info block command doesn't seem to show any difference before/after reset, apart from locked/not-locked. The only odd part to me is that it's attached to IDE2 instead of a scsi drive, but that might be irrelevant.

Bash:

# info block
pflash0 (#block197): /usr/share/pve-edk2-firmware//OVMF_CODE_4M.fd (raw, read-only)
    Attached to:      /machine/system.flash0
    Cache mode:       writeback

drive-efidisk0 (#block368): json:{"driver": "raw", "size": "540672", "file": {"driver": "host_device", "filename": "/dev/nvme2-lvm/vm-1062-disk-0"}} (raw)
    Attached to:      /machine/system.flash1
    Cache mode:       writeback

drive-ide2 (#block521): /dev/nvme2-lvm/vm-1062-cloudinit (raw, read-only)
    Attached to:      ide2
    Removable device: not locked, tray closed
    Cache mode:       writeback

drive-scsi0 (#block794): /dev/nvme2-lvm/vm-1062-disk-1 (raw)
    Attached to:      scsi0
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

drive-scsi1 (#block988): /dev/nvme2-lvm/vm-1062-disk-2 (raw)
    Attached to:      scsi1
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

What's curious is that after I do the reset/reboot to get the cloud-init drive working, that the UUID on the cloud-init partition (as seen by `lsblk -f`) shows the same timestamp as the VM start task...

Bash:

[root@localhost ~]# lsblk -f
NAME FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                     
├─sda1
│                                                                       
├─sda2
│    vfat   FAT16       57E9-4595                             192.8M     3% /boot/efi
├─sda3
│    xfs                abd5d231-f8cb-41d6-9b8a-528febbfb19e  395.3M    20% /boot
└─sda4
     xfs                953ee9e0-c4d0-4f57-8255-d356fc915215    8.4G    10% /
sdb                                                                     
sr0  iso966       cidata
                        2022-10-31-15-59-32-00

When looking at the WebUI task log, it shows the 'VM ### - Start' task was created at 2022-10-31 at 15:59:32, which was the first time the VM was trying to boot. So I think the disk image is fine, it's just not getting mapped into the VM somehow.

I couldn't figure out how to get the disk UUID when booting with init=/bin/bash, and I suspect it's because udev or some other userland subsystem hasn't booted.

I looked to see if there was a difference in qemu process flags at all, but there really wasn't anything different that I could notice

Bash:

root@vdev-3:~# ps auxw | grep 1063
root     1567387 75.7  0.0 68336012 121156 ?     Sl   18:33   0:09 /usr/bin/kvm -id 1063 -name kdevdev-compute-3,debug-threads=on -no-shutdown -chardev socket,id=qmp,path=/var/run/qemu-server/1063.qmp,server=on,wait=off -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/1063.pid -daemonize -smbios type=1,uuid=b0e59e01-b833-4da7-ab0a-7cd319a33015 -drive if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.fd -drive if=pflash,unit=1,format=raw,id=drive-efidisk0,size=540672,file=/dev/nvme2-lvm/vm-1063-disk-0 -smp 8,sockets=1,cores=8,maxcpus=8 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vnc unix:/var/run/qemu-server/1063.vnc,password=on -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt -m 65536 -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device vmgenid,guid=dd06270a-db04-40c3-809e-23e340a0ca36 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -chardev socket,id=serial0,path=/var/run/qemu-server/1063.serial0,server=on,wait=off -device isa-serial,chardev=serial0 -device VGA,id=vga,bus=pci.0,addr=0x2 -chardev socket,path=/var/run/qemu-server/1063.qga,server=on,wait=off,id=qga0 -device virtio-serial,id=qga0,bus=pci.0,addr=0x8 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on -iscsi initiator-name=iqn.1993-08.org.debian:01:5398f791fb65 -drive file=/dev/nvme2-lvm/vm-1063-cloudinit,if=none,id=drive-ide2,media=cdrom,aio=io_uring -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2 -device virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 -drive file=/dev/nvme2-lvm/vm-1063-disk-1,if=none,id=drive-scsi0,discard=on,format=raw,cache=none,aio=io_uring,detect-zeroes=unmap -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=101 -drive file=/dev/nvme2-lvm/vm-1063-disk-2,if=none,id=drive-scsi1,discard=on,format=raw,cache=none,aio=io_uring,detect-zeroes=unmap -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1 -netdev type=tap,id=net0,ifname=tap1063i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=8A:8C:A1:DD:E4:6A,netdev=net0,bus=pci.0,addr=0x12,id=net0 -machine type=pc+pve0

mira · Nov 2, 2022

This sounds like an issue of Alma Linux rather than PVE. Since the hardware doesn't change between runs/resets, it should find it both times.
Do you see similar issues with other distributions or is it just Alma Linux 9?
Have you tried other cloud image versions of Alma Linux? Could be just this one build that has this issue.

jamesharr · Nov 2, 2022

I tested out a few different cloud images.

What I found intriguing, but also confusing is that

Several distributions exhibited similar behavior, but ever so slightly different
Changing to SeaBios fixed the issue in SOME cases
Increasing the core count made it WORSE in some cases

Images used in tests

Bash:

root@vdev-1:/mnt/pve/ceph-fs/ci-images# ls -lh
total 2.5G
-rw-r--r-- 1 root root 528M Oct 25 15:23 AlmaLinux-8-GenericCloud-latest.x86_64.qcow2
-rw-r--r-- 1 root root 441M Oct 25 18:33 AlmaLinux-9-GenericCloud-latest.x86_64.qcow2
-rw-r--r-- 1 root root 243M Oct 20 19:07 debian-11-genericcloud-amd64.qcow2
-rw-r--r-- 1 root root 635M Nov  2 13:47 jammy-server-cloudimg-amd64.img
-rw-r--r-- 1 root root 678M Nov  2 15:31 kinetic-server-cloudimg-amd64.img

root@vdev-1:/mnt/pve/ceph-fs/ci-images# sha256sum *
e4b215dc807200db3a8934bdd785026c3776d3798e36f830897861074f29fc54  AlmaLinux-8-GenericCloud-latest.x86_64.qcow2
ae715819d7fc280633d3364c53cab05f52970b843afc530e6ce122d1b36b40dc  AlmaLinux-9-GenericCloud-latest.x86_64.qcow2
087508073f0e08ee1fd3d50ad3c3e90e0a3191245c2ed15869ec6664af1d9a14  debian-11-genericcloud-amd64.qcow2
2f35b57775ab2e28f50fa09a18d6ccc10ece5e72c30242a73291b9a0eec1f592  jammy-server-cloudimg-amd64.img
8dc6cbae004d61dcd6098a93eeddebc3ddc7221df6688d1cbbbf0d86909aecf4  kinetic-server-cloudimg-amd64.img

Deployment script

Bash:

# Reset environment
pvesh create /nodes/vdev-1/qemu/1199/status/stop
pvesh delete /nodes/vdev-1/qemu/1199

# Create, start test VM
pvesh create /nodes/vdev-1/qemu \
    --name=cloudinit-test \
    --vmid=1199 \
    --agent=1 \
    --boot=c \
    --bootdisk=scsi0 \
    --scsihw=virtio-scsi-pci \
    --bios=seabios \
    --onboot=1 \
    --serial0=socket \
    --net0=virtio,bridge=vmbr302,firewall=1 \
    --ostype=l26 \
    --citype=nocloud \
    --ciuser=root \
    --cipassword=TestPassword \
    --cpu=host \
    --memory=2048 \
    --cores=1 \
    --efidisk0=local-nvme2-blk:0,efitype=4m,pre-enrolled-keys=1 \
    --scsi0=local-nvme2-blk:0,discard=on,size=100G,import-from=/mnt/pve/ceph-fs/ci-images/$IMAGE \
    --ide2=local-nvme2-blk:cloudinit \
    --ipconfig0=ip=10.39.1.99/21,gw=10.39.0.1 \
    --output-format=json-pretty
pvesh create /nodes/vdev-1/qemu/1199/status/start

Alma Variants
Testing was done by interrupting the Grub process and adding `init=/bin/bash`, then running `lsblk -f` and observing the output.

PASS = cloud-init drive (sr0) detected
FAIL = coud-init drive (sr0 or otherwise) was not detected

Kernel Versions:

Alma8 - `Linux localhost 4.18.0-372.19.1.el8_6.x86_64 #1 SMP Tue Aug 2 13:42:59 EDT 2022 x86_64 GNU/Linux`
Alma9 - `Linux localhost 5.14.0-70.22.1.el9_0.x86_64 #1 SMP PREEMPT Tue Aug 9 11:45:52 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux`

Test Results:

Alma9 - 1 cores - OVMF bios
- cold boot - FAIL
- warm boot - PASS
Alma9 - 2 cores - OVMF bios
- cold boot - FAIL
- warm boot - PASS
Alma9 - 2 cores - OVMF bios
- cold boot - FAIL
- warm boot - PASS
Alma9 - 8 cores - OVMF bios
- cold boot - FAIL
- warm boot - FAIL
- Retried above scenarios 3 or 4 times
Alma9 - 8 cores - SeaBios
- cold boot - PASS
- warm boot - PASS

Alma8 - 2 cores - SeaBIOS (UEFI not supported)
- cold boot - PASS
- warmboot - PASS
Alma8 - 8 cores - SeaBIOS (UEFI not supported)
- cold boot - PASS
- warmboot - PASS

Debian 11 (bullseye) tests
Debian 11 was a bit more tricky. The grub settings don't permit the console to change the boot parameters, so I had to rely on.

FAIL - I could observe that no cloud-init settings were picked up (NIC wasn't initialized, user wasn't set up, no updates run)
PASS - I could observe that the user account was set up.
Warm restart was achieved via the Proxmox WebUI using "Reset"
Once an image PASSed any test I reset the disk image since I didn't have a good way to observe it fail.
Linux cloudinit-test 5.10.0-19-cloud-amd64 #1 SMP Debian 5.10.149-1 (2022-10-17) x86_64 GNU/Linux

Test results
Debian 11 - 1 cores - OVMF
- cold boot - FAIL
- warm boot - PASS
Debian 11 - 8 cores - OVMF
- cold boot - FAIL
- warm boot - FAIL (retried 4 times)
Debian 11 - 8 cores - SeaBios
- cold boot - FAIL
- warm boot - PASS. Getting it to PASS took 2 warm resets in most cases. I replicated this 3 times on a fresh image
- ONCE it worked on cold boot
Debian 11 - 2 cores - SeaBios
- cold boot - FAIL
- warm boot - PASS. Getting it to PASS took 2 warm resets in most cases. Replicated twice
Debian 11 - 1 cores - SeaBios
- cold boot - FAIL
- warm bot - PASS
- Replicated twice

Ubuntu 22.04 (Jammy) variants
Ubuntu had similar problems that Debian did with interrupting the boot process. The test process was the same.

FAIL - I could observe that no cloud-init settings were picked up (NIC wasn't initialized, user wasn't set up, no updates run)
PASS - I could observe that the user account was set up.
Warm restart was achieved via the Proxmox WebUI using "Reset"
Once an image PASSed any test I reset the disk image since I didn't have a good way to observe it fail.

Test results
Ubuntu 22.04 - 1 core - OVMF
- cold boot - FAIL
- warm boot - PASS
- Replicated twice
Ubuntu 22.04 - 8 cores - OVMF
- cold boot - FAIL
- warm boot - PASS
Ubuntu 22.04 - 1 cores - SeaBios
- cold boot - PASS
Ubuntu 22.04 - 8 cores - SeaBios
- cold boot - PASS

Ubuntu 22.10 - 1 core - OVMF
- cold boot - FAIL
- warm boot - PASS
Ubuntu 22.10 - 8 core - OVMF
- cold boot - FAIL
- warm boot - PASS
Ubuntu 22.10 - 1 core - SeaBios
- cold boot - PASS
ubuntu 22.10 - 8 core - SeaBios
- cold boot - PASS

jamesharr · Nov 2, 2022

Thinking through this a bit, I'm really doubting this is a proxmox-specific bug.

In my head I'm imagining it being one of a few things 1) Qemu IDE device bug 2) Kernel bug regarding this IDE devices 3) an issue with how distributions are building their devices. It seems like it *could* be a race condition.

So - what happens if that's the case? If it's not a proxmox-specific issue, what's the appropriate course of caction.
1. Escalate it to the right project -- I'm not sure what that is...
2. It might be a good idea to wave Proxmox users away from using IDE devices (which honestly, should probably be the case). I only used it because it was the default device. Then I got smacked with this bug and I really wanted to track it down.

In any case, I'm not equipped to do that deep level of debugging. It seems like IDE is probably due to fade away over time, so how much effort should one put into solving the problem vs just helping people use "the right setting"?

mira · Nov 3, 2022

Did you add the CD-ROM drive (IDE2) to the boot order under VM -> Options? If not, try adding it and see if it is recognized.
The priority should be below the disk you boot from.

jamesharr · Nov 3, 2022

Interestingly enough, it won't let me. I don't think I can add CloudInit drives to the boot order (see screenshots below). At least it won't let me through the WebUI.

I do want to take a moment to clarify that I'm not actually trying to boot off the CloudInit drive. I think you're aware of that but I wanted to state it just to be clear. I think I get what you're trying to test out, and I'll give it a try via the API and report back.

jamesharr · Nov 3, 2022

Putting the cloud-init drive (ide2) in the boot-order seemed to do the trick. This is available via the API, but not via the WebUI. Were you thinking that having it in the boot order initializes the IDE device?

Testing Results

Note that on all subsequent tests below, I made some changes compared to tests above:

I removed the --bootdisk=scsi0 parameter (deprecated)
The --boot= parameter is upgraded to the new format. I was previously using a deprecated format.

I don't believe these changes had an effect on the test, however I'm mentioning it for completeness.

Test Results - With cloud-init drive in boot order

Bash:

    --boot='order=scsi0;ide2'

Alma9 - OVMF - 1 core - ide2 in boot path
- cold boot - PASS
Alma9 - OVMF - 8 cores - ide2 in boot path
- cold boot - PASS

Debian 11 - OVMF - 8 cores - ide2 in boot path
- cold boot - PASS
- Note: Cloud-init on debian seems to pick up really late in the boot process (up to 70-80 seconds after boot). I think this is a distribution-specific problem though. This happens when I put the cloud-init drive on scsi15. I think some of my previous tests with Debian may be innaccurate becuase of this.

Ubuntu 22.04 (jammy) - OVMF - 8 cores - ide2 in boot path
- cold boot - PASS
Ubuntu 22.10 (kinetic) - OVMF - 8 cores - ide2 in boot path
- cold boot - PASS

Test Results - Another IDE drive in boot order

Bash:

    --boot='order=scsi0;ide0' \
    --ide0=none,media=cdrom \

Alma9 - OVMF - 8 cores - ide0 in boot path
- cold boot - FAIL
- warm boot - PASS
- replicated 3 times

mira · Nov 3, 2022

Yes, OVMF requires disks to be in the boot order to be seen by it. So it gets initialized early.
At least we know now that this workaround helps with OVMF. Thank you for testing it!

ryan4yin · Dec 2, 2022

jamesharr said:
Interestingly enough, it won't let me. I don't think I can add CloudInit drives to the boot order (see screenshots below). At least it won't let me through the WebUI.

I do want to take a moment to clarify that I'm not actually trying to boot off the CloudInit drive. I think you're aware of that but I wanted to state it just to be clear. I think I get what you're trying to test out, and I'll give it a try via the API and report back.

View attachment 42893

View attachment 42894

I have the cloudinit's `ide2` option in my Web UI's boot order tab, my pve cluster's version is 7.2

jamesharr · Dec 14, 2022

ryan4yin said:
I have the cloudinit's `ide2` option in my Web UI's boot order tab, my pve cluster's version is 7.2

That's probably my bad -- I see it there now.

I think I got ahead of myself when writing the post.

Search

Search

IDE cloud-init volumes are not usable when cold-booting a VM

jamesharr

Member

Reproducing the issue

Things I've done to narrow down the issue

Other observations

mira

Proxmox Staff Member

jamesharr

Member

jamesharr

Member

mira

Proxmox Staff Member

jamesharr

Member

jamesharr

Member

Testing Results

Test Results - With cloud-init drive in boot order

Test Results - Another IDE drive in boot order

mira

Proxmox Staff Member

ryan4yin

Member

Attachments

jamesharr

Member

IDE cloud-init volumes are not usable when cold-booting a VM

Member

Reproducing the issue​

Things I've done to narrow down the issue​

Other observations​

Proxmox Staff Member

Member

Member

Proxmox Staff Member

Member

Member

Testing Results​

Test Results - With cloud-init drive in boot order​

Test Results - Another IDE drive in boot order​

Proxmox Staff Member

Member

Attachments

Member

Reproducing the issue

Things I've done to narrow down the issue

Other observations

Testing Results

Test Results - With cloud-init drive in boot order

Test Results - Another IDE drive in boot order