I am running into a weird issue that I hope someone else has solved before. I am using ansible to deploy a debian VM with a few customizations via cloud-init. Basically, I am just creating a template, doing a full clone, resizing the disk and starting the vm. This causes a kernel panic. I decided to do the full clone without resizing the disk first and boot it normally. This works. But then if I do a qm shutdown, resize the disk and start it again, again I get a kernel panic. However, if I "stop" the VM and start it again, it boots fine and resizes the disk appropriately in the OS. Why does this work? And is there something I can do differently in my ansible playbook to get this right? Here is my cloud-init user template:
And here is my playbook:
Anyone have any ideas why the VM stop command fixes this but a graceful shutdown and start does not? Ideally, I want to fix my config and playbook to account for this issue.
Code:
#cloud-config
growpart:
mode: auto
devices: ['/dev/sda1'] # Adjust based on your disk (e.g., /dev/vda1 for virtio)
ignore_growroot_disabled: true
resize_rootfs: true
fqdn: "{{ new_vm_name }}.{{ new_vm_domain }}"
hostname: {{ new_vm_name }}
prefer_fqdn_over_hostname: true
create_hostname_file: true
chpasswd:
expire: False
# This section enables SUDO
users:
- name: {{ ciuser }}
sudo: ALL=(ALL) ALL
lock_passwd: false
passwd: {{ encrypted_password }}
shell: /bin/bash
#enables ssh password auth
ssh_pwauth: True
# Place distro specific packages here
packages:
- qemu-guest-agent
- tmux
- sudo
- openssl
package_update: true
package_upgrade: true
package_reboot_if_required: true
runcmd:
- [ systemctl, daemon-reload ]
- [ systemctl, enable, qemu-guest-agent.service ]
- [ systemctl, start, --no-block, qemu-guest-agent.service ]
And here is my playbook:
Code:
- name: Download cloud-init ubuntu image
ansible.builtin.get_url:
url: https://cloud.debian.org/images/cloud/{{ debian_codename }}/daily/latest/debian-{{ debian_version }}-generic-amd64-daily.raw
dest: /var/lib/vz/images
tags:
- download
- name: make sure libguestfs-tools are installed
ansible.builtin.apt:
name: libguestfs-tools
tags:
- clone
- download
- name: Remove Template
proxmox_kvm:
vmid: "{{ clone_vm_id }}"
api_user: root@pam
api_password: "{{ password }}"
api_host: pve-v2
state: absent
tags:
- download
- name: Prepare vm for template
command: "{{ item }}"
args:
chdir: /var/lib/vz/images
with_items:
- "qm create {{ clone_vm_id }} --memory 8192 --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-single"
- "qm set {{ clone_vm_id }} --scsi0 vmstor:0,import-from=/var/lib/vz/images/debian-{{ debian_version }}-generic-amd64-daily.raw"
- "qm set {{ clone_vm_id }} --ide2 vmstor:cloudinit"
- "qm set {{ clone_vm_id }} --boot order=scsi0"
ignore_errors: true
tags:
- download
- name: Copy user-data to Proxmox node
ansible.builtin.template:
src: "templates/{{ user_data_file }}"
remote_src: false
dest: "{{ user_data_file_storage_path }}/user-data.yaml"
owner: root
group: root
mode: '0644'
tags:
- clone
- name: Copy file with owner and permissions
ansible.builtin.copy:
src: "templates/{{ network_data_file }}"
dest: "{{ user_data_file_storage_path }}/network.yaml"
owner: root
group: root
mode: '0644'
tags:
- clone
- name: Attach user-data to VM and convert to template
command: "{{ item }}"
with_items:
- 'qm set {{ clone_vm_id }} --cicustom "user={{ user_data_file_storage }}:snippets/user-data.yaml,network={{ user_data_file_storage }}:snippets/network.yaml"'
- "qm template {{ clone_vm_id }}"
tags:
- clone
- download
- name: Clone VM and Start
command: "{{ item }}"
args:
chdir: /var/lib/vz/images
with_items:
- "qm clone {{ clone_vm_id }} {{ new_vm_id }} --name {{ new_vm_name }} --full"
- "qm set {{ new_vm_id }} --agent 1"
- "qm start {{ new_vm_id }}"
tags:
- clone
- download
- name: Pause for 45 seconds to complete first boot
ansible.builtin.pause:
seconds: 45
- name: Shutdown VM, Resize and Start again
command: "{{ item }}"
args:
chdir: /var/lib/vz/images
with_items:
- "qm shutdown {{ new_vm_id }}"
tags:
- clone
- download
- name: Pause for 20 seconds to complete shutdown
ansible.builtin.pause:
seconds: 20
- name: Resize disk and start VM again
command: "{{ item }}"
args:
chdir: /var/lib/vz/images
with_items:
- "qm resize {{ new_vm_id }} scsi0 +{{ resize_disk }}G"
- "qm start {{ new_vm_id }}"
tags:
- clone
- download
Anyone have any ideas why the VM stop command fixes this but a graceful shutdown and start does not? Ideally, I want to fix my config and playbook to account for this issue.