Debian 11 Cloud Image Networking broken

Hi,

I eventually solved this issue by overwriting the included ens18 dhcp statement and setting the network config to manual.
It took me countless hours to find the root cause for this issue as this isn't documented anywhere.

The debian cloud team seems to assume there's always a DHCP server running, so the included this config by default.

Code:
virt-customize -a $IMAGE_NAME --install qemu-guest-agent --install resolvconf --update --run-command 'echo "auto ens18" >> /etc/network/interfaces.d/ens18' --run-command 'echo "iface ens18 inet manual" >> /etc/network/interfaces.d/ens18'
To be fair, you can set your DHCP settings in cloud init, and if you don't use DHCP, you probably shouldn't use the option. So I wouldn't blame the debian cloud team, when you clearly had a choice. I know I had, and I made the wrong one as stated.
I also always prefer correct configs outside the VM pre-initialization rather than tinkering with a default image that actually works if used correctly. But as all things go, ymmv.
 
No that won't work as debian tries to enable the ens18 interface before the cloud-init configuration.

So even if you completely disable DHCP via Cloud-Init the VM still gets stuck during boot for 30-60s as it tries to obtain a IP via DHCP.
 
No that won't work as debian tries to enable the ens18 interface before the cloud-init configuration.

So even if you completely disable DHCP via Cloud-Init the VM still gets stuck during boot for 30-60s as it tries to obtain a IP via DHCP.
The Debian cloud image has a helper script assuming that the DHCP is always in use to configure a NIC. The helper is: /etc/network/cloud-ifupdown-helper. This helper uses a template file to generate the network configuration file. The template is: /etc/network/cloud-interfaces-template. Look at that template and you will see that DHCP is always set as NIC configuration method.
This behavior makes configuration via cloud-init ineffective when you want to use static IP configuration.
To workaround that I had to modify the helper script by adding a "exit 0" as first instruction of the script. That solved my problem and the cloud-init now works fine with static IP address configuration.
The final solution is to build our own cloud image.
 
Last edited:
The Debian cloud image has a helper script assuming that the DHCP is always in use to configure a NIC. The helper is: /etc/network/cloud-ifupdown-helper. This helper uses a template file to generate the network configuration file. The template is: /etc/network/cloud-interfaces-template. Look at that template and you will see that DHCP is always set as NIC configuration method.
This behavior makes configuration via cloud-init ineffective when you want to use static IP configuration.
To workaround that I had to modify the helper script by adding a "exit 0" as first instruction of the script. That solved my problem and the cloud-init now works fine with static IP address configuration.
The final solution is to build our own cloud image.
Why not use network config or metadata with version 2 ?
 
I just hit this kind of issue with PVE 8.1.3 and Debian 11 genericcloud image 2024-01-04.

Precisely:
- I downloaded the debian 11 genericcloud image in qcow2
- imported into a VM by using qm commands (see below for the script)
- I customized the cloud image using a yaml file to get the desired result, without knowing too much about the yaml and cloud init syntax ;) (yaml file shown below)

Bash:
#!/bin/bash

#Create template
#args:
# vm_id
# vm_name
# file name in the current directory
function create_template() {
    #Print all of the configuration
    echo "Creating template $2 ($1)"

    #Create new VM
    #Feel free to change any of these to your liking
    qm create $1 --name $2 --ostype l26 --hotplug disk,usb --tablet 0
    #Set networking to default bridge
    qm set $1 --net0 virtio,bridge=vmbr4000,mtu=1400
    #Set display to serial
    qm set $1 --serial0 socket --vga serial0
    #Set memory, cpu, type defaults
    #If you are in a cluster, you might need to change cpu type
    qm set $1 --memory 1024 --cores 4 --cpu host
    #Set boot device to new file
    qm set $1 --scsi0 ${storage}:0,import-from="$3",discard=on,ssd=1
    #Set scsi hardware as default boot disk using virtio scsi single
    qm set $1 --boot order=scsi0 --scsihw virtio-scsi-pci
    #Enable Qemu guest agent in case the guest has it available
    qm set $1 --agent enabled=1,fstrim_cloned_disks=1
    #Add cloud-init device
    qm set $1 --ide2 ${storage}:cloudinit
    #Set CI ip config
    #IP6 = auto means SLAAC (a reliable default with no bad effects on non-IPv6 networks)
    #IP = DHCP means what it says, so leave that out entirely on non-IPv4 networks to avoid DHCP delays
    qm set $1 --ipconfig0 "ip=dhcp" --nameserver 10.13.14.250
    #Import the ssh keyfile
    qm set $1 --sshkeys ${ssh_keyfile}
    #If you want to do password-based auth instaed
    #Then use this option and comment out the line above
    qm set $1 --cipassword verystrongandsecretpasswordbaby
    #Add the user, not relevant for us
    # qm set $1 --ciuser ${username}
    # Resize the disk to 8G, a reasonable minimum. You can expand it more later.
    # If the disk is already bigger than 8G, this will fail, and that is okay.
    qm disk resize $1 scsi0 8G
    # Custom actions
    qm set $1 --cicustom "user=local:snippets/rabbitmq.yaml"
    # Update cloudinit drive
    qm cloudinit update $1
    #Make it a template
    qm template $1

    #Remove file when done, prefer not yet
    #rm $3
}

#Path to your ssh authorized_keys file
export ssh_keyfile=/var/lib/vz/template/cloud-images/id_rsa_pongraczi.pub

#Name of your storage
export storage=local-zfs

## Debian
#Bookworm (11) (stable) - fix your PATH!!!!
wget "https://cloud.debian.org/images/cloud/bullseye/latest/debian-11-genericcloud-amd64.qcow2"
create_template 910  "deb11gencloud" "/var/lib/vz/template/cloud-images/debian-11-genericcloud-amd64.qcow2"

YAML:
#cloud-config
#ssh_pwauth: true
#disable_root: false
#chpasswd:
#  list: |
#    root:othergoodpassword
#  expire: false

write_files:
  - path: /etc/sudoers.d/cloud-init
    content: |
      Defaults !requiretty
  - path: /etc/rabbitmq/enabled_plugins
    content: |
        [rabbitmq_amqp1_0,rabbitmq_auth_backend_ldap,rabbitmq_federation,rabbitmq_federation_management,rabbitmq_management,rabbitmq_mqtt,rabbitmq_peer_discovery_common,rabbitmq_prometheus,rabbitmq_shovel,rabbitmq_shovel_management,rabbitmq_stomp,rabbitmq_stream,rabbitmq_stream_management,rabbitmq_web_mqtt,rabbitmq_web_mqtt_examples,rabbitmq_web_stomp].
  - path: /etc/apt/sources.list.d/rabbitmq
    content: |
      ## Provides modern Erlang/OTP releases from a Cloudsmith mirror
      ##
      deb [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa1.novemberain.com/rabbitmq/rabbitmq-erlang/deb/debian bullseye main
      deb-src [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa1.novemberain.com/rabbitmq/rabbitmq-erlang/deb/debian bullseye main
      
      deb [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa2.novemberain.com/rabbitmq/rabbitmq-erlang/deb/debian bullseye main
      deb-src [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa2.novemberain.com/rabbitmq/rabbitmq-erlang/deb/debian bullseye main
      
      ## Provides RabbitMQ from a Cloudsmith mirror
      ##
      deb [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa1.novemberain.com/rabbitmq/rabbitmq-server/deb/debian bullseye main
      deb-src [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa1.novemberain.com/rabbitmq/rabbitmq-server/deb/debian bullseye main
      
      # another mirror for redundancy
      deb [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa2.novemberain.com/rabbitmq/rabbitmq-server/deb/debian bullseye main
      deb-src [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa2.novemberain.com/rabbitmq/rabbitmq-server/deb/debian bullseye main
  - path: /etc/systemd/system/rabbitmq-server.service.d/limits.conf
    content: |
      [Service]
      LimitNOFILE=64000
  - path: /root/configure.sh
    content: |
      #!/usr/bin/env bash
      apt-get install curl gnupg apt-transport-https -y
      curl -1sLf "https://keys.openpgp.org/vks/v1/by-fingerprint/0A9AF2115F4687BD29803A206B73A36E6026DFCA" | gpg --dearmor | tee /usr/share/keyrings/com.rabbitmq.team.gpg > /dev/null
      curl -1sLf https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-erlang.E495BB49CC4BBE5B.key | gpg --dearmor | tee /usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg > /dev/null
      curl -1sLf https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-server.9F4587F226208342.key | gpg --dearmor | tee /usr/share/keyrings/rabbitmq.9F4587F226208342.gpg > /dev/null
      mv /etc/apt/sources.list.d/rabbitmq  /etc/apt/sources.list.d/rabbitmq.list
      apt-get update -y
      apt-get install -y erlang-base    erlang-asn1 erlang-crypto erlang-eldap erlang-ftp erlang-inets    erlang-mnesia erlang-os-mon erlang-parsetools erlang-public-key    erlang-runtime-tools erlang-snmp erlang-ssl    erlang-syntax-tools erlang-tftp erlang-tools erlang-xmerl
      apt-get install rabbitmq-server -y --fix-missing
      service rabbitmq-server start
      rabbitmqctl add_user rabbituser belongingpassword
      rabbitmqctl set_user_tags rmqadmin administrator

fqdn: rabbitmq.example.tld

ssh_authorized_keys:
  - ssh-rsa longkey pongi@ipc5

package_update: true
package_upgrade: true
packages:
  - qemu-guest-agent
  - mc

runcmd:
  - systemctl enable qemu-guest-agent
  - systemctl start qemu-guest-agent
  - bash /root/configure.sh

Besides, it does what I needed, but the whole process had some sideeffects, regarding to the network stack, specifically when the cloned VM started in the first time.

The experience:
- First boot, the VM started and did all the stuff I needed to bring up a fresh rabbitmq appliance, everything on place, qemu-guest-agent is running, mc in place, rabbitmq is running with all necessary plugins enabled out of the box.
- The surprise was, I got 2 NICs for the same MAC (one virtio net0) and both got different IP addresses using the dhcp.

...
[ 0.699007] virtio_net virtio3 ens18: renamed from eth0
...
[ 1.916191] virtio_net virtio3 eth0: renamed from ens18

[ OK ] Finished Initial cloud-init job (pre-networking).
[ OK ] Reached target Network (Pre).
[DEPEND] Dependency failed for ifup for ens18.
Starting Raise network interfaces...
[ OK ] Finished Raise network interfaces.
[ OK ] Reached target Network.
Starting Initial cloud-ini… (metadata service crawler)...
[ 4.327976] cloud-init[587]: Cloud-init v. 20.4.1 running 'init' at Thu, 01 Feb 2024 09:53:51 +0000. Up 4.32 seconds.
[ 4.332521] cloud-init[587]: ci-info: +++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
[ 4.333519] cloud-init[587]: ci-info: +--------+------+------------------------------+---------------+--------+-------------------+
[ 4.334476] cloud-init[587]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
[ 4.335443] cloud-init[587]: ci-info: +--------+------+------------------------------+---------------+--------+-------------------+
[ 4.336416] cloud-init[587]: ci-info: | eth0 | True | 10.0.1.115 | 255.255.255.0 | global | bc:24:11:cf:65:a7 |
[ 4.337383] cloud-init[587]: ci-info: | eth0 | True | 10.0.1.116 | 255.255.255.0 | global | bc:24:11:cf:65:a7 |

[ 4.338336] cloud-init[587]: ci-info: | eth0 | True | fe80::be24:11ff:fecf:65a7/64 | . | link | bc:24:11:cf:65:a7 |
[ 4.339349] cloud-init[587]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[ 4.340366] cloud-init[587]: ci-info: | lo | True | ::1/128 | . | host | . |
[ 4.341320] cloud-init[587]: ci-info: +--------+------+------------------------------+---------------+--------+-------------------+
[ 4.342254] cloud-init[587]: ci-info: +++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++
[ 4.343042] cloud-init[587]: ci-info: +-------+-------------+----------+---------------+-----------+-------+
[ 4.343833] cloud-init[587]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
[ 4.344653] cloud-init[587]: ci-info: +-------+-------------+----------+---------------+-----------+-------+
[ 4.345455] cloud-init[587]: ci-info: | 0 | 0.0.0.0 | 10.0.1.5 | 0.0.0.0 | eth0 | UG |
[ 4.346394] cloud-init[587]: ci-info: | 1 | 10.0.1.0 | 0.0.0.0 | 255.255.255.0 | eth0 | U |
[ 4.347363] cloud-init[587]: ci-info: +-------+-------------+----------+---------------+-----------+-------+
[ 4.348336] cloud-init[587]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
[ 4.349046] cloud-init[587]: ci-info: +-------+-------------+---------+-----------+-------+
[ 4.349743] cloud-init[587]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ 4.350435] cloud-init[587]: ci-info: +-------+-------------+---------+-----------+-------+
[ 4.351161] cloud-init[587]: ci-info: | 1 | fe80::/64 | :: | eth0 | U |
[ 4.351855] cloud-init[587]: ci-info: | 3 | local | :: | eth0 | U |
[ 4.352566] cloud-init[587]: ci-info: | 4 | multicast | :: | eth0 | U |

# ifconfig
ens18: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1400
inet 10.0.1.116 netmask 255.255.255.0 broadcast 10.0.1.255
ether bc:24:11:cf:65:a7 txqueuelen 1000 (Ethernet)

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1400
inet 10.0.1.115 netmask 255.255.255.0 broadcast 10.0.1.255
inet6 fe80::be24:11ff:fecf:65a7 prefixlen 64 scopeid 0x20<link>
ether bc:24:11:cf:65:a7 txqueuelen 1000 (Ethernet)
RX packets 10668 bytes 79959472 (76.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6561 bytes 478910 (467.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

# ip a
...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:cf:65:a7 brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18
inet 10.0.1.115/24 brd 10.0.1.255 scope global dynamic eth0
valid_lft 32182sec preferred_lft 32182sec
inet 10.0.1.116/24 brd 10.0.1.255 scope global secondary dynamic ens18
valid_lft 32183sec preferred_lft 32183sec
inet6 fe80::be24:11ff:fecf:65a7/64 scope link
valid_lft forever preferred_lft forever

- When I rebooted the VM, I guess the yaml did not run again as expected, and I got only one NIC (eth0, renamed from ens18 etc.)

ip a:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:cf:65:a7 brd ff:ff:ff:ff:ff:ff
altname enp0s18
altname ens18

inet 10.0.1.115/24 brd 10.0.1.255 scope global dynamic eth0
valid_lft 43192sec preferred_lft 43192sec
inet6 fe80::be24:11ff:fecf:65a7/64 scope link
valid_lft forever preferred_lft forever
ifconfig:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1400
inet 10.0.1.115 netmask 255.255.255.0 broadcast 10.0.1.255
inet6 fe80::be24:11ff:fecf:65a7 prefixlen 64 scopeid 0x20<link>
ether bc:24:11:cf:65:a7 txqueuelen 1000 (Ethernet)
....

Any idea and how to avoid this kind of anomaly? Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!