I just ran the upgrade to 5.4 and now I am running into an issue with Ceph and the cloud-init images. Every time I migrate or want to start a vm with a cloud-init image, it says it already exists and fails to start or migrate. If I go into rbd and remove the image it seems to work fine.
Is there any update on this bug? It was logged in bugzilla 5 days ago, but there are no further updates or estimated fix date.
This is a critical bug which renders Proxmox environments using cloudinit and Ceph as fundamentally broken.
A bug fix (whether temporary or otherwise), or a roll back to a previously working version should be provided ASAP.
Users in this configuration currently can't shutdown and restart a VM without manual intervention to remove the cloiudinit disk from RBD. High Availability (which is one reason many people will use Ceph) is also broken.
If a HA node was to crash VM's would not migrate to a live node, nor would they restart on the original node.
Additionally, unless administrators have manually encountered this bug they are likely unaware of the existence and would be unaware of the fact their HA VM's are no longer HA.
But, If after the respective VM in HA shutdown, it unable to boot up again. Following error showed:
task started by HA resource agent rbd: create error: (17) File exists TASK ERROR: error with cfs lock 'storage-mystorage': rbd create vm-300-cloudinit' error: rbd: create error: (17) File exists
Another issue is that previously we don't need to put in domain in host "search domain" field, but now you have to put in otherwise it prompt error at line between:
my $host_resolv_conf = PVE::INotify::read_file('resolvconf'); my $searchdomains = [
Interesting I have external Ceph and both the Cloud-INIT and the VM drive on on ceph
I did some more tests and still the same issue the VM is an Ubuntu one but it happens for all
It Only happens if you do the HA start .
Live or Shutdown migration works fine
So if you create a VM then add it to HA and issue a start you get that error
If you create a VM and start it then add it to HA it starts ok
The issue seems to be only with the HA start its almost like that it calls some other code that either bypass that Cloudinit.pm
or it has its own code that is simpler
I used a hyper-converged cluster to test it. And I used HA to start and stop the VM repeatedly. Still could not reproduce it. Even migrated it inbetween to see if that makes a difference, none whatsoever.
What Ceph version are you running? Do you use krbd?
Steps to reproduce:
1) shutdown the existing VM with HA using GUI.
2) Start the VM with HA using GUI
VM with HA unable to boot and following logs output via task viewer:
Use of uninitialized value in split at /usr/share/perl5/PVE/QemuServer/Cloudinit.pm line 94. rbd: create error: (17) File exists TASK ERROR: error with cfs lock 'storage-2': rbd create vm-300-cloudinit' error: rbd: create error: (17) File exists
error at line 94 can be resolved by adding a DNS domain search value, but VM still unable to boot due to error (17)
Tested:
Remove HA - still unable to boot, similar log output showed.
Remove cloud-init drive or put cloud-init drive on NFS/local disk, reboot, OK
so I believed that this only happened when put cloud-init drive on ceph
tailf /var/log/daemon.log
Apr 24 16:45:01 node1.local.host systemd[1]: Started Proxmox VE replication runner.
Apr 24 16:45:37 node1.local.host pvedaemon[1890943]: start VM 300: UPID:node1.local.host:001CDA7F:010F622F:5CC02231:qmstart:300:root@pam:
Apr 24 16:45:37 rk1-4u2-106C6a pvedaemon[1890943]: Use of uninitialized value in split at /usr/share/perl5/PVE/QemuServer/Cloudinit.pm line 94.
Apr 24 16:45:37 node1.local.host pvedaemon[1890943]: error with cfs lock 'storage-2': rbd create vm-300-cloudinit' error: rbd: create error: (17) File exists
Apr 24 16:46:00 node1.local.host systemd[1]: Starting Proxmox VE replication runner...
Apr 24 16:46:01 node1.local.host systemd[1]: Started Proxmox VE replication runner.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.