Proxmox node turned into vm after reboot

nsky.link

New Member
Feb 3, 2022
12
0
1
24
Please help me understand how this can happen and why. How to return the Proxmox node to its original state?!
The Proxmox node has a hostname for example "node7" and is in the cluster for 7 nodes. Has zfs-RAID10. 2 virtual machines (vm3 and vm4) installed with the help of terraform worked on the node.
What was done before the problem was identified:
- network reboot on the node
- turn off all (2 pcs.) vm on the node
- command in the console "reboot"

After that, I could not enter via ssh to the node. Already in ipmi I saw an invitation to the console but already under the hostname vm4.

I noticed that the file "authorized keys" was overwritten and the proxmox and haproxy services were down. All other services such as Zabbix, Nagios, etc. are working fine.

i do pvesr.service restarting and give with error "vm4 pvesr: ipcc_send_rec[1] failed: Connection refused

Network is working normal.

From web interface "node6" i see what a "node7" is unavailable (grey)


I also attach a screenshot from ipmi after the first reboot
P.S. Thanks in advance!
 

Attachments

  • proxmox_issue.png
    proxmox_issue.png
    306 KB · Views: 19
You installed cloud-init on the host. This has to be installed in the VM!

First I'd suggest purging cloud-init from your host: apt purge cloud-init
Afterwards check /etc/network/interfaces and make sure it is set up correctly again (as it was before).
Then edit /etc/hostname and /etc/hosts to match the ones on your other nodes, but with the right hostname and IP for this node.

This should be enough to get it up and running again.
 
Thanks a lot ! you are my savior. I did everything you advised and the node really returned to the cluster. Now everything is working normally!
You installed cloud-init on the host. This has to be installed in the VM!

First I'd suggest purging cloud-init from your host: apt purge cloud-init
Afterwards check /etc/network/interfaces and make sure it is set up correctly again (as it was before).
Then edit /etc/hostname and /etc/hosts to match the ones on your other nodes, but with the right hostname and IP for this node.

This should be enough to get it up and running again.
 
That's great!
Made the same mistake a few years ago.
 
  • Like
Reactions: nsky.link