Change hostname killed my cluster

Moz

Member
Dec 15, 2022
13
0
6
Hello,
I did a mistake, I changed the hostname of my cluster without thinking it will destroy so many things..

I changed the hostname from ansible with
Code:
- name: Set a hostname
  become: true
  ansible.builtin.hostname:
    name: "{{ local_dns }}" # changing mox01.ether-source.fr to cyclops-alpha.ether-source.fr
Code:
root@cyclops-alpha:~# pvecm add cyclops-alpha.ether-source.fr
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused




Then I changed the hosts file with this template
Code:
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

{% for host in groups['init'] %}
{{ hostvars[host]['ansible_default_ipv4']['address'] }} {{ hostvars[host]['local_dns']}}
{% endfor %}

Well, it's not working anymore now and I would like to recover nodes.
I saw some tutorials, tried so many things, I think I do need some helps :(
Code:
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2022-12-15 23:06:49 CET; 18min ago
    Process: 1210 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
    Process: 1211 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 1212 (pveproxy)
      Tasks: 4 (limit: 76987)
     Memory: 133.9M
        CPU: 15.954s
     CGroup: /system.slice/pveproxy.service
             ├─1212 pveproxy
             ├─2080 pveproxy worker
             ├─2081 pveproxy worker
             └─2082 pveproxy worker

Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: starting 2 worker(s)
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: worker 2080 started
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: worker 2081 started
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[2080]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key)>
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[2081]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key)>
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[2079]: worker exit
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: worker 2079 finished
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: starting 1 worker(s)
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[1212]: worker 2082 started
Dec 15 23:24:59 cyclops-alpha.ether-source.fr pveproxy[2082]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key)


Code:
root@cyclops-alpha:~# hostname
cyclops-alpha.ether-source.fr
root@cyclops-alpha:~# cat /etc/hostname
cyclops-alpha.ether-source.fr
root@cyclops-alpha:~# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.15 cyclops-alpha cyclops-alpha.ether-source.fr


192.168.1.23 cerberus-alpha.ether-source.fr
192.168.1.189 cerberus-beta.ether-source.fr
192.168.1.161 cerberus-gamma.ether-source.fr
192.168.1.3 mermaid-alpha.ether-source.fr
192.168.1.106 minotor-alpha.ether-source.fr
192.168.1.43 basilisk.ether-source.fr

192.168.1.144 cyclops-beta.ether-source.fr
192.168.1.42 cyclops-gamma.ether-source.fr
192.168.1.138 cyclops-epsilon.ether-source.fr
192.168.1.172 cyclops-zeta.ether-source.fr
192.168.1.198 cyclops-eta.ether-source.fr
192.168.1.113 cyclops-theta.ether-source.fr
192.168.1.41 centaurs-alpha.ether-source.fr
Dunno what i still need to fix to make it up again, if you have any clues
 
I saw some tutorials, tried so many things, I think I do need some helps
The best approach would be to back out your changes to the saved configuration you made before you started.

Beyond that I would recommend reading through this discussion https://askubuntu.com/questions/863132/should-one-use-fqdn-in-etc-hostname-instead-of-hostname - it will highlight the few critical configuration errors you have. There may be more across all nodes that you have not shown.

Other resources:
https://pve.proxmox.com/wiki/Renaming_a_PVE_node
https://forum.proxmox.com/threads/proxmox-rename-node-big-problem.69149/



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
@bbgeek17 Thanks for the reply !
I trie without success to rollback the previous hostname. I will try on another node (I have 7 actually).
I will just change the hostname to the previos one to see what's gonna happen.
I will read all of thoses link this evening, thanks a lot
 
Well, I think the command

Code:
hostnamectl set-hostname mox03.ether-source.fr
Cannot fix the problem (mox03.ether-source.fr is the original name)
I tottally rewrote /etc/hosts with ansible, I'm pretty sure there is mandatory stuff in.

EDIT:

With a new /etc/hosts


Code:
127.0.0.1 localhost.localdomain localhost
192.168.1.42 mox03.ether-source.fr mox03

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
It's working !
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!