Hostname Change - PVECluster not starting - Corrupt DB?

alasdairc

New Member
Jun 25, 2021
25
0
1
24
Hi all,


I've attempted to change the hostname of my server and it seems to have sledgehammered my server. Hopefully there is an easier way to change this in the future.

My process of changing the hostname was, alter /etc/hosts and /etc/hostname and also the /etc/postfix/main.cf hostname entry. I then moved nodes from the old directory to the newly created (automatically) directory in /etc/pve/. This caused SSL errors - due to downtime I've had to try and revert back.

Currently pveproxy and pvedaemon are working/restarting fine.


At this point I have tried to change everything back to how it was to at least try and get my server back up again as it's been down for 3/4 hours now.


================

/etc/hosts:

127.0.0.1 localhost.localdomain localhost 145.239.252.194 ns3100125.stellar-network.co.uk ns3100125

/etc/hostname:

ns3100125.stellar-network.co.uk

I've tried different variations of the hosts and hostname files.

ip addr:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000 link/ether a4:bf:01:47:3a:93 brd ff:ff:ff:ff:ff:ff 3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr1 state UP group default qlen 1000 link/ether a4:bf:01:47:3a:94 brd ff:ff:ff:ff:ff:ff 4: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether a4:bf:01:47:3a:93 brd ff:ff:ff:ff:ff:ff inet 145.239.252.194/24 brd 145.239.252.255 scope global dynamic vmbr0 valid_lft 83620sec preferred_lft 83620sec inet6 fe80::a6bf:1ff:fe47:3a93/64 scope link valid_lft forever preferred_lft forever 5: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether a4:bf:01:47:3a:94 brd ff:ff:ff:ff:ff:ff inet6 fe80::a6bf:1ff:fe47:3a94/64 scope link valid_lft forever preferred_lft forever

Versions:

proxmox-ve: 6.4-1 (running kernel: 5.4.119-1-pve) pve-manager: 6.4-9 (running version: 6.4-9/5f5c0e3f) pve-kernel-5.4: 6.4-3 pve-kernel-helper: 6.4-3 pve-kernel-5.4.119-1-pve: 5.4.119-1 pve-kernel-5.4.114-1-pve: 5.4.114-1 ceph-fuse: 12.2.11+dfsg1-2.1+b1 corosync: 3.1.2-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 libjs-extjs: 6.0.1-10 libknet1: 1.20-pve1 libproxmox-acme-perl: 1.1.0 libproxmox-backup-qemu0: 1.0.3-1 libpve-access-control: 6.4-3 libpve-apiclient-perl: 3.1-3 libpve-common-perl: 6.4-3 libpve-guest-common-perl: 3.1-5 libpve-http-server-perl: 3.2-3 libpve-storage-perl: 6.4-1 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.6-2 lxcfs: 4.0.6-pve1 novnc-pve: 1.1.0-1 proxmox-backup-client: 1.1.10-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.5-6 pve-cluster: 6.4-1 pve-container: 3.3-5 pve-docs: 6.4-2 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-4 pve-firmware: 3.2-4 pve-ha-manager: 3.1-1 pve-i18n: 2.3-1 pve-qemu-kvm: 5.2.0-6 pve-xtermjs: 4.7.0-3 pve-zsync: 2.2 qemu-server: 6.4-2 smartmontools: 7.2-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 2.0.4-pve1

=================

journalctl -b -u pve-cluster

Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Starting The Proxmox VE cluster filesystem... Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000001F34, parent = 0x0000000000001F33, v./mtim Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [database] crit: DB load failed Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000001F34, parent = 0x0000000000001F33, v./mtim Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [database] crit: DB load failed Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [main] notice: exit proxmox configuration filesystem (-1) Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1044]: [main] notice: exit proxmox configuration filesystem (-1) Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Failed with result 'exit-code'. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Failed to start The Proxmox VE cluster filesystem. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 1. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Stopped The Proxmox VE cluster filesystem. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Starting The Proxmox VE cluster filesystem... Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000001F34, parent = 0x0000000000001F33, v./mtim Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [database] crit: DB load failed Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000001F34, parent = 0x0000000000001F33, v./mtim Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [database] crit: DB load failed Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [main] notice: exit proxmox configuration filesystem (-1) Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1053]: [main] notice: exit proxmox configuration filesystem (-1) Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Failed with result 'exit-code'. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Failed to start The Proxmox VE cluster filesystem. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 2. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Stopped The Proxmox VE cluster filesystem. Jun 26 18:34:26 ns3100125.stellar-network.co.uk systemd[1]: Starting The Proxmox VE cluster filesystem... Jun 26 18:34:26 ns3100125.stellar-network.co.uk pmxcfs[1054]: [database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000001F34, parent = 0x0000000000001F33, v./mtim

===================

tail -n 100 -f /var/log/syslog:

Jun 26 19:21:57 ns3100125 pveproxy[3601]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891. Jun 26 19:21:57 ns3100125 pveproxy[3599]: worker exit Jun 26 19:21:57 ns3100125 pveproxy[3600]: worker exit Jun 26 19:21:57 ns3100125 pveproxy[1066]: worker 3599 finished Jun 26 19:21:57 ns3100125 pveproxy[1066]: worker 3600 finished Jun 26 19:21:57 ns3100125 pveproxy[1066]: starting 2 worker(s) Jun 26 19:21:57 ns3100125 pveproxy[1066]: worker 3602 started Jun 26 19:21:57 ns3100125 pveproxy[1066]: worker 3603 started Jun 26 19:21:57 ns3100125 pveproxy[3602]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891. Jun 26 19:21:57 ns3100125 pveproxy[3603]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891. Jun 26 19:22:00 ns3100125 systemd[1]: Starting Proxmox VE replication runner... Jun 26 19:22:01 ns3100125 pvesr[3610]: ipcc_send_rec[1] failed: Connection refused Jun 26 19:22:01 ns3100125 pvesr[3610]: ipcc_send_rec[2] failed: Connection refused Jun 26 19:22:01 ns3100125 pvesr[3610]: ipcc_send_rec[3] failed: Connection refused Jun 26 19:22:01 ns3100125 pvesr[3610]: Unable to load access control list: Connection refused Jun 26 19:22:01 ns3100125 systemd[1]: pvesr.service: Main process exited, code=exited, status=111/n/a Jun 26 19:22:01 ns3100125 systemd[1]: pvesr.service: Failed with result 'exit-code'. Jun 26 19:22:01 ns3100125 systemd[1]: Failed to start Proxmox VE replication runner. Jun 26 19:22:01 ns3100125 cron[1046]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump) Jun 26 19:22:02 ns3100125 pveproxy[3601]: worker exit Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3601 finished Jun 26 19:22:02 ns3100125 pveproxy[1066]: starting 1 worker(s) Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3617 started Jun 26 19:22:02 ns3100125 pveproxy[3617]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891. Jun 26 19:22:02 ns3100125 pveproxy[3602]: worker exit Jun 26 19:22:02 ns3100125 pveproxy[3603]: worker exit Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3602 finished Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3603 finished Jun 26 19:22:02 ns3100125 pveproxy[1066]: starting 2 worker(s) Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3618 started Jun 26 19:22:02 ns3100125 pveproxy[1066]: worker 3619 started Jun 26 19:22:02 ns3100125 pveproxy[3618]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891. Jun 26 19:22:02 ns3100125 pveproxy[3619]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1891.

===================


Nothing in /etc/pve that's why I assume it can't find the pve-ssl.key.


Any help on this would be great, if there's any other information you might need, please let me know


Thanks in advance
 
Last edited:

alasdairc

New Member
Jun 25, 2021
25
0
1
24
I've isolated this down to a config.db issue.

I renamed config.db to config.db.bak to test, it generated a new config and was able to access the web GUI.

Just going to rebuild and start again I think.

Please sort this out, it's too much hassle to change a hostname when other hypervisors it's as easy as a few clicks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!