[SOLVED] Hostname changed, now nodes gone from /etc/pve

ropeguru

Member
Nov 18, 2019
37
2
13
65
So there is probably a right way of changing the hostname than the way I did it, but curious as to why this happened.

So I updated the hostname in /etc/hosts and /etc/hostname in the latest version of Proxmox. Rebooted, and had both the old and new hostname in the GUI and realized I forgot to move the files under /etc/pve/nodes, so I completed that and rebooted. Now, the /etc/pve directory still exists after a reboot, but everything else is gone.

It is a new setup and easy to rebuild, but curious as to how the directories got deleted. History shows that a "rm" was never used and I have confirmed that the directories just didn't get moved somewhere else as a find command does not find either the old or new hostname as a directory anywhere on the server.
 
That's strange. Could you provide the complete list of commands (in order) you ran?
Which version of PVE are you running? pveversion -v

Did you see the following wiki entry? https://pve.proxmox.com/wiki/Renaming_a_PVE_node

Is /etc/pve completely empty, not even a single file exists?

What's the output of systemctl status pve-cluster.service and the output of mount?
 
Last edited:
That's strange. Could you provide the complete list of commands (in order) you ran?
Which version of PVE are you running? pveversion -v

Did you see the following wiki entry? https://pve.proxmox.com/wiki/Renaming_a_PVE_node

Is /etc/pve completely empty, not even a single file exists?

What's the output of systemctl status pve-cluster.service and the output of mount?
Yes, I followed that wiki entry for the renaming. I followed the "Change Hostname" section with the exception of the mail changes. Did not perform anything from the "Cleanup" section since it was a new system.

/etc/pve is completely empty:
root@core-routing:/etc/pve# ls -l /etc/pve
total 0

root@core-routing:/etc/pve# systemctl status pve-cluster.service ● pve-cluster.service - The Proxmox VE cluster filesystem Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2022-06-30 11:13:15 EDT; 1h 24min ago Process: 1983 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION) CPU: 9ms Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5. Jun 30 11:13:15 core-routing systemd[1]: Stopped The Proxmox VE cluster filesystem. Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Start request repeated too quickly. Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Failed with result 'exit-code'. Jun 30 11:13:15 core-routing systemd[1]: Failed to start The Proxmox VE cluster filesystem.

These are the commands I used from history:

238 cd /etc/pve/ 239 ls 240 cd nodes/ 241 ls 242 cd grafton-pm/ 243 ls 244 mv -R * ../core-routing/ 245 mv -r * ../core-routing/ 246 mv * ../core-routing/ 247 cp -R * ../core-routing/ 248 cd ../core-routing/ 249 cd ../grafton-pm/ 250 cp -Rf * ../core-routing/ 251 ls -l 252 ls -l ../core-routing/ 253 ls -l ../core-routing/qemu-server/ 254 mv qemu-server/* ../core-routing/qemu-server/ 255 reboot 256 cd /etc/pve/ 257 ls 258 ls

Some of the commands were invalid which is why you see extra. But a entries 252 and 253, I verified all the files were there before the reboot.

Want to be transparent with what I ran including all invalid commands. Unfortunately the file lists from 252 and 253 have scrolled out of my ssh window buffer.
 
Can you start pmxcfs in local mode with debug output? pmxcfs -l -d -f
With this we should get enough information on why it failed to start.
The pve-cluster service is responsible for starting it, with this failed there's no pmxcfs running and as a result /etc/pve is empty because it couldn't be mounted.
 
root@core-routing:/var/lib/pve-cluster# ls -l
total 40
-rw------- 1 root root 36864 Jun 30 09:23 config.db

crit: found entry with duplicate name 'qemu-server'


Code:
[database] debug: name __version__ (inode = 0000000000000000, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name user.cfg (inode = 0000000000000004, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name datacenter.cfg (inode = 0000000000000006, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name virtual-guest (inode = 0000000000000008, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000009, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name nodes (inode = 000000000000000B, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name grafton-pm (inode = 000000000000000C, parent = 000000000000000B) (database.c:375:bdb_backend_load_index)
[database] debug: name lxc (inode = 000000000000000D, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000000000E, parent = 000000000000000C) (database.c:375:bdb_backend_load_index)
[database] debug: name openvz (inode = 000000000000000F, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000010, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name lock (inode = 0000000000000011, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-www.key (inode = 0000000000000018, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-ssl.key (inode = 000000000000001A, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.key (inode = 000000000000001C, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.pem (inode = 000000000000001E, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.srl (inode = 0000000000000020, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-ssl.pem (inode = 0000000000000023, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name ha (inode = 0000000000000031, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name acme (inode = 0000000000000032, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name sdn (inode = 0000000000000033, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name status.cfg (inode = 0000000000BDC990, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name replication.cfg (inode = 0000000000E2CB5F, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name vzdump.cron (inode = 0000000000E2CB63, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name storage.cfg (inode = 0000000000E2F713, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.pub.old (inode = 0000000000E3D58D, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.pub (inode = 0000000000E3D590, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.key (inode = 0000000000E3D593, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name 100.conf (inode = 0000000000E42D4D, parent = 0000000000E42EE1) (database.c:375:bdb_backend_load_index)
[database] debug: name host.fw (inode = 0000000000E42D6A, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name core-routing (inode = 0000000000E42DF2, parent = 000000000000000B) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 0000000000E42DF3, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name authorized_keys (inode = 0000000000E42DFD, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name known_hosts (inode = 0000000000E42E00, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 0000000000E42EE1, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000E42DF3, parent = 0x0000000000E42DF2, v./mtime = 0xE42DF3/0x1656594653) vs. B:(inode = 0x0000000000E42EE1, parent = 0x0000000000E42DF2, v./mtime = 0xE42EE1/0x1656594965) (database.c:430:bdb_backend_load_index)
[database] crit: DB load failed (database.c:470:bdb_backend_load_index)
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' (pmxcfs.c:888:main)
[main] notice: exit proxmox configuration filesystem (-1) (pmxcfs.c:1099:main)
 
Last edited:
Oh that explains it. You have a duplicate entry for qemu-server which is not allowed in the Cluster Filesystem.
It most likely comes from this command: 254 mv qemu-server/* ../core-routing/qemu-server/
You moved the content of qemu-server rather than the directory itself.

In theory you could manually edit the config.db file, similar to this thread: https://forum.proxmox.com/threads/pve-cluster-fails-to-start.82861/
But create a backup before trying anything on the config.db file!
 
Oh that explains it. You have a duplicate entry for qemu-server which is not allowed in the Cluster Filesystem.
It most likely comes from this command: 254 mv qemu-server/* ../core-routing/qemu-server/
You moved the content of qemu-server rather than the directory itself.

In theory you could manually edit the config.db file, similar to this thread: https://forum.proxmox.com/threads/pve-cluster-fails-to-start.82861/
But create a backup before trying anything on the config.db file!
How do I figure out which one is the duplicate??
 
The one in nodes/<current-node-name>/qemu-server should be the right one, and the other, probably under the old name, the old one.
If you have one on the root level, then that's also a duplicate one. When mounting this will be a symlink to the node-specific one (for the local node).
 
Ok, almost there...

Service is now starting but looks like maybe an auth issue as the GUI has ? by everything..
 
What's the status output of pve-cluster? systemctl status pve-cluster.service

Did you reboot the node after getting pmxcfs to start? Maybe some other services require a restart, e.g. pvedaemon, pveproxy and pvestatd.
 
Yeah... That was it.. Needed a reboot.

I thought about that after I posted...

All good now. Thanks for walking me through this as I now have a better understanding of how things work fin order to troubleshoot on my own in the future.
 
Well, I learned something new as well. I had no need until now to manually modify the config.db file.
But Thomas explained it quite well in the other thread, I think.

I'm glad it worked out in the end! Usually something like this should not be possible just by copying and moving files around in pmxcfs.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!