[SOLVED] Hostname changed, now nodes gone from /etc/pve

ropeguru · Jun 30, 2022

So there is probably a right way of changing the hostname than the way I did it, but curious as to why this happened.

So I updated the hostname in /etc/hosts and /etc/hostname in the latest version of Proxmox. Rebooted, and had both the old and new hostname in the GUI and realized I forgot to move the files under /etc/pve/nodes, so I completed that and rebooted. Now, the /etc/pve directory still exists after a reboot, but everything else is gone.

It is a new setup and easy to rebuild, but curious as to how the directories got deleted. History shows that a "rm" was never used and I have confirmed that the directories just didn't get moved somewhere else as a find command does not find either the old or new hostname as a directory anywhere on the server.

mira · Jun 30, 2022

That's strange. Could you provide the complete list of commands (in order) you ran?
Which version of PVE are you running? pveversion -v

Did you see the following wiki entry? https://pve.proxmox.com/wiki/Renaming_a_PVE_node

Is /etc/pve completely empty, not even a single file exists?

What's the output of systemctl status pve-cluster.service and the output of mount?

ropeguru · Jun 30, 2022

mira said:
That's strange. Could you provide the complete list of commands (in order) you ran?
Which version of PVE are you running? pveversion -v

Did you see the following wiki entry? https://pve.proxmox.com/wiki/Renaming_a_PVE_node

Is /etc/pve completely empty, not even a single file exists?

What's the output of systemctl status pve-cluster.service and the output of mount?

Yes, I followed that wiki entry for the renaming. I followed the "Change Hostname" section with the exception of the mail changes. Did not perform anything from the "Cleanup" section since it was a new system.

/etc/pve is completely empty:
root@core-routing:/etc/pve# ls -l /etc/pve
total 0


root@core-routing:/etc/pve# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2022-06-30 11:13:15 EDT; 1h 24min ago
    Process: 1983 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 9ms

Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jun 30 11:13:15 core-routing systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 30 11:13:15 core-routing systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 30 11:13:15 core-routing systemd[1]: Failed to start The Proxmox VE cluster filesystem.

These are the commands I used from history:


238  cd /etc/pve/
  239  ls
  240  cd nodes/
  241  ls
  242  cd grafton-pm/
  243  ls
  244  mv -R * ../core-routing/
  245  mv -r * ../core-routing/
  246  mv * ../core-routing/
  247  cp -R * ../core-routing/
  248  cd ../core-routing/
  249  cd ../grafton-pm/
  250  cp -Rf * ../core-routing/
  251  ls -l
  252  ls -l ../core-routing/
  253  ls -l ../core-routing/qemu-server/
  254  mv qemu-server/* ../core-routing/qemu-server/
  255  reboot
  256  cd /etc/pve/
  257  ls
  258  ls

Some of the commands were invalid which is why you see extra. But a entries 252 and 253, I verified all the files were there before the reboot.

Want to be transparent with what I ran including all invalid commands. Unfortunately the file lists from 252 and 253 have scrolled out of my ssh window buffer.

mira · Jul 1, 2022

Can you start pmxcfs in local mode with debug output? pmxcfs -l -d -f
With this we should get enough information on why it failed to start.
The pve-cluster service is responsible for starting it, with this failed there's no pmxcfs running and as a result /etc/pve is empty because it couldn't be mounted.

ropeguru · Jul 1, 2022

root@core-routing:/var/lib/pve-cluster# ls -l
total 40
-rw------- 1 root root 36864 Jun 30 09:23 config.db

crit: found entry with duplicate name 'qemu-server'

Code:

[database] debug: name __version__ (inode = 0000000000000000, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name user.cfg (inode = 0000000000000004, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name datacenter.cfg (inode = 0000000000000006, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name virtual-guest (inode = 0000000000000008, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000009, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name nodes (inode = 000000000000000B, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name grafton-pm (inode = 000000000000000C, parent = 000000000000000B) (database.c:375:bdb_backend_load_index)
[database] debug: name lxc (inode = 000000000000000D, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000000000E, parent = 000000000000000C) (database.c:375:bdb_backend_load_index)
[database] debug: name openvz (inode = 000000000000000F, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000010, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name lock (inode = 0000000000000011, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-www.key (inode = 0000000000000018, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-ssl.key (inode = 000000000000001A, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.key (inode = 000000000000001C, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.pem (inode = 000000000000001E, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-root-ca.srl (inode = 0000000000000020, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name pve-ssl.pem (inode = 0000000000000023, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name ha (inode = 0000000000000031, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name acme (inode = 0000000000000032, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name sdn (inode = 0000000000000033, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name status.cfg (inode = 0000000000BDC990, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name replication.cfg (inode = 0000000000E2CB5F, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name vzdump.cron (inode = 0000000000E2CB63, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name storage.cfg (inode = 0000000000E2F713, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.pub.old (inode = 0000000000E3D58D, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.pub (inode = 0000000000E3D590, parent = 0000000000000000) (database.c:375:bdb_backend_load_index)
[database] debug: name authkey.key (inode = 0000000000E3D593, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name 100.conf (inode = 0000000000E42D4D, parent = 0000000000E42EE1) (database.c:375:bdb_backend_load_index)
[database] debug: name host.fw (inode = 0000000000E42D6A, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name core-routing (inode = 0000000000E42DF2, parent = 000000000000000B) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 0000000000E42DF3, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] debug: name authorized_keys (inode = 0000000000E42DFD, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name known_hosts (inode = 0000000000E42E00, parent = 0000000000000009) (database.c:375:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 0000000000E42EE1, parent = 0000000000E42DF2) (database.c:375:bdb_backend_load_index)
[database] crit: found entry with duplicate name 'qemu-server' - A:(inode = 0x0000000000E42DF3, parent = 0x0000000000E42DF2, v./mtime = 0xE42DF3/0x1656594653) vs. B:(inode = 0x0000000000E42EE1, parent = 0x0000000000E42DF2, v./mtime = 0xE42EE1/0x1656594965) (database.c:430:bdb_backend_load_index)
[database] crit: DB load failed (database.c:470:bdb_backend_load_index)
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' (pmxcfs.c:888:main)
[main] notice: exit proxmox configuration filesystem (-1) (pmxcfs.c:1099:main)

mira · Jul 1, 2022

Oh that explains it. You have a duplicate entry for qemu-server which is not allowed in the Cluster Filesystem.
It most likely comes from this command: 254 mv qemu-server/* ../core-routing/qemu-server/
You moved the content of qemu-server rather than the directory itself.

In theory you could manually edit the config.db file, similar to this thread: https://forum.proxmox.com/threads/pve-cluster-fails-to-start.82861/
But create a backup before trying anything on the config.db file!

ropeguru · Jul 1, 2022

mira said:
Oh that explains it. You have a duplicate entry for qemu-server which is not allowed in the Cluster Filesystem.
It most likely comes from this command: 254 mv qemu-server/* ../core-routing/qemu-server/
You moved the content of qemu-server rather than the directory itself.

In theory you could manually edit the config.db file, similar to this thread: https://forum.proxmox.com/threads/pve-cluster-fails-to-start.82861/
But create a backup before trying anything on the config.db file!

How do I figure out which one is the duplicate??

mira · Jul 1, 2022

The one in nodes/<current-node-name>/qemu-server should be the right one, and the other, probably under the old name, the old one.
If you have one on the root level, then that's also a duplicate one. When mounting this will be a symlink to the node-specific one (for the local node).

ropeguru · Jul 1, 2022

Ok, almost there...

Service is now starting but looks like maybe an auth issue as the GUI has ? by everything..

mira · Jul 1, 2022

What's the status output of pve-cluster? systemctl status pve-cluster.service

Did you reboot the node after getting pmxcfs to start? Maybe some other services require a restart, e.g. pvedaemon, pveproxy and pvestatd.

ropeguru · Jul 1, 2022

Yeah... That was it.. Needed a reboot.

I thought about that after I posted...

All good now. Thanks for walking me through this as I now have a better understanding of how things work fin order to troubleshoot on my own in the future.

mira · Jul 1, 2022

Well, I learned something new as well. I had no need until now to manually modify the config.db file.
But Thomas explained it quite well in the other thread, I think.

I'm glad it worked out in the end! Usually something like this should not be possible just by copying and moving files around in pmxcfs.

Search

Search

[SOLVED] Hostname changed, now nodes gone from /etc/pve

ropeguru

Member

mira

Proxmox Staff Member

ropeguru

Member

mira

Proxmox Staff Member

ropeguru

Member

mira

Proxmox Staff Member

ropeguru

Member

mira

Proxmox Staff Member

ropeguru

Member

mira

Proxmox Staff Member

ropeguru

Member

mira

Proxmox Staff Member

We value your privacy