Accidentally deleted /etc/pve/nodes in main node in cluster how to recover

feradz

New Member
May 21, 2024
2
3
3
Hi all,

I have a cluster of 3 nodes: node1, node2, node3
Accidentally I run `rm -rf /etc/pve/nodes` on the primary node.

After that I cannot login through the the web console into node1.

I can ssh to the three nodes.

None of the CTs are visible right now.

I don't have backup of /etc/pve/nodes

How can I recover the cluster node?
 
Last edited:
@feradz Do you have anything useful in the /var/lib/pve-cluster/backup directory on any of your nodes?

For example, this is on one of my nodes and seems to be automatically setup/created:

Bash:
# ls -la /var/lib/pve-cluster/backup/
total 36
drwxr-xr-x 2 root root     3 May 16 13:07 .
drwxr-xr-x 3 root root     7 May 21 23:18 ..
-rw-r--r-- 1 root root 14473 May 16 13:07 config-1715864850.sql.gz

That file looks like it's a compressed dump of the SQLite commands needed to recreate the cluster configuration:

Bash:
# zmore /var/lib/pve-cluster/backup/config-1715864850.sql.gz
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE tree (  inode INTEGER PRIMARY KEY NOT NULL,  parent INTEGER NOT NULL CHECK(typeof(parent)=='integer'),  version INTEGER NOT NULL CHECK(typeof(version)=='integer'),  write
r INTEGER NOT NULL CHECK(typeof(writer)=='integer'),  mtime INTEGER NOT NULL CHECK(typeof(mtime)=='integer'),  type INTEGER NOT NULL CHECK(typeof(type)=='integer'),  name TEXT NOT NUL
L,  data BLOB);
INSERT INTO tree VALUES(0,0,1392,0,1715864846,8,'__version__',NULL);
INSERT INTO tree VALUES(2,0,3,0,1715862102,8,'datacenter.cfg',X'6b6579626f6172643a20656e2d75730a');
(etc)

So if you have a file in the backup directory, there's a chance it might be from before you nuked the /etc/pve/nodes directory. If that's the case, then it shouldn't be tooooo hard to recover.
 
Last edited:
Thanks for the feedback.
The cluster configuration was total mess. I have installed the proxmox from the scratch and manually configured the LXCs. It was easy but until I understand that it is easy I had setup a test environment and was breaking it in different ways and trying to restore until I realized what how to recover the existing LXC subvolumes.

After this experience I have learnt that this SQLLite mapped file system can create big mess. I came across many users falling into this problem after update/upgrade of proxmox.

I have learnt that is necessary to setup backups to easily recover VMs/LXCs.

I want to thank to the professional support from Proxmox, which they have guided me how to proceed.

Ironically, I created this mess accidentally while trying to add a backup server to make proper backups :)
 
Hi all, litsen carefully.

If this happens to you too, DO NOT restart any node!

Follow theses steps to recover your cluster (Copy and Paste):
  1. cp /var/lib/pve-cluster/config.db /var/lib/clusterconfig.db
  2. systemctl stop pve-cluster.service & systemctl stop corosync.service
  3. cp /var/lib/clusterconfig.db /var/lib/pve-cluster/config.db
  4. systemctl start pve-cluster.service & systemctl start corosync.service
This helped me, I hope it will help you too. :)
 
Hi all, litsen carefully.

If this happens to you too, DO NOT restart any node!

Follow theses steps to recover your cluster (Copy and Paste):
  1. cp /var/lib/pve-cluster/config.db /var/lib/clusterconfig.db
  2. systemctl stop pve-cluster.service & systemctl stop corosync.service
  3. cp /var/lib/clusterconfig.db /var/lib/pve-cluster/config.db
  4. systemctl start pve-cluster.service & systemctl start corosync.service
This helped me, I hope it will help you too. :)

This looks wrong, first you copy out the config.db (the SQL database backing up the /etc/pve fileystem) AFTER you already deleted the files from it and while everything is running, then you stop services and ... copy it back, then start them again?

I can only imagine it works because your delete was not checkpointed from the write-ahead-log to the base, very hacky.

Actually, you should keep proper backups, e.g.:
https://forum.proxmox.com/threads/backup-cluster-config-pmxcfs-etc-pve.154569/
 
@feradz Do you have anything useful in the /var/lib/pve-cluster/backup directory on any of your nodes?

For example, this is on one of my nodes and seems to be automatically setup/created:

Bash:
# ls -la /var/lib/pve-cluster/backup/
total 36
drwxr-xr-x 2 root root     3 May 16 13:07 .
drwxr-xr-x 3 root root     7 May 21 23:18 ..
-rw-r--r-- 1 root root 14473 May 16 13:07 config-1715864850.sql.gz

This is normally created before joining a cluster, it's not automated/regular, unfortunately.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!