Failed to start pve-cluster.service / unable to access Web-GUI

gh0st125

New Member
Oct 27, 2023
9
0
1
Hi,

after searching for hours online for a solution I finally decided to post here, hoping someone could guide me.
I am a fairly new proxmox user, installed PVE on a mini PC and deployed some VMs / LXCs. Everything was working fine, until I decided to change the hostname of my Proxmox-Server... (dumb I know).
I changed the hostname in /etc/hosts as well as in /etc/hostname (I think I also used hostnamectl) and rebooted the host, but then was unable to connect via the Web Interface (timeout).
I can access the host via SSH.
After some troubleshooting, I changed the hostname back, hoping it would fix the error, but to no avail.

Can anyone please guide me where to look at? I think it might have something to do with the pve-cluster service, but my knowledge with PVE is too limited...

Here are some logs / info. If you need more logs / commands, please let me know!
Thank you very much in advance for any tips!

Code:
root@P-SRV01:~# pveversion
pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 6.2.16-19-pve)

Code:
root@P-SRV01:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.0.10.11 P-SRV01.lan.mydomain.de P-SRV01

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Code:
root@P-SRV01:~# cat /etc/hostname
P-SRV01

Code:
root@P-SRV01:~# systemctl status pve-cluster.service
× pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Wed 2023-11-08 15:26:15 CET; 3min 17s ago
    Process: 1295 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 8ms

Nov 08 15:26:15 P-SRV01 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Nov 08 15:26:15 P-SRV01 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:26:15 P-SRV01 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Nov 08 15:26:15 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:26:15 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.

Code:
root@P-SRV01:~# systemctl status pveproxy.service
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: active (running) since Wed 2023-11-08 15:23:47 CET; 23min ago
   Main PID: 1039 (pveproxy)
      Tasks: 4 (limit: 18793)
     Memory: 140.1M
        CPU: 31.590s
     CGroup: /system.slice/pveproxy.service
             ├─1039 pveproxy
             ├─2276 "pveproxy worker"
             ├─2277 "pveproxy worker"
             └─2278 "pveproxy worker"

Nov 08 15:46:52 P-SRV01 pveproxy[2276]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEven>Nov 08 15:46:53 P-SRV01 pveproxy[2275]: worker exit
Nov 08 15:46:53 P-SRV01 pveproxy[2274]: worker exit
Nov 08 15:46:53 P-SRV01 pveproxy[1039]: worker 2274 finished
Nov 08 15:46:53 P-SRV01 pveproxy[1039]: worker 2275 finished
Nov 08 15:46:53 P-SRV01 pveproxy[1039]: starting 2 worker(s)
Nov 08 15:46:53 P-SRV01 pveproxy[1039]: worker 2277 started
Nov 08 15:46:53 P-SRV01 pveproxy[1039]: worker 2278 started
Nov 08 15:46:53 P-SRV01 pveproxy[2277]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEven>Nov 08 15:46:53 P-SRV01 pveproxy[2278]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEven>lines 1-23/23 (END)

Code:
root@P-SRV01:~# qm list
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
 
Last edited:
Hi,
without the pve-cluster service almost nothing else will work, so well need to take a look at that first.
Could you post the output of journalctl -eu pve-cluster?
 
thanks for your reply!
Here we go:

Code:
root@P-SRV01:~# journalctl -eu pve-cluster
Nov 08 15:55:38 P-SRV01 pmxcfs[821]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:38 P-SRV01 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Nov 08 15:55:38 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:55:38 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 2.
Nov 08 15:55:39 P-SRV01 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[879]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:55:39 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 3.
Nov 08 15:55:39 P-SRV01 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[891]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:55:39 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 4.
Nov 08 15:55:39 P-SRV01 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [database] crit: missing directory inode (inode = 00000000001501A5)
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [database] crit: DB load failed
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Nov 08 15:55:39 P-SRV01 pmxcfs[979]: [main] notice: exit proxmox configuration filesystem (-1)
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:55:39 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Nov 08 15:55:39 P-SRV01 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Nov 08 15:55:39 P-SRV01 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Nov 08 15:55:39 P-SRV01 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
lines 27-83/83 (END)

the config.db file is available in /var/lib/pve-cluster/
 
It looks like config.db has been corrupted somehow. For context, the content of /etc/pve/ is actually a virtual filesystem that is build from the config.db file that fails to load in your case. The 'missing directory' is probably an inconsistency in the filetree that is saved in that file.
Could you run pmxcfs -d -f and post the output here.
 
I must say, I modified the config.db file because I saw the old hostname appearing in the config.db file so I deleted that row:
1699458758217.png
(P-SRV01 was the old one, wanted to change to SRV01, but then reverted back to P-SRV01)



Code:
root@P-SRV01:~# pmxcfs -d -f
[main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address (pmxcfs.c:859:main)
[database] debug: name __version__ (inode = 0000000000000000, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name virtual-guest (inode = 0000000000000008, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000009, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name nodes (inode = 000000000000000B, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name P-SRV01 (inode = 000000000000000C, parent = 000000000000000B) (database.c:370:bdb_backend_load_index)
[database] debug: name lxc (inode = 000000000000000D, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000000000E, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name openvz (inode = 000000000000000F, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000010, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name lock (inode = 0000000000000011, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-www.key (inode = 0000000000000018, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-ssl.key (inode = 000000000000001A, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.key (inode = 000000000000001C, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.pem (inode = 000000000000001E, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.srl (inode = 0000000000000020, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-ssl.pem (inode = 0000000000000023, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name firewall (inode = 0000000000000031, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name ha (inode = 0000000000000032, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name mapping (inode = 0000000000000033, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name acme (inode = 0000000000000034, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name sdn (inode = 0000000000000035, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name shadow.cfg (inode = 000000000001AC38, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name user.cfg (inode = 000000000001AC3C, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name storage.cfg (inode = 000000000001C30F, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name datacenter.cfg (inode = 0000000000039554, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name replication.cfg (inode = 00000000000D3211, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name jobs.cfg (inode = 00000000000D3215, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name vzdump.cron (inode = 00000000000D321A, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authorized_keys (inode = 00000000000F1EB8, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name known_hosts (inode = 00000000000F1EBB, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.pub.old (inode = 000000000014F8D3, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.pub (inode = 000000000014F8D6, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.key (inode = 000000000014F8D9, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name 100.conf (inode = 0000000000150140, parent = 000000000000000E) (database.c:370:bdb_backend_load_index)
[database] debug: name lxc (inode = 00000000001501A7, parent = 00000000001501A5) (database.c:370:bdb_backend_load_index)
[database] debug: name 101.conf (inode = 00000000001502B7, parent = 000000000000000D) (database.c:370:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000015045D, parent = 00000000001501A5) (database.c:370:bdb_backend_load_index)
[database] debug: name lrm_status (inode = 0000000000150497, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] crit: missing directory inode (inode = 00000000001501A5) (database.c:453:bdb_backend_load_index)
[database] crit: DB load failed (database.c:465:bdb_backend_load_index)
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' (pmxcfs.c:898:main)
[main] notice: exit proxmox configuration filesystem (-1) (pmxcfs.c:1109:main)
 
Then you have to restore the entry, or recursively find all references to it and delete them.
Restoring is probably the better option, less chance to destroy more.
 
Next time, better edit/remove the files in /etc/pve, then pmxcfs has a chance to keep the filetree healthy
 
aight, fortunately I backuped the config.db file before manipulating, so here's the command with the restored db:

Code:
root@P-SRV01:~# pmxcfs -d -f
[main] notice: resolved node name 'P-SRV01' to '10.0.10.11' for default node IP address (pmxcfs.c:859:main)
[database] debug: name __version__ (inode = 0000000000000000, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name virtual-guest (inode = 0000000000000008, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000009, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name nodes (inode = 000000000000000B, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name P-SRV01 (inode = 000000000000000C, parent = 000000000000000B) (database.c:370:bdb_backend_load_index)
[database] debug: name lxc (inode = 000000000000000D, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000000000E, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name openvz (inode = 000000000000000F, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name priv (inode = 0000000000000010, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name lock (inode = 0000000000000011, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-www.key (inode = 0000000000000018, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-ssl.key (inode = 000000000000001A, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.key (inode = 000000000000001C, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.pem (inode = 000000000000001E, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-root-ca.srl (inode = 0000000000000020, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name pve-ssl.pem (inode = 0000000000000023, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[database] debug: name firewall (inode = 0000000000000031, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name ha (inode = 0000000000000032, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name mapping (inode = 0000000000000033, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name acme (inode = 0000000000000034, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name sdn (inode = 0000000000000035, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name shadow.cfg (inode = 000000000001AC38, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name user.cfg (inode = 000000000001AC3C, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name storage.cfg (inode = 000000000001C30F, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name datacenter.cfg (inode = 0000000000039554, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name replication.cfg (inode = 00000000000D3211, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name jobs.cfg (inode = 00000000000D3215, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name vzdump.cron (inode = 00000000000D321A, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authorized_keys (inode = 00000000000F1EB8, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name known_hosts (inode = 00000000000F1EBB, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.pub.old (inode = 000000000014F8D3, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.pub (inode = 000000000014F8D6, parent = 0000000000000000) (database.c:370:bdb_backend_load_index)
[database] debug: name authkey.key (inode = 000000000014F8D9, parent = 0000000000000009) (database.c:370:bdb_backend_load_index)
[database] debug: name 100.conf (inode = 0000000000150140, parent = 000000000000000E) (database.c:370:bdb_backend_load_index)
[database] debug: name SRV01 (inode = 00000000001501A5, parent = 000000000000000B) (database.c:370:bdb_backend_load_index)
[database] debug: name lxc (inode = 00000000001501A7, parent = 00000000001501A5) (database.c:370:bdb_backend_load_index)
[database] debug: name 101.conf (inode = 00000000001502B7, parent = 000000000000000D) (database.c:370:bdb_backend_load_index)
[database] debug: name qemu-server (inode = 000000000015045D, parent = 00000000001501A5) (database.c:370:bdb_backend_load_index)
[database] debug: name lrm_status (inode = 0000000000150497, parent = 000000000000000C) (database.c:370:bdb_backend_load_index)
[main] crit: vmlist_add_dir: assertion 'subdir->type == DT_DIR' failed (:0:)
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' (pmxcfs.c:898:main)
[main] notice: exit proxmox configuration filesystem (-1) (pmxcfs.c:1109:main)
 
I guess it's best to reinstall and start over by now. For the most parts I do have backups, however if I wanted to make sure, where are my VMs and LXC Containers stored if I wanted to restore them? As far as I know, /etc/pve/ is where the Environment gets mounted to, but because pve-cluster.service can't start it doesn't get mounted. So it needs to be stored somewhere else, right?
 
Unfortunately no, the filesystem tree and it's data is contained in the config.db. The only other place would be other cluster members.
You could get the configs from the config.db, in your case they are named 100.conf and 101.conf. If you used lvm to set up your proxmox, which should be default AFAIK, you can mount the vm's partiton at /etc/pve/vm-x-disk-y and back up the files you need.
 
don't know if that was correct what I've done, but trying to mount /dev/pve/vm-100-disk-y to (manually created) /etc/pve/vm-100-disk-y throws an error:

Code:
root@P-SRV01:/etc/pve# mount /dev/pve/vm-100-disk-1 /etc/pve/vm-100-disk-1
mount: /etc/pve/vm-100-disk-1: wrong fs type, bad option, bad superblock on /dev/mapper/pve-vm--100--disk--1, missing codepage or helper program, or other error.       dmesg(1) may have more information after failed mount system call.

Code:
root@P-SRV01:/etc/pve# mount /dev/pve/vm-100-disk-0 /etc/pve/vm-100-disk-0
mount: /etc/pve/vm-100-disk-0: wrong fs type, bad option, bad superblock on /dev/mapper/pve-vm--100--disk--0, missing codepage or helper program, or other error.       dmesg(1) may have more information after failed mount system call.

mounting vm-101-disk-0 (NPM LXC Container) however worked.
 
Ah right sorry, the vms are full disk images and have to be handled separately. In that case you can use losetup, first use losetup -f to check which device is free and then use that in losetup -P /dev/loopN /dev/pve/vm-X-disk-Y. After that you should have something like /dev/loop0p1 that you can mount
 
thank you for your time.

I used losetup -P:
Code:
root@P-SRV01:/etc/pve# losetup -P /dev/loop0 /dev/pve/vm-100-disk-0

After that you should have something like /dev/loop0p1 that you can mount

how do I find out what the name is? When I try to tab complete the command mount /dev/loop it only gives me the following options:

Code:
root@P-SRV01:/etc/pve# mount /dev/loop /etc/pve/vm-100-disk-0
loop0         loop1         loop2         loop3         loop4         loop5         loop6         loop7         loop-control

loop0 didn't work:
Code:
root@P-SRV01:/etc/pve# mount /dev/loop0 /etc/pve/vm-100-disk-0
mount: /etc/pve/vm-100-disk-0: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

losetup -a tells me loop0 is used in /dev/dm-6 (?):
Code:
root@P-SRV01:/etc/pve# losetup -a
/dev/loop0: [0005]:416 (/dev/dm-6)

however, same output if I try to mount /dev/dm-6 :
Code:
root@P-SRV01:/etc/pve# mount /dev/dm-6 /etc/pve/vm-100-disk-0
mount: /etc/pve/vm-100-disk-0: wrong fs type, bad option, bad superblock on /dev/mapper/pve-vm--100--disk--0, missing codepage or helper program, or other error.       dmesg(1) may have more information after failed mount system call.
 
Do you by any chance remember what was on that disk or if it actually was used?
Does the same work for vm-100-disk-1?
 
vm-100 was used for a Home Assistant install.
I run the commands for vm-100-disk-1 now. This time, it created /dev/loop1p1 to /dev/loop1p8.
Mounting them is working for the most of them, and in /dev/loop1p8 I found the data files of the VM I am currently saving to my PC.

After this, I will reinstall proxmox on the host, I guess it's best to start from scratch. Can I simply boot from USB Proxmox-ISO and let the installer format everything, or do I need to format the drives manually?
 
The installer should allow you to pave right over it
 
ok, so I reinstalled the host, setup the VMs / LXC Containers and restored the backups. Everything is working again now.

Thanks again for your time and help!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!