pve-cluster fails to start

sysop777

New Member
Oct 20, 2023
1
0
0
After a data center change, I recently changed the IP and hostname of a pve server and removed it from a cluster. I did it wrong I guess and now I cant get the pve-cluster service to start and mount /etc/pve.

I'd really appreciate some help, it is very important I don't lose anything on this hypervisor and the rest of our network infrastructure is dependent on it. Thank you so much.

Essentially very similar issues to https://forum.proxmox.com/threads/pve-cluster-fails-to-start.82861, except I'm unsure on what to do moving forward

Code:
systemctl stop pve-cluster corosync
pmxcfs -l
rm -rf /etc/corosync/*
rm -rf /etc/pve/corosync.conf
killall pmxcfs
systemctl start pve-cluster

(All Reverted)
/etc/hosts
Code:
127.0.0.1 localhost.localdomain localhost
23.160.200.66 colo-fl05.myinfra.lol colo-fl05

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

/etc/hostname
Code:
colo-fl05

---

Statuses

PVE Cluster
Code:
`systemctl status pve-cluster -n 30

● pve-cluster.service - The Proxmox VE cluster filesystem

     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)

     Active: failed (Result: exit-code) since Thu 2023-10-19 19:27:16 MDT; 1min 38s ago

    Process: 3178 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

        CPU: 6ms


Oct 19 19:27:16 colo-fl05 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.

Oct 19 19:27:16 colo-fl05 systemd[1]: Stopped The Proxmox VE cluster filesystem.

Oct 19 19:27:16 colo-fl05 systemd[1]: pve-cluster.service: Start request repeated too quickly.

Oct 19 19:27:16 colo-fl05 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.

Oct 19 19:27:16 colo-fl05 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

root@colo-fl05:~#

PVE Proxy
Code:
```

root@colo-fl05:~# systemctl status pveproxy

● pveproxy.service - PVE API Proxy Server

     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)

     Active: active (running) since Thu 2023-10-19 19:27:17 MDT; 2min 10s ago

    Process: 3179 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)

    Process: 3181 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)

   Main PID: 3182 (pveproxy)

      Tasks: 3 (limit: 76609)

     Memory: 176.7M

        CPU: 3.753s

     CGroup: /system.slice/pveproxy.service

             ├─3182 pveproxy

             ├─3271 pveproxy worker

             └─3272 pveproxy worker


Oct 19 19:29:23 colo-fl05 pveproxy[3182]: starting 1 worker(s)

Oct 19 19:29:23 colo-fl05 pveproxy[3182]: worker 3271 started

Oct 19 19:29:23 colo-fl05 pveproxy[3271]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key>

Oct 19 19:29:23 colo-fl05 pveproxy[3269]: worker exit

Oct 19 19:29:23 colo-fl05 pveproxy[3270]: worker exit

Oct 19 19:29:23 colo-fl05 pveproxy[3182]: worker 3269 finished

Oct 19 19:29:23 colo-fl05 pveproxy[3182]: starting 1 worker(s)

Oct 19 19:29:23 colo-fl05 pveproxy[3182]: worker 3270 finished

Oct 19 19:29:23 colo-fl05 pveproxy[3182]: worker 3272 started

Oct 19 19:29:23 colo-fl05 pveproxy[3272]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key>

The database exists and is openable.

and
Code:
root@colo-fl05:~# pvecm --help

ipcc_send_rec[1] failed: Connection refused

ipcc_send_rec[2] failed: Connection refused

ipcc_send_rec[3] failed: Connection refused

Unable to load access control list: Connection refused


--

Code:
root@colo-fl05:~# sqlite3 /var/lib/pve-cluster/config.db 'PRAGMA integrity_check'

sqlite3 /var/lib/pve-cluster/config.db .schema

sqlite3 /var/lib/pve-cluster/config.db 'SELECT inode,mtime,name FROM tree WHERE parent = 0'

sqlite3 /var/lib/pve-cluster/config.db 'SELECT inode,mtime,name FROM tree WHERE parent = 347747 or inode = 347747'


ok


CREATE TABLE tree (  inode INTEGER PRIMARY KEY NOT NULL,  parent INTEGER NOT NULL CHECK(typeof(parent)=='integer'),  version INTEGER NOT NULL CHECK(typeof(version)=='integer'),  writer INTEGER NOT NULL CHECK(typeof(writer)=='integer'),  mtime INTEGER NOT NULL CHECK(typeof(mtime)=='integer'),  type INTEGER NOT NULL CHECK(typeof(type)=='integer'),  name TEXT NOT NULL,  data BLOB);


0|1697762269|__version__

8|1645837046|virtual-guest

9|1645837048|priv

11|1645837048|nodes

24|1645837048|pve-www.key

30|1645837049|pve-root-ca.pem

49|1645837049|ha

51|1645837049|sdn

172516|1646032555|ceph.conf

12255393|1659058527|firewall

35733410|1672569429|colo-fl04

55270320|1677312746|domains.cfg

95589350|1693565371|storage.cfg

96172202|1694003514|datacenter.cfg

98870271|1696038836|user.cfg

98870276|1696038836|replication.cfg

98870280|1696038836|jobs.cfg

98870285|1696038836|vzdump.cron

98949744|1697759917|authkey.pub.old

---

Code:
[database] crit: missing directory inode (inode = 0000000005E5DA88) (database.c:453:bdb_backend_load_index)"
 
Last edited: