Earlier this evening I tried renaming 2 standalone hosts.
One worked out as expected - the other did not.
Meaning the folder /etc/pve was empty - not sure if any other damage was done.
I compared the 2 hosts and was able to re-create some bits and pieces:
- Recreated the folder structure and logical links
- Restored the folder ./nodes with the VM-configs from a backup
- Recreated the storage.cfg
- Recreated the firewall config
The content of the folder is now:
The .nodes folder now:
The host is now failing with lots of errors in the syslog.
Lots of these:
Some of these:
Feb 23 00:18:01 vigilant cron[964]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
And before those there is this:
I tried to recreate the certs with the following command and results:
The system name is vigilant. The content of the files /etc/hosts and /etc/hostname is:
I searhc around the from in an attempt fixing this.
But after several hours and numerous attempts still no noticable improvements.
Any suggestions?
One worked out as expected - the other did not.
Meaning the folder /etc/pve was empty - not sure if any other damage was done.
I compared the 2 hosts and was able to re-create some bits and pieces:
- Recreated the folder structure and logical links
- Restored the folder ./nodes with the VM-configs from a backup
- Recreated the storage.cfg
- Recreated the firewall config
The content of the folder is now:
Code:
root@vigilant:/etc/pve# ls -l
total 32
-rw-r--r-- 1 root root 44 Feb 22 22:51 datacenter.cfg
drwxr-xr-x 2 root root 4096 Feb 22 23:30 firewall
lrwxrwxrwx 1 root root 14 Feb 22 22:54 local -> nodes/vigilant
lrwxrwxrwx 1 root root 18 Feb 22 22:57 lxc -> nodes/vigilant/lxc
drwxr-xr-x 3 root root 4096 Feb 22 22:38 nodes
lrwxrwxrwx 1 root root 21 Feb 22 22:58 openvz -> nodes/vigilant/openvz
drwxr-xr-x 2 root root 4096 Feb 22 22:52 priv
lrwxrwxrwx 1 root root 26 Feb 22 22:59 qemu-server -> nodes/vigilant/qemu-server
drwxr-xr-x 2 root root 4096 Feb 22 23:32 sdn
-rw-r--r-- 1 root root 762 Feb 22 22:35 storage.cfg
-rw-r--r-- 1 root root 107 Feb 22 22:31 user.cfg
drwxr-xr-x 2 root root 4096 Feb 22 23:14 virtual-guest
r
The .nodes folder now:
Code:
root@vigilant:/etc/pve# ls -l nodes/vigilant
total 32
-rw-r----- 1 root root 34 Feb 22 22:40 host.fw
-rw-r----- 1 root root 83 Feb 22 22:40 lrm_status
drwxr-xr-x 2 root root 4096 Feb 22 22:40 lxc
drwxr-xr-x 2 root root 4096 Feb 22 22:40 openvz
drwx------ 2 root root 4096 Feb 22 22:40 priv
-rw-r----- 1 root root 1679 Feb 22 22:40 pve-ssl.key
-rw-r----- 1 root root 1692 Feb 22 22:40 pve-ssl.pem
drwxr-xr-x 2 root root 4096 Feb 22 22:40 qemu-server
root@vigilant:/etc/pve# ls -l ./nodes
total 4
drwxr-xr-x 6 root root 4096 Feb 22 22:40 vigilant
The host is now failing with lots of errors in the syslog.
Lots of these:
Code:
Feb 23 00:18:00 vigilant pveproxy[3177]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key)>
Feb 23 00:18:00 vigilant pveproxy[3178]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key)>
Some of these:
Feb 23 00:18:01 vigilant cron[964]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
And before those there is this:
Code:
Feb 22 23:34:18 vigilant pmxcfs[1163]: [database] crit: found entry with duplicate name 'lxc' - A:(inode = 0x0000000000>
Feb 22 23:34:18 vigilant pmxcfs[1163]: [database] crit: found entry with duplicate name 'lxc' - A:(inode = 0x0000000000>
Feb 22 23:34:18 vigilant pmxcfs[1163]: [database] crit: DB load failed
Feb 22 23:34:18 vigilant pmxcfs[1163]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/c>
Feb 22 23:34:18 vigilant pmxcfs[1163]: [main] notice: exit proxmox configuration filesystem (-1)
Feb 22 23:34:18 vigilant pmxcfs[1163]: [database] crit: DB load failed
Feb 22 23:34:18 vigilant pmxcfs[1163]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/c>
Feb 22 23:34:18 vigilant pmxcfs[1163]: [main] notice: exit proxmox configuration filesystem (-1)
Feb 22 23:34:18 vigilant systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Feb 22 23:34:18 vigilant systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Feb 22 23:34:18 vigilant systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Feb 22 23:34:18 vigilant systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[1] failed: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[2] failed: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[3] failed: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: Unable to load access control list: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[1] failed: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[2] failed: Connection refused
Feb 22 23:34:18 vigilant pve-firewall[975]: ipcc_send_rec[3] failed: Connection refused
Feb 22 23:34:18 vigilant systemd[1]: pve-firewall.service: Control process exited, code=exited, status=111/n/a
Feb 22 23:34:18 vigilant systemd[1]: pve-firewall.service: Failed with result 'exit-code'.
Feb 22 23:34:18 vigilant systemd[1]: Failed to start Proxmox VE firewall.
I tried to recreate the certs with the following command and results:
Code:
root@vigilant:/etc/pve# pvecm updatecerts --force
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
The system name is vigilant. The content of the files /etc/hosts and /etc/hostname is:
Code:
root@vigilant:/etc/pve# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.139.251 vigilant.itv.lan vigilant
root@vigilant:/etc/pve# cat /etc/hostname
vigilant
I searhc around the from in an attempt fixing this.
But after several hours and numerous attempts still no noticable improvements.
Any suggestions?