Proxmox won't start after power cycle

Apr 13, 2021
14
4
1
Did a reboot yesterday of my single-node Proxmox installation and after the powercycle, Proxmox won't properly boot any more. I get an error message stating'Failed to start Proxmox VE cluster filesystem' and to check 'systemctl status pve-cluster.service'.

After a few minutes I get a login prompt for one of my VM's, a Debian Test VM, on which I can't login with the login I have for this VM.

Tried to reboot with the emergency boot option, but that results in the same errors.

Really strange that this just suddenly happens. Have searched on how to do a reinstall or repair a broken installation, but no helpful results. Any suggestions?
 
output of systemctl status pve-cluster.service:
Code:
 pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2021-05-26 14:19:22 CEST; 29min ago
  Process: 1954 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

May 26 14:19:22 DebianTest systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
May 26 14:19:22 DebianTest systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
May 26 14:19:22 DebianTest systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 26 14:19:22 DebianTest systemd[1]: pve-cluster.service: Start request repeated too quickly.
May 26 14:19:22 DebianTest systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
May 26 14:19:22 DebianTest systemd[1]: Failed to start The Proxmox VE cluster filesystem.

output of 'tail /var/log/syslog':
Code:
tail /var/log/syslog
May 26 14:50:55 DebianTest pveproxy[1242]: starting 2 worker(s)
May 26 14:50:55 DebianTest pveproxy[1242]: worker 3693 started
May 26 14:50:55 DebianTest pveproxy[1242]: worker 3694 started
May 26 14:50:55 DebianTest pveproxy[3692]: worker exit
May 26 14:50:55 DebianTest pveproxy[1242]: worker 3692 finished
May 26 14:50:55 DebianTest pveproxy[1242]: starting 1 worker(s)
May 26 14:50:55 DebianTest pveproxy[1242]: worker 3695 started
May 26 14:50:55 DebianTest pveproxy[3693]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /         usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1909.
May 26 14:50:55 DebianTest pveproxy[3694]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /         usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1909.
May 26 14:50:55 DebianTest pveproxy[3695]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /         usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1909.

journalctl -xe:
Code:
journalctl -xe
May 26 14:52:19 DebianTest postfix/pickup[1229]: 3842E20131A: uid=0 from=<root>
May 26 14:52:19 DebianTest postfix/cleanup[3324]: 3842E20131A: message-id=<20210526125219.3842E20131A@pve.cmr-net.io>
May 26 14:52:19 DebianTest postfix/cleanup[3324]: warning: 3842E20131A: write queue file: No space left on device
May 26 14:52:19 DebianTest postfix/pickup[1229]: warning: maildrop/B94BB2012D8: error writing 3842E20131A: queue file write error
May 26 14:52:19 DebianTest postfix/pickup[1229]: warning: 3862520131A: message has been queued for 2 days
May 26 14:52:19 DebianTest postfix/pickup[1229]: 3862520131A: uid=0 from=<root>
May 26 14:52:19 DebianTest postfix/cleanup[3324]: 3862520131A: message-id=<20210526125219.3862520131A@pve.cmr-net.io>
May 26 14:52:19 DebianTest postfix/cleanup[3324]: warning: 3862520131A: write queue file: No space left on device
May 26 14:52:19 DebianTest postfix/pickup[1229]: warning: maildrop/882D32012D6: error writing 3862520131A: queue file write error
May 26 14:52:22 DebianTest pveproxy[3780]: worker exit
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3780 finished
May 26 14:52:22 DebianTest pveproxy[1242]: starting 1 worker(s)
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3783 started
May 26 14:52:22 DebianTest pveproxy[3783]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share
May 26 14:52:22 DebianTest pveproxy[3782]: worker exit
May 26 14:52:22 DebianTest pveproxy[3781]: worker exit
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3781 finished
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3782 finished
May 26 14:52:22 DebianTest pveproxy[1242]: starting 2 worker(s)
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3784 started
May 26 14:52:22 DebianTest pveproxy[1242]: worker 3785 started
May 26 14:52:22 DebianTest pveproxy[3784]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share
May 26 14:52:22 DebianTest pveproxy[3785]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share

journalctl -u pve-cluster:

Code:
journalctl -u pve-cluster
-- Logs begin at Wed 2021-05-26 14:06:53 CEST, end at Wed 2021-05-26 14:53:08 CEST. --
May 26 14:07:15 DebianTest systemd[1]: Starting The Proxmox VE cluster filesystem...
May 26 14:07:15 DebianTest pmxcfs[1155]: [main] crit: Unable to get local IP address
May 26 14:07:15 DebianTest pmxcfs[1155]: [main] crit: Unable to get local IP address
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
May 26 14:07:15 DebianTest systemd[1]: Failed to start The Proxmox VE cluster filesystem.
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 1.
May 26 14:07:15 DebianTest systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 26 14:07:15 DebianTest systemd[1]: Starting The Proxmox VE cluster filesystem...
May 26 14:07:15 DebianTest pmxcfs[1179]: [main] crit: Unable to get local IP address
May 26 14:07:15 DebianTest pmxcfs[1179]: [main] crit: Unable to get local IP address
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
May 26 14:07:15 DebianTest systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
May 26 14:07:15 DebianTest systemd[1]: Failed to start The Proxmox VE cluster filesystem.
May 26 14:07:16 DebianTest systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
May 26 14:07:16 DebianTest systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 2.
May 26 14:07:16 DebianTest systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 26 14:07:16 DebianTest systemd[1]: Starting The Proxmox VE cluster filesystem...
May 26 14:07:16 DebianTest pmxcfs[1201]: [main] crit: Unable to get local IP address
May 26 14:07:16 DebianTest pmxcfs[1201]: [main] crit: Unable to get local IP address
May 26 14:07:16 DebianTest systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION

cat /etc/hosts:
Code:
cat /etc/hosts
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
#     /etc/cloud/cloud.cfg or cloud-config from user-data
#
127.0.1.1 DebianTest DebianTest
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

cat /etc/hostname:

Code:
cat /etc/hostname
DebianTest
Hope these help!
 
Last edited:
i mean an ip not in 127.x.y.z/8 (what i called loopback address, see https://en.wikipedia.org/wiki/Reserved_IP_addresses )
but a regular ip in you network (e.g. 10.0.x.y)

maybe you can show the network config, then we can tell which ip the hostname must resolve to
 
Changed 127.0.1.1 to 10.0.1.60 in the /etc/hosts file, rebooted, problem still exists. Opened /etc/hosts again and IP address was again at 127.0.1.1:
Code:
cat /etc/hosts

# Your system has configured 'manage_etc_hosts' as True.

# As a result, if you wish for changes to this file to persist

# then you will need to either

# a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl

# b.) change or remove the value of 'manage_etc_hosts' in

#     /etc/cloud/cloud.cfg or cloud-config from user-data

#

127.0.1.1 DebianTest DebianTest

127.0.0.1 localhost



# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

ff02::3 ip6-allhosts
According to the comment lines on top, I have 'manage_etc_hosts' set to 'True'. Is this a standard thing that is done by Proxmox? I never set anything like this. And should I change it by using option A or B?

The strange thing is that the Proxmox install worked flawlessly for at least a Month, I removed some VMs, rebooted and suddenly this happens.
 
did you install cloud-init on the host?
if yes, why? normally you only need to install that in a guest...
 
Well.....It should have been on the DebianTest VM, wanted to make my own Template VM. Might have done it in the wrong putty window....:oops:
Any way to fix this?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!