No GUI after power loss

butt3rballbeats

New Member
Mar 3, 2023
3
1
3
Hey I am a little bit of a noobie when it comes to proxmox/linux. I've been running proxmox for about a month now with no issues until today when I lost power. When booting back up I am able to ping and SSH into my host again but am not able to connect to the GUI. I confirmed in /etc/hosts that everything is still set up correctly and it appears fine to me.

I tried to do some digging and troubleshooting on my own on the forum and think I am stuck. When I run:
systemctl status pve-cluster
Code:
pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-03-02 23:31:56 EST; 25min ago
    Process: 2666 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 10ms


Mar 02 23:31:56 one systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Mar 02 23:31:56 one systemd[1]: Stopped The Proxmox VE cluster filesystem.
Mar 02 23:31:56 one systemd[1]: pve-cluster.service: Start request repeated too quickly.
Mar 02 23:31:56 one systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Mar 02 23:31:56 one systemd[1]: Failed to start The Proxmox VE cluster filesystem.


When running
journalctl -u pve-cluster
Code:
Journal file /var/log/journal/76190b52a0dd462a8caf31bc668ea0ac/system.journal is truncated, ignoring file.
-- Journal begins at Thu 2023-03-02 17:15:57 EST, ends at Thu 2023-03-02 23:58:24 EST. --
Mar 02 23:31:54 one systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 02 23:31:54 one pmxcfs[2643]: [database] crit: unable to set WAL mode: disk I/O error#010
Mar 02 23:31:54 one pmxcfs[2643]: [database] crit: unable to set WAL mode: disk I/O error#010
Mar 02 23:31:54 one pmxcfs[2643]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Mar 02 23:31:54 one pmxcfs[2643]: [main] notice: exit proxmox configuration filesystem (-1)
Mar 02 23:31:54 one pmxcfs[2643]: [main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
Mar 02 23:31:54 one pmxcfs[2643]: [main] notice: exit proxmox configuration filesystem (-1)
Mar 02 23:31:54 one systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Mar 02 23:31:54 one systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Mar 02 23:31:54 one systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Mar 02 23:31:55 one systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 1.
Mar 02 23:31:55 one systemd[1]: Stopped The Proxmox VE cluster filesystem.

This is where I am stuck, I am not sure what to do from here as I am still new to all of this. Any help is appreciated, thanks!
 
The error message unable to set WAL mode: disk I/O error suggests that there may be an issue with the disk or the file system where the Proxmox configuration database is located. The error message memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db' indicates that the database file used by the service could not be opened, which could also be caused by disk or file system issues.

I hope you have battery backup or exciting stuff awaits you.
https://pve.proxmox.com/wiki/Cluster_Manager

Cluster Cold Start​

It is obvious that a cluster is not quorate when all nodes are offline. This is a common case after a power failure.
NoteIt is always a good idea to use an uninterruptible power supply (“UPS”, also called “battery backup”) to avoid this state, especially if you want HA.
On node startup, the pve-guests service is started and waits for quorum. Once quorate, it starts all guests which have the onboot flag set.
When you turn on nodes, or when power comes back after power failure, it is likely that some nodes will boot faster than others. Please keep in mind that guest startup is delayed until you reach quorum.
 
Yeah... it's my fault for not having a UPS in place just yet slowly been building out my lab. Would you reckon it'll be easier to just reinstall from scratch than try to fix this mess?
 
I ended up just nuking it and reinstalling after copying the data over. My zpool was still intact and I was able to rebuild a few vms from the disks that were still in pool. Ended up just having to start from scratch on the others oh well! I'll use this as a learning moment to get UPS and start scheduling daily backups ;)
 
  • Like
Reactions: mikeinnyc
I ended up just nuking it and reinstalling after copying the data over. My zpool was still intact and I was able to rebuild a few vms from the disks that were still in pool. Ended up just having to start from scratch on the others oh well! I'll use this as a learning moment to get UPS and start scheduling daily backups ;)
all good now? make sure you update the time servers with each node. :cool:
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!