Can't Access GUI, Upgrade, or VZDump

jonathanwash

Active Member
Jun 7, 2017
8
0
41
38
I was working on my switch and accidentally pulled the power from my small server and now for some reason I can't access the GUI and the VM I have can't run.
I tried to upgrade the installation in case it was a corrupted file from the power outage but I get a whole bunch of dpkg errors (h++ps://pastebin.com/vWBwmTdR).
I looked into the errors and tried a couple things listed before on the forums (delete keys and recreate them, edit sqlite3 db, change hostnames) but no go.
So I looked into backing up my VM config so that I can reinstall but I get a ipcc_send_rec[1] failed: Connection refused when I run VZDump.

I'm lost at what I can do without losing all my data?
I'm hoping it's not too hard since I run Proxmox from a SSD but the actual VM data is on a ZFS mirrored pool locally installed on the server.

Thanks for reading.
 
Your ZFS pool is normally safe, because ZFS does not corrupt, maybe only your PVE is corrupt.

Also, where did you read to do an apt upgrade? Never, ever do that. Always do apt dist-upgrade. Please try a dist-upgrade.
 
So I looked into backing up my VM config so that I can reinstall but I get a ipcc_send_rec[1] failed: Connection refused when I run VZDump.

Are you running as root?


What is the output of:
Code:
systemctl status pve-ha-lrm.service
 
  • Like
Reactions: jonathanwash
Yes, I'm running as root.

Here is the output from systemctl status pve-ha-lrm.service
Code:
● pve-ha-lrm.service - PVE Local HA Ressource Manager Daemon
   Loaded: loaded (/lib/systemd/system/pve-ha-lrm.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2019-01-06 10:21:51 PST; 20h ago
  Process: 3086 ExecStart=/usr/sbin/pve-ha-lrm start (code=exited, status=111)
      CPU: 579ms

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
 
Please try upgrading again, if the error occurs, post the output of
Code:
journalctl -xe

I looked into the errors and tried a couple things listed before on the forums (delete keys and recreate them, edit sqlite3 db, change hostnames) but no go.

Also the outputs of the following files/commands could be useful in finding out the issue:
* /etc/hosts
* 'pvesm status'
* 'journalctl | grep smartd'
 
  • Like
Reactions: jonathanwash
Output of journalctl -xe:
Code:
https://pastebin.com/s0RHwgAF

/etc/hosts:
Code:
https://pastebin.com/Ue3FREC3

Code:
root@jonathanwash:~# pvesm status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused

journalctl | grep smartd:
Code:
https://pastebin.com/yiGpAkpK
 
This looks like a dependency/package management issue to me.

Send the outputs of the following please:

* 'pveversion -v'
* 'dpkg -l'
* 'ip a'
* 'find /etc/pve'

Note: The error messages in your journalctl indicate pmxcfs not being mounted properly (which means maybe while editing the /etc/hosts file some error was made or the service hasn't been restarted.)
 
Last edited:
  • Like
Reactions: jonathanwash
The package state seems to be in an inconsistent state with packages not fully installed. Given that you said you pulled the power supply that's a likely outcome.

What can help to get you started here is:
Code:
dpkg --configure -a

That should try to clean up the interrupted installs/upgrades and finish them. If you still get errors from that along the lines of that "subprocess installed post-installation script returned error exit status 1" or similar it might be needed to (temporarily) edit the corresponding maintainer script of the package to either make it more verbose (set -x) or return success (exit 0). After the package status is cleaned up it is needed to force a re-install of the package you tampered with the maintainer scripts so that the steps in there are also executed, if you had to put in the exit 0.

The tampering with the maintainer script is a task that should be taken with caution, but depending on the state of the package status it might be your best option nevertheless (unless you can re-install from backup).

Good luck
 
  • Like
Reactions: jonathanwash
* 'pveversion -v'
* 'dpkg -l'
* 'ip a'
* 'find /etc/pve'

Note: The error messages in your journalctl indicate pmxcfs not being mounted properly (which means maybe while editing the /etc/hosts file some error was made or the service hasn't been restarted.)

pveversion -v
Code:
https://pastebin.com/pg2DzjDm
dpkg -l
Code:
https://pastebin.com/wb26prKc
ip a
Code:
https://pastebin.com/c8SyvXai
find /etc/pve
Code:
/etc/pve

That should try to clean up the interrupted installs/upgrades and finish them. If you still get errors from that along the lines of that "subprocess installed post-installation script returned error exit status 1" or similar it might be needed to (temporarily) edit the corresponding maintainer script of the package to either make it more verbose (set -x) or return success (exit 0). After the package status is cleaned up it is needed to force a re-install of the package you tampered with the maintainer scripts so that the steps in there are also executed, if you had to put in the exit 0.

The tampering with the maintainer script is a task that should be taken with caution, but depending on the state of the package status it might be your best option nevertheless (unless you can re-install from backup).

Good luck

I edited the postinst for pve-cluster and I got this with anything I did:
Code:
https://pastebin.com/UMs13JuG

You say re-install from backup but of course I don't have a backup.
So my ultimate question is if I reinstall and re-configure the basic VM settings can I just point the config to the previous data on the ZFS storage side and have it continue where is left off?
 
Thanks. You did touch the right postinst to look into, because pve-cluster is responsible for mounting /etc/pve, which doesn't seem to be the case for you. You see in that output that pve-ha-lrm can't get started. Can you check the corresponding logs to why not ("journalctl -xe", and/or "systemctl status pve-ha-lrm"). Having that started/running is a prerequisite for pve-cluster.

I see that you have posted that above, but since you tried to get it running again I hope there is something more useful in there now.
 
  • Like
Reactions: jonathanwash
systemctl status pve-ha-lrm
Code:
https://pastebin.com/mn7tuz6i
journalctl -xe
Code:
https://pastebin.com/xLw0ZfQx
 
You could try starting pmxcfs in localmode:
`pmxcfs -l`
and afterwards try to repair the package-database by running
`apt install -f`

if possible please post the outputs of the commands directly here instead of via pastebin

is this a clustered setup or a single node?

what's the output of:
`hostname`
`hostname -f`
`uname -n`
 
pmxcfs -l
Code:
root@jonathanwash:~# pmxcfs -l
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server')
[database] crit: DB load failed
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
[main] notice: exit proxmox configuration filesystem (-1)

apt install -f
Code:
root@jonathanwash:/var/lib/dpkg/info# apt install -f
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
8 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up pve-cluster (5.0-31) ...
Job for pve-ha-lrm.service failed because the control process exited with error code.
See "systemctl status pve-ha-lrm.service" and "journalctl -xe" for details.
dpkg: error processing package pve-cluster (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of pve-firewall:
 pve-firewall depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package pve-firewall (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of qemu-server:
 qemu-server depends on pve-cluster; however:
  Package pve-cluster is not configured yet.
 qemu-server depends on pve-firewall; however:
  Package pve-firewall is not configured yet.

dpkg: error processing package qemu-server (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of libpve-storage-perl:
 libpve-storage-perl depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package libpve-storage-perl (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-manager:
 pve-manager depends on libpve-storage-perl (>= 5.0-18); however:
  Package libpve-storage-perl is not configured yet.
 pve-manager depends on pve-cluster (>= 5.0-27); however:
  Package pve-cluster is not configured yet.
 pve-manager depends on pve-firewall; however:
  Package pve-firewall is not configured yet.
 pve-manager depends on qemu-server (>= 5.0-24); however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-manager (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of libpve-access-control:
 libpve-access-control depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package libpve-access-control (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-container:
 pve-container depends on libpve-storage-perl (>= 5.0-31); however:
  Package libpve-storage-perl is not configured yet.
 pve-container depends on pve-cluster (>= 4.0-8); however:
  Package pve-cluster is not configured yet.

dpkg: error processing package pve-container (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent processing triggers for pve-ha-manager:
 pve-ha-manager depends on pve-cluster (>= 3.0-17); however:
  Package pve-cluster is not configured yet.
 pve-ha-manager depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-ha-manager (--configure):
 dependency problems - leaving triggers unprocessed
Errors were encountered while processing:
 pve-cluster
 pve-firewall
 qemu-server
 libpve-storage-perl
 proxmox-ve
 pve-manager
 libpve-access-control
 pve-container
 pve-ha-manager
E: Sub-process /usr/bin/dpkg returned an error code (1)

This is a single node

Code:
root@jonathanwash:/var/lib/dpkg/info# uname -n
jonathanwash
root@jonathanwash:/var/lib/dpkg/info# hostname
jonathanwash
root@jonathanwash:/var/lib/dpkg/info# hostname -f
jonathanwash.com
 
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server') [database] crit: DB load failed

Seems your pmxcfs database is broken :/
see https://pve.proxmox.com/pve-docs/chapter-pmxcfs.html for the details

If you have a backup - restore `/var/lib/pve-cluster/config.db` from that
Else copy the current file to a different location and try accessing it with sqlite3
 
I've checked the /var/lib/pve-cluster/config.db and I'm not seeing anything glaring wrong but that might be because I'm unfamiliar of the structure.

This is what I'm seeing:
RXCQ7TB.jpg
 
first - make sure you have a copy of config.db before making any changes (if the files breaks your configuration for PVE is gone!)
second - If you have a backup of config.db (or the contents of `/etc/pve`) it would probably be best to just restore them.

The error-message I quoted gives a hint as to where to search for the error:
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server')

I would check how many rows you have with `name='qemu-server'`, and maybe the row with inode equalling 00000000014141CC (you might have to convert the number from hex to decimal)
Code:
select * from tree where name='qemu-server';
select * from tree where inode=00000000014141CC;

Hope that helps
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!