Can't Access GUI, Upgrade, or VZDump

jonathanwash · Jan 6, 2019

I was working on my switch and accidentally pulled the power from my small server and now for some reason I can't access the GUI and the VM I have can't run.
I tried to upgrade the installation in case it was a corrupted file from the power outage but I get a whole bunch of dpkg errors (h++ps://pastebin.com/vWBwmTdR).
I looked into the errors and tried a couple things listed before on the forums (delete keys and recreate them, edit sqlite3 db, change hostnames) but no go.
So I looked into backing up my VM config so that I can reinstall but I get a ipcc_send_rec[1] failed: Connection refused when I run VZDump.

I'm lost at what I can do without losing all my data?
I'm hoping it's not too hard since I run Proxmox from a SSD but the actual VM data is on a ZFS mirrored pool locally installed on the server.

Thanks for reading.

LnxBil · Jan 6, 2019

Your ZFS pool is normally safe, because ZFS does not corrupt, maybe only your PVE is corrupt.

Also, where did you read to do an apt upgrade? Never, ever do that. Always do apt dist-upgrade. Please try a dist-upgrade.

jonathanwash · Jan 6, 2019

Same thing comes up I pastebin'd when I try dist-upgrade.

oguz · Jan 7, 2019

jonathanwash said:
So I looked into backing up my VM config so that I can reinstall but I get a ipcc_send_rec[1] failed: Connection refused when I run VZDump.

Are you running as root?

What is the output of:

Code:

systemctl status pve-ha-lrm.service

jonathanwash · Jan 7, 2019

Yes, I'm running as root.

Here is the output from systemctl status pve-ha-lrm.service

Code:

● pve-ha-lrm.service - PVE Local HA Ressource Manager Daemon
   Loaded: loaded (/lib/systemd/system/pve-ha-lrm.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2019-01-06 10:21:51 PST; 20h ago
  Process: 3086 ExecStart=/usr/sbin/pve-ha-lrm start (code=exited, status=111)
      CPU: 579ms

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

oguz · Jan 8, 2019

Please try upgrading again, if the error occurs, post the output of

Code:

journalctl -xe

jonathanwash said:
I looked into the errors and tried a couple things listed before on the forums (delete keys and recreate them, edit sqlite3 db, change hostnames) but no go.

Also the outputs of the following files/commands could be useful in finding out the issue:
* /etc/hosts
* 'pvesm status'
* 'journalctl | grep smartd'

jonathanwash · Jan 9, 2019

Output of journalctl -xe:

Code:

https://pastebin.com/s0RHwgAF

/etc/hosts:

Code:

https://pastebin.com/Ue3FREC3

Code:

root@jonathanwash:~# pvesm status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused

journalctl | grep smartd:

Code:

https://pastebin.com/yiGpAkpK

oguz · Jan 9, 2019

This looks like a dependency/package management issue to me.

Send the outputs of the following please:

* 'pveversion -v'
* 'dpkg -l'
* 'ip a'
* 'find /etc/pve'

Note: The error messages in your journalctl indicate pmxcfs not being mounted properly (which means maybe while editing the /etc/hosts file some error was made or the service hasn't been restarted.)

rhonda · Jan 9, 2019

The package state seems to be in an inconsistent state with packages not fully installed. Given that you said you pulled the power supply that's a likely outcome.

What can help to get you started here is:

Code:

dpkg --configure -a

That should try to clean up the interrupted installs/upgrades and finish them. If you still get errors from that along the lines of that "subprocess installed post-installation script returned error exit status 1" or similar it might be needed to (temporarily) edit the corresponding maintainer script of the package to either make it more verbose (set -x) or return success (exit 0). After the package status is cleaned up it is needed to force a re-install of the package you tampered with the maintainer scripts so that the steps in there are also executed, if you had to put in the exit 0.

The tampering with the maintainer script is a task that should be taken with caution, but depending on the state of the package status it might be your best option nevertheless (unless you can re-install from backup).

Good luck

jonathanwash · Jan 10, 2019

oguz said:
* 'pveversion -v'
* 'dpkg -l'
* 'ip a'
* 'find /etc/pve'

Note: The error messages in your journalctl indicate pmxcfs not being mounted properly (which means maybe while editing the /etc/hosts file some error was made or the service hasn't been restarted.)

pveversion -v

Code:

https://pastebin.com/pg2DzjDm

dpkg -l

Code:

https://pastebin.com/wb26prKc

ip a

Code:

https://pastebin.com/c8SyvXai

find /etc/pve

Code:

/etc/pve

rhonda said:
That should try to clean up the interrupted installs/upgrades and finish them. If you still get errors from that along the lines of that "subprocess installed post-installation script returned error exit status 1" or similar it might be needed to (temporarily) edit the corresponding maintainer script of the package to either make it more verbose (set -x) or return success (exit 0). After the package status is cleaned up it is needed to force a re-install of the package you tampered with the maintainer scripts so that the steps in there are also executed, if you had to put in the exit 0.

The tampering with the maintainer script is a task that should be taken with caution, but depending on the state of the package status it might be your best option nevertheless (unless you can re-install from backup).

Good luck

I edited the postinst for pve-cluster and I got this with anything I did:

Code:

https://pastebin.com/UMs13JuG

You say re-install from backup but of course I don't have a backup.
So my ultimate question is if I reinstall and re-configure the basic VM settings can I just point the config to the previous data on the ZFS storage side and have it continue where is left off?

rhonda · Jan 10, 2019

Thanks. You did touch the right postinst to look into, because pve-cluster is responsible for mounting /etc/pve, which doesn't seem to be the case for you. You see in that output that pve-ha-lrm can't get started. Can you check the corresponding logs to why not ("journalctl -xe", and/or "systemctl status pve-ha-lrm"). Having that started/running is a prerequisite for pve-cluster.

I see that you have posted that above, but since you tried to get it running again I hope there is something more useful in there now.

jonathanwash · Jan 10, 2019

systemctl status pve-ha-lrm

Code:

https://pastebin.com/mn7tuz6i

journalctl -xe

Code:

https://pastebin.com/xLw0ZfQx

Stoiko Ivanov · Jan 11, 2019

You could try starting pmxcfs in localmode:
`pmxcfs -l`
and afterwards try to repair the package-database by running
`apt install -f`

if possible please post the outputs of the commands directly here instead of via pastebin

is this a clustered setup or a single node?

what's the output of:
`hostname`
`hostname -f`
`uname -n`

jonathanwash · Jan 11, 2019

pmxcfs -l

Code:

root@jonathanwash:~# pmxcfs -l
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server')
[database] crit: DB load failed
[main] crit: memdb_open failed - unable to open database '/var/lib/pve-cluster/config.db'
[main] notice: exit proxmox configuration filesystem (-1)

apt install -f

Code:

root@jonathanwash:/var/lib/dpkg/info# apt install -f
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
8 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up pve-cluster (5.0-31) ...
Job for pve-ha-lrm.service failed because the control process exited with error code.
See "systemctl status pve-ha-lrm.service" and "journalctl -xe" for details.
dpkg: error processing package pve-cluster (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of pve-firewall:
 pve-firewall depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package pve-firewall (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of qemu-server:
 qemu-server depends on pve-cluster; however:
  Package pve-cluster is not configured yet.
 qemu-server depends on pve-firewall; however:
  Package pve-firewall is not configured yet.

dpkg: error processing package qemu-server (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of libpve-storage-perl:
 libpve-storage-perl depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package libpve-storage-perl (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-manager:
 pve-manager depends on libpve-storage-perl (>= 5.0-18); however:
  Package libpve-storage-perl is not configured yet.
 pve-manager depends on pve-cluster (>= 5.0-27); however:
  Package pve-cluster is not configured yet.
 pve-manager depends on pve-firewall; however:
  Package pve-firewall is not configured yet.
 pve-manager depends on qemu-server (>= 5.0-24); however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-manager (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of libpve-access-control:
 libpve-access-control depends on pve-cluster; however:
  Package pve-cluster is not configured yet.

dpkg: error processing package libpve-access-control (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of pve-container:
 pve-container depends on libpve-storage-perl (>= 5.0-31); however:
  Package libpve-storage-perl is not configured yet.
 pve-container depends on pve-cluster (>= 4.0-8); however:
  Package pve-cluster is not configured yet.

dpkg: error processing package pve-container (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent processing triggers for pve-ha-manager:
 pve-ha-manager depends on pve-cluster (>= 3.0-17); however:
  Package pve-cluster is not configured yet.
 pve-ha-manager depends on qemu-server; however:
  Package qemu-server is not configured yet.

dpkg: error processing package pve-ha-manager (--configure):
 dependency problems - leaving triggers unprocessed
Errors were encountered while processing:
 pve-cluster
 pve-firewall
 qemu-server
 libpve-storage-perl
 proxmox-ve
 pve-manager
 libpve-access-control
 pve-container
 pve-ha-manager
E: Sub-process /usr/bin/dpkg returned an error code (1)

This is a single node

Code:

root@jonathanwash:/var/lib/dpkg/info# uname -n
jonathanwash
root@jonathanwash:/var/lib/dpkg/info# hostname
jonathanwash
root@jonathanwash:/var/lib/dpkg/info# hostname -f
jonathanwash.com

Stoiko Ivanov · Jan 11, 2019

jonathanwash said:
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server') [database] crit: DB load failed

Seems your pmxcfs database is broken :/
see https://pve.proxmox.com/pve-docs/chapter-pmxcfs.html for the details

If you have a backup - restore `/var/lib/pve-cluster/config.db` from that
Else copy the current file to a different location and try accessing it with sqlite3

jonathanwash · Jan 11, 2019

I've checked the /var/lib/pve-cluster/config.db and I'm not seeing anything glaring wrong but that might be because I'm unfamiliar of the structure.

This is what I'm seeing:

Stoiko Ivanov · Jan 14, 2019

first - make sure you have a copy of config.db before making any changes (if the files breaks your configuration for PVE is gone!)
second - If you have a backup of config.db (or the contents of `/etc/pve`) it would probably be best to just restore them.

The error-message I quoted gives a hint as to where to search for the error:

jonathanwash said:
[database] crit: found entry with duplicate name (inode = 00000000014141CC, parent = 0000000001413E21, name = 'qemu-server')

I would check how many rows you have with `name='qemu-server'`, and maybe the row with inode equalling 00000000014141CC (you might have to convert the number from hex to decimal)

Code:

select * from tree where name='qemu-server';
select * from tree where inode=00000000014141CC;

Hope that helps

Search

Search

Can't Access GUI, Upgrade, or VZDump

jonathanwash

Active Member

LnxBil

Distinguished Member

jonathanwash

Active Member

oguz

Proxmox Retired Staff

jonathanwash

Active Member

oguz

Proxmox Retired Staff

jonathanwash

Active Member

oguz

Proxmox Retired Staff

rhonda

Proxmox Retired Staff

jonathanwash

Active Member

rhonda

Proxmox Retired Staff

jonathanwash

Active Member

Stoiko Ivanov

Proxmox Staff Member

jonathanwash

Active Member

Stoiko Ivanov

Proxmox Staff Member

jonathanwash

Active Member

Stoiko Ivanov

Proxmox Staff Member

We value your privacy