[SOLVED] upgrade failed - proxmox not working anymore

astrakid

Renowned Member
Jun 13, 2013
76
1
73
hi,
i upgraded my proxmox installation. due to some local issues the installation was interrupted and i had to continue later on by dpkg --configure -a and apt fullupgrade.
however, everything seems up to date, but proxmox is not working.
pveproxy is running, but i am not able to reach the gui. access log says error code 501, my browser just displays a blank page after ssl-warning.

::ffff:192.168.1.104 - - [05/09/2019:17:21:43 +0200] "GET / HTTP/1.1" 501 -

pveversion --verbose
root@server:~# pveversion --verbose
proxmox-ve: 6.0-2 (running kernel: 5.0.21-1-pve)
pve-manager: 6.0-7 (running version: 6.0-7/28984024)
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-4.15: 5.4-8
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-19-pve: 4.15.18-45
pve-kernel-4.10.11-1-pve: 4.10.11-9
pve-kernel-4.10.8-1-pve: 4.10.8-7
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.6-1-pve: 4.4.6-48
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.11-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-8
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-64
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1


any idea to narrow down the issue (and solve it... ;-) )? it is only a homeserver, so it is just me who is affected, but iw ould like to get my VMs running again. :-/


kind regards,
andre
 
Hi,
please check if all the services are up and running correctly systemctl status pve*. Then check the journal journalctl -b
 
Last edited:
hi,

pve-cluster does not start:

Code:
root@server:/etc/apt# journalctl -xe
--
-- An ExecStart= process belonging to unit pve-cluster.service has exited.
--
-- The process' exit code is 'exited' and its exit status is 255.
Sep 05 17:45:45 server systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Sep 05 17:45:45 server systemd[1]: Failed to start The Proxmox VE cluster filesystem.
-- Subject: A start job for unit pve-cluster.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit pve-cluster.service has finished with a failure.
--
-- The job identifier is 12522 and the job result is failed.
Sep 05 17:45:45 server systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
-- Subject: A start job for unit corosync.service has finished successfully
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit corosync.service has finished successfully.
--
-- The job identifier is 12525.
Sep 05 17:45:45 server systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Sep 05 17:45:45 server systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Automatic restarting of the unit pve-cluster.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Sep 05 17:45:45 server systemd[1]: Stopped The Proxmox VE cluster filesystem.
-- Subject: A stop job for unit pve-cluster.service has finished
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A stop job for unit pve-cluster.service has finished.
--
-- The job identifier is 12712 and the job result is done.
Sep 05 17:45:45 server systemd[1]: pve-cluster.service: Start request repeated too quickly.
Sep 05 17:45:45 server systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Sep 05 17:45:45 server systemd[1]: Failed to start The Proxmox VE cluster filesystem.
-- Subject: A start job for unit pve-cluster.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit pve-cluster.service has finished with a failure.
--
-- The job identifier is 12712 and the job result is failed.
Sep 05 17:45:45 server systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
-- Subject: A start job for unit corosync.service has finished successfully
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit corosync.service has finished successfully.
--
-- The job identifier is 12715.

tried to reinstall pve-cluster, but didn't succeed.

anyway:
Code:
root@server:/etc/apt# systemctl status pve*
Unit pve-repo-ca-certificates.crt.service could not be found.

how to reinstall that?
 
hmmm... is that missing package related to proxmox-enterprise? i am using the no-subscription version... better said: WAS using... but i want to use it still... ;)
 
journalctl -b errors i think of being related to this issue:
Code:
Sep 05 17:29:52 server systemd[1]: Failed to start Cgroup management daemon.
[...]
Sep 05 17:30:00 server systemd[1]: Starting Proxmox VE replication runner...
Sep 05 17:30:00 server pvesr[861]: ipcc_send_rec[1] failed: Connection refused
Sep 05 17:30:00 server pvesr[861]: ipcc_send_rec[2] failed: Connection refused
Sep 05 17:30:00 server pvesr[861]: ipcc_send_rec[3] failed: Connection refused
Sep 05 17:30:00 server pvesr[861]: Unable to load access control list: Connection refused
Sep 05 17:30:00 server systemd[1]: pvesr.service: Main process exited, code=exited, status=111/n/a
Sep 05 17:30:00 server systemd[1]: pvesr.service: Failed with result 'exit-code'.
Sep 05 17:30:00 server systemd[1]: Failed to start Proxmox VE replication runner.
[...]
Sep 05 17:30:03 server pmxcfs[1325]: [main] crit: fuse_mount error: File exists
Sep 05 17:30:03 server pmxcfs[1325]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 05 17:30:03 server systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Sep 05 17:30:03 server pmxcfs[1325]: [main] crit: fuse_mount error: File exists
Sep 05 17:30:03 server systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Sep 05 17:30:03 server pmxcfs[1325]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 05 17:30:03 server systemd[1]: Failed to start The Proxmox VE cluster filesystem.
 
ok, /etc/pve is an own filesystem - i guess this is my problem. removed the folder, rebooted, and the webgui is back again. only my configuration is gone. any chance to get that back? my VM-files are still there....
 
ok, that information is stored in config.db in /var/lib/pve-cluster. i moved that before, so i restored it and everything seems to be back!!!
thanks a lot for having an ear for me!
 
ok, that information is stored in config.db in /var/lib/pve-cluster. i moved that before, so i restored it and everything seems to be back!!!
thanks a lot for having an ear for me!
Glad you are back up and running! Consider performing regular backups to be save ;) (if you do not perform those already).
 
Glad you are back up and running! Consider performing regular backups to be save ;) (if you do not perform those already).
i am backing up my VMs, that must be enough, because i am not able to backup my single system except a file backup. but i will enhance my existing backups by this important files that i detected now. is that filesystem /etc/pve completely contained in the config.db?

regards,
andre
 
i am backing up my VMs, that must be enough, because i am not able to backup my single system except a file backup. but i will enhance my existing backups by this important files that i detected now. is that filesystem /etc/pve completely contained in the config.db?

regards,
andre
Yes, this is a sqlite database which is mounted via fuse to /etc/pve.