[SOLVED] Proxmox 4 Upgrade Hanging "Setting up pve-manager"

Jospeh Huber · Jan 11, 2018

Hi,

today I am trying to upgrade from an Proxmox 4 Version to the current Proxmox 4 Version in my cluster on the no-subscription repo.
The first node has no problems.
The second node hangs in the configuring step of "pve-manager".
My Starting Version
proxmox-ve: 4.4-93 (running kernel: 4.4.76-1-pve)
pve-manager: 4.4-17 (running version: 4.4-17/70a65945)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.16-1-pve: 4.4.16-64
pve-kernel-4.4.24-1-pve: 4.4.24-72
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.76-1-pve: 4.4.76-93
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-52
qemu-server: 4.0-111
pve-firmware: 1.1-11
libpve-common-perl: 4.0-96
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-101
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
drbdmanage: not correctly installed
ceph: 10.2.9-1~bpo80+1

This is the last printout:
Setting up pve-manager (4.4-21) ...

This command is executed:
/usr/bin/dpkg --status-fd 22 --configure perl-modules:all perl:amd64 libssl1.0.0:amd64 libkrb5support0:amd64 libk5crypto3:amd64 libkrb5-3:amd64 libgssapi-krb5-2:amd64 libgssrpc4:amd64 libkadm5clnt-mit9:amd64 libkdb5-7:amd64 libkadm5srv-mit9:amd64 libkrad0:amd64 libxml2:amd64 libcups2:amd64 libcurl3:amd64 libcurl3-gnutls:amd64 libisc-export95:amd64 libdns-export100:amd64 libx11-data:all libx11-6:amd64 libgdk-pixbuf2.0-common:all libgdk-pixbuf2.0-0:amd64 libicu52:amd64 libisccfg-export90:amd64 libirs-export91:amd64 mysql-common:all libmysqlclient18:amd64 libnss3:amd64 libwbclient0:amd64 samba-common:all samba-libs:amd64 libsmbclient:amd64 smbclient:amd64 libx11-xcb1:amd64 libxfixes3:amd64 libxcursor1:amd64 libxi6:amd64 libxrandr2:amd64 libxtst6:amd64 rsync:amd64 openssl:amd64 pve-cluster:amd64 openssh-client:amd64 openssh-sftp-server:amd64 openssh-server:amd64 ssh:all wget:amd64 libisc95:amd64 libdns100:amd64 libisccc90:amd64 libisccfg90:amd64 libbind9-90:amd64 liblwres90:amd64 bind9-host:amd64 dnsutils:amd64 krb5-locales:all ncurses-term:all procmail:amd64 librados2:amd64 librbd1:amd64 libradosstriper1:amd64 librgw2:amd64 python-rados:amd64 libcephfs1:amd64 python-cephfs:amd64 python-rbd:amd64 ceph-common:amd64 ceph-base:amd64 ceph-osd:amd64 ceph-mon:amd64 ceph:amd64 python-ceph:amd64 libdbi1:amd64 libio-socket-ssl-perl:all libxml-libxml-perl:amd64 pve-kernel-4.4.98-3-pve:amd64 pve-qemu-kvm:amd64 qemu-server:amd64 pve-container:all pve-manager:amd64 proxmox-ve:all pve-kernel-4.4.76-1-pve:amd64 tcpdump:amd64

fuser -v /var/cache/debconf/config.dat
USER PID ACCESS COMMAND
/var/cache/debconf/config.dat:

This is the hanging command
root 8163 0.0 0.0 63592 17616 ? S 18:28 0:00 /usr/bin/perl -w /usr/share/debconf/frontend /var/lib/dpkg/info/pve-manager.postinst configure 4.4-17

There is nothing special in the logs...
Any Ideas?

fabian · Jan 12, 2018

please post the whole subtree of with the hanging command generated by "ps faxl"

Jospeh Huber · Jan 12, 2018

Unfortunately I killed the command, I was to impatient ;-)

So everything in the dpkg --configure command after pve-manager is unconfigured now - including the kernel and proxmox-ve

"pve-manager:amd64 proxmox-ve:all pve-kernel-4.4.76-1-pve:amd64 tcpdump:amd64"
At the moment the System is broken.

But I can start it again only for the manager with the same result - it is hanging:
0 0 26518 19121 20 0 19020 4728 wait S+ pts/20 0:00 \_ /usr/bin/dpkg --configure pve-manager:amd64
0 0 26519 26518 20 0 63592 17716 wait S+ pts/20 0:00 \_ /usr/bin/perl -w /usr/share/debconf/frontend /var/lib/dpkg/info/pve-manager.postinst configure 4.4-17
0 0 26526 26519 20 0 13328 3084 wait S+ pts/20 0:00 \_ /bin/bash /var/lib/dpkg/info/pve-manager.postinst configure 4.4-17

The problem is in the script "/var/lib/dpkg/info/pve-manager.postinst" configure.
I modified the script with an "set -x" and the output is:
Setting up pve-manager (4.4-21) ...
+ set -e
+ . /usr/share/debconf/confmodule
++ '[' '!' '' ']'
++ PERL_DL_NONLAZY=1
++ export PERL_DL_NONLAZY
++ '[' '' ']'
++ exec /usr/share/debconf/frontend /var/lib/dpkg/info/pve-manager.postinst configure 4.4-17
+ set -e
+ . /usr/share/debconf/confmodule
++ '[' '!' 1 ']'
++ '[' -z '' ']'
++ exec
++ '[' '' ']'
++ exec
++ DEBCONF_REDIR=1
++ export DEBCONF_REDIR
+ db_stop
+ echo STOP
+ case "$1" in
+ mkdir /etc/pve
+ true
+ rm -rf /var/lib/pve-manager/apl-available
+ test -e /etc/cron.daily/pve
+ rm -f /etc/init.d/pvebanner
+ rm -f /etc/init.d/pvenetcommit
++ shuf -i 0-59 -n 1
+ MIN=44
++ shuf -i 2-5 -n 1
+ HOUR=3
+ cat
+ test '!' -e /var/lib/pve-manager/apl-info/download.proxmox.com
+ test -f /root/.forward
+ grep -q '|/usr/bin/pvemailforward' /root/.forward
+ test -f /etc/lsb-base-logging.sh
+ '[' -f /etc/systemd/system/ceph.service ']'
++ md5sum /etc/systemd/system/ceph.service
+ md5='f716952fcc5dda4ecdb153c02627da52 /etc/systemd/system/ceph.service'
+ [[ f716952fcc5dda4ecdb153c02627da52 /etc/systemd/system/ceph.service == \2\1\b\2\e\7\a\7\c\4\f\f\c\f\9\2\a\d\0\e\c\2\c\9\0\5\e\8\8\e\5\b\ \ \/\e\t\c\/\s\y\s\t\e\m\d\/\s\y\s\t\e\m\/\c\e\p\h\.\s\e\r\v\i\c\e ]]
+ systemctl --system daemon-reload
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pvedaemon.service
+ deb-systemd-helper --quiet was-enabled pvedaemon.service
+ deb-systemd-helper enable pvedaemon.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pveproxy.service
+ deb-systemd-helper --quiet was-enabled pveproxy.service
+ deb-systemd-helper enable pveproxy.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask spiceproxy.service
+ deb-systemd-helper --quiet was-enabled spiceproxy.service
+ deb-systemd-helper enable spiceproxy.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pvestatd.service
+ deb-systemd-helper --quiet was-enabled pvestatd.service
+ deb-systemd-helper enable pvestatd.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pvebanner.service
+ deb-systemd-helper --quiet was-enabled pvebanner.service
+ deb-systemd-helper enable pvebanner.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pvenetcommit.service
+ deb-systemd-helper --quiet was-enabled pvenetcommit.service
+ deb-systemd-helper enable pvenetcommit.service
+ for service in pvedaemon pveproxy spiceproxy pvestatd pvebanner pvenetcommit pve-manager
+ deb-systemd-helper unmask pve-manager.service
+ deb-systemd-helper --quiet was-enabled pve-manager.service
+ deb-systemd-helper enable pve-manager.service
+ test '!' -e /proxmox_install_mode
+ for service in pvedaemon pveproxy spiceproxy pvestatd
+ deb-systemd-invoke reload-or-restart pvedaemon
+ for service in pvedaemon pveproxy spiceproxy pvestatd
+ deb-systemd-invoke reload-or-restart pveproxy

If you can help me, I think can go on like this ...
dpkg --configure proxmox-ve:all pve-kernel-4.4.76-1-pve:amd64 tcpdump:amd64
or a
apt-get upgrade
should do the job

fabian · Jan 12, 2018

please post the full output of ps faxl..

Jospeh Huber · Jan 12, 2018

Do need really the full output, it is a running system the output would be about 1.100 lines...
Or is this extract sufficient?

4 0 21543 31254 20 0 82740 5848 poll_s Ss ? 0:00 \_ sshd: root@pts/27
4 0 21655 21543 20 0 24344 6340 wait Ss pts/27 0:00 | \_ -bash
0 0 11611 21655 20 0 19020 4716 wait S+ pts/27 0:00 | \_ /usr/bin/dpkg --configure pve-manager:amd64
0 0 11612 11611 20 0 63608 17592 wait S+ pts/27 0:00 | \_ /usr/bin/perl -w /usr/share/debconf/frontend /var/lib/dpkg/info/pve-manager.postinst configu
0 0 11619 11612 20 0 13328 3092 wait S+ pts/27 0:00 | \_ /bin/bash /var/lib/dpkg/info/pve-manager.postinst configure 4.4-17
4 0 11682 11619 20 0 22484 2564 poll_s S+ pts/27 0:00 | \_ /bin/systemctl reload-or-restart pveproxy
4 0 18747 31254 20 0 82740 5884 - Ss ? 0:00 \_ sshd: root@pts/20
4 0 19121 18747 20 0 24352 6348 wait Ss pts/20 0:00 \_ -bash
0 0 12880 19121 20 0 11204 2640 - R+ pts/20 0:00 \_ ps faxl
4 0 8239 1 20 0 22484 2548 poll_s S ? 0:00 /bin/systemctl reload-or-restart pveproxy
4 0 30723 1 0 -20 17960 5684 pause S<L ? 0:11 /usr/bin/atop -a -w /var/log/atop/atop_20180112 600
4 0 20100 1 20 0 22484 2556 poll_s S pts/20 0:00 /bin/systemctl reload-or-restart pveproxy
4 0 22971 1 20 0 22484 2560 poll_s S ? 0:00 /bin/systemctl reload-or-restart pveproxy

I killed all other restarts of "/bin/systemctl reload-or-restart pveproxy"... but it hangs again.
It looks like the problem is in the restart of pveproxy.

/bin/systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled)
Active: inactive (dead) since Fri 2018-01-12 06:25:06 CET; 6h ago
Main PID: 20008 (code=exited, status=0/SUCCESS)

Jan 12 06:25:02 vmhost2 systemd[1]: Stopping PVE API Proxy Server...
Jan 12 06:25:05 vmhost2 pveproxy[20008]: received signal TERM
Jan 12 06:25:05 vmhost2 pveproxy[20008]: server closing
Jan 12 06:25:05 vmhost2 pveproxy[15525]: worker exit
Jan 12 06:25:05 vmhost2 pveproxy[3584]: worker exit
Jan 12 06:25:05 vmhost2 pveproxy[20008]: worker 15525 finished
Jan 12 06:25:05 vmhost2 pveproxy[20008]: worker 3584 finished
Jan 12 06:25:05 vmhost2 pveproxy[20008]: worker 13075 finished
Jan 12 06:25:05 vmhost2 pveproxy[20008]: server stopped
Jan 12 06:25:06 vmhost2 pveproxy[22983]: worker exit

Jospeh Huber · Jan 12, 2018

It looks like an hanging cronjob which is also blocked by the "/bin/systemctl restart pveproxy.service"

4 0 1648 1 20 0 27504 2840 hrtime Ss ? 0:56 /usr/sbin/cron -f
5 0 22364 1648 20 0 42240 2604 wait S ? 0:00 \_ /usr/sbin/CRON -f
4 0 22367 22364 20 0 4336 820 wait Ss ? 0:00 \_ /bin/sh -c test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
0 0 22368 22367 20 0 4224 668 poll_s S ? 0:00 \_ run-parts --report /etc/cron.daily
0 0 22530 22368 20 0 4336 808 wait S ? 0:00 \_ /bin/sh /etc/cron.daily/logrotate
4 0 22531 22530 20 0 29456 2792 wait S ? 0:00 \_ /usr/sbin/logrotate /etc/logrotate.conf
0 0 22571 22531 20 0 4336 728 wait S ? 0:00 \_ sh -c ??/etc/init.d/pveproxy restart > /dev/null ??/etc/init.d/spiceproxy restart >
/dev/null logrotate_script /var/log/pveproxy/access.log
0 0 22572 22571 20 0 4336 1668 wait S ? 0:00 \_ /bin/sh /etc/init.d/pveproxy restart
4 0 22583 22572 20 0 22484 2632 poll_s S ? 0:00 \_ /bin/systemctl restart pveproxy.service

But anyway, if I kill all these restart jobs I can not start or restart it on the command line manually.
" service pveproxy start" hangs ...
There is nothing in the logs /var/log/syslog messages pveproxy
In my opinion the problem is in pveproxy which cannot be started.

How can I solve this problem without rebooting?
... a reboot of the unconfigured kernel and system will fail...

fabian · Jan 12, 2018

in that case the question is why pveproxy does not restart - are there still leftover pveproxy processes? if so, can you do "cat /proc/PID/stack" for each of their pids? what happens if you kill -9 them? is there anything else in the logs that looks suspicious? can you please provide "pveversion -v"

Jospeh Huber · Jan 12, 2018

No there is no pve-proxy running.
ps waux | grep pveproxy

pveversion -v
proxmox-ve: not correctly installed (running kernel: 4.4.76-1-pve)
pve-manager: not correctly installed (running version: 4.4-21/e0dadcf8)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.98-3-pve: 4.4.98-103
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.15-1-pve: 4.4.15-60
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.16-1-pve: 4.4.16-64
pve-kernel-4.4.24-1-pve: 4.4.24-72
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.76-1-pve: 4.4.76-94
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-54
qemu-server: 4.0-114
pve-firmware: 1.1-11
libpve-common-perl: 4.0-96
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.9.1-5~pve4
pve-container: 1.0-104
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
drbdmanage: not correctly installed
ceph: 10.2.10-1~bpo80+1

Is this the problem with the hanging clustered filesystem?
forum.proxmox.com/threads/pveproxy-crashes-unable-to-start-it.35144/

Jospeh Huber · Jan 12, 2018

Jospeh Huber said:
No there is no pve-proxy running.

Is this the problem with the hanging clustered filesystem?
forum.proxmox.com/threads/pveproxy-crashes-unable-to-start-it.35144/

"pve-cluster restart" on the corrupt node did not help...

Jospeh Huber · Jan 12, 2018

# this hangs
service pveproxy start

# ps tree
0 19121 18747 20 0 24376 6376 wait Ss pts/20 0:00 \_ -bash
4 0 2761 19121 20 0 22484 2540 poll_s S+ pts/20 0:00 \_ systemctl start pveproxy.service
0 0 2795 2761 20 0 13176 1544 poll_s S+ pts/20 0:00 \_ /bin/systemd-tty-ask-password-agent --watch

cat /proc/2761/stack
[<ffffffff812246c9>] poll_schedule_timeout+0x49/0x70
[<ffffffff81225d52>] do_sys_poll+0x442/0x560
[<ffffffff8122618d>] SyS_ppoll+0x17d/0x1b0
[<ffffffff81865d76>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

cat /proc/2795/stack
[<ffffffff812246c9>] poll_schedule_timeout+0x49/0x70
[<ffffffff81225d52>] do_sys_poll+0x442/0x560
[<ffffffff81225f87>] SyS_poll+0x97/0x120
[<ffffffff81865d76>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

Any ideas?

Jospeh Huber · Jan 12, 2018

Is it possible and safe to reboot and try it again to configure?

All packages except proxmox-ve and pve-manager are configured, also the kernel.

/usr/bin/dpkg --configure proxmox-ve:all
dpkg: dependency problems prevent configuration of proxmox-ve:
proxmox-ve depends on pve-manager; however:
Package pve-manager is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
dependency problems - leaving unconfigured
Errors were encountered while processing:
proxmox-ve

dpkg -l:
iU proxmox-ve 4.4-103 all The Proxmox Virtual Environment
ii psmisc 22.21-2 amd64 utilities that use the proc file system
ii pve-cluster 4.0-54 amd64 Cluster Infrastructure for Proxmox Virtual Environment
ii pve-container 1.0-104 all Proxmox VE Container management tool
ii pve-docs 4.4-4 all Proxmox VE Documentation
ii pve-firewall 2.0-33 amd64 Proxmox VE Firewall
ii pve-firmware 1.1-11 all Binary firmware code for the pve-kernel
ii pve-ha-manager 1.0-41 amd64 Proxmox VE HA Manager
ii pve-kernel-4.4.15-1-pve 4.4.15-60 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.16-1-pve 4.4.16-64 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.19-1-pve 4.4.19-66 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.24-1-pve 4.4.24-72 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.35-1-pve 4.4.35-77 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.35-2-pve 4.4.35-79 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.59-1-pve 4.4.59-87 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.6-1-pve 4.4.6-48 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.76-1-pve 4.4.76-94 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-4.4.98-3-pve 4.4.98-103 amd64 The Proxmox PVE Kernel Image
ii pve-libspice-server1 0.12.8-2 amd64 SPICE remote display system server library
iF pve-manager 4.4-21 amd64 The Proxmox Virtual Environment
ii pve-qemu-kvm 2.9.1-5~pve4 amd64 Full virtualization on x86 hardware

Jospeh Huber · Jan 15, 2018

@fabian: Any ideas?

fabian · Jan 15, 2018

what happens if you kill the "0 0 2795 2761 20 0 13176 1544 poll_s S+ pts/20 0:00 \_ /bin/systemd-tty-ask-password-agent --watch" process?

Jospeh Huber · Jan 15, 2018

"kill" and "kill -9" ... nothing... I have tried some other "things" to ... I have absolutely no idea!

4 0 25072 24520 20 0 22484 2600 poll_s S+ pts/21 0:00 \_ systemctl start pveproxy.service
0 0 25093 25072 20 0 0 0 exit Z+ pts/21 0:00 \_ [systemd-tty-ask] <defunct>

What do you mean, is it possible and safe to reboot and try it again to configure?
... but I think there is no other option

fabian · Jan 15, 2018

it probably won't make the situation any worse than it already is..

Jospeh Huber · Jan 15, 2018

Strange, the reboot solved my problem.

After that I could configure the two unconfigured packages:
dpkg --configure proxmox-ve:all pve-manager:amd64

Unitl now, the problem occured only on one host of 7 others ...

If somebody else has this issue, be sure that as many as possible or all other packages are configured before rebooting.
I have manually removed the dpkg locks and called the configure starting after the first failed package:
dpkg --configure proxmox-ve:all pve-kernel-4.4.76-1-pve:amd64 tcpdump:amd64

Search

Search

[SOLVED] Proxmox 4 Upgrade Hanging "Setting up pve-manager"

Jospeh Huber

Renowned Member

fabian

Proxmox Staff Member

Jospeh Huber

Renowned Member

fabian

Proxmox Staff Member

Jospeh Huber

Renowned Member

Jospeh Huber

Renowned Member

fabian

Proxmox Staff Member

Jospeh Huber

Renowned Member

Jospeh Huber

Renowned Member

Jospeh Huber

Renowned Member

Jospeh Huber

Renowned Member

Jospeh Huber

Renowned Member

fabian

Proxmox Staff Member

Jospeh Huber

Renowned Member

fabian

Proxmox Staff Member

Jospeh Huber

Renowned Member

We value your privacy