pveproxy hanging

blackpaw · Dec 20, 2017

proxmox 4.4
nosub repo

I dist-upraded two nodes 11-Dec. Now both those nodes have multiple unkillable pveproxy processes. dmesg has many entries of:

[50996.416909] INFO: task pveproxy:6798 blocked for more than 120 seconds.
[50996.416914] Tainted: P O 4.4.95-1-pve #1
[50996.416918] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[50996.416922] pveproxy D ffff8809194e3df8 0 6798 1 0x00000004
[50996.416925] ffff8809194e3df8 ffff880ff6f5ed80 ffff880ff84fe200 ffff880fded5e200
[50996.416927] ffff8809194e4000 ffff880fc7fb43ac ffff880fded5e200 00000000ffffffff
[50996.416929] ffff880fc7fb43b0 ffff8809194e3e10 ffffffff818643b5 ffff880fc7fb43a8

cluster file system is fine
pvesm returns all storage ok.
pvecm status is normal
qm list and qm migrate just hang.
can't connect to the webgui on the two ndoes in question.
The 3rd node that I didn't upgrade is fine, no problems.

It took a remote hard reset to bring the nodes back. This has happen mutliple times since unfortunately.

systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled)
Active: failed (Result: timeout) since Wed 2017-12-20 06:49:06 AEST; 3h 44min ago
Main PID: 4325 (code=exited, status=0/SUCCESS)

Dec 20 06:46:06 vng systemd[1]: pveproxy.service start operation timed out. Terminating.
Dec 20 06:47:36 vng systemd[1]: pveproxy.service stop-final-sigterm timed out. Killing.
Dec 20 06:49:06 vng systemd[1]: pveproxy.service still around after final SIGKILL. Entering failed mode.
Dec 20 06:49:06 vng systemd[1]: Failed to start PVE API Proxy Server.
Dec 20 06:49:06 vng systemd[1]: Unit pveproxy.service entered failed state.

Anyone else seeing this?

Search

Search

pveproxy hanging

blackpaw

Renowned Member