PVE is not starting / hangs and not upgrading

rodionov12

New Member
Nov 30, 2023
1
1
3
Hi! I have 2 nodes cluster
I upgraded to 7.14.2 last week and then have a problem with node interconnection: 401 errors, questions in web ui, etc, but all VM's works okay. Till yesterday. Yesterday it stops communicating between each other, and I reboot 1 of this (which stops responding).

After that, it boots too long (20-30 minutes), and no start web ui, auto boot only one VM (but I have totally 3 to auto start). I tried to upgrade 2 nodes to 8.1.3, and this bad node does not completely to upgrade. In journalctl/dmesg it always show timeouts.

pve-manager


Code:
[  725.573099] INFO: task pvestatd:2101 blocked for more than 604 seconds.
[  725.573113]       Tainted: P           O       6.5.11-6-pve #1
[  725.573121] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.


Code:
[  604.741985] INFO: task pvecm:2132 blocked for more than 483 seconds.
[  604.741993]       Tainted: P           O       6.5.11-6-pve #1
[  604.742000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  604.742008] task:pvecm           state:D stack:0     pid:2132  ppid:1      flags:0x00004006
[  604.742013] Call Trace:
[  604.742014]  <TASK>
[  604.742017]  __schedule+0x3fd/0x1450
[  604.742021]  ? srso_return_thunk+0x5/0x10
[  604.742025]  ? __wake_up_common_lock+0x8b/0xd0
[  604.742035]  schedule+0x63/0x110
[  604.742039]  request_wait_answer+0x1be/0x2a0
[  604.742044]  ? __pfx_autoremove_wake_function+0x10/0x10
[  604.742050]  fuse_simple_request+0x19d/0x2d0
[  604.742055]  fuse_create_open+0x243/0x570
[  604.742070]  fuse_atomic_open+0x139/0x180
[  604.742075]  path_openat+0x70f/0x1180
[  604.742084]  do_filp_open+0xaf/0x170
[  604.742097]  do_sys_openat2+0xb3/0xe0
[  604.742101]  ? handle_mm_fault+0xad/0x360
[  604.742106]  __x64_sys_openat+0x6c/0xa0
[  604.742111]  do_syscall_64+0x5b/0x90
[  604.742115]  ? srso_return_thunk+0x5/0x10
[  604.742119]  ? exc_page_fault+0x94/0x1b0
[  604.742124]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8


Code:
[  483.909330] INFO: task pvecm:2132 blocked for more than 362 seconds.
[  483.909338]       Tainted: P           O       6.5.11-6-pve #1
[  483.909345] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  483.909353] task:pvecm           state:D stack:0     pid:2132  ppid:1      flags:0x00004006


Code:
[  604.741773] INFO: task pvestatd:2101 blocked for more than 483 seconds.
[  604.741789]       Tainted: P           O       6.5.11-6-pve #1
[  604.741796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.


Code:
Cluster information
-------------------
Name:             computing
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Nov 30 09:56:18 2023
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.1717
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.1.3.3 (local)
0x00000002          1 10.1.3.5


Code:
Job for pveproxy.service failed because a timeout was exceeded.
See "systemctl status pveproxy.service" and "journalctl -xeu pveproxy.service" for details.
dpkg: error processing package pve-manager (--configure):
 installed pve-manager package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on pve-manager (>= 8.0.4); however:
  Package pve-manager is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 pve-manager
 proxmox-ve
E: Sub-process /usr/bin/dpkg returned an error code (1)
 
Last edited:
  • Like
Reactions: mfkito