[SOLVED] Can't login and timeout when restarting services

tsulhc

New Member
Mar 17, 2022
1
0
1
32
Hey guys!

After adding a node to the cluster I cannot login anymore on the gui and I can't restart pvedaemon or pveproxy or use pvecm updatecerts --force.

pveversion -v

Bash:
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-9
pve-kernel-5.13.19-6-pve: 5.13.19-14
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1

The cluster quorate without problems

Bash:
Cluster information
-------------------
Name:             Netpass
Config Version:   32
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Mar 17 17:58:49 2022
Quorum provider:  corosync_votequorum
Nodes:            8
Node ID:          0x00000008
Ring ID:          1.8a741
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   8
Highest expected: 8
Total votes:      8
Quorum:           5
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.15.70
0x00000002          1 192.168.15.21
0x00000003          1 192.168.15.80
0x00000004          1 192.168.15.50
0x00000005          1 192.168.15.90
0x00000006          1 192.168.15.40
0x00000007          1 192.168.15.60
0x00000008          1 192.168.15.30 (local)

systemctl status pveproxy

Bash:
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-03-16 20:20:57 CET; 21h ago
    Process: 1923 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
    Process: 1926 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 1929 (pveproxy)
      Tasks: 4 (limit: 77016)
     Memory: 228.1M
        CPU: 10.015s
     CGroup: /system.slice/pveproxy.service
             ├─  1929 pveproxy
             ├─  1930 pveproxy worker
             ├─507590 pveproxy worker
             └─517540 pveproxy worker

Mar 17 13:02:13 mikasa pveproxy[1931]: proxy detected vanished client connection
Mar 17 13:02:13 mikasa pveproxy[1930]: proxy detected vanished client connection
Mar 17 13:02:37 mikasa pveproxy[1931]: worker exit
Mar 17 13:02:37 mikasa pveproxy[1929]: worker 1931 finished
Mar 17 13:02:37 mikasa pveproxy[1929]: starting 1 worker(s)
Mar 17 13:02:37 mikasa pveproxy[1929]: worker 517540 started
Mar 17 13:03:14 mikasa pveproxy[517540]: Clearing outdated entries from certificate cache
Mar 17 13:06:16 mikasa pveproxy[507590]: proxy detected vanished client connection
Mar 17 17:36:58 mikasa pveproxy[517540]: proxy detected vanished client connection
Mar 17 18:10:52 mikasa pveproxy[1930]: proxy detected vanished client connection #Every time I try to login

systemctl restart pveproxy

Bash:
Job for pveproxy.service failed because a timeout was exceeded.
See "systemctl status pveproxy.service" and "journalctl -xe" for details.

systemctl status pveproxy.service

Bash:
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: deactivating (final-sigterm) (Result: timeout) since Thu 2022-03-17 18:41:01 CET; 22min ago
    Process: 696475 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=killed, signal=KILL)
      Tasks: 3 (limit: 77016)
     Memory: 156.0M
        CPU: 182ms
     CGroup: /system.slice/pveproxy.service
             ├─687710 /usr/bin/perl -T /usr/bin/pveproxy stop
             ├─693698 /usr/bin/perl /usr/bin/pvecm updatecerts --silent
             └─696476 /usr/bin/perl /usr/bin/pvecm updatecerts --silent

Mar 17 18:59:04 mikasa systemd[1]: Starting PVE API Proxy Server...
Mar 17 18:59:35 mikasa pvecm[696475]: got timeout
Mar 17 19:00:35 mikasa systemd[1]: pveproxy.service: start-pre operation timed out. Terminating.
Mar 17 19:02:05 mikasa systemd[1]: pveproxy.service: State 'stop-sigterm' timed out. Killing.
Mar 17 19:02:05 mikasa systemd[1]: pveproxy.service: Killing process 696475 (pvecm) with signal SIGKILL.
Mar 17 19:02:05 mikasa systemd[1]: pveproxy.service: Killing process 687710 (pveproxy) with signal SIGKILL.
Mar 17 19:02:05 mikasa systemd[1]: pveproxy.service: Killing process 693698 (pvecm) with signal SIGKILL.
Mar 17 19:02:05 mikasa systemd[1]: pveproxy.service: Killing process 696476 (pvecm) with signal SIGKILL.

service pvedaemon restart

Bash:
Job for pvedaemon.service failed because a timeout was exceeded.
See "systemctl status pvedaemon.service" and "journalctl -xe" for details.

systemctl status pvedaemon.service

Bash:
● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
     Active: activating (start) since Thu 2022-03-17 18:32:11 CET; 17s ago
Cntrl PID: 683742 (pvedaemon)
      Tasks: 6 (limit: 77016)
     Memory: 355.1M
        CPU: 283ms
     CGroup: /system.slice/pvedaemon.service
             ├─134389 pvedaemon worker
             ├─134390 pvedaemon worker
             ├─134391 pvedaemon worker
             ├─677909 /usr/bin/perl -T /usr/bin/pvedaemon stop
             ├─680793 /usr/bin/perl -T /usr/bin/pvedaemon start
             └─683742 /usr/bin/perl -T /usr/bin/pvedaemon start

pvecm updatecerts --force

Bash:
(re)generate node files
generate new node certificate
got timeout

ntp is synced on all nodes.

I can only access to the gui or restart services killing the cluster and reset quorum with pvecm expected 1.

EDIT: probably i was just impatient, the issue resolved itself after a night.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!