invalid PVE ticket (401) on 8 node PVE 6.4 cluster / chrony ?

Mar 23, 2021
23
0
6
51
Hi,

After 292 days of uptime without issue on our 8 node PVE 6.4 cluster (with subscription, maintained up to date) we started to randomly get "invalid PVE ticket (401)" messages on the web UI.

There are multiple threads about this particular error message on the forum, in some cases it seemed to have been solved by replacing systemd-timesyncd by chrony.

The wiki mentions chrony but only in the context of PVE 7:

https://pve.proxmox.com/wiki/Time_Synchronization

Is it safe to "apt-get install chrony" on a 8 nodes PVE 6.4 cluster? I tried on a standalone PVE 6.4 and it just stopped systemd-timesyncd but it seems not to be masked:

Code:
# systemctl status systemd-timesyncd
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /usr/lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: inactive (dead) since Thu 2022-04-21 14:48:52 CEST; 1min 39s ago

Also we looked at various logs, incluing following "journalctl -f" on multiple nodes but without any logged error while the web UI shows the "invalid PVE ticket", is there a way to debug what happens?

Thanks for your help!
 
Last edited:
Code:
# pveversion --verbose
proxmox-ve: 6.4-1 (running kernel: 5.4.119-1-pve)
pve-manager: 6.4-14 (running version: 6.4-14/15e2bf61)
pve-kernel-5.4: 6.4-15
pve-kernel-helper: 6.4-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.15-pve1~bpo10
ceph-fuse: 15.2.15-pve1~bpo10
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve4~bpo10
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-2
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1
 
We still get random 401, while looking more closely at some PVE processes :

Code:
www-data    2900  0.0  0.0 354016 140092 ?       Ss    2021  14:28 pveproxy
www-data 1774629  0.3  0.0 364960 136240 ?       S    08:56   0:10  \_ pveproxy worker
www-data 1796770  0.4  0.0 365224 136336 ?       S    09:09   0:10  \_ pveproxy worker
www-data 1840035  0.6  0.0 364496 135312 ?       S    09:35   0:04  \_ pveproxy worker

root@r640b:~# systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2021-07-03 13:02:25 CEST; 9 months 18 days ago
  Process: 868855 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)
 Main PID: 2900 (pveproxy)
    Tasks: 4 (limit: 618170)
   Memory: 203.3M
   CGroup: /system.slice/pveproxy.service
           ├─   2900 pveproxy
           ├─1774629 pveproxy worker
           ├─1796770 pveproxy worker
           └─1840035 pveproxy worker

Apr 22 09:09:34 r640b pveproxy[2900]: worker 1796770 started
Apr 22 09:09:39 r640b pveproxy[1796770]: Clearing outdated entries from certificate cache
Apr 22 09:20:41 r640b pveproxy[1713846]: Clearing outdated entries from certificate cache
Apr 22 09:26:52 r640b pveproxy[1774629]: Clearing outdated entries from certificate cache
Apr 22 09:35:24 r640b pveproxy[1713846]: worker exit
Apr 22 09:35:24 r640b pveproxy[2900]: worker 1713846 finished
Apr 22 09:35:24 r640b pveproxy[2900]: starting 1 worker(s)
Apr 22 09:35:24 r640b pveproxy[2900]: worker 1840035 started
Apr 22 09:36:20 r640b pveproxy[1840035]: Clearing outdated entries from certificate cache
Apr 22 09:40:04 r640b pveproxy[1796770]: Clearing outdated entries from certificate cache

It looks like the pveproxy main process (pid 2900) hasn't been restarted since its initial launch 9 monthes ago. Same for most of PVE service processes, eg:

Code:
# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
   Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2021-07-03 13:02:23 CEST; 9 months 18 days ago
 Main PID: 2864 (pvestatd)
    Tasks: 1 (limit: 618170)
   Memory: 214.9M
   CGroup: /system.slice/pvestatd.service
           └─2864 pvestatd

Is it normal? Should we restart these processes through the PVE web UI or systemctl?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!