Problems after upgrading the cluster from 8 to 9

Kosh

Well-Known Member
Dec 24, 2019
94
7
48
45
Hi.
After upgrading the entire cluster (most of the machines haven't rebooted yet and are running the old 6.5 kernel), we encountered poor web interface performance. Everything started opening and running very slowly, and it was throwing "loading" errors, "broken pipe 596." I can't find the cause. There are no obvious problems in the logs, or I don't see any.


1759395779457.png
 
Hi!

Which exact version of pve-manager have you upgraded to (e.g. pveversion)? Could you post a syslog from one of the nodes you are experiencing the problems on?
 
Hi!

Which exact version of pve-manager have you upgraded to (e.g. pveversion)? Could you post a syslog from one of the nodes you are experiencing the problems on?
pveversion
pve-manager/9.0.6/49c767b70aeb6648 (running kernel: 6.14.11-1-pve)


Code:
journalctl -f
Oct 02 12:21:31 cloud-p001 pveproxy[2091468]: proxy detected vanished client connection
Oct 02 12:21:31 cloud-p001 pveproxy[2091468]: proxy detected vanished client connection
Oct 02 12:21:31 cloud-p001 pveproxy[2091468]: proxy detected vanished client connection
Oct 02 12:21:31 cloud-p001 pveproxy[2091468]: proxy detected vanished client connection
Oct 02 12:21:34 cloud-p001 corosync[4007843]:   [TOTEM ] Retransmit List: 6cc2f
Oct 02 12:22:00 cloud-p001 corosync[4007843]:   [TOTEM ] Retransmit List: 6cfad
Oct 02 12:22:00 cloud-p001 corosync[4007843]:   [TOTEM ] Retransmit List: 6cfd2
Oct 02 12:22:08 cloud-p001 pvedaemon[1949228]: <root@pam> successful auth for user 'user'
Oct 02 12:22:12 cloud-p001 pvescheduler[2132589]: replication: cfs-lock 'file-replication_cfg' error: got lock request timeout
Oct 02 12:22:13 cloud-p001 pvescheduler[2132590]: jobs: cfs-lock 'file-jobs_cfg' error: got lock request timeout
Oct 02 12:22:20 cloud-p001 corosync[4007843]:   [TOTEM ] Retransmit List: 6d41c
Oct 02 12:22:20 cloud-p001 corosync[4007843]:   [TOTEM ] Retransmit List: 6d41d
Oct 02 12:22:20 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection
Oct 02 12:22:20 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection
Oct 02 12:22:20 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection
Oct 02 12:22:20 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection
Oct 02 12:22:20 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection
Oct 02 12:22:23 cloud-p001 pveproxy[2086543]: proxy detected vanished client connection



The problem is that there isn't just one server that's performing poorly, it's a general problem for the entire cluster, and no matter which node you log into, it's always the same - very slow web performance.

pveversion
Code:
pve-manager/9.0.6/49c767b70aeb6648 (running kernel: 6.14.11-1-pve)
root@cloud-p001:~# pveversion -v
proxmox-ve: 9.0.0 (running kernel: 6.14.11-1-pve)
pve-manager: 9.0.6 (running version: 9.0.6/49c767b70aeb6648)
proxmox-kernel-helper: 9.0.4
pve-kernel-5.15: 7.4-6
proxmox-kernel-6.14.11-1-pve-signed: 6.14.11-1
proxmox-kernel-6.14: 6.14.11-1
proxmox-kernel-6.8: 6.8.12-4
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
proxmox-kernel-6.5.13-3-pve-signed: 6.5.13-3
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 19.2.3-pve1
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.3.1-1+pve4
ifupdown2: 3.3.0-1+pmx10
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.3
libpve-apiclient-perl: 3.4.0
libpve-cluster-api-perl: 9.0.6
libpve-cluster-perl: 9.0.6
libpve-common-perl: 9.0.9
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.4
libpve-network-perl: 1.1.6
libpve-rs-perl: 0.10.10
libpve-storage-perl: 9.0.13
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.4-2
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.0.14-1
proxmox-backup-file-restore: 4.0.14-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.1.2
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.1
proxmox-widget-toolkit: 5.0.5
pve-cluster: 9.0.6
pve-container: 6.0.9
pve-docs: 9.0.8
pve-edk2-firmware: 4.2025.02-4
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.3
pve-firmware: 3.16-4
pve-ha-manager: 5.0.4
pve-i18n: 3.5.2
pve-qemu-kvm: 10.0.2-4
pve-xtermjs: 5.5.0-2
qemu-server: 9.0.19
smartmontools: 7.4-pve1
spiceterm: 3.4.0
swtpm: 0.8.0+pve2
vncterm: 1.9.0
zfsutils-linux: 2.3.4-pve1
 
Last edited:
Does the performance improve if you upgrade to a newer version of pve-manager? There were some performance-related improvements in pve-manager >= 9.0.8.
 
  • Like
Reactions: Kosh
Does the performance improve if you upgrade to a newer version of pve-manager? There were some performance-related improvements in pve-manager >= 9.0.8.
I'll try to update tomorrow and I'll definitely report back on the results.