Degrading proxy connection to ceph, timeouts with HTTP response code 596

Jan 6, 2026
1
2
1
Happy New Year,

since the end of November last year we are experiencing issues with one of our proxmox clusters, which we cannot get a grip on. The issue first appeared as a terraform provider tried to create some resources. The runs frequently failed with HTTP response code 596, timeout when hitting the Proxmox API.
The error results from calling:
Bash:
curl -ik https://f-cpu-104-01:8006/api2/json/nodes/f-cpu-64-01/storage/p-f-vm-twice/status -H "Authorization: PVEAPIToken=${PVEAPIToken}"
HTTP/1.1 596 Connection timed out
Cache-Control: max-age=0
Connection: close
Date: Tue, 30 Dec 2025 16:37:04 GMT
Pragma: no-cache
Server: pve-api-daemon/3.0
Expires: Tue, 30 Dec 2025 16:37:04 GMT

Since the issue did not happen deterministically a colleague wrote a quick python script to check whether these timeouts happen for all queried nodes and via all API entry nodes. The graph is attached. Y-axis are the HTTP status codes (high = 596 = bad, low = 200 = good), x-axis is the time. The images' rows are the different hosts we tried to query the API against, the images' columns are the nodes we test for, and the color is the storage (which does not matter, as the image shows).

1767704687381.png
Initially, we discovered one node (2-01) to be "especially" unhealthy. For it no call succeeded, as can be seen in the third row where all the dots are at 596, and none had return code 200. On the server side we found error messages in the pveproxy service, which corresponded to our requests. Furthermore the ceph reported unhealthy status on that node.
Code:
/var/log/pveproxy/access.log:::ffff:192.168.199.153 - <redacted user and token> [...] "GET /api2/json/nodes/f-cpu-64-01/storage/p-f-vm-twice/status HTTP/1.1" 596 -

We restarted several services, which did not help. Finally we rebooted the node 2-01, which resolved the issue temporarily. After a few days the performance degraded again. When accessing ceph information via Proxmox UI, it shows the same timeouts, the rest of the Proxmox UI is unaffected. We changed its network configuration of the mentioned node and finally removed it from the cluster entirely, but the issue returned again, suggesting it lies elsewhere.

As a next step we upgraded our whole setup in December from 9.0 to the then most recent 9.1 version, which did not resolve the issue either. In one occurrence over the years, it was enough to restart the pveproxy service to get rid of the issue. This is especially confusing since it did not heal the problem earlier. We tried this first when we found the above mentioned output in the pveproxy log.
My colleague also tried to dig into the code of the pveproxy, to encircle the issue, but did not come to a conclusion yet.

Code:
pveversion --verbose

proxmox-ve: 9.1.0 (running kernel: 6.14.11-4-pve)
pve-manager: 9.1.2 (running version: 9.1.2/9d436f37a0ac4172)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.4-1-pve-signed: 6.17.4-1
proxmox-kernel-6.17: 6.17.4-1
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
proxmox-kernel-6.14: 6.14.11-4
proxmox-kernel-6.8: 6.8.12-16
proxmox-kernel-6.8.12-16-pve-signed: 6.8.12-16
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.1
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.4
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.0-1
proxmox-backup-file-restore: 4.1.0-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.5
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.1
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.0.8
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-4
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.2
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1

Code:
sudo pvecm status

Cluster information
-------------------
Name:             fec-cluster
Config Version:   37
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Wed Jan  7 10:49:57 2026
Quorum provider:  corosync_votequorum
Nodes:            31
Node ID:          0x00000005
Ring ID:          3.1ac9
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   31
Highest expected: 31
Total votes:      31
Quorum:           16
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000003          1 192.168.199.149
0x00000005          1 192.168.199.130 (local)
0x00000006          1 192.168.199.131
0x00000007          1 192.168.199.132
0x00000008          1 192.168.199.133
0x00000009          1 192.168.199.134
0x0000000a          1 192.168.199.135
0x0000000b          1 192.168.199.136
0x0000000c          1 192.168.199.137
0x0000000d          1 192.168.199.138
0x0000000e          1 192.168.199.141
0x0000000f          1 192.168.199.142
0x00000010          1 192.168.199.143
0x00000011          1 192.168.199.144
0x00000012          1 192.168.199.145
0x00000013          1 192.168.199.146
0x00000014          1 192.168.199.147
0x00000015          1 192.168.199.148
0x00000016          1 192.168.199.150
0x00000017          1 192.168.199.152
0x00000018          1 192.168.199.153
0x00000019          1 192.168.199.154
0x0000001a          1 192.168.199.155
0x0000001b          1 192.168.199.156
0x0000001c          1 192.168.199.157
0x0000001d          1 192.168.199.158
0x0000001e          1 192.168.199.159
0x0000001f          1 192.168.199.160
0x00000020          1 192.168.199.161
0x00000021          1 192.168.199.162
0x00000022          1 192.168.199.163

Code:
/var/log/pveproxy/access.log:::ffff:192.168.199.153 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:14:56:23 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-vm-twice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:192.168.199.153 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:14:56:23 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-vm-thrice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:192.168.199.130 - user-one@fme-intern [19/12/2025:14:56:24 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/CephFS/rrddata?timeframe=month&cf=MAX HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:10.142.100.137 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:15:08:06 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-vm-thrice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:10.142.100.137 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:15:08:06 +0100] "GET /api2/json/nodes/f-data-hdd-02/storage/p-f-vm-twice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:192.168.199.153 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:15:09:06 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-vm-twice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:192.168.199.153 - friday-nu-t1@fme-intern!tofu_token [19/12/2025:15:09:06 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-vm-thrice/status HTTP/1.1" 596 -
/var/log/pveproxy/access.log:::ffff:192.168.199.144 - user-two@fme-intern [19/12/2025:15:10:39 +0100] "GET /api2/json/nodes/f-data-hdd-01/storage/p-f-data-nvme-twice/status HTTP/1.1" 596 -

Can anyone help us to find the root cause here?
 
Happy New Year to you too!

It seems they have fixed the 30-second timeout (which existed in 8.4) in /usr/share/perl5/PVE/APIServer/AnyEvent.pm at line 835. We just set it to 300.
do not forget to restart pveproxy and pvedaemon