Summary
VM (QEMU) consoles via noVNC reliably disconnect after a few seconds, on every QEMU VM, while LXC console (termproxy) works fine. This happens even when connecting from a Windows VM running on the same host, ruling out client-side network/proxy/firewall issues. pveproxy logs show SSL routines::record layer failure at the exact moment of disconnect, and packet capture confirms the server sends the first FIN, with the underlying TLS session failing mid-stream.
Environment
pve-manager: 9.2.3 (running version: 9.2.3/d0fde103346cf89a)
libpve-http-server-perl: 6.0.5
novnc-pve: 1.7.0-1
libnet-ssleay-perl: 1.94-3
libssl3t64: 3.5.6-1~deb13u2
openssl: OpenSSL 3.5.6 7 Apr 2026
machine: pc-i440fx-11.0 (also reproduced on other machine types)
Single standalone node. No reverse proxy, no Cloudflare, no firewall/iptables rules affecting port 8006 (confirmed clean conntrack, no DROP/REJECT rules). Confirmed not MTU/fragmentation related (DF-ping at 1472 and 1400 bytes both succeed cleanly to all tested clients).
Steps to reproduce
- Open any QEMU VM's console (noVNC) from the web UI.
- Console connects successfully (HTTP/1.1 101 Switching Protocols, WebSocket upgrade succeeds, VNC handshake completes).
- After roughly 1–10 seconds of normal operation, the connection drops with the noVNC error Failed when connecting: Connection closed (code: 1006).
- Browser console shows a burst of net::ERR_SSL_PROTOCOL_ERROR across all concurrent requests to port 8006 (API calls, static assets, the websocket itself) — not just the VNC socket.
- LXC console (termproxy) on the same host does not exhibit this issue and stays connected normally for the same duration of testing.
Reproduced with:
- External client over public internet (different ISPs/source IPs)
- A client browser running on a Windows VM hosted on the same Proxmox node (rules out external network entirely)
- Multiple different QEMU VMs (different VMIDs, different vga settings tested: default, std, cirrus)
Server-side logs at the moment of disconnect
pveproxy[...]: problem with client ::ffff:<client-ip>; error:0A000139:SSL routines::record layer failure
pveproxy -debug output around the failure shows a clean WebSocket upgrade followed shortly after by client_do_disconnect, with no error printed by the AnyEvent/http-server layer itself — the failure originates lower, in OpenSSL's record layer, not in PVE::APIServer::AnyEvent.
Packet capture findings
Captured with tcpdump on the host (tap*, fwln*, fwpr*, vmbr0 chain for the relevant VM) while reproducing from a same-host Windows VM client (65.109.121.94), filtering only that client's traffic (excluding background bot/scanner traffic on the public interface):
server.8006 > client.<port>: Flags [F.] <- server sends FIN first
client.<port> > server.8006: Flags [R.] <- client RSTs in response
This pattern repeats across multiple concurrent TCP connections almost simultaneously (4 separate sessions, ports 54896/54897/54899/54900, all closed by the server within ~1.5 seconds of each other), correlating exactly with the record layer failure log lines. This strongly suggests the failure is not connection-specific but tied to a shared resource/state in the pveproxy worker process or the underlying OpenSSL context at that moment.
What has been ruled out
- Client-side network, VPN/proxy, browser, OS — reproduced from a VM on the same physical host
- Firewall rules / iptables / Docker conntrack interference — clean conntrack table (376/262144), no relevant DROP/REJECT rules, no NAT rules touching port 8006
- MTU/fragmentation — DF-ping tests at 1400 and 1472 bytes both succeed without fragmentation
- pveproxy worker resource exhaustion — file descriptor count (14) and process limits (1024 soft / 524288 hard) far from any limit; memory usage normal
- Certificate issues — certs valid, pvecm updatecerts --force and service restart did not change behavior
- VM-specific config (vga type, machine type, cpu type) — reproduced across different VM configurations
Hypothesis
Given the failure is isolated to noVNC/vncwebsocket sessions specifically (not LXC termproxy, not plain HTTPS API calls under openssl s_client), and correlates with larger/bursty TLS record traffic typical of VNC framebuffer updates, this looks like a possible regression in how libpve-http-server-perl 6.0.5 / Net::SSLeay 1.94-3 interacts with OpenSSL 3.5.6 when handling sustained WebSocket traffic with larger TLS records, possibly related to TLS session ticket rotation or buffer handling under this specific combination of versions.
Question for the Proxmox team
- Is anyone else seeing SSL routines::record layer failure specifically on noVNC/vncwebsocket sessions on PVE 9.2.x with OpenSSL 3.5.6?
- Is there a known interaction between libpve-http-server-perl 6.0.5 and recent OpenSSL 3.5.x point releases affecting long-lived WebSocket TLS sessions?
- Any recommended TLS cipher/protocol restriction in /etc/default/pveproxy known to work around this?
Happy to provide full pveproxy -debug output, complete pcap, or run further tests as needed.
Last edited: