[SOLVED] Proxmox Backup Proxy seemingly crashes after being accessed through reverse proxy after update to 4.0

nett_hier

Member
Mar 27, 2023
28
2
8
Hello,

I just upgraded my PBS to 4.0 and noticed that I'm no longer able to access it through the reverse proxy hosted on the same server (tailscale serve).
After restarting the proxmox-backup-proxy service I can access it via 8007, but even that stops working after I try to open the proxied version in the browser. Accessing the proxied version via curl does not trigger the crash, so I'm assuming it has to do something with one of the assets being fetched.
Code:
$ curl https://localhost:8007 -k
<!DOCTYPE html>
(etc.)

$ curl https://<proxied-host>
<!DOCTYPE html>
(etc.)
First load of the proxied host in the browser:
1754487188521.png
Subsequent loads only generate a 502.
Following this, attempting to access the site via curl also fails:
Code:
$ curl https://localhost:8007 -k
curl: (56) Recv failure: Connection reset by peer
Restarting the proxmox-backup-proxy service fixes the issue until the next access through the reverse proxy. Accessing the website using the 8007 port works fine and doesn't trigger the crash.

The proxmox-backup-proxy service doesn't seem to log anything at the time of crash, but this error is being repeatedly printed:
Code:
get_network_interfaces failed - could not deserialize ip link output
FWIW, my ip link output looks like this (MAC addresses obfuscated):
Code:
$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 11:22:23:12:55:44 brd ff:ff:ff:ff:ff:ff
    altname enx123456
3: enp8s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 13:12:aa:bb:cc:aa brd ff:ff:ff:ff:ff:ff
    altname enx123457
4: tailscale0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1280 qdisc fq_codel state UNKNOWN mode DEFAULT group default qlen 500
    link/none
 
Last edited:
Could you post the output of

Code:
ip -details -json link show
JSON:
[{"ifindex":1,"ifname":"lo","flags":["LOOPBACK","UP","LOWER_UP"],"mtu":65536,"qdisc":"noqueue","operstate":"UNKNOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"loopback","address":"00:00:00:00:00:00","broadcast":"00:00:00:00:00:00","promiscuity":0,"allmulti":0,"min_mtu":0,"max_mtu":0,"inet6_addr_gen_mode":"eui64","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535,"tso_max_size":524280,"tso_max_segs":65535,"gro_max_size":65536,"gso_ipv4_max_size":65536,"gro_ipv4_max_size":65536},{"ifindex":2,"ifname":"enp7s0","flags":["BROADCAST","MULTICAST","UP","LOWER_UP"],"mtu":1500,"qdisc":"mq","operstate":"UP","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"11:11:22:33:44:55","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"allmulti":0,"min_mtu":68,"max_mtu":9216,"inet6_addr_gen_mode":"eui64","num_tx_queues":8,"num_rx_queues":8,"gso_max_size":65536,"gso_max_segs":65535,"tso_max_size":65536,"tso_max_segs":65535,"gro_max_size":65536,"gso_ipv4_max_size":65536,"gro_ipv4_max_size":65536,"parentbus":"pci","parentdev":"0000:07:00.0","vfinfo_list":[],"altnames":["enx111122334455"]},{"ifindex":3,"ifname":"enp8s0","flags":["BROADCAST","MULTICAST","UP","LOWER_UP"],"mtu":1500,"qdisc":"mq","operstate":"UP","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"22:11:22:33:44:55","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"allmulti":0,"min_mtu":68,"max_mtu":9710,"inet6_addr_gen_mode":"eui64","num_tx_queues":64,"num_rx_queues":64,"gso_max_size":65536,"gso_max_segs":65535,"tso_max_size":65536,"tso_max_segs":65535,"gro_max_size":65536,"gso_ipv4_max_size":65536,"gro_ipv4_max_size":65536,"parentbus":"pci","parentdev":"0000:08:00.0","vfinfo_list":[],"altnames":["enx221122334455"]},{"ifindex":4,"ifname":"tailscale0","flags":["POINTOPOINT","MULTICAST","NOARP","UP","LOWER_UP"],"mtu":1280,"qdisc":"fq_codel","operstate":"UNKNOWN","linkmode":"DEFAULT","group":"default","txqlen":500,"link_type":"none","promiscuity":0,"allmulti":0,"min_mtu":68,"max_mtu":65535,"linkinfo":{"info_kind":"tun","info_data":{"type":"tun","pi":false,"vnet_hdr":true,"multi_queue":false,"persist":false}},"inet6_addr_gen_mode":"random","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535,"tso_max_size":65536,"tso_max_segs":65535,"gro_max_size":65536,"gso_ipv4_max_size":65536,"gro_ipv4_max_size":65536}]
 
We changed the way we read information about network devices in this version, the problem seems to be that the parser cannot handle the tailscale IF. I'll look into preparing a fix asap!
 
Thank you, but I'm not sure if this is related to the crashes. This error appears periodically both before and after the crash occurs.
 
Hi,

I don't know if I have same issue but I also upgraded to 4.0 and have the same problem (and temporary fix) with accessing the web gui through 8007.
In my case I access the gui with DNS Name (and internal trustworthy lets encrypt certificate) but it resolves to IPv4 and a IPv6 ULA address.

Got the same results. 8007 is working a little bit but after 1-2 minutes it crashes with "connection reset". Services shown here in this threads not showing any errors.
 
  • Like
Reactions: nett_hier
Thank you, but I'm not sure if this is related to the crashes. This error appears periodically both before and after the crash occurs.
While I was able to easily reproduce the errors caused by trying to parse the ip link output - it seems that the crashes are indeed not related to it. Would you mind posting the full logs of your system since booting?

Code:
journalctl -b > syslog_$(hostname).txt
 
I cut off the logs at around 30min of uptime, but they technically contain a crash and restart of the proxy. Not seeing any interesting logs though.
 

Attachments

Hi,

I don't know if I have same issue but I also upgraded to 4.0 and have the same problem (and temporary fix) with accessing the web gui through 8007.
In my case I access the gui with DNS Name (and internal trustworthy lets encrypt certificate) but it resolves to IPv4 and a IPv6 ULA address.

Got the same results. 8007 is working a little bit but after 1-2 minutes it crashes with "connection reset". Services shown here in this threads not showing any errors.
some interesting point:

After disabling IPv6 completely on sysctl base and reboot (so no ipv6 ula and gua anymore) no crashing of the web gui after 30min.

If I enable IPv6 again and reboot crashes are coming back.
 
some interesting point:

After disabling IPv6 completely on sysctl base and reboot (so no ipv6 ula and gua anymore) no crashing of the web gui after 30min.

If I enable IPv6 again and reboot crashes are coming back.
Not sure if I can reproduce this, I tried adding a file like this and rebooting (I confirmed the values were being set), but the issue continues to occur:
Code:
$ cat /etc/sysctl.d/10-disable-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Or did you set different parameters?
 
Not sure if I can reproduce this, I tried adding a file like this and rebooting (I confirmed the values were being set), but the issue continues to occur:
Code:
$ cat /etc/sysctl.d/10-disable-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Or did you set different parameters?
Same place as you did but only used first line.
Maybe it's slighty different (or another origin of your problem) with same result?
 
First load of the proxied host in the browser:
Would it be possible to see the details (request header, response header, response body) of the first 502 request? Please make sure to censor the Cookies in the Request header!
 
It's not consistent which request fails first, but while testing I noticed that the issue is probably unrelated to any request headers and more likely related to parallel requests. I am able to consistently trigger the issue like this:
Code:
$ systemctl restart proxmox-backup-proxy
$ curl https://<proxied-host>/qrcodejs/qrcode.min.js
var QRCode(etc.)

$ for i in {0..10}; do curl -s https://<proxied-host>/qrcodejs/qrcode.min.js > /dev/null & done
[1] 6400
[2] 6401
[3] 6402
[4] 6403
(etc.)

$ curl https://<proxied-host>/qrcodejs/qrcode.min.js -I
HTTP/2 502
date: Wed, 06 Aug 2025 15:40:06 GMT

I cannot trigger this issue when directing the requests directly at https://localhost:8007. I'd record the traffic to compare but I'm not sure how to decrypt it.
 
Last edited:
Might be that the reverse-proxy is adding Headers / altering the request in another way inbetween that trips up the proxy. I tried adding e.g. the Headers tailscale adds, but was not successful in reproducing. Might need to just set it up and try and reproduce it.
 
  • Like
Reactions: nett_hier
Would it be possible to create a tcpdump of the traffic while you are reproducing the problem?

Code:
tcpdump -w output.pcap port 8007
 
Would it be possible to create a tcpdump of the traffic while you are reproducing the problem?

Code:
tcpdump -w output.pcap port 8007
I recorded this using tcpdump -w /tmp/output.pcap -i lo port 8007 in order to filter out traffic from my PVE instances and only capture traffic between the tailscale and pbs proxies.
I first made a regular, successful HEAD request to the tailscale proxy. I then waited a few seconds, and made 3 GET requests in parallel to the same endpoint as in my previous post. I then waited another few seconds and made a single final HEAD request, which failed with a 502.
Not sure how helpful the capture is considering its all encrypted traffic, but it does clearly show the TCP connections being reset at some point during the parallel requests.
 

Attachments

Last edited:
thanks for all your data and reports! this issue should now be fixed in PBS version 4.0.12-1, available on pbs-test!
 
In my case I have enabled ipv6, rebooted PBS and after 25min I would confirm it's working.

Thanks for your Work!
 
Upgraded to 4.0.12 and enabled check in haproxy. No crashes so far, so seems to be fixed.
Thanks for good job :)