599 To many Redirections, 596 Broke Pipe

Aug 13, 2021
58
6
13
51
Hi,

we have e Cluster with 3 Nodes in our school envieroment. When students create LXC Container, then occurs in 33% of the cases the error 599 To many Redirections, 596 Broke Pipe. In the most cases the error occurs when the storage should be selected or the installer image should be selected. We have checked the DNS and add all Nodes in the local hosts file on all nodes. We have tree storages (local, local-lvm, and SSD-School). This storages have on all nodes the same name. No NFS or so, are configured. Only lokal disks. I had also checkt the corosync.cfg nothing strange inside. Switch to debug to see more Information. No error/warning messages in the log. i had also checkt the syslogd file. Only in the pveproxy/accesslog i could find some messages

::ffff:172.17.216.200 - if200183@htl [15/11/2022:10:10:51 +0100] "GET /api2/extjs/version?_dc=1668503445997 HTTP/1.1" 200 77
::ffff:172.17.216.200 - if200183@htl [15/11/2022:10:10:52 +0100] "GET /api2/extjs/cluster/sdn?_dc=1668503445999 HTTP/1.1" 200 92
::ffff:172.17.214.234 - if200182@htl [15/11/2022:10:13:16 +0100] "GET /api2/json/nodes/pve02/storage/local/content?content=vztmpl HTTP/1.1" 599 -
::ffff:172.17.210.233 - if200205@htl [15/11/2022:10:13:37 +0100] "GET /api2/json/nodes/pve01/storage?format=1&content=rootdir HTTP/1.1" 599 -
::ffff:172.17.216.200 - if200183@htl [15/11/2022:10:13:56 +0100] "GET /api2/json/nodes/pve01/storage/local/content?content=vztmpl HTTP/1.1" 599 -
::ffff:172.17.218.70 - root@pam [15/11/2022:10:14:42 +0100] "GET /api2/json/nodes/pve03/status HTTP/1.1" 599 -
::ffff:172.17.213.212 - if200104@htl [15/11/2022:10:17:19 +0100] "GET /api2/json/nodes/pve01/network?type=any_bridge HTTP/1.1" 599 -
::ffff:172.17.212.224 - if200188@htl [15/11/2022:10:21:32 +0100] "GET /api2/json/nodes/pve01/storage/local/content?content=vztmpl HTTP/1.1" 599 -
::ffff:172.17.218.70 - root@pam [15/11/2022:10:38:23 +0100] "GET /api2/json/nodes/pve03/status HTTP/1.1" 599 -

what should i do to solve that problem?

Regards
Michael
 
Hi Moayad,

thx for you quick reponse. I have uploades the syslogd files from all nodes. We had tested create about 30 LXCs at 15.11.2022 around 10:00

Regards
Michael
 

Attachments

  • syslog-allnodes.zip
    376.4 KB · Views: 4
I also really hate this "error 599". As soon as my remote SMB/PBS storages can't be accessed, the webUI basically will become unusable as the connection will timeout all the time (so I need several tries to disable these storages) or I see the "error 599". As soon as I disable the SMB/PBS storages or they become available again, everything continues to work normally.
There really needs to be something done, that at least the webUI won't stop working. And those PVE servers aren't clustered and its all local storage except for the SMB/PBS storages where I just store my backups once per day.

@michael wagner:
Is the IO delay high when that happens? If it is the same as here, then maybe there is a probolem with your storage and PVE gets somehow stuck while polling the state of your local storages.
 
Last edited:
I also really hate this "error 599". As soon as my remote SMB/PBS storages can't be accessed, the webUI basically will become unusable as the connection will timeout all the time (so I need several tries to disable these storages) or I see the "error 599". As soon as I disable the SMB/PBS storages or they become available again, everything continues to work normally.
There really needs to be something done, that at least the webUI won't stop working. And those PVE servers aren't clustered and its all local storage except for the SMB/PBS storages where I just store my backups once per day.

@michael wagner:
Is the IO delay high when that happens? If it is the same as here, then maybe there is a probolem with your storage and PVE gets somehow stuck while polling the state of your local storages.
Hi Dunuin,

Thx for your response. I had now checked these values. While the system is under load. The change of that value (IO Wait) between no load and load with 599 occourence is very low. I will analyze that, because the behavior description from your side is very logicly

regards
Michael
 
Good Morning, I had installed 7.3 and add to all Bridges the new mac address option. Still the same. I had installed a new node just for connecting to the Cluster still the same.
 
* On a hunch - please check your DNS entries and your /etc/hosts - are all IPs correctly assigned to the PVE node names?
* Do you see any issues in the system journal when this occurs? (`journalctl -f`)

I hope this helps!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!