Hi all -
I have a production proxmox node setup and it's been working fine for the most part now for a couple months. One thing I've noticed is that intermittently it stops responding. The GUI stop loading and I get a "Unable to Connect" error. When that happens, SSH fails to work too. Ping used to work, but it now doesn't.
The GUI and Ping issues may be separate. Reason is that the GUI (port 8006) blocks show up in the firewall log, but they are linked to a VM that is running on the node, even though that VM has no rules relating to port 8006. My GUI rule is in the Datacenter section of the firewall options. Additionally, the IP that is listed in the FW log as rejected for GUI access is definitely whitelisted. The VM linked to rejecting the GUI request is the last VM in the list of all VMs (ID#150).
Here's an example of the FW blocking the request:
(Note that this is me trying to access the Node GUI, not VM #150 in any way.)
I can't seem to find any real reason why this is happening, especially since it works 98% of the time and when it stops working, it tends to start working again after a few minutes on it's own. Ping has yet to come back. I can ping within my subnet at the datacenter this Node is in no problem.
Pinging may be a red herring and be related to some weird ACLs that I don't know about, but it used to work fine until recently. Plus I have 8 nodes in this data center and I can ping 2 of them from my office, and they can all ping each other up there.
So, TL;DR: I can usually get to my GUI, but intermittently the GUI stops loading and the firewall log shows rejections to the GUI (8006) from a whitelisted IP and it lists my last VM as the source of the blocking rule.
Any ideas how a VM's firewall rules are seemingly intermittently getting applied to accessing the Host?
Thanks!
I have a production proxmox node setup and it's been working fine for the most part now for a couple months. One thing I've noticed is that intermittently it stops responding. The GUI stop loading and I get a "Unable to Connect" error. When that happens, SSH fails to work too. Ping used to work, but it now doesn't.
The GUI and Ping issues may be separate. Reason is that the GUI (port 8006) blocks show up in the firewall log, but they are linked to a VM that is running on the node, even though that VM has no rules relating to port 8006. My GUI rule is in the Datacenter section of the firewall options. Additionally, the IP that is listed in the FW log as rejected for GUI access is definitely whitelisted. The VM linked to rejecting the GUI request is the last VM in the list of all VMs (ID#150).
Here's an example of the FW blocking the request:
Code:
150 4 tap150i0-IN 20/Dec/2016:16:02:53 -0500 policy REJECT: IN=fwbr150i0 OUT=fwbr150i0 PHYSIN=fwln150i0 PHYSOUT=tap150i0 MAC=<macaddress> SRC=xx.xx.xx.243 DST=xx.xx.xx.11 LEN=64 TOS=0x00 PREC=0x00 TTL=58 ID=62076 DF PROTO=TCP SPT=4331 DPT=8006 SEQ=1293836082 ACK=0 WINDOW=65535 SYN
(Note that this is me trying to access the Node GUI, not VM #150 in any way.)
I can't seem to find any real reason why this is happening, especially since it works 98% of the time and when it stops working, it tends to start working again after a few minutes on it's own. Ping has yet to come back. I can ping within my subnet at the datacenter this Node is in no problem.
Pinging may be a red herring and be related to some weird ACLs that I don't know about, but it used to work fine until recently. Plus I have 8 nodes in this data center and I can ping 2 of them from my office, and they can all ping each other up there.
So, TL;DR: I can usually get to my GUI, but intermittently the GUI stops loading and the firewall log shows rejections to the GUI (8006) from a whitelisted IP and it lists my last VM as the source of the blocking rule.
Any ideas how a VM's firewall rules are seemingly intermittently getting applied to accessing the Host?
Thanks!