Intermittent Lost GUI Access and Blocked ICMP

shawnk

New Member
Sep 9, 2016
11
0
1
43
Hi all -

I have a production proxmox node setup and it's been working fine for the most part now for a couple months. One thing I've noticed is that intermittently it stops responding. The GUI stop loading and I get a "Unable to Connect" error. When that happens, SSH fails to work too. Ping used to work, but it now doesn't.

The GUI and Ping issues may be separate. Reason is that the GUI (port 8006) blocks show up in the firewall log, but they are linked to a VM that is running on the node, even though that VM has no rules relating to port 8006. My GUI rule is in the Datacenter section of the firewall options. Additionally, the IP that is listed in the FW log as rejected for GUI access is definitely whitelisted. The VM linked to rejecting the GUI request is the last VM in the list of all VMs (ID#150).

Here's an example of the FW blocking the request:

Code:
150 4 tap150i0-IN 20/Dec/2016:16:02:53 -0500 policy REJECT: IN=fwbr150i0 OUT=fwbr150i0 PHYSIN=fwln150i0 PHYSOUT=tap150i0 MAC=<macaddress> SRC=xx.xx.xx.243 DST=xx.xx.xx.11 LEN=64 TOS=0x00 PREC=0x00 TTL=58 ID=62076 DF PROTO=TCP SPT=4331 DPT=8006 SEQ=1293836082 ACK=0 WINDOW=65535 SYN

(Note that this is me trying to access the Node GUI, not VM #150 in any way.)

I can't seem to find any real reason why this is happening, especially since it works 98% of the time and when it stops working, it tends to start working again after a few minutes on it's own. Ping has yet to come back. I can ping within my subnet at the datacenter this Node is in no problem.

Pinging may be a red herring and be related to some weird ACLs that I don't know about, but it used to work fine until recently. Plus I have 8 nodes in this data center and I can ping 2 of them from my office, and they can all ping each other up there.

So, TL;DR: I can usually get to my GUI, but intermittently the GUI stops loading and the firewall log shows rejections to the GUI (8006) from a whitelisted IP and it lists my last VM as the source of the blocking rule.

Any ideas how a VM's firewall rules are seemingly intermittently getting applied to accessing the Host?

Thanks!
 
Any ideas how a VM's firewall rules are seemingly intermittently getting applied to accessing the Host?

How does the rules look like? Post PVE firewall settings shown by

Code:
grep "" /etc/pve/firewall/*
iptables-save

Is the PVE's the only one firewall in the network?
 
Hi Richard -

Thanks for responding! I've attached the results below. I stripped out all the unique identifiers like MAC and IP addresses.

These nodes sit in a datacenter on a public subnet, but there are hardware firewalls and ACLs in addition to Proxmox. However, not sure how those would affect us intermittently.

Curious to hear what you find. The closest we've come so far is that there are RETURN values in there that we did not setup in the GUI. So maybe a loop is being created somewhere?
 

Attachments

  • firewalls-edit.txt
    18.5 KB · Views: 4
  • iptables-edit.txt
    38.3 KB · Views: 3
Last edited:
@Richard just checking in on this, any ideas? we've been looking to see if it's network related, but haven't found anything conclusive yet.
 
@Richard just checking in on this, any ideas? we've been looking to see if it's network related, but haven't found anything conclusive yet.

The firewall rules are rather complex - before going deeper: does the problem disappear when there is no firewall active?

This can be done temporarily as follows:

- disable the firewall on "Datacenter" level

- to be sure run also

Code:
iptables -F
 
@Richard

thanks for following up. Unfortunately this issue not easy to reproduce as it happens inconsistently. Further, our proxmox nodes are in the DMZ so we can't really turn off the firewalls.

So far, what I've done is moved that last VM (150) onto it's own proxmox node (they aren't clustered), and I haven't seen the issue since. Though I haven't had to interact with the GUI much since then.

TL;DR - hard to say no if moving that VM solved the issue, but I'm unable to troubleshoot by turning our FW off entirely unfortunately.

If you have any suggestions on how to reproduce the issue or any insight into the rules I sent, please send any info over!

Thanks,
Shawn
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!