Got timeout (500) - Communication failure (0) - Connection timed out (595)

maverickws

Member
Jun 8, 2020
50
2
8
Hi all,

Two nodes connected by private LAN.
3 networks and 3 Corosync links (link0,link1,link2) with independent networks on private vSwitch.

When accessing a remote node through the GUI of one:

Screenshot 2020-06-16 at 17.40.42.png

Code:
# ping -c100 -f 10.11.49.2
PING 10.11.49.2 (10.11.49.2) 56(84) bytes of data.

--- 10.11.49.2 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 258ms
rtt min/avg/max/mdev = 2.525/2.568/2.605/0.034 ms, ipg/ewma 2.606/2.574 ms

Do you think this link is having communications failure or connectivity issues? Or timeouts?

Screenshot 2020-06-16 at 17.48.25.png

Code:
# ping -c1000 -f 10.11.49.2
PING 10.11.49.2 (10.11.49.2) 56(84) bytes of data.

--- 10.11.49.2 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 609ms
rtt min/avg/max/mdev = 2.537/2.571/2.613/0.057 ms, ipg/ewma 2.609/2.574 ms

Only Proxmox is. Why?
 
Can you please post the /etc/network/interfaces and the /etc/pve/corosync.conf files? If you use public IPs redact them but keep them uniquely identifiable.
 
Hi @aaron thanks for the reply.

Here goes as requested:
/etc/network/interfaces
Code:
auto lo
iface lo inet loopback

iface lo inet6 loopback

auto enp2s0
iface enp2s0 inet manual

auto enp2s1
iface enp2s1 inet manual

auto enp2s2
iface enp2s2 inet manual

auto enp2s1.4001
iface enp2s1.4001 inet manual
    mtu 1400
#Public

auto enp2s1.4005
iface enp2s1.4005 inet manual
    mtu 1400
#Private

auto enp2s2.4010
iface enp2s2.4010 inet manual
    mtu 1400
#Data

auto enp2s2.4011
iface enp2s2.4011 inet manual
    mtu 1400
#ClusterSync

auto enp2s2.4013
iface enp2s2.4013 inet manual
    mtu 1400
#Management

auto vmbr0
iface vmbr0 inet static
    address public_ipv4/27
    gateway ipv4_gw
    bridge-ports enp2s0
    bridge-stp off
    bridge-fd 0
    bridge-vlan-aware yes
    bridge-vids 2-4094

iface vmbr0 inet6 static
    address ipv6_public::2/64
    gateway gw_v6

    up route add -net public_v4 netmask slash_27_netmask gw ipv4_gw dev vmbr0
    up route add -net public_v4 netmask slash_29_netmask gw ipv4_gw dev vmbr0
    up route -6 add ipv6_public::/64 dev vmbr0

auto vmbr1
iface vmbr1 inet manual
    bridge-ports none
    bridge-stp off
    bridge-fd 0
    bridge-vlan-aware yes
    bridge-vids 2-4094

auto vmbr01
iface vmbr01 inet manual
    address 10.1.49.1/24
    gateway 10.1.49.254
    bridge-ports enp2s1.4001
    bridge-stp off
    bridge-fd 0
#Public

auto vmbr05
iface vmbr05 inet manual
    address 10.5.49.1/24
    gateway 10.5.49.254
    bridge-ports enp2s1.4005
    bridge-stp off
    bridge-fd 0
#Private

auto vmbr10
iface vmbr10 inet manual
    address 10.10.49.1/24
    gateway 10.10.49.254
    bridge-ports enp2s2.4010
    bridge-stp off
    bridge-fd 0
#Data

auto vmbr11
iface vmbr11 inet manual
    address 10.11.49.1/24
    gateway 10.10.49.254
    bridge-ports enp2s2.4011
    bridge-stp off
    bridge-fd 0
#ClusterSync

auto vmbr13
iface vmbr13 inet manual
    address 10.13.49.1/24
    gateway 10.13.49.254
    bridge-ports enp2s2.4013
    bridge-stp off
    bridge-fd 0
#Management

/etc/pve/corosync.conf
Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox-01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.11.49.1
    ring1_addr: 10.5.49.1
    ring2_addr: 10.1.49.1
  }
  node {
    name: proxmox-02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.11.49.2
    ring1_addr: 10.5.49.2
    ring2_addr: 10.1.49.2
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: MyCluster
  config_version: 2
  interface {
    linknumber: 0
  }
  interface {
    linknumber: 1
  }
  interface {
    linknumber: 2
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}
 
Can the nodes reach each other on the public address?

My guess is that this is not possible and that in the /etc/hosts the public IP is set for the node. The proxy traffic to other nodes is usually sent to the IP configured there.

If my guess is correct, a possible workaround would be to change the IP set in /etc/hosts for the node to one of the private ones that you want to use for the proxy traffic.
IMPORTANT: if you use the firewall and access the nodes from external you will have to manually set rules to allow access via SSH and to the GUI on the external IP because the automatic rules are generated for the IP set in /etc/hosts IIRC.

The following hint from the documentation helps to simplify that:
To simplify that task, you can instead create an IPSet called “management”, and add all remote IPs there. This creates all required firewall rules to access the GUI from remote.
https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_configuration_files
 
Oh shit ok.
I get it is the same stuff @t.lamprecht described on my other post. I'm having some IPv6 connectivity issues, so I'll assume for now that is causing the issue. So changing link0, link1 and link2 resolved nothing.

Honestly I think this implementation is wrong. The proxmox cluster should have a feature to select which links to use for the nodes to communicate via API among each other across the cluster, and shouldn't have to be neither a public IP nor whatever's showing on /etc/hosts.

When I created the cluster I chose three networks for proxmox to use. Using another link aside from those the administrator selected is wrong imo.

Thanks for your replies.
 
using the corosync links for proxying API traffic (and migration/storage replication traffic) by default would be wrong and dangerous - that could cause an overload of the corosync links which might lead to a total cluster failure.

making it easier to specify which network to use for proxying API traffic probably makes sense, the current way of "resolve the other nodes hostname" works quite well in practice though. you can already override the network used for migration traffic, and external access uses whatever the client provides anyway..
 
@fabian

I never meant that nor what I said should be interpreted that way. If that's what you read, the wrong is on you. Corosync traffic should have its dedicated network? YES! same page here. When I said "select which links to use for the nodes to communicate via API among each other" I meant select a link purely for that purpose - API traffic.

Making it easier to specify which network to use for proxying API traffic would make sense yes.
I get it such feature doesn't exist now, but I do hope you implement it in the future, as it would be very very good to have that control, and I imagine more people could benefit from such.

Thanks for the feedback.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!