[SOLVED] The SSH always hangs on first time

flyraz

Member
May 22, 2020
4
1
8
42
Hi, Dear all..

i have a host installed 6.2-4, using 10G network card,

everything look good,
but when i use ssh connect to host or guest vm,

i'm always get hangs on first time,
then get message
Code:
packet_write_wait: Connection to 10.0.0.1 port 22: Broken pipe

rare thing is after this, the ssh connection will work normally...

i don't know how to resolve this...

anyone have same experience?
 
The SSH server could be trying a DNS lookup of the connecting client, maybe?
Maybe you can try disabling dns lookup of the sshd server and see if the timeouts still occur?
 
  • Like
Reactions: flyraz
The SSH server could be trying a DNS lookup of the connecting client, maybe?
Maybe you can try disabling dns lookup of the sshd server and see if the timeouts still occur?

wow, thank you very much
when i change all sshd_config using
Code:
UseDNS=no
The abnormal connection does not seem to appear again

let me test few days...
i'm using Mikrotik Router as DNS server
maybe it's reason...
 
  • Like
Reactions: shantanu
Usually (usually) the DNS server is not the problem. Typically workstations (laptop/desktop) and such don't have a DNS record on the LAN and hence the delay.

After the login into the ssh server, what does nslookup of your workstation IP reveal?
 
  • Like
Reactions: flyraz
Usually (usually) the DNS server is not the problem. Typically workstations (laptop/desktop) and such don't have a DNS record on the LAN and hence the delay.

After the login into the ssh server, what does nslookup of your workstation IP reveal?

okay... today it's happen again :eek:

when i login the server,
i'm try use nslookup to another server in same subnet
it's just hangs then response connection time out; no servers could be reached...

oh my god ...

i'm using ubuntu 18.04, use nmcli device show ens18, got..
Code:
nmcli device show ens18
GENERAL.DEVICE:                         ens18
GENERAL.TYPE:                           ethernet
GENERAL.HWADDR:                         76:16:99:CB:C1:EF
GENERAL.MTU:                            1500
GENERAL.STATE:                          100 (connected)
GENERAL.CONNECTION:                     Wired connection 1
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/1
WIRED-PROPERTIES.CARRIER:               on
IP4.ADDRESS[1]:                         10.6.6.2/24
IP4.GATEWAY:                            10.6.6.254
IP4.ROUTE[1]:                           dst = 0.0.0.0/0, nh = 10.6.6.254, mt = 100
IP4.ROUTE[2]:                           dst = 10.6.6.0/24, nh = 0.0.0.0, mt = 100
IP4.ROUTE[3]:                           dst = 169.254.0.0/16, nh = 0.0.0.0, mt = 1000
IP4.DNS[1]:                             10.6.6.254
IP6.ADDRESS[1]:                         fe80::d0b4:9444:ea47:9c4e/64
IP6.GATEWAY:                            --
IP6.ROUTE[1]:                           dst = ff00::/8, nh = ::, mt = 256, table=255
IP6.ROUTE[2]:                           dst = fe80::/64, nh = ::, mt = 256
IP6.ROUTE[3]:                           dst = fe80::/64, nh = ::, mt = 100

i guess my dns server it's not working?


i'm still trying something to resolve this...
 
I actually came to these forums for the very same reason as the original poster of this thread.

The first connection to both my proxmox hosts (3 separate ones) as well as guests (~10 or so at the moment) fails.

If its a web server the experience is basically that the browser tells me "Connecting..." in 3-4-5 seconds and then the page loads. After initial load everything works smooth and fast as expected.

If its a SSH connection, for example, I get "Connection refused" and then 4-5+ seconds later it starts working.

If its an ICMP ping I getsomething like this:

Code:
PING zz.71.xx.6 (zz.71.xx.6): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6
Request timeout for icmp_seq 7
Request timeout for icmp_seq 8
Request timeout for icmp_seq 9
Request timeout for icmp_seq 10
64 bytes from zz.71.xx.6: icmp_seq=11 ttl=63 time=5.674 ms
64 bytes from zz.71.xx.6: icmp_seq=12 ttl=63 time=5.410 ms
64 bytes from zz.71.xx.6: icmp_seq=13 ttl=63 time=5.762 ms

Then it keeps working if I am using it actively. As soon as no active traffic is flowing then the problem comes back again. I have not measured if its 5, 10 or 15 minutes but probably around 10.

So far struggling with understanding; but I just started looking into it seriously after lunch today so only had a few hours so far.
 
I actually came to these forums for the very same reason as the original poster of this thread.

The first connection to both my proxmox hosts (3 separate ones) as well as guests (~10 or so at the moment) fails.

If its a web server the experience is basically that the browser tells me "Connecting..." in 3-4-5 seconds and then the page loads. After initial load everything works smooth and fast as expected.

If its a SSH connection, for example, I get "Connection refused" and then 4-5+ seconds later it starts working.

If its an ICMP ping I getsomething like this:

Code:
PING zz.71.xx.6 (zz.71.xx.6): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6
Request timeout for icmp_seq 7
Request timeout for icmp_seq 8
Request timeout for icmp_seq 9
Request timeout for icmp_seq 10
64 bytes from zz.71.xx.6: icmp_seq=11 ttl=63 time=5.674 ms
64 bytes from zz.71.xx.6: icmp_seq=12 ttl=63 time=5.410 ms
64 bytes from zz.71.xx.6: icmp_seq=13 ttl=63 time=5.762 ms

Then it keeps working if I am using it actively. As soon as no active traffic is flowing then the problem comes back again. I have not measured if its 5, 10 or 15 minutes but probably around 10.

So far struggling with understanding; but I just started looking into it seriously after lunch today so only had a few hours so far.

anything in syslog/dmesg/journals ?
 
Then it keeps working if I am using it actively. As soon as no active traffic is flowing then the problem comes back again. I have not measured if its 5, 10 or 15 minutes but probably around 10.

So far struggling with understanding; but I just started looking into it seriously after lunch today so only had a few hours so far.

sound like an arp cache or mac address table cache timeout somewhere on your network.

can you send your /etc/network/interfaces ?
 
  • Like
Reactions: Mattias Ahnberg
Okay...

it's all my fault... the hangs issue resolved... lol

when i setting RouteOS firewall filter,
i set a rule to drop all forward invalid connection-state,

this is what causes all the problems...

2020-06-27_05-10-52.jpg

thank you all for help :D

I'm not very familiar with the principles of this part,
if anyone knows, hope can tell me why make this happen

thank you very much :)
 
Last edited:
sound like an arp cache or mac address table cache timeout somewhere on your network.

can you send your /etc/network/interfaces ?

You are correct!

After a little further debugging we found a misconfigured bridge on an adjacent machine; the problem disappeared after correcting this one.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!