virtio network cards don't work with ssh

John Ratliff

Member
Apr 17, 2019
9
1
23
42
I'm setting up a new proxmox cluster. If I use the virtio network card, I cannot ssh into the virtual machines. If I switch to e1000, it works.

By not working, I mean the ssh connection just hangs. It never shows a login prompt. I see packets traveling back and forth, but it never completes the connection. I've used puTTY on Windows and ssh from different machines (mostly RHEL 7 and the Debian from the proxmox hosts themselves).

I'm using a vlan-aware bridge and tagging the interfaces with the vlan id.

I've run tcpdump on both sides and I see no packet checksum mismatch on either side. The traffic is flowing. I can ping the VM with the virtio network card.

There are no firewalls on the virtual machines on either the proxmox or vm side.

I have put the VMs on different servers and used different linux distro. It gives the same result with a CentOS 7 vm or a Debian 9 VM (installed from latest media -- CentOS 7 1810 and Debian 9.8).

I am running Proxmox 5.4-3. pve-community repo. Installed from Proxmox 5.4 ISO and then updated fully just after install.

I know there are/were some issues with older linux not working with virtio, but I thought current linux should be fine.

Thanks for any assistance.
 
We all use virtio with ssh, so I am quite sure this works well. Must be some other problem ...
 
I've been doing some testing. I have a couple other proxmox hosts that aren't part of this cluster. I have no issues using virtio and ssh with them. I was trying to look for differences between the installs.

I noticed that the working proxmox hosts were on 5.3, while the non-working ones are on 5.4. However, after separating one of my non-working hosts from the cluster and reinstalling with 5.3, there was no change.

So I looked at the underlying network card. In our non-working cluster, we have Brocade 1860 CNA with one port being the 10 Gbps ethernet NIC. In the working machines, we have Broadcom or Intel NICs. One of the test hosts (which is working) is the exact same machine type as the machines in the cluster (Dell R710), so they have the same Broadcom 1 Gbps adapters. I switched to bridging on that adapter and it started to work.

I tried updating the Brocade CNA firmware, but it did not help. I suppose my next step will be to replace that CNA with a different 10 Gbps NIC. We have some Chelsio 10 Gbps NIC that I will try next.

Are there any known issues with virtio and Brocade NICs?
 
I'm not sure if this is a problem similar to this one https://forum.proxmox.com/threads/vm-network-freeze.37161/ but it seems like it. I would suggest trying older kernel( I'm currently using kernel - 4.15.18-9-pve) with no issue on 5.4.3. With the latest kernel, I'm getting similar problems with some of our Linux vms.
If this post does not belong here, my apologies and please feel free to delete it. I hope this helps.

THanks,
Mladen
 
Yes, changing to the Chelsio network cards fixed the issue. There seems to be a problem with the Q-Logic (formerly Brocade) BR1860 adapter and virtio drivers.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!