Hi,
we're planning an upgrade of our infrastructure based on Proxmox VE and Proxmox Backup Server (PBS). The proposed setup involves a three-node cluster: 2 PVE nodes and 1 PBS node.
The 2 PVE nodes (hpve1: 192.168.0.220 and hpve2: 192.168.21.221) will use ZFS storage. Node hpve1 will be the primary (local) node located at our main site (LAN 192.168.0.0/24), while hpve2 will be the secondary (remote) node located at a secondary site (LAN 192.168.21.0/24).
The two nodes will be connected via a VPN (currently OpenVPN, but possibly replaced with IPSec or Wireguard). VMs will always run on node hpve1 and will be configured to perform local snapshots and remote replication to hpve2.
Before purchasing the final hardware for this infrastructure, we set up a lab environment using two Mini PCs with AMD Ryzen 9 PRO 695H – GMKtec NucBox M7 Pro ([https://tinyurl.com/2bbhbmfo](https://tinyurl.com/2bbhbmfo)), each equipped with 2 x 1TB NVMe SSDs and 32GB RAM, to run all tests.
We successfully created the HA cluster, enabled snapshots and replication (via `cv4pve-admin` running in a Docker container on hpve1), and configured the SDN by enabling an EVPN Zone (MTU set to 1450 as per documentation), with its corresponding VNets and a Subnet (192.168.88.0/24) used to attach the VMs.
We set hpve1 as the preferred egress node and configured our internal network (a pfSense firewall with gateway 192.168.0.220 and a static route for the 192.168.88.0/24 subnet along with the appropriate firewall rules). This setup allowed us to reach the test VMs (192.168.88.10, 192.168.88.20), running on hpve1 and connected to the VNet subnet, from a PC (192.168.0.149) on the main site's LAN—without going through the VPN. The VMs respond to pings, reach the Internet, allow SSH and VNC access, and everything seems to work fine, but:
1. Once connected via SSH to the test Linux VM, after a random interval (10–20 seconds), the SSH session freezes. If I run a ping from the VM (while connected via SSH) to the firewall or to any PC in the 192.168.0.0/24 LAN, the ping runs for a few packets (sometimes 10, sometimes 20) and then freezes. However, if I access the VM through the Proxmox VE console and run `htop`, I can still see the `ping` process running.
2. I can connect via VNC to the test Windows VMs, but after 10–20 seconds the VNC client freezes.
As mentioned, I set the MTU to 1450 and then to 1350 on all involved NICs. From the VM console, if I send "invalid" ICMP packets, the command fails immediately; otherwise, it works for only a few seconds. However, if I run a `ping` from the LAN PC (192.168.0.149) to the VMs, the command never fails or drops packets.
From what I’ve read, it looks like an MTU-related issue, but I can’t figure out how to fix it.
Is there anyone who can help me understand what's happening and how to resolve it?
Thank you very much in advance.
Pierluigi
we're planning an upgrade of our infrastructure based on Proxmox VE and Proxmox Backup Server (PBS). The proposed setup involves a three-node cluster: 2 PVE nodes and 1 PBS node.
The 2 PVE nodes (hpve1: 192.168.0.220 and hpve2: 192.168.21.221) will use ZFS storage. Node hpve1 will be the primary (local) node located at our main site (LAN 192.168.0.0/24), while hpve2 will be the secondary (remote) node located at a secondary site (LAN 192.168.21.0/24).
The two nodes will be connected via a VPN (currently OpenVPN, but possibly replaced with IPSec or Wireguard). VMs will always run on node hpve1 and will be configured to perform local snapshots and remote replication to hpve2.
Before purchasing the final hardware for this infrastructure, we set up a lab environment using two Mini PCs with AMD Ryzen 9 PRO 695H – GMKtec NucBox M7 Pro ([https://tinyurl.com/2bbhbmfo](https://tinyurl.com/2bbhbmfo)), each equipped with 2 x 1TB NVMe SSDs and 32GB RAM, to run all tests.
We successfully created the HA cluster, enabled snapshots and replication (via `cv4pve-admin` running in a Docker container on hpve1), and configured the SDN by enabling an EVPN Zone (MTU set to 1450 as per documentation), with its corresponding VNets and a Subnet (192.168.88.0/24) used to attach the VMs.
We set hpve1 as the preferred egress node and configured our internal network (a pfSense firewall with gateway 192.168.0.220 and a static route for the 192.168.88.0/24 subnet along with the appropriate firewall rules). This setup allowed us to reach the test VMs (192.168.88.10, 192.168.88.20), running on hpve1 and connected to the VNet subnet, from a PC (192.168.0.149) on the main site's LAN—without going through the VPN. The VMs respond to pings, reach the Internet, allow SSH and VNC access, and everything seems to work fine, but:
1. Once connected via SSH to the test Linux VM, after a random interval (10–20 seconds), the SSH session freezes. If I run a ping from the VM (while connected via SSH) to the firewall or to any PC in the 192.168.0.0/24 LAN, the ping runs for a few packets (sometimes 10, sometimes 20) and then freezes. However, if I access the VM through the Proxmox VE console and run `htop`, I can still see the `ping` process running.
2. I can connect via VNC to the test Windows VMs, but after 10–20 seconds the VNC client freezes.
As mentioned, I set the MTU to 1450 and then to 1350 on all involved NICs. From the VM console, if I send "invalid" ICMP packets, the command fails immediately; otherwise, it works for only a few seconds. However, if I run a `ping` from the LAN PC (192.168.0.149) to the VMs, the command never fails or drops packets.
From what I’ve read, it looks like an MTU-related issue, but I can’t figure out how to fix it.
Is there anyone who can help me understand what's happening and how to resolve it?
Thank you very much in advance.
Pierluigi