VM created after upgrade to Proxmox 8.1 can't reach other VMs

May 1, 2022
63
8
13
Stockholm, Sweden
Hi

I have upgraded to Proxmox 8.1 last weekend and now when I create new VMs they can't access other VMs on the same network/vlan. The problem occurs with both IPv4 och IPv6.
I can access the internet and the new VMs can be reached from outside, I have a list via pveversion on what versions are running on my server.


Bash:
root@kg-virt01:~# pveversion --verbose
proxmox-ve: 8.1.0 (running kernel: 6.5.11-6-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5: 6.5.11-6
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.3
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve4
 
I can see via tcpdump that my server sends an arp, but I see nothing else.
Bash:
[root@portal ~]# tcpdump host my-public-ip
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp6s18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
21:57:52.519803 ARP, Request who-has my-public-ip tell my-server-trying-to-send-a-ping, length 42
21:57:52.519824 ARP, Reply my-public-ip is-at bc:24:11:fb:62:5e (oui Unknown), length 28
 
This is (what I see) the relevant part of the interfaces file:

auto ens3f0np0 iface ens3f0np0 inet manual ovs_mtu 9000 ovs_bridge vmbr0 ovs_type OVSPort ovs_options tag=1 vlan_mode=native-untagged pre-up echo 56 > /sys/class/net/ens3f0np0/device/sriov_numvfs auto vmbr0 iface vmbr0 inet manual ovs_type OVSBridge ovs_ports vlan30 vlan32 vlan50 vlan51 vlan52 vlan60 vlan70 vlan71 vlan80 vlan90 vlan200 vlan202 ovs_mtu 9000 auto vlan51 iface vlan51 inet static ovs_mtu 1500 ovs_type OVSIntPort ovs_bridge vmbr0 ovs_options tag=51
 
Hi, @Veidit! I am facing a similar issue in which two VMs running Ubuntu 18.04 were migrated (via PBS) from a Proxmox 6 cluster to a Proxmox 7 cluster. The first can connect to the second, but the second cannot connect to the first. Both can be connected from a LXC I use as Ansible Controller.

It looks like a routing or firewall problem, but I just cannot find the source of the issue. Did you came up with a way to find out what happened in your case?
 
In my case, it turned out that the firewall had gone crazy (for whatever reason). Restarting the firewall in the destination node (the one holding the VM that could not be contacted by the other VM) solved the problem.

Kind of weird, to say the least.