VM's unable to reach default gateway

leeleatherwood · Aug 21, 2024

Hello, first time proxmox user. I am familiar with other platforms such as ESXi, Hyper-V, OpenStack, OpenShift, etc. I have heard many good things about Proxmox over the years and because of some quirkiness (involving Dual Actuator hard drives) with my new server, it turns out ESXi is not the best choice.

Anyways, long story short I can not figure out how to get VM's to communicate with anything past the proxmox host itself. VM's can communicate with each other as well as the proxmox host, but like I said nothing past proxmox such as the default gateway or any other network devices.

In order to simplify as much as possible I have put proxmox on a simple 192.168.1.1/24 network with just a switch and a router.

Ping from VM (192.168.1.45) to Proxmox (192.168.1.44) works & vice versa.
Ping from VM (192.168.1.45) to VM (192.168.1.46) works & vice versa.
Ping from Proxmox (192.168.1.44) to Default Gateway (192.168.1.1) works.
Ping from VMs (192.168.1.45 & 192.168.1.46) to Default Gateway fail.
Ping from VMs (192.168.1.45 & 192.168.1.46) to anything else on network & internet fail.

Strangely enough, VM's are getting DHCP addresses from 192.168.1.1

Things I have tried:
Rebooting (duh)
Disabling Proxmox firewall at Datacenter, Host and VM layers
Making sure network is configured correctly (vmbr0 using enp9s0 as bridge. vmbr0 on correct subnet with correct default gateway)
Different network adapters in the VM's (tried VirtIO, Intel and Realtek)
Both Windows and Ubuntu
Reinstalled Proxmox
Tried VLAN Aware

Here is ip a

Code:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: enp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether 40:01:7a:ff:0b:f5 brd ff:ff:ff:ff:ff:ff
3: enp10s0: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
    link/ether 40:01:7a:ff:0b:f6 brd ff:ff:ff:ff:ff:ff
4: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 40:01:7a:ff:0b:f5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.44/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::4201:7aff:feff:bf5/64 scope link
       valid_lft forever preferred_lft forever
8: tap100i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UNKNOWN group default qlen 1000
    link/ether 0e:54:dd:77:eb:64 brd ff:ff:ff:ff:ff:ff
9: tap101i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UNKNOWN group default qlen 1000
    link/ether 5a:96:aa:49:d6:6a brd ff:ff:ff:ff:ff:ff
10: vmbr0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 40:01:7a:ff:0b:f5 brd ff:ff:ff:ff:ff:ff
11: enp9s0.1@enp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0v1 state UP group default qlen 1000
    link/ether 40:01:7a:ff:0b:f5 brd ff:ff:ff:ff:ff:ff

here is /etc/network/interfaces

Code:

auto lo
iface lo inet loopback

auto enp9s0
iface enp9s0 inet manual

iface enp10s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.44/24
        gateway 192.168.1.1
        bridge-ports enp9s0
        bridge-stp on
        mtu 1500

source /etc/network/interfaces.d/*

here is ip route

Code:

default via 192.168.1.1 dev vmbr0 proto kernel onlink
192.168.1.0/24 dev vmbr0 proto kernel scope link src 192.168.1.44

fireon · Aug 21, 2024

Basically, it's a straightforward matter. Your network config also looks quite default and ok. I therefore suspect that something is wrong in your network.
Please post a VM config that cannot ping the GW.

Code:

qm config <VMID>

And the running network configuration of an Ubuntu.

Have you already tested an LXC? There you can easily configure the network manually in the Proxmox WebUI.

leeleatherwood · Aug 21, 2024

Hi fireon, thanks for the reply!

I suspected something "wrong" on the network as well, but other hypervisors are not having similar issue. With that being said I still disabled all fancy network options such as Spanning Tree, Flow Control, Rogue DHCP Detection, IoT Auto-Discovery, etc. Still could be something wrong which I dont know about though. Switch is a Ubiquiti USW Pro 48 PoE, Router is Ubiquiti USG. Both currently with nothing fancy (no VLANs, static routes, ACL's, etc)

Here is the config for the ubuntu VM (just running live, not installed)

Code:

root@pve:~# qm config 101
agent: 1
boot: order=scsi0;ide2;net0
cores: 4
cpu: x86-64-v2-AES
ide2: local:iso/ubuntu-24.04-desktop-amd64.iso,media=cdrom,size=5971344K
memory: 2048
meta: creation-qemu=8.1.5,ctime=1724210246
net0: virtio=BC:24:11:95:3D:D7,bridge=vmbr0
numa: 0
ostype: l26
scsi0: dev-sda:vm-101-disk-0,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=2c7d3f88-4af1-4e43-a0ec-903b42a8dfe8
sockets: 1
vmgenid: 9b9fe1d3-b003-43cb-9f1b-7a9cbc430d57

I have not tried LXC, I will work on that now. Reading on it, seems pretty interesting!

Markku · Aug 21, 2024

Is there a reason why you have "bridge-stp on" in vmbr0, have you tried with it off?

leeleatherwood · Aug 21, 2024

I just disabled bridge-stp on vmbr0, no change.

LCX container of Ubuntu-Focal is experiencing the same issue.

Markku · Aug 21, 2024

After trying to ping the gw from the VMs, what does the ARP table on the USG say?

leeleatherwood · Aug 21, 2024

Markku said:
After trying to ping the gw from the VMs, what does the ARP table on the USG say?

(proxmox host) 192.168.1.44 dev eth1 lladdr 40:01:7a:ff:0b:f5 REACHABLE
(VM) 192.168.1.45 dev eth1 lladdr bc:24:11:73:45:5e STALE

leeleatherwood · Aug 21, 2024

hmm, seems like its probably some weird routing issue. The VM's do have unique mac address's, and since this is all on same subnet and same switch, not sure why routing would even be involved in the first place. Well technically the proxmox bridge is a switch... hmm

traceroute from default gateway to proxmox:

Code:

traceroute to 192.168.1.44 (192.168.1.44), 30 hops max, 38 byte packets
 1  192.168.1.44 (192.168.1.44)  0.072 ms  0.439 ms  0.401 ms

traceroute from default gateway to VM, you can see one time it worked, next time it didnt.

Code:

traceroute to 192.168.1.45 (192.168.1.45), 30 hops max, 38 byte packets
 1  *  *  *
 2  *  WIN-MNJ8V4Q0MOD.internal (192.168.1.45)  1.149 ms  0.709 ms

traceroute to 192.168.1.45 (192.168.1.45), 30 hops max, 38 byte packets
 1  *  *  *
 2  *  *  *
 3  *  *  *
 4  *  *  *
 5  *  *  *
 6  *  *  *
 7  *  *  *

Markku · Aug 21, 2024

leeleatherwood said:
(proxmox host) 192.168.1.44 dev eth1 lladdr 40:01:7a:ff:0b:f5 REACHABLE
(VM) 192.168.1.45 dev eth1 lladdr bc:24:11:73:45:5e STALE

So USG is able to get ARP responses and knows where the VM is.

You could try tcpdump on the host to see the packets while pinging the gw from the VM:

tcpdump -i enp9s0 -n host 192.168.1.45

You should see the outgoing packets, and also the incoming replies.

leeleatherwood · Aug 21, 2024

Markku said:
So USG is able to get ARP responses and knows where the VM is.

You could try tcpdump on the host to see the packets while pinging the gw from the VM:

tcpdump -i enp9s0 -n host 192.168.1.45

You should see the outgoing packets, and also the incoming replies.

Code:

tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:16:05.271281 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:05.271350 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:05.271368 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:05.271444 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:06.051768 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:06.051821 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:06.051838 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:06.051942 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:07.051803 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:07.051862 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:07.051904 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:07.051959 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:11.757779 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:11.757865 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:11.757898 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:11.757953 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:12.551756 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:12.551820 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:12.551839 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:12.551908 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:13.551753 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:13.551832 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:13.551860 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:13.551873 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:20.271169 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:20.271255 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:20.271301 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:20.271316 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:16:21.051712 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:16:21.051771 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:21.051803 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:16:21.051891 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46

thats the correct mac address for the USG LAN port

Code:

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 74:ac:b9:37:00:14 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.1/24 brd 192.168.1.255 scope global eth1
       valid_lft forever preferred_lft forever

Markku · Aug 21, 2024

Right, so it looks like the VM doesn't receive the ARP replies as it keeps retrying.

Now do the same tcpdump but using the interface name of the VM (tapxxxxx from the "ip addr" list) instead of enp9s0

leeleatherwood · Aug 21, 2024

Markku said:
Right, so it looks like the VM doesn't receive the ARP replies as it keeps retrying.

Now do the same tcpdump but using the interface name of the VM (tapxxxxx from the "ip addr" list) instead of enp9s0

Code:

tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on tap100i0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:22:35.271368 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:35.271401 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:35.271428 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:36.051909 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:36.051954 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:36.051958 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:37.051899 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:37.051926 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:37.051930 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:50.271263 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:50.271367 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:50.271383 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:51.051869 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:51.051962 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:51.051978 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:52.051918 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:22:52.051999 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:22:52.052008 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:05.271204 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:23:05.271307 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:05.271323 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:06.051902 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:23:06.051981 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:06.051990 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:07.051860 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:23:07.051967 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:23:07.051979 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46

Markku · Aug 21, 2024

Ok so now we know that the PVE host blocks the return traffic for some reason.

Ideas, anyone? Why wouldn't the host forward packets to its own VM?

leeleatherwood · Aug 21, 2024

this is from vmbr0 if that helps

Code:

listening on vmbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:26:51.051978 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:26:51.052055 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:26:51.052059 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:26:51.052122 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:26:52.051928 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:26:52.052009 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:26:52.052014 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:26:52.052165 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:27:05.271386 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:27:05.271412 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:27:05.271418 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:27:05.271546 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46
14:27:06.052009 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 28
14:27:06.052022 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:27:06.052024 ARP, Request who-has 192.168.1.1 tell 192.168.1.45, length 46
14:27:06.052201 ARP, Reply 192.168.1.1 is-at 74:ac:b9:37:00:14, length 46

Markku · Aug 21, 2024

Ok so the ARP replies are shown in vmbr0 but the switch/host does not send them "out" to the VM.

leeleatherwood · Aug 21, 2024

Markku said:
Ok so now we know that the PVE host blocks the return traffic for some reason.

Ideas, anyone? Why wouldn't the host forward packets to its own VM?

I am running proxmox Virtual Environment 8.2.2, downloaded the ISO from the downloads page. I could try an older one?

Markku · Aug 21, 2024

Did you install the upgrades?

I'm a new PVE user myself as well, haven't experienced your problems.

leeleatherwood · Aug 22, 2024

I installed Proxmox 7.4-1 and had the same network issue. Even worse a new issue appeared, the default local-lvm pool was not being created, the Storage pane was stuck refreshing with an error because I assume the local-lvm pool was missing. Syslog showed this error

Code:

command '/sbin/lvs --separator : --noheadings --units b --unbuffered --nosuffix --config 'report/time_format="%s"' --options vg_name,lv_name,lv_size,lv_attr,pool_lv,data_percent,metadata_percent,snap_percent,uuid,tags,metadata_size,time' failed: exit code 5 (500)

I reformatted the boot disk and reinstalled Proxmox 7.4-1 yet again, and both errors remained.

I thank you for all the help @Markku and @fireon but I need something a little more reliable than this. Probably I'll just run KVM on Debian or Ubuntu.

fireon · Aug 22, 2024

Markku said:
Ok so the ARP replies are shown in vmbr0 but the switch/host does not send them "out" to the VM.

Reminds me of Mikrotik...

I would replace the router as a test. This is definitely not a problem with the Proxmox operating system. Because then a lot of people would have it and it would be a main bug.

leeleatherwood · Aug 22, 2024

I do not believe there is anything wrong with proxmox per say, as you have mentioned a lot more people would have the same issue.

On the other hand all of these worked out of the box:
ESXi
Hyper-V
Unraid
TrueNAS Scale
Ubuntu Server + KVM

Ideally I wanted to run ESXi, but it does not recognize both LUN's on dual actuator hard disks so that is why I tried all of these in order to figure out what would be the best compromise. In this case I believe TrueNAS Scale will be the ideal solution. It does not recognize both LUN's out of the box, but with a little bit of CLI commands, they are all present and now working phenomenally.

I am sure if I was more familiar and had more time to troubleshoot Linux Routing, Firewall and IP Tables I could get proxmox to work.

VM's unable to reach default gateway

New Member

Distinguished Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

Distinguished Member

New Member