VM VLAN Issue

cyrus104

Active Member
Feb 10, 2020
55
1
28
39
In my homelab setup, I have 3 proxmox hosts each with 2x 10G connections. On each of these are a bridge (vmbr0 and vmbr1) and vlan_bridge (vmbr0.10 and vmbr1.50) that allow the host proxmox to have an IP and connection into the respective public and storage vlans.

Proxmox has it's IP set on the vlan_bridges and has no comms issues.

I currently have 2x VMs that I've migrated over to the Proxmox setup. In the VM configuration, one has a connection to vmbr0 and tagged vlan 10 and is properly pulling a dhcp from that network. The second vm is configured for vmbr0 with a blank vlan tag field because I want it to be on the native "non-tagged vlan". When I run tcpdump I can see that even though the VM is not configured to tag a VLAN and is on vmbr0 it's still showing traffic as if it's on the vlan 10.

Code:
auto lo
iface lo inet loopback

iface eno7 inet manual

iface eno5 inet manual

iface eno6 inet manual

iface eno8 inet manual

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

auto vmbr0
iface vmbr0 inet manual
        bridge-ports eno7
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr0.10
iface vmbr0.10 inet static
        address 10.100.10.21/24
        gateway 10.100.10.1

auto vmbr1
iface vmbr1 inet manual
        bridge-ports eno8
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr1.50
iface vmbr1.50 inet static
        address 10.100.50.21/24
 
Last edited:
I tried rebooting the host and also tried to migrate the VM to another of the 3 nodes. Each of the 3 nodes is the same besides the IP address but all of the have the same bridges/vlans connected to the same eno7 and eno8.
 
root@proxmox-2:~# bridge -c vlan show
port vlan ids
eno5
eno6
eno7 1 PVID Egress Untagged
2-50

eno7
eno8 1 PVID Egress Untagged
2-50

eno8
vmbr0 1 PVID Egress Untagged
10

vmbr1 1 PVID Egress Untagged
50

docker_gwbridge 1 PVID Egress Untagged

docker0 1 PVID Egress Untagged

vethb13277d 1 PVID Egress Untagged

vethc780f0c 1 PVID Egress Untagged

tap111i0 1 PVID Egress Untagged
2-4094

root@proxmox-2:~#
 
so, "1 PVID Egress Untagged" is showing the native default vlan "1".

are you sure that you have a dhcp server running on this vlan1 or without any vlan defined ? Does it work with static ip address ?
 
I've double checked that there is a dhcp server running on there. I have several machine there.. including my desktop (using DHCP) that I'm on.

I have given the VM a static address in the same /24 as it would have if the DHCP server would have handed it out and it still can't ping anything that the rest can.
 
The switch is a Unifi US-16-XG, because it's my homelab core, it's pushing all vlans and untagged to all ports. Each of the proxmox nodes is connected via 2x 10Gb SFP+ DAC cable. My configuration has vmbr0.10 as the management vlan and vmbr1.50 as the storage/ceph vlan. In the near future I will have a second 10G switch and the plan is for that to be for storage only, I should be able to just move the SFP+ over to the new switch (after basic config).
 
I also found this, when I googled it and looked through the forums it looks like it's mainly an issue with OVS but I'm not using that.

dmesg:
Code:
[   17.481904] device tap111i0 entered promiscuous mode
[   17.491682] vmbr0: port 2(tap111i0) entered blocking state
[   17.491685] vmbr0: port 2(tap111i0) entered disabled state
[   17.491779] device eno7 entered promiscuous mode
[   17.491854] vmbr0: port 2(tap111i0) entered blocking state
[   17.491856] vmbr0: port 2(tap111i0) entered forwarding state
[   21.645152] FS-Cache: Loaded
[   21.652208] Key type ceph registered
[   21.652347] libceph: loaded (mon/osd proto 15/24)
[   21.657311] FS-Cache: Netfs 'ceph' registered for caching
[   21.657317] ceph: loaded (mds proto 32)
[   21.664451] libceph: mon2 (1)10.100.10.41:6789 session established
[   21.664735] libceph: client934562 fsid 93286cdf-7711-434e-be61-1b80cdf6fa63
[   21.936037] FS-Cache: Netfs 'nfs' registered for caching
[   22.046853] NFS: Registering the id_resolver key type
[   22.046865] Key type id_resolver registered
[   22.046865] Key type id_legacy registered
[   65.444206] vmbr0: port 2(tap111i0) entered disabled state
[   65.471617] device eno7 left promiscuous mode
[   65.493561] vmbr0: port 2(tap111i0) entered blocking state
[   65.493562] vmbr0: port 2(tap111i0) entered disabled state
[   65.493656] device eno7 entered promiscuous mode
[   65.493746] vmbr0: port 2(tap111i0) entered blocking state
[   65.493747] vmbr0: port 2(tap111i0) entered forwarding state
 
Ok, well the issue was I did something stupid. If you know me you would know this is nothing new.

Most of the services I want to run are either VMs or are Docker containers in the Docker Hub. I've was struggling with how work Docker into my 3-node cluster... So I followed a couple of guides and installed it to the Debian base OS of Proxmox. This only worked until I put the 3 Proxmox/Docker nodes into a Docker Cluster and created an interface of it's own that would be the same across the Docker instances. Once I stopped the Docker services the problem went away.

I don't want to install Docker (Portainer) into a VM and create a swarm because if 1 or 2 hosts go down, I'll have 3 vms all on one machine which defeats the purpose of the swarm and moving dockers around. Similar issue with LXC that they won't even be able to take advantage of HA.

I'm at a loss on how to use Docker now without another container / vm overhead. I would use LXC but no services that my work uses are already available for it. So I'm not really using it in the homelab either because I need to use what I will be expected work with.

Thanks for the help
 
I don't want to install Docker (Portainer) into a VM and create a swarm because if 1 or 2 hosts go down, I'll have 3 vms all on one machine which defeats the purpose of the swarm and moving dockers around.
If you don't enable Ha on theses vms, they will not move automaticaly in case of host failure. (You already managed "HA" inside your docker anyway, as containers will be restarted on others vms)

I'm using a lot of kubernetes cluster in vms, I don't have any problem with this.
 
Thank you for the troubleshooting and the advice. With you kubernetes cluster in vms, what base OS / configuration are you using? I don't like the idea of having another OS to manage and keep updated, even with apt having a plan for security and kernel patching is an extra layer.
 
Thank you for the troubleshooting and the advice. With you kubernetes cluster in vms, what base OS / configuration are you using? I don't like the idea of having another OS to manage and keep updated, even with apt having a plan for security and kernel patching is an extra layer.

I'm using debian for vm os, and I deploy kubernetes with rancher rke (https://github.com/rancher/rke). (I'm using calico as cni for kube network with ip in ip encapsulation)
 
Thanks, as I'm most familiar with Debian and it's what Proxmox is built on that is what I was going to go with too. How do you handle the debian VM needing more storage space, cpu cores, or memory assigned?
 
Thanks, as I'm most familiar with Debian and it's what Proxmox is built on that is what I was going to go with too. How do you handle the debian VM needing more storage space, cpu cores, or memory assigned?

for storage, I'm using xfs directly on a raw disk /dev/sdb where /var/lib/docker is mounted.
like this, I can extend the disk from proxmox, and simply "xfs_growfs /var/lib/docker" to extend xfs.

for cpu, I'm using cpu hotplug/unplug.

for memory, I'm using balloning with shares=0, to simulate fake hotplug/unplug. (shares=0 force the memory to min value, so you can increase/decrease min memory online, like hottplug).
 
Thanks a bunch. I'm going to work on standing up a 2-3 virtual node swarm as you recommend.

My last question has to do with if you are using VLANs in your network and if you have docker containers in different VLANs?

When create the Debian VM for docker, my plan was to pass in everything via the vmbr0. Then similar to how my conifg for Proxmox puts that management interface on VLAN 10, do the same thing for the Debian Machine. After that I am not sure how to get docker to put containers in different vlans.
 
Thanks a bunch. I'm going to work on standing up a 2-3 virtual node swarm as you recommend.

My last question has to do with if you are using VLANs in your network and if you have docker containers in different VLANs?

When create the Debian VM for docker, my plan was to pass in everything via the vmbr0. Then similar to how my conifg for Proxmox puts that management interface on VLAN 10, do the same thing for the Debian Machine. After that I am not sure how to get docker to put containers in different vlans.
Were you able to fix the issue?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!