Weird network behaviour with Proxmox?

Tozz

Active Member
Mar 11, 2012
31
0
26
We've had some issues with our shared storage, so I've migrated some VMs to another proxmox server using local storage. Since this migration I am having network issues, which I boiled down to yet another proxmox server doing ARP replies for a machine that was never even on this proxmox server.

I have:
proxmox1: Ran this specific VM I migrated
proxmox2: The server that ARP replies for a VM that was never on pm2
proxmox3: The new host for my VM.

The VM has MAC 76:9c:5b:72:61:fd, however when running "arp -n" on a machine in my network I get:

Code:
# arp -n
Address                  HWtype  HWaddress           Flags Mask            Iface
1.2.3.100           ether   f6:8f:3b:9b:92:78   C                     eth0

After looking in our core switches where this MAC was coming from I saw it came from proxmox2:

Running "brctl showmacs vmbr402 | grep f6:8f" showed that this machine has this MAC in its ARP table too, coming from a port not equal to 1:

Code:
# brctl showmacs vmbr402 | grep f6:8f
  7     f6:8f:3b:9b:92:78       no                 2.06

The port beeing greater then 1 means it is not the physical interface. It is the virtual port from one of the VM's. brctl show output:

Code:
root@pm3:~# brctl show 
bridge name     bridge id               STP enabled     interfaces
vmbr402         8000.003048f81284       no              tap102i0
                                                        tap104i0
                                                        tap111i0
                                                        vlan402

As shown here, this bridge only contains 3 virtual interface, for VM 102, 104 and 111. Which is rather strange, because qm list shows:

Code:
# qm list
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID       
       108 vmX   stopped    512                0.00 0         
       117 xmY     running    512                0.00 1003551

As you can see, VM's with ID 102, 104 and 111 dont even exist. Why are the virtual interfaces still connected?

Now, back to brctl. We saw that my conflicting MAC was on port 7. According to brctl showstp vmbr402 that is on tap111i0:

Code:
# brctl showstp vmbr402
vmbr402
 bridge id              8000.003048f81284
 designated root        8000.003048f81284
 root port                 0                    path cost                  0
 max age                  19.99                 bridge max age            19.99
 hello time                1.99                 bridge hello time          1.99
 forward delay             0.00                 bridge forward delay       0.00
 ageing time             299.95
 hello timer               0.31                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                   0.31
 flags


tap102i0 (2)
 port id                8002                    state                forwarding
 designated root        8000.003048f81284       path cost                100
 designated bridge      8000.003048f81284       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

tap104i0 (4)
 port id                8004                    state                forwarding
 designated root        8000.003048f81284       path cost                100
 designated bridge      8000.003048f81284       message age timer          0.00
 designated port        8004                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

tap111i0 (7)
 port id                8007                    state                forwarding
 designated root        8000.003048f81284       path cost                100
 designated bridge      8000.003048f81284       message age timer          0.00
 designated port        8007                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vlan402 (1)
 port id                8001                    state                forwarding
 designated root        8000.003048f81284       path cost                 19
 designated bridge      8000.003048f81284       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

See, tap111i0 has (7) behind it, indicating that tap111i0 is identified as port 7. As shown from the qm list output, VM 111 does not exist.

Can anyone explain to me why:

- tap111i0 is still bridged to vlan402 (using vmbr402). Why isnt it removed?
- Why does tap111i0 respond to ARP requests for an IP that is not on this machine? (both because VM ID 111 does not exist and even it it would exist, it is not running).

The only way to resolve this issue is to manually remove tap111i0 from my bridge.
 
VM 111 does not exists anymore in the cluster. It is possible that VM111 is the old VM that used this IP. This VM was removed from the cluster (using the "Remove" button in the interface or qm destroy <vmid>). I had the exact same issue with yet another VM that was also migrated. None of the troubled VM's were on "proxmox2".

If VM 111 was the VM ID of this VM on the old cluster, I would have thought that this issue would occur on proxmox1, the physical server that originally ran this VM. I dont understand why proxmox2, a cluster node that never ran this VM, replies to this IP's ARP requests.

Actually, I dont understand the issue at all. From what I understand responding to ARP requests is a matter that the VM itself should take care of not the dom0. The physical server (hypervisor, dom0, whatever) only does Ethernet bridging, and thus, theoretically, should not have any business with or knowledge of ARP at all besides for its local management address.
 
Oke, lets assume that it ran on this host some time. That doesn't explain my issue. I basicly have 2 things I dont understand:

- Why wasn't the TAP interface removed when the VM was destroyed/stopped?
- What is answering the ARP requests? Why is the dom0 doing this instead of the VM?

In my opinion this is a pretty serious issue. It is hard to diagnose by someone who has less experience and/or knowledge of ARP and/or layer2/3 Ethernet/IP, and you might not notice it at first. Chances are 50/50 you get an ARP reply from the correct host.
 
Oke, lets assume that it ran on this host some time. That doesn't explain my issue.

I would help to know what you have really done.

- Why wasn't the TAP interface removed when the VM was destroyed/stopped?

How did you remove the VM - using the GUI. Or simply by deleting the config files?
 
What I know for sure is that the VM that originally had this VM, never ran on proxmox2. So the "VM 111" (from tap111) must be another VM. That VM was in the same VLAN (402). VM 111 was also migrated without issues.

Either using the GUI or using "qm destroy", I did not simply remove the config files.

I had issues with (I think) the "HA manager" that my VM kept restarting after I shut it down. So what I did was run "qm stop 100 && qm destroy 100". Maybe this caused some kind of race condition that caused the interface to stay there.
 
I had issues with (I think) the "HA manager" that my VM kept restarting after I shut it down. So what I did was run "qm stop 100 && qm destroy 100". Maybe this caused some kind of race condition that caused the interface to stay there.

Fencing would have killed the node. Also, tap111 can only be created by VM 111, but you tell it was never there.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!