How to fix this: received packet on bond0 with own address as source address ?

Patrik Dufresne · Apr 11, 2018

I have one server with proxmox 4.5, other server are still on version 4.4 until fix resolved this issue.

On proxmox 4.5, we have kernel message as follow:

Code:

[  202.482275] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.048280] net_ratelimit: 23 callbacks suppressed
[  206.048281] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.050799] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.051516] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.052227] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.052889] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.254154] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.573467] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  206.877175] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  207.180844] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  207.484490] vmbr0: received packet on bond0 with own address as source address (addr:0c:c4:7a:40:c1:3a, vlan:0)
[  211.053481] net_ratelimit: 10 callbacks suppressed

We have configure interface bonding as follow:
/etc/network/interfaces:

Code:

auto lo
iface lo inet loopback

auto bond0
iface bond0 inet manual
slaves eno1 eno2
bond-mode active-backup
bond_miimon 100
bond_downdelay 200
bond_updelay 200

auto vmbr0
iface vmbr0 inet static
address 192.168.14.14
netmask 255.255.255.0
gateway 192.168.14.1
bridge_ports bond0
bridge_stp off
bridge_fd 0
bridge_maxwait 0

We have try different bonding mode without success.

For unknown reason, if we restart the networking service (sudo service networking restart), the problem is resolved, until the next reboot. When restarting the networking service, check the dmes logs:

Code:

[  477.033692] vmbr0: port 1(bond0) entered disabled state
[  477.037305] device bond0 left promiscuous mode
[  477.037306] device eno1 left promiscuous mode
[  477.037728] device eno2 left promiscuous mode
[  477.038105] vmbr0: port 1(bond0) entered disabled state
[  477.061670] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[  477.219122] bond0: Releasing backup interface eno1
[  477.219124] bond0: the permanent HWaddr of eno1 - 0c:c4:7a:40:c1:3a - is still in use by bond0 - set the HWaddr of eno1 to a different address to avoid conflicts
[  477.311042] bond0: Releasing backup interface eno2
[  477.441354] e1000e: eno2 NIC Link is Down
[  478.680184] bond0: Enslaving eno1 as a backup interface with a down link
[  478.896243] bond0: Enslaving eno2 as a backup interface with a down link
[  478.899223] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[  479.020874] vmbr0: port 1(bond0) entered blocking state
[  479.020877] vmbr0: port 1(bond0) entered disabled state
[  479.020927] device bond0 entered promiscuous mode
[  479.023168] IPv6: ADDRCONF(NETDEV_UP): vmbr0: link is not ready
[  481.517275] igb 0000:05:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[  481.604184] bond0: link status up for interface eno1, enabling it in 0 ms
[  481.604212] bond0: link status definitely up for interface eno1, 1000 Mbps full duplex
[  481.604213] bond0: making interface eno1 the new active one
[  481.604214] device eno1 entered promiscuous mode
[  481.605207] bond0: first active interface up!
[  481.605218] vmbr0: port 1(bond0) entered blocking state
[  481.605219] vmbr0: port 1(bond0) entered forwarding state
[  481.605553] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
[  481.738602] e1000e: eno2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[  481.812194] bond0: link status up for interface eno2, enabling it in 200 ms
[  482.020214] bond0: link status definitely up for interface eno2, 1000 Mbps full duplex

Alwin · Apr 12, 2018

Why did you set the maxwait? I guess it is coming from that, the bridge has a IP and as the bond was not ready when the bridge is already up.

From 'man bridge-utils-interfaces':

bridge_maxwait time
forces to time seconds the maximum time that the Debian bridge setup scripts will wait for the bridge ports to get to the forwarding status, doesn't allow factional part. If it is equal to 0
then no waiting is done.

Patrik Dufresne · Apr 12, 2018

With or without "bridge_maxwait" doesn't change anything. I've remove the "bridge_maxwait 0", then reboot.
Yesterday, I've cross this post: https://forum.proxmox.com/threads/a...rk-does-not-start-correctly-every-time.36677/
Might it be related ?

Cause I do see this error in my logs:

Code:

[   26.692423] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[   26.730296] vmbr0: port 1(bond0) entered blocking state
[   26.730298] vmbr0: port 1(bond0) entered disabled state
[   26.730337] device bond0 entered promiscuous mode
[   26.732534] IPv6: ADDRCONF(NETDEV_UP): vmbr0: link is not ready
[   26.738886] bond0: option mode: unable to set because the bond device is up
[   27.996310] device eno1 entered promiscuous mode
[   27.996456] bond0: Enslaving eno1 as an active interface with a down link
[   28.212236] device eno2 entered promiscuous mode
[   28.212289] bond0: Enslaving eno2 as an active interface with a down link
[   28.630590] audit: type=1400 audit(1523559940.480:8): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/usr/bin/lxc-start" pid=8605 comm="apparmor_parser"
[   28.636013] audit: type=1400 audit(1523559940.486:9): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default" pid=8609 comm="apparmor_parser"
[   28.636016] audit: type=1400 audit(1523559940.486:10): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-cgns" pid=8609 comm="apparmor_parser"
[   28.636017] audit: type=1400 audit(1523559940.486:11): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-with-mounting" pid=8609 comm="apparmor_parser"
[   28.636018] audit: type=1400 audit(1523559940.486:12): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-with-nesting" pid=8609 comm="apparmor_parser"
[   28.636020] audit: type=1400 audit(1523559940.486:13): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-with-nfs" pid=8609 comm="apparmor_parser"
[   30.905125] igb 0000:05:00.0 eno1: igb: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   31.004035] bond0: link status up for interface eno1, enabling it in 0 ms
[   31.004061] bond0: link status definitely up for interface eno1, 1000 Mbps full duplex
[   31.004065] bond0: first active interface up!
[   31.004083] vmbr0: port 1(bond0) entered blocking state
[   31.004084] vmbr0: port 1(bond0) entered forwarding state
[   31.004457] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
[   31.258474] e1000e: eno2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   31.316072] bond0: link status up for interface eno2, enabling it in 200 ms
[   31.524081] bond0: link status definitely up for interface eno2, 1000 Mbps full duplex

Patrik Dufresne · Apr 12, 2018

Adding "options bonding max_bonds=0" to "/etc/modprobe.d/bonding.conf" did not work.

Applying the path with "sleep 2" to ifslave script is working. Don't really understand why tough.

Alwin · Apr 13, 2018

It looks like a timing problem and works with the delay of 2s.

Patrik Dufresne · Apr 13, 2018

Should I report this as a bug ?

vmi · Jul 25, 2018

Patrik Dufresne said:
Adding "options bonding max_bonds=0" to "/etc/modprobe.d/bonding.conf" did not work.

Applying the path with "sleep 2" to ifslave script is working. Don't really understand why tough.

Can you please elaborate on:

exactly what settings (i.e. "sleep 2")
where (in the file)
and into which file

resolved this issue for you?

As I am currently experiencing the same using bond level 6 + bridge.

Thanks in advance.

Taylor Murphy · Sep 20, 2018

I am also experiencing this issue and would like to know exactly which setting you changed to make it work and where to add the "sleep 2" setting.

Patrik Dufresne · Sep 21, 2018

Sorry, I'm making reference to another post:
https://forum.proxmox.com/threads/a...rk-does-not-start-correctly-every-time.36677/

Code:

--- /etc/network/if-pre-up.d/ifenslave.orig    2018-01-31 00:39:53.408660244 +0100
+++ /etc/network/if-pre-up.d/ifenslave    2018-01-31 00:45:29.668216453 +0100
@@ -12,11 +12,15 @@
    # If the bonding module is not yet loaded, load it.
    if [ ! -r /sys/class/net/bonding_masters ]; then
        modprobe -q bonding
+        # GF20180131 Give the interface a chance to come up
+        sleep 2
    fi

    # Create the master interface.
    if ! grep -sq "\\<$BOND_MASTER\\>" /sys/class/net/bonding_masters; then
        echo "+$BOND_MASTER" > /sys/class/net/bonding_masters
+        # GF20180131 ... also with max_bonds=0
+        sleep 2
    fi
 }

spirit · Sep 21, 2018

it could be interesting to see if it's working fine with new ifupdown2 package (available in proxmox repo)

#apt-get install ifupdown2.

it's replace all old ifupdown bash script with python code. (should be 100% compatible with current /etc/network/interfaces syntax)

Mam89 · Oct 24, 2018

I was suffering from the same issue, massive amounts of dropped packets with bond0 configured. Could reset the network service but issue would come back after node restart.

Installed ifupdown2 per Spirt's advise and restarted system. Issue hasn't come back after 3 different reboots. Will keep an eye on this issue for other systems.

albert_a · Jan 13, 2020

spirit said:
it could be interesting to see if it's working fine with new ifupdown2 package (available in proxmox repo)

#apt-get install ifupdown2.

it's replace all old ifupdown bash script with python code. (should be 100% compatible with current /etc/network/interfaces syntax)

After installing ifupdown2, I observed odd behavior: my bond 'lan0' was renamed to rename11, and was not configured because of that...

Rolled back to ifupdown

spirit · Jan 13, 2020

albert_a said:
After installing ifupdown2, I observed odd behavior: my bond 'lan0' was renamed to rename11, and was not configured because of that...

Rolled back to ifupdown

This is strange, "renamexxx" interface sound like systemd/udev renaming. (ifupdown2 don't magically rename interface).
can you send your /etc/network/interfaces ? I woud like to test on my side to see where is the problem.

albert_a · Jan 17, 2020

spirit said:
This is strange, "renamexxx" interface sound like systemd/udev renaming. (ifupdown2 don't magically rename interface).
can you send your /etc/network/interfaces ? I woud like to test on my side to see where is the problem.

Sure:

auto lo
iface lo inet loopback

iface r0p1 inet manual
iface r0p2 inet manual

iface aux0p1 inet manual
iface aux0p2 inet manual

iface lan0p1 inet manual
iface lan0p2 inet manual

auto r0
iface r0 inet static
bond-slaves r0p1 r0p2
bond-mode balance-rr
bond-miimon 100
address 172.22.255.1
netmask 255.255.255.0

auto aux0
iface aux0 inet static
bond-slaves aux0p1 aux0p2
bond-mode balance-rr
bond-miimon 100
address 172.22.254.1
netmask 255.255.255.0

auto lan0
iface lan0 inet manual
bond-slaves lan0p1 lan0p2
bond-miimon 100
bond-mode 802.3ad
bond-lacp-rate fast
xmit_hash_policy layer2+3

auto vmbr0
iface vmbr0 inet static
address 172.22.11.11
netmask 255.255.254.0
gateway 172.22.10.20
bridge_ports lan0
bridge_stp off
bridge_fd 0
pre-up sleep 2

But now with last 'sleep' line added, everything works with ifupdown. (note: I've had the other problem than OP: bond mode was not set at times)
When I used ifupdown2, lan0 got renamed to rename11. Now problem is gone with ifupdown.

spirit · Jan 17, 2020

albert_a said:
But now with last 'sleep' line added the everything works with ifupdown. (note: I've had the other problem than OP: bond mode was not set at times)
When I used ifupdown2, lan0 got renamed to rename11. Now problem is gone with ifupdown.

How do you rename your interfaces to lan0, aux0, r0 .... ?

albert_a · Jan 17, 2020

spirit said:
How do you rename your interfaces to lan0, aux0, r0 .... ?

I don't. I just define them in /etc/network/interfaces. However I do rename physical interfaces r0p1, r0p2, aux0p1, ... using udev rules from their weird default names by their MACs.

spirit · Jan 18, 2020

albert_a said:
I don't. I just define them in /etc/network/interfaces. However I do rename physical interfaces r0p1, r0p2, aux0p1, ... using udev rules from their weird default names by their MACs.

Ok, I'll look with udev rules and ifupdown2. (maybe ifupdown2 is launching before udev rename.
Also, udev rules shouldn't be used anymore for rename, it's possible to do it with systemd directly. (I need to write doc about this)
I'll keep you in touch with my tests.
thanks for the report.

albert_a · Jan 18, 2020

spirit said:
thanks for the report.

You're welcome.

spirit said:
Also, udev rules shouldn't be used anymore for rename, it's possible to do it with systemd directly.

Can you please explain why udev rules shouldn't be used anymore for renaming?
I much more prefer laconic udev one-liners, that are kept in only one file for the whole cluster (this file is distributed between nodes), rather than having to deal with a heap of *.link files. Furthermore each time you make a change you have to run `update-initramfs -u`. I think it's not the way it has to be.
I hope the udev approach will still be supported!

spirit · Jan 19, 2020

albert_a said:
Can you please explain why udev rules shouldn't be used anymore for renaming?
I much more prefer laconic udev one-liners, that are kept in only one file for the whole cluster (this file is distributed between nodes), rather than having to deal with a heap of *.link files. Furthermore each time you make a change you have to run `update-initramfs -u`. I think it's not the way it has to be.
I hope the udev approach will still be supported!

yes, It's still working, but debian recommand using systemd network link
https://wiki.debian.org/NetworkInterfaceNames
it's unsupported officially by debian team anymore.
https://www.debian.org/releases/bus...h-information.en.html#migrate-interface-names
(TLDR, udev rename could work, but if you have bugs, it's your problem

I'm not sure, but in your case, you have systemd /lib/systemd/network/99-default.link, trying to name your interface from pci port slot address, and maybe it's too long(, and interface got the name "renameXXXX".
Then you udev rule rename it again.

Doing rename with a customlink file, it's bypassing the first step of /lib/systemd/network/99-default.link.

I need to check if ifupdown2 is not started before udev renaming.

If you have time, could you try with new ifupdown2 2.0.1 package ? (they are some change in systemd service order)
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/ifupdown2_2.0.1-1+pve2_all.deb

albert_a · Feb 14, 2020

spirit said:
yes, It's still working, but debian recommand using systemd network link
https://wiki.debian.org/NetworkInterfaceNames
it's unsupported officially by debian team anymore.
https://www.debian.org/releases/bus...h-information.en.html#migrate-interface-names
(TLDR, udev rename could work, but if you have bugs, it's your problem

I see... Thanks for the info! Ok, moving to systemd rename.

spirit said:
I'm not sure, but in your case, you have systemd /lib/systemd/network/99-default.link, trying to name your interface from pci port slot address, and maybe it's too long(, and interface got the name "renameXXXX".
Then you udev rule rename it again.

Doing rename with a customlink file, it's bypassing the first step of /lib/systemd/network/99-default.link.

I need to check if ifupdown2 is not started before udev renaming.

If you have time, could you try with new ifupdown2 2.0.1 package ? (they are some change in systemd service order)
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/ifupdown2_2.0.1-1+pve2_all.deb

I'm sorry for the late answer, but in case you still need it, I've had a chance to check `ifupdown2` again on those servers after they've been upgraded, and yes all the names were correct, no 'rename11' or other inconsistent names.

UPDATE.
Actually I've posted too soon. The situation is the same with ifupdown2 2.0.1-1+pve4.
Probably sometimes it works, sometimes don't. Last time I rebooted the cluster, the other interface `r0` appeared as `rename8`.
And that's only a half of troubles. On the cluster without subscription it just ignores 'post-up' statements.

How to fix this: received packet on bond0 with own address as source address ?

Member

Proxmox Retired Staff

Member

Member

Proxmox Retired Staff

Member

New Member

New Member

Member

Distinguished Member

New Member

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Well-Known Member

We value your privacy