How to fix this: received packet on bond0 with own address as source address ?

spirit

Famous Member
Apr 2, 2010
5,696
642
133
www.odiso.com
@albert_a

for ifupdown2, you could try to add in

/lib/systemd/system/networking.service
"
ExecStartPre=/usr/sbin/udevadm settle
"

Just before
"
ExecStart=/sbin/ifup -a
"

I should force renaming of interfaces before starting the interfaces
 
Mar 22, 2018
33
6
28
39
@spirit Thanks for the suggestion!
But as I said, first problem is that it happens occasionally, so I may think everything is ok, but one fine day cluster would not start up.
The second thing that I mentioned, is that ifupdown2 ignores 'post-up' statements on no-subscription cluster.
So.. no trust to ifupdown2 anymore.
Is there any reason to switch to ifupdown2 (except applying changes without reboot)? Is ifupdown deprecated?
 

spirit

Famous Member
Apr 2, 2010
5,696
642
133
www.odiso.com
@spirit Thanks for the suggestion!
But as I said, first problem is that it happens occasionally, so I may think everything is ok, but one fine day cluster would not start up.
The second thing that I mentioned, is that ifupdown2 ignores 'post-up' statements on no-subscription cluster.
So.. no trust to ifupdown2 anymore.
Is there any reason to switch to ifupdown2 (except applying changes without reboot)? Is ifupdown deprecated?
ifupdown is not deprecated. but a lot of new feature are coming for ifupdown2 (vxlan,....).
So I'm trying to get it really stable.
I asked you about udevm settle, because ifupdown1 is doing it before starting and not ifupdown2 .

about the post-up, can you send your /etc/network/interfaces with post-up ?
 

Hyacin

Member
May 6, 2020
27
3
8
41
Ok, so I'm getting a flood of this same error, and then a Proxmox reboot, on all three of my nodes when I reboot my router?!??

I tried ifupdown2 and the problem continued. I added the sleeps to ifenslave (which helped my bonds pick up the interfaces that were being left out) - and I've confirmed both ends are set for 802.3ad/LACP. I'm stumped.

The only thing left I can think to do is disable untagged VLAN 1 on the member ports facing Proxmox boxes, which is not an idea I love, and really shouldn't be required - it also makes zero sense that that would trigger it, but only when I reboot my router (on a LACP LAG off the same switch) - but it's all I've got left.

My playtime is up and the network is in use again, so I'll have to give it a try another night. Any thoughts/insight would be greatly appreciated in the meantime.
 

spirit

Famous Member
Apr 2, 2010
5,696
642
133
www.odiso.com
Ok, so I'm getting a flood of this same error, and then a Proxmox reboot, on all three of my nodes when I reboot my router?!??
what is your router model ?

This error should mean that a sended packet by your host, is coming back to your host. (so switch is flooding to all port for example, or a network loop).

Maybe it could be a router software bug, when you reboot it. (do you have tried to hard poweroff your router to compare ?)
 

Hyacin

Member
May 6, 2020
27
3
8
41
what is your router model ?

Asus RT-AC5300 w/ Merlin WRT 384.19

This error should mean that a sended packet by your host, is coming back to your host. (so switch is flooding to all port for example, or a network loop).

Yeah, but that shouldn't make Debian/Proxmox crash and burn should it?

Maybe it could be a router software bug, when you reboot it. (do you have tried to hard poweroff your router to compare ?)

Oooo, this sounds like a very good test. I'll give it a try as soon as I can and report back!
 

Hyacin

Member
May 6, 2020
27
3
8
41
if it's really a loop, with network amplification, it could overload corosync process. (I have see that once, as corosync only use 1core, it was 100%).
if ha is enabled, it could reboot the hosts.

Alright, just had to reboot the router due to flaky internet, so I took the opportunity to test this theory and sadly, all four boxes again have 1 minute of uptime when I get back online :-/

One of them has a bond but only one link even (for easy switch to dual links when I get around to ordering a USB NIC) - maybe I'll try taking it out of the bond configuration and see if that one still goes down.
 

Hyacin

Member
May 6, 2020
27
3
8
41
I'll let you guess which one does not have a bond, lol -

post-router-reboot.png


I'm going to try disabling the untagged VLAN 1 on the member ports of one of the other units and then try another reboot tonight to see if that fixes it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!