Full mesh network using Batman protocol

mgiammarco

Renowned Member
Feb 18, 2010
165
10
83
Hello,
I read full mesh article but I have tested for 1 year an improved solution that uses batman routing protocol.
The batman routing protocol is a level 2 mesh protocol for wireless networks but it works perfectly with ethernet (I am testing with 40g network).
So I can do a three servers switchless redundant configuration with fast recovery time.
Batman protocol is already in linux kernel and batctl cli utility is in debian.
I would like to ask for official proxmox support.
The first step, very easy, would be to recognize bat0 bat1 and so on as real network devices and not as "unknown" ones so I can edit them with proxmox gui.

Thanks in advance for interest.
Mario Giammarco
 
Can't help with your question, but curious about any links you would recommend to educate myself on this protocol and how it is better than the full mesh linked from proxmox? I'm about to implement ceph, waiting on another 2 hardrives, and had just setup a 10g mesh as linked from proxmox.
 
Hi,
Batman is an established protocol for mesh networks: batman
The advantages are:
- level 2 easy configuration (no need to use level 3)
- the source node talks directly to destination node ( no bandwidth wasted)
- very fast convergence time if a link is down
 
batman is supported by ifupdown2, so you can use /etc/network/interfaces to configure it.
Yes infact actually I have configured it manually.
if you can validate and provide a sample working conf, it should be easy to add support to gui.
Yes I will check which commands I have done last year
(note as alterative, vxlan meshed network is supported with sdn feature too)
Did not know about it, two questions:
- is it production stable?
- is it complex to setup?

Batman is more than 10 years old so it is stable, battle tested and very easy to setup.
 
Yes infact actually I have configured it manually.

Yes I will check which commands I have done last year
thanks !
Did not know about it, two questions:
- is it production stable?
- is it complex to setup?
yes, I'm using it in production, it's really stable.
No complex to setup, define a vxlan interface, add peers, done :).
(in the sdn plugin, you define a vxlan zone with differents peers, then creats the differents bridge/vnets with tags in this zone)

I could add a batman sdn plugin too easily. (I just need a working /etc/network/interfaces)
 
Hi all,
the setup is just like in the link @vesalius has indicated.
My /etc/network/interfaces (omitted boring part) is:
auto eno1 iface eno1 inet manual mtu 9000 auto eno2 iface eno2 inet manual mtu 9000 auto eno3 iface eno3 inet manual mtu 9000 auto eno4 iface eno4 inet manual mtu 9000 auto ens2 iface ens2 inet manual mtu 9000 auto ens2d1 iface ens2d1 inet manual mtu 9000 ... auto bat0 iface bat0 inet static address 10.1.6.4/24 pre-up /usr/sbin/batctl -m bat0 if add eno1 pre-up /usr/sbin/batctl -m bat0 if add eno2 auto bat1 iface bat1 inet static address 10.1.5.4/24 pre-up /usr/sbin/batctl -m bat1 if add ens2 pre-up /usr/sbin/batctl -m bat1 if add ens2d1
 
  • Like
Reactions: vesalius
thanks !

yes, I'm using it in production, it's really stable.
No complex to setup, define a vxlan interface, add peers, done :).
(in the sdn plugin, you define a vxlan zone with differents peers, then creats the differents bridge/vnets with tags in this zone)

I could add a batman sdn plugin too easily. (I just need a working /etc/network/interfaces)
Hi,
can you please point me to some documentation to use sdn in proxmox? I think I am missing the starting point.
 
@spirit I currently use 3node full mesh with Ceph as below:
Node1
Code:
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.15.15.18/24
        mtu 9000
        up ip route add 10.15.15.19/32 dev enp10s0f1
        down ip route del 10.15.15.19/32
#to axiom2(.19) ceph

auto enp12s0f1
iface enp12s0f1 inet static
        address 10.15.15.18/24
        mtu 9000
        up ip route add 10.15.15.20/32 dev enp12s0f1
        down ip route del 10.15.15.20/32
#to pve(.20) ceph
Node2
Code:
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.15.15.19/24
        mtu 9000
        up ip route add 10.15.15.18/32 dev enp10s0f1
        down ip route del 10.15.15.18/32
#to axiom1(.18) ceph

auto enp12s0f1
iface enp12s0f1 inet static
        address 10.15.15.19/24
        mtu 9000
        up ip route add 10.15.15.20/32 dev enp12s0f1
        down ip route del 10.15.15.20/32
#to pve(.20) ceph
Node3
Code:
auto enp9s0f0
iface enp9s0f0 inet static
        address 10.15.15.20/24
        mtu 9000
        up ip route add 10.15.15.18/32 dev enp9s0f0
        down ip route del 10.15.15.18/32
#to axiom1(.18) ceph

auto enp9s0f1
iface enp9s0f1 inet static
        address 10.15.15.20/24
        mtu 9000
        up ip route add 10.15.15.19/32 dev enp9s0f1
        down ip route del 10.15.15.19/32
#to axiom2(.19) ceph

So to to use a vxlan meshed network I would do the following?
1. remove the up ip route add and down ip route del lines from each the mesh interfaces of every node above.
2. Create an VXLAN zone named ‘Ceph_vxlan’, use the MTU -50
Code:
id: Ceph_vxlan
peers address list: 10.15.15.18,10.15.15.19,10.15.15.20
mtu: 8950

And that should be it? Does this vxlan meshed network offer additional redundancy like the batman setup above over the full mesh if one of my dac cables or interfaces go down?
 
Last edited:
Hi all,
the setup is just like in the link @vesalius has indicated.
My /etc/network/interfaces (omitted boring part) is:
auto eno1 iface eno1 inet manual mtu 9000 auto eno2 iface eno2 inet manual mtu 9000 auto eno3 iface eno3 inet manual mtu 9000 auto eno4 iface eno4 inet manual mtu 9000 auto ens2 iface ens2 inet manual mtu 9000 auto ens2d1 iface ens2d1 inet manual mtu 9000 ... auto bat0 iface bat0 inet static address 10.1.6.4/24 pre-up /usr/sbin/batctl -m bat0 if add eno1 pre-up /usr/sbin/batctl -m bat0 if add eno2 auto bat1 iface bat1 inet static address 10.1.5.4/24 pre-up /usr/sbin/batctl -m bat1 if add ens2 pre-up /usr/sbin/batctl -m bat1 if add ens2d1
If would like to have native ifupdown2 syntax without pre-up && batctl

https://github.com/CumulusNetworks/...s/examples/batman_adv/configure_batman_adv.sh
I'll look at the code, but I think it's something like:

Code:
auto bat0
iface bat inet static
     address 10.1.6.4/24
     batman-iface eno1 eno2

auto bat1
iface bat1 inet static
        address 10.1.5.4/24
        batman-iface ens2 ens2d1

maybe could you test it ?
 
@spirit I currently use 3node full mesh with Ceph as below:
Node1
Code:
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.15.15.18/24
        mtu 9000
        up ip route add 10.15.15.19/32 dev enp10s0f1
        down ip route del 10.15.15.19/32
#to axiom2(.19) ceph

auto enp12s0f1
iface enp12s0f1 inet static
        address 10.15.15.18/24
        mtu 9000
        up ip route add 10.15.15.20/32 dev enp12s0f1
        down ip route del 10.15.15.20/32
#to pve(.20) ceph
Node2
Code:
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.15.15.19/24
        mtu 9000
        up ip route add 10.15.15.18/32 dev enp10s0f1
        down ip route del 10.15.15.18/32
#to axiom1(.18) ceph

auto enp12s0f1
iface enp12s0f1 inet static
        address 10.15.15.19/24
        mtu 9000
        up ip route add 10.15.15.20/32 dev enp12s0f1
        down ip route del 10.15.15.20/32
#to pve(.20) ceph
Node3
Code:
auto enp9s0f0
iface enp9s0f0 inet static
        address 10.15.15.20/24
        mtu 9000
        up ip route add 10.15.15.18/32 dev enp9s0f0
        down ip route del 10.15.15.18/32
#to axiom1(.18) ceph

auto enp9s0f1
iface enp9s0f1 inet static
        address 10.15.15.20/24
        mtu 9000
        up ip route add 10.15.15.19/32 dev enp9s0f1
        down ip route del 10.15.15.19/32
#to axiom2(.19) ceph

So to to use a vxlan meshed network I would do the following?
1. remove the up ip route add and down ip route del lines from each the mesh interfaces of every node above.
2. Create an VXLAN zone named ‘Ceph_vxlan’, use the MTU -50
Code:
id: Ceph_vxlan
peers address list: 10.15.15.18,10.15.15.19,10.15.15.20
mtu: 8950

And that should be it? Does this vxlan meshed network offer additional redundancy like the batman setup above over the full mesh if one of my dac cables or interfaces go down?
Personnaly, I'll go with differents subnets for each interfaces to avoid routing problem, something like

Code:
node1: eth0 : 10.15.15.18 ------ node2:eth0 10.15.15.19

node1: eth0: 10.15.16.18 ------- node3:eth0: 10.15.16.19

node2: eth0: 10.15.17.18 -------- node3:eth0: 10.15.17.19

(ips addresse could be /31 pointtopoint)

then in peer address list, add all 6 ips. (I think the sdn plugin should handle this fine, filtering only needed ip by local node)


Then, on top of that, if you want to have proxmox host ips inside in the vxlan network,
it's not yet implement (currently sdn is mainly implemented for vms), but you can add in

/etc/network/interfaces
Code:
auto <vnetname>
iface <vnetname> inet static
           address X.X.X.X/X
 
If would like to have native ifupdown2 syntax without pre-up && batctl

https://github.com/CumulusNetworks/...s/examples/batman_adv/configure_batman_adv.sh
I'll look at the code, but I think it's something like:

Code:
auto bat0
iface bat inet static
     address 10.1.6.4/24
     batman-iface eno1 eno2

auto bat1
iface bat1 inet static
        address 10.1.5.4/24
        batman-iface ens2 ens2d1

maybe could you test it ?
So with using "batman-ifaces" this works fine for me. I tried it with the https://github.com/CumulusNetworks/...s/examples/batman_adv/configure_batman_adv.sh example iface bat0 and things work the same as using iface bat0 inet static

my ceph full mesh network worked fine with just

Code:
#node1
auto bat0
iface bat0 inet static
        address 10.15.15.18/24
        batman-ifaces enp10s0f1 enp12s0f1

#node2
auto bat0
iface bat0 inet static
        address 10.15.15.19/24
        batman-ifaces enp10s0f1 enp12s0f1
    
#node3
auto bat0
iface bat0 inet static
        address 10.15.15.20/24
        batman-ifaces enp9s0f0 enp9s0f1

However speed was decreased and the bat0 interface was limited to a mtu of 1500, even though each of the individual interfaces were set to 9000. on a whim I added an mtu 9000 to each of the bat0 configs, but ifreload -a reported an error and I subsequently found that batman does not support mtu above 1500 yet. So i'll go back to the full mesh ceph network for now until/unless I get a better understanding of the sdn suggestions from @spirit.
 
Last edited:
So with using "batman-ifaces" this works fine for me. I tried it with the https://github.com/CumulusNetworks/...s/examples/batman_adv/configure_batman_adv.sh example iface bat0 and things work the same as using iface bat0 inet static

my ceph full mesh network worked fine with just

Code:
#node1
auto bat0
iface bat0 inet static
        address 10.15.15.18/24
        batman-ifaces enp10s0f1 enp12s0f1

#node2
auto bat0
iface bat0 inet static
        address 10.15.15.19/24
        batman-ifaces enp10s0f1 enp12s0f1
  
#node3
auto bat0
iface bat0 inet static
        address 10.15.15.20/24
        batman-ifaces enp9s0f0 enp9s0f1

However speed was decreased and the bat0 interface was limited to a mtu of 1500, even though each of the individual interfaces were set to 9000. on a whim I added an mtu 9000 to each of the bat0 configs, but ifreload -a reported an error and I subsequently found that batman does not support mtu above 1500 yet. So i'll go back to the full mesh ceph network for now until/unless I get a better understanding of the sdn suggestions from @spirit.
Thanks for the test !

about vxlan, without the sdn feature, you can create something like in /etc/network/interfaces:


Code:
 (10.0.0.1)  node1 (10.0.1.1)-----------(10.0.1.2)------node3
     |                                                     |
     |                                                     |
     |                                                     |
     |                                                     |
     |                                                     |
 (10.0.0.2)  node2 (10.0.2.1)----------(10.0.2.2)----------

Code:
node1
----------
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.0.0.1/24
        mtu 9050


auto enp12s0f1
iface enp12s0f1 inet static
        address 10.0.1.1/24
        mtu 9050


auto vxlan2
iface vxlan2
        address 192.168.0.1/24
        vxlan-id 40000
        vxlan_remoteip 10.0.0.2  #node2
        vxlan_remoteip 10.0.1.2 #node3
        mtu 9000


node2
-----------
auto enp10s0f1
iface enp10s0f1 inet static
        address 10.0.0.2/24
        mtu 9050


auto enp12s0f1
iface enp12s0f1 inet static
        address   10.0.2.1/24
        mtu 9050


auto vxlan2
iface vxlan2
        address 192.168.0.2/24
        vxlan-id 40000
        vxlan_remoteip  10.0.0.1 #node1
        vxlan_remoteip 10.0.2.2 #node3
        mtu 9000

node3
-----------
auto enp9s0f0
iface enp9s0f0 inet static
        address  10.0.1.2/24
        mtu 9050


auto enp9s0f1
iface enp9s0f1 inet static
        address  10.0.2.2/24
        mtu 9050


auto vxlan2
iface vxlan2
        address 192.168.0.3/24
        vxlan-id 40000
        vxlan_remoteip 10.0.1.1 #node1
        vxlan_remoteip 10.0.2.1 #node2
        mtu 9000

Then you should be able to ping between 192.168.0.X/24 ips

I have increased mtu to 9050 on the physical interfaces (if they don't support it, you can also decrease mtu on vxlan interface to 8950)
 
Last edited:
  • Like
Reactions: vesalius
Thanks @spirit, I’ll look into this. Assume I would need to change up IP’s for each nodes CEPH monitors/managers if each link will be a separate subnet.
 
Last edited:
Thanks @spirit, I’ll look into this. Assume I would need to change up IP’s for each nodes CEPH monitors/managers if each link will be a separate subnet.
you can simply reuse your ips 10.15.15.18/24,10.15.15.19/24,10.15.15.20/24 on vxlan interface (instead my 192.168.0.0/24 example).
 
  • Like
Reactions: vesalius
However speed was decreased and the bat0 interface was limited to a mtu of 1500, even though each of the individual interfaces were set to 9000. on a whim I added an mtu 9000 to each of the bat0 configs, but ifreload -a reported an error and I subsequently found that batman does not support mtu above 1500 yet. So i'll go back to the full mesh ceph network for now until/unless I get a better understanding of the sdn suggestions from @spirit.
Ok I have checked and I see that unfortunately mtu is locked to 1500.
So I have tried vxlan but I have discovered a big problem: vxlan is not fault tolerant!
Batman does a full mesh, so if I detach a cable between two servers batman routes (at level 2) packets to other path and so all three servers continue to ping each other.
If I detach a cable from server A to server B, both using vxlan they cannot ping anymore!
So vxlan it is not usable at all.
I hope I have done something wrong in the vxlan configuration.
 
Ok I have checked and I see that unfortunately mtu is locked to 1500.
So I have tried vxlan but I have discovered a big problem: vxlan is not fault tolerant!
Batman does a full mesh, so if I detach a cable between two servers batman routes (at level 2) packets to other path and so all three servers continue to ping each other.
If I detach a cable from server A to server B, both using vxlan they cannot ping anymore!
So vxlan it is not usable at all.
I hope I have done something wrong in the vxlan configuration.
@mgiammarco, FYI since we have had this conversion a fault-tolerant, Open vSwitch (OVS) mesh option that supports jumbo mtu has been added to the Full Mesh Network for Ceph Server wiki.

RSTP Loop Setup