MACVTAP as future replacement of classic NIC+BRIDGE+TAP interfaces

mipsH

Renowned Member
Hello.

Are you considering the future replacement of classic Bridge+TAP interface (to VM network) with MACVTAP ?.
MACVTAP is a relatively new replacement for TAP interface, but also use /dev/tapXY as the old one (classic TAP) so can be easly used with QEMU.
Since MACVTAP can be connected directly to the physical network card (same as bridge), It that way we will not need to use LAN(NIC)+BRIDGE+TAP, but only: LAN(NIC)+MACVTAP interface as a simpler and probably faster solution (without one more layer of Bridge).

Further readings:
Source-1
Source-2
Source-3 (example: QEMU+MACVTAP)

Also there is a MACVLAN as replacement for Network Namespaces (Linux containers) connections like VETH to Linux containers.


BR,
Hrvoje.
 
Advantage may be removing one layer (bridge interface) and therefore speed things up and having "lighter" configuration :)
... as i understand from IBM presentation there is speed up on net communication between VMs on the same node:
MACVTAP-IBM-presentation.jpg
They (IBM) comment the improvement using MACVTAP: HERE
(There is also some possible limitation noted here, but it depends, on which scenario).
I had no time/resource right now to test it but will in the future :cool:

Some of the considerations can be found: HERE

Anyway i support you KISS and conservative method of doing everything inside Proxmox VE :) and in many cases quote your way of implementing stuff in the best engineering manner.
And i understand: "why to change something that is working and is proven to be stable and secure, and well implemented/tested".

BR,
Hrvoje.
 
Last edited:
It looks like I find myself here, hello. I am familiar with hyperv and vmware, but have been playing with Proxmox, and by God is it glorious. Why has it taken me so long to arrive here?

But yeah, it also looks like I'm stumbling here from Macvtap LANd.
@dietmar
"What is the advantage"

From IBM
"The MacVTap driver provides exceptional transactional throughput and operations/sec results (up to 10-50%) better than either of the two software bridges. Additionally, throughput of MacVTap scales up with load more quickly compared to using a software bridge. This means that MacVTap is more CPU efficient, consuming less CPU resources to complete the same amount of work. Stated another way, MacVTap can do more work using the same amount of CPU resources.

Although MacVTap is the best performing, it suffers from a couple of issues that may limit the use cases where it would be a suitable choice.

The first limitation is that MacVTap can not readily enable network communication between the KVM host and any of the KVM guests using MacVTap."


So macvtap is a kernel driver tap vs the software router bridge that was used in the past (and still used very heavily today)... Trust me, Macvtap documentation appears to be fairly sparse across the interwebs, and yet it for the most part just seems to work as advertised (in my extremely limited testing).

That being said, I believe KVM already integrates with Macvtaps pretty naturally. Take the following /etc/network/interfaces config for a KVM+Virt-Manager setup as an example. (note, I haven't dug into why yet, but this same type of networking setup doesn't play nicely with some custom systemd settings the proxmox hypervisor uses. ¯\_(ツ)_/¯ sysd complains it can't restart networking, but it does anyway and works like it's supposed to... I have no idea. Not really super relevant or important here, just thought I'd mention).


Code:
auto lo
iface lo inet loopback

allow-hotplug enp1s0f0
auto enp1s0f0
iface enp1s0f0 inet static
  address 10.8.0.30
  netmask 255.255.240.0
  gateway 10.8.0.1
  dns-nameservers 10.80.0.5 10.10.10.10

auto enp1s0f0.4
iface enp1s0f0.4 inet manual
  gateway 10.80.0.1
  vlan-raw-device enp1s0f0

auto enp1s0f0.10
iface enp1s0f0.10 inet manual
  gateway 10.10.10.1
  vlan-raw-device enp1s0f0

auto enp1s0f0.24
iface enp1s0f0.24 inet manual
  gateway 10.24.0.1
  vlan-raw-device enp1s0f0

auto enp1s0f0.25
iface enp1s0f0.25 inet manual
  gateway 10.25.0.1
  vlan-raw-device enp1s0f0

I'm just declaring the vlan interfaces in the network config file here, now if I go into virt-manager and add a new vm, I have the option to use these vlan interfaces as a macvtap bridge, vepa, private or passthrough device.

Screen Shot 2021-03-04 at 1.00.10 PM.png

and once that VM comes online, all that looks like in the ip command (on the host) for any VM that's assigned a vlan interface, looks like this

Code:
~# ip a
...
13: enp1s0f0.4@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
14: enp1s0f0.10@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
15: enp1s0f0.24@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
16: enp1s0f0.25@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
18: macvtap5@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
26: macvtap1@enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
27: macvtap2@enp1s0f0.24: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
28: macvtap3@enp1s0f0.25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...

I assume virt-manager is just making the `ip link add` calls for the macvtap interfaces here, and I believe once the interfaces are created, they're just handled by the kernel until they're manually deleted... But don't quote me on that.

Things to Note with Macvtaps: As mentioned before, you cannot ping the host IP from guests on the same subnet. That's where the NAT hairpinning comes into play. However, if the guests are on a separate VLan or any other separate cnamed subnet, they are crossing NAT, and that serves the same purpose as the hairpinning mentioned in the docs. Allowing coms between guests and hosts over IP.

Regardess, even though I can't ping host to guest and vice versa, virt-manager is still able to vnc/spice manage these machines, so some type of communication is still taking place here. Perhaps they're riding the vlan interface connection into the macvtap, or ignoring the tap entirely from a management perspective? I'm not sure.

I don't think this is really an either or scenario, it would just be nice to have the option to take advantage of something that's been built into the kernel for over a decade :p

My use case here is reviving old hardware by running proxmox on older cpus. When I get cpu spike, that also hits my networking. I believe this could help alleviate some of that pressure. Or not, I'm not the brightest.

That's my two cents, but yall keep rocking, I think I'm gonna start migrating everything to proxmox.
Jack


TL:\DR
As far as I can tell, the major documenation issues with Macvtap coms were written around the same time cgroup networking was being built into the kernel. So it was never considered as an easy solution to that problem, and no docs have really re-addressed it since. I guess Docker kind of consumed macvtaps and everyone assumed they were just for containers ¯\_(ツ)_/¯
 
Last edited:
Hi, in the bridged mode you can communicate with all VM's + LXC's + Host.
For the host the only difference is, that you need to add an Macvtap apater either. You can add one with another ip or in another vlan or you completeley replace the linux bridge with macvtap. (aka your linux bridge has the proxmox ip in a default installation and in our case, it would be macvtap)

There are still many things, i don't understand, why the host nic, needs to be in promiscous mode etc...
But overall, macvtap looks like its the future.
To be honest, macvtap looks for me exactly same or comparable to sr-iov. Sr-iov splits the nic in hardware, macvtap does something similar in software.

However, the main problem is, that we cannot even test it on proxmox, the config files won't allow it. Or anyone knows a way or tryed and use it?
 
I was playing around with ipvtap as I see a real-world advantage here over the usual bridge setup:
VMs (and CTs for that matter) don't need a unique mac address but share the mac address with the physical interface. This is an issue with several hosts (i.e. Hetzner only allowing a single mac even if you get an IPv6 /56).

Inter-VM communication could also be considerably faster compared to classic bridge and l3-mode (not tested yet) should scale better (according to macvtap/ipvtap howto).

However, it indeed looks like proxmox doesn't accept the necessary syntax even when manually editing the qm.conf and using "-args".

On the command line it works however. I.e. take your favorite vm and watch for the full kvm invocation (ps axwww is your friend). Then replace the network with sth like:
-netdev type=tap,fd=31,id=net0,vhost=on,vhostfd=4 31<>/dev/tap31 4<>/dev/vhost-net -device virtio-net-pci,mac=34:97:f6:5b:11:55,netdev=net0,bus=pci.0,addr=0x12,id=net0

where "/dev/tap31" matches the character node created by the vtap creation and the mac address is the same as the physical interface mac.

Before you can do that you obviously have to create and bring up the vtap interface. For me the following did the job:
ip link set <physical-interface-name> name ipvtap0 type l2
take not of the character device! it is equivalent to the interface idx as displayed by "ip link show"
bring up interface with the IP (yes you need to configure this on the proxmox host, at least for ipvtap - otherwise the host doesn't know where to route which packets):
ifconfig ipvtap0 <guest-ip> netmask <guest-mask> up
 
As far as I can see the only thing that prevents semi-integrated usage in proxmox is the fact that "-args" in qm.conf are quoted. This makes the file descriptor creation (i.e. <>/dev/tap31 in above example) fail.

However, the file descriptors from cmdline qm invocation are passed to kvm, so this qm.conf-entry works for me:
Code:
args: -netdev 'type=tap,fd=10,id=net0,vhost=on' -device 'virtio-net-pci,mac=xx:xx:xx:xx:xx:xx,netdev=net0,bus=pci.0,addr=0x12,id=net0'
when starting the vm like:
Bash:
qm start vmid 10<>/dev/tapid
so in above example (with the tap being "31", can be derived from /sys/class/net/<tapname>/ifindex
Bash:
qm start vmid 10<>/dev/tap31

(as far as I can tell the fd number is totally arbitrary, it just has to be the same in qm.conf and when creating the fd on the shell. You should probably stay away from 0,1,2 and I have seen "3" being used as well.)
 
Last edited:
Another advantage of macvtap (if the above is not enough) is the ability to passthrough a VF of a network device (or an entire network device) without actual pci-passthrough, allowing migration capabilities. See https://access.redhat.com/documenta...Physical_interface_delivery_modes-passthrough or https://wenchma.github.io/2016/09/07/macvtap-deep-dive.html#title5 for details.

So I hope there are enough good reasons why making the use of macvtap/ipvtap possible in Proxmox would improve user experience @dietmar :)

If you are generally willing to accept this, I could look into creating a PR.
 
Another advantage of macvtap (if the above is not enough) is the ability to passthrough a VF of a network device (or an entire network device) without actual pci-passthrough, allowing migration capabilities. See https://access.redhat.com/documenta...Physical_interface_delivery_modes-passthrough or https://wenchma.github.io/2016/09/07/macvtap-deep-dive.html#title5 for details.

Took me a while to research the subject, but it turns out that both MACVTAP in Passtru mode, or IPVLAN would be the solutions to my issue described here:
https://forum.proxmox.com/threads/t...arently-relay-to-host-nic.113600/#post-490741

So if there's one incentive to add support for this, it is to be able to use Proxmox as a L1 VM hypervisor and host a firewall in L2 VM, wherever the hosting does not allow multiple MAC addresses per L1 VF. EDIT: which I now realize you also mentioned as a known issue this should solve.
 
Last edited:
I currently came across the same question. At the moment Iam using KVM virtualisation with QEMU and want to switch to Proxmox.
However Iam surprised that MACVTAP is no nativly supported.

Is it, after the 3 years this thread has startet, still the only option to hack into each VM config manually in order to get MACVTAP set up ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!