Hello all,
I have stumbled on a very weird problem while trying to setup a Mellanox Infiniband ConnectX-5 card in proxmox on a DellEMC R640 server. I am trying to set it up in a bridge in order to be able to pass it to a VM running on the server.
What I managed so far is to install the drivers and configure the card and get it to ping the switch. The setup is as follows:
1) Mellanox infiniband switch with a 100.100.0.1 IP.
2) Dell server with the installed infiniband and 100.100.0.11 IP. As you can see the infiniband card is UP and has an IP and I can ping the switch:
Now that I know the card works, I want to create a bridge with it so I can give it to a VM I want to create. The reason for this is because I want to handle the drivers on the proxmox baremetal and not worry about issues with the mellanox drivers if I update the VM.
So now I create a bridge from the GUI (but the same thing happens if I do it by editing the inferfaces file.
As you can see the bridge is active (vmbr1) and the infiniband card is also active (ib0). However, it doesn't work:
If I inverstigate a bit, I get this from systemctl:
Following some threads on this forum, it was suggested that ib0 should be added to a bond first and the the bond should be added to a bridge. I tried that too but what happens is that when I create
I do not understand what is happening. Unless I am completely misreading the documentation, my settings should work.
For reference here is the
Can anyone halp me figure out what is happening?
I have stumbled on a very weird problem while trying to setup a Mellanox Infiniband ConnectX-5 card in proxmox on a DellEMC R640 server. I am trying to set it up in a bridge in order to be able to pass it to a VM running on the server.
What I managed so far is to install the drivers and configure the card and get it to ping the switch. The setup is as follows:
1) Mellanox infiniband switch with a 100.100.0.1 IP.
2) Dell server with the installed infiniband and 100.100.0.11 IP. As you can see the infiniband card is UP and has an IP and I can ping the switch:
Code:
root@thor1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2a brd ff:ff:ff:ff:ff:ff
3: enp25s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
link/ether b0:26:28:c5:d6:2c brd ff:ff:ff:ff:ff:ff
4: enp25s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2d brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2b brd ff:ff:ff:ff:ff:ff
6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
link/infiniband 20:00:05:01:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:c3:0e:76 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet 100.100.0.11/24 scope global ib0
valid_lft forever preferred_lft forever
inet6 fe80::ba59:9f03:c3:e76/64 scope link
valid_lft forever preferred_lft forever
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether b0:26:28:c5:d6:2c brd ff:ff:ff:ff:ff:ff
inet 10.10.0.11/24 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::b226:28ff:fec5:d62c/64 scope link
valid_lft forever preferred_lft forever
root@thor1:~# ping 100.100.0.1
PING 100.100.0.1 (100.100.0.1) 56(84) bytes of data.
64 bytes from 100.100.0.1: icmp_seq=1 ttl=64 time=0.613 ms
64 bytes from 100.100.0.1: icmp_seq=2 ttl=64 time=0.320 ms
64 bytes from 100.100.0.1: icmp_seq=3 ttl=64 time=0.210 ms
^C
--- 100.100.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 49ms
rtt min/avg/max/mdev = 0.210/0.381/0.613/0.170 ms
Now that I know the card works, I want to create a bridge with it so I can give it to a VM I want to create. The reason for this is because I want to handle the drivers on the proxmox baremetal and not worry about issues with the mellanox drivers if I update the VM.
So now I create a bridge from the GUI (but the same thing happens if I do it by editing the inferfaces file.
As you can see the bridge is active (vmbr1) and the infiniband card is also active (ib0). However, it doesn't work:
Code:
root@thor1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2a brd ff:ff:ff:ff:ff:ff
3: enp25s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
link/ether b0:26:28:c5:d6:2c brd ff:ff:ff:ff:ff:ff
4: enp25s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2d brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b0:26:28:c5:d6:2b brd ff:ff:ff:ff:ff:ff
6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 256
link/infiniband 20:00:05:01:fe:80:00:00:00:00:00:00:b8:59:9f:03:00:c3:0e:76 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet6 fe80::ba59:9f03:c3:e76/64 scope link
valid_lft forever preferred_lft forever
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether b0:26:28:c5:d6:2c brd ff:ff:ff:ff:ff:ff
inet 10.10.0.11/24 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::b226:28ff:fec5:d62c/64 scope link
valid_lft forever preferred_lft forever
10: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 16:13:09:5b:82:d8 brd ff:ff:ff:ff:ff:ff
inet 100.100.0.11/24 scope global vmbr1
valid_lft forever preferred_lft forever
inet6 fe80::1413:9ff:fe5b:82d8/64 scope link
valid_lft forever preferred_lft forever
root@thor1:~# ping 100.100.0.1
PING 100.100.0.1 (100.100.0.1) 56(84) bytes of data.
^C
--- 100.100.0.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 22ms
If I inverstigate a bit, I get this from systemctl:
Code:
root@thor1:~# systemctl status networking.service
● networking.service - Network initialization
Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2020-07-23 22:06:55 EEST; 1min 1s ago
Docs: man:interfaces(5)
man:ifup(8)
man:ifdown(8)
Process: 42523 ExecStart=/usr/share/ifupdown2/sbin/start-networking start (code=exited, status=0/SUCCESS)
Main PID: 42523 (code=exited, status=0/SUCCESS)
Jul 23 22:06:54 thor1 systemd[1]: Starting Network initialization...
Jul 23 22:06:54 thor1 networking[42523]: networking: Configuring network interfaces
Jul 23 22:06:54 thor1 networking[42523]: warning: vmbr1: skipping port ib0, invalid ether addr
Jul 23 22:06:55 thor1 systemd[1]: Started Network initialization.
Following some threads on this forum, it was suggested that ib0 should be added to a bond first and the the bond should be added to a bridge. I tried that too but what happens is that when I create
bond0
, I can use it to ping switch, but as soon as I add it to the vmbr1
bridge I get the same result as above.I do not understand what is happening. Unless I am completely misreading the documentation, my settings should work.
For reference here is the
/etc/network/interfaces
file:
Code:
root@thor1:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
auto enp25s0f0
iface enp25s0f0 inet manual
iface enp25s0f1 inet manual
iface eth1 inet manual
iface eth3 inet manual
iface eth0 inet manual
iface eth5 inet manual
auto ib0
iface ib0 inet manual
auto vmbr0
iface vmbr0 inet static
address 10.10.0.11/24
gateway 10.10.0.254
bridge-ports enp25s0f0
bridge-stp off
bridge-fd 0
#10Gb SFP+
auto vmbr1
iface vmbr1 inet static
address 100.100.0.11/24
bridge-ports ib0
bridge-stp off
bridge-fd 0
Can anyone halp me figure out what is happening?