Quick HOWTO on setting up iSCSI Multipath

Spoke with Pure, they ran an `arping` from the array. Looks like there's no flapping (after the initial ping) on the array side.
1739280077074.png
1739280099720.png

Going to see if support knows of if there is any logic that prevents it from flapping.

It looks like when arping either interface, it returns the MAC for the first interface
1739280532669.png
 
Last edited:
1739372640808.png
1739372653561.png

Second time is reports correctly, other than the first response again. Is the first one a broadcast?
 
View attachment 82251

Second time is reports correctly, other than the first response again. Is the first one a broadcast?
Hi, sorry for the delay. In principle, it would be good to avoid such ARP "inconsistencies" (e.g., getting replies with different MAC addresses). In my test setup, I have seen intermittent connection issues due to these -- while the issues are to some degree mitigated by multipath, it would still be better to avoid them altogether. This should be possible by some variant of source-based routing, or using VRFs [1], but I'm still trying to find the most maintainable option.

Regarding your screenshots from posts #21 and #22: Did you change anything in the network configuration between taking those two screenshots? I'm asking because, as you point out, they look different (in #22, "most" replies report the MAC address of the correct interface, whereas in #21 most replies report the MAC address of the first interface). Are they both from the same arping invocation?

[1] https://docs.kernel.org/networking/vrf.html
 
Regarding your screenshots from posts #21 and #22: Did you change anything in the network configuration between taking those two screenshots? I'm asking because, as you point out, they look different (in #22, "most" replies report the MAC address of the correct interface, whereas in #21 most replies report the MAC address of the first interface). Are they both from the same arping invocation?

Yup, this is from the same arping with no changes between them. On Pure arrays only tech support has access to these commands and no changes were made by them or myself, so this is just how it behaves normally.
 
The best way to remove any ambiguity is to isolate interfaces on their own VLANs/Subnets.
If that is not possible, one should carefully review vendor's best practices.

For example:
https://support.purestorage.com/bun...opics/concept/c_linux_host_configuration.html
Code:
Note: If multiple interfaces exist on the same subnet in RHEL, your iSCSI initiator may fail to connect to Pure Storage target.
In this case, you need to set sysctl's net.ipv4.conf.all.arp_ignore to 1 to force each interface to only answer ARP requests for its own
addresses. Please see RHEL KB for Issue Detail and Resolution Steps (requires Red Hat login).

The above recommendation is generally applicable to any Linux.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
The best way to remove any ambiguity is to isolate interfaces on their own VLANs/Subnets.
If that is not possible, one should carefully review vendor's best practices.


The above recommendation is generally applicable to any Linux.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

How does ESXi get around this issue? It seems to be doing something similar, but I know they probably have a lot more magic behind the curtains to do so.

I will try setting the arp_ignore and report back with the results.



We can run them on different VLANs, but this is more to see if there's a technical limitation behind this or simply that it's easier to do, therefor best practice.
 
How does ESXi get around this issue?
They were/are running proprietary "Linux like" OS and Kernel. The networking is proprietary as well.

I will try setting the arp_ignore and report back with the results.
We do not run Pure Storage, so I cannot make any recommendations or guarantees regarding the applicability of this solution.

That said, most storage vendors mention the "single subnet" use case in their documentation as a special case.
I strongly recommend extensive testing in your environment. It may initially seem like everything is configured correctly, only to find "gremlins causing unexplained behavior" a few months from now.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I strongly recommend extensive testing in your environment. It may initially seem like everything is configured correctly, only to find "gremlins causing unexplained behavior" a few months from now.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Absolutely, we have a couple of months from migration - So we're spending a lot of time getting ahead of issues.
 
The best way to remove any ambiguity is to isolate interfaces on their own VLANs/Subnets.
If that is not possible, one should carefully review vendor's best practices.

For example:
https://support.purestorage.com/bun...opics/concept/c_linux_host_configuration.html
Code:
Note: If multiple interfaces exist on the same subnet in RHEL, your iSCSI initiator may fail to connect to Pure Storage target.
In this case, you need to set sysctl's net.ipv4.conf.all.arp_ignore to 1 to force each interface to only answer ARP requests for its own
addresses. Please see RHEL KB for Issue Detail and Resolution Steps (requires Red Hat login).

The above recommendation is generally applicable to any Linux.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Thanks for the suggestion! I personally have not tested arp_ignore=1 yet, but it sounds like it may be a possibility.

An alternative (that allows to keep arp_ignore at its default) might be to define one VRF [1] for each path in /etc/network/interfaces and assign each iSCSI interface to its respective VRF, for example in my setup with two iSCSI interface ens19 and ens20:
Code:
auto ens19
iface ens19 inet static
    address 172.16.0.200/24
    vrf path1

auto ens20
iface ens20 inet static
    address 172.16.0.201/24
    vrf path2

auto path1
iface path1
    vrf-table auto

auto path2
iface path2
    vrf-table auto
... and then use Open-iSCSI's ifaces feature with the iSCSI interfaces ens19 and ens20 as described in the guide that was posted here.

However, please note: I've run a few tests in the setup with VRFs and so far it looks good, but these were not very realistic workloads. So @PwrBank if you are in a position to test setups, it would be interesting to hear your experience with the arp_ignore=1 and the VRF setup.

[1] https://docs.kernel.org/networking/vrf.html
 
  • Like
Reactions: Johannes S
Here is the baseline and the arp_ignore results, I will try to get the VRF and VRF+arp_ignore today

Pure got back with the baseline results
Code:
= CT1 (primary) =
 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.60 | nl
Fri Feb 21 01:01:23 PM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:69] 0.581ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.589ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.598ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.601ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.560ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.592ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.578ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.590ms
10 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.590ms
11 Sent 8 probes (1 broadcast(s))
12 Received 9 response(s)
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.61 | nl
Fri Feb 21 01:01:34 PM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.589ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.605ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.574ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.563ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.587ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.622ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.584ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.574ms
10 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.566ms
11 Sent 8 probes (1 broadcast(s))
12 Received 9 response(s)
root@der-pure-ct1:~#
 
= CT0 (secondary) =
 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.60 | nl
Fri Feb 21 01:02:34 PM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:69] 0.578ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.584ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.546ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.573ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.572ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.552ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.548ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.559ms
10 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.556ms
11 Sent 8 probes (1 broadcast(s))
12 Received 9 response(s)
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.61 | nl
Fri Feb 21 01:02:53 PM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.578ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.588ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.569ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.569ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.566ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.667ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.596ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.575ms
10 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.570ms
11 Sent 8 probes (1 broadcast(s))
12 Received 9 response(s)
root@der-pure-ct0:~#

Edit the sysctl config file

Bash:
micro /etc/sysctl.conf

Added
Code:
net.ipv4.conf.all.arp_ignore = 1

Added the changes to the systemctl config and applied using the following command
Code:
sysctl -p /etc/sysctl.conf

After making the changes:
Code:
= CT1 =
 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.60 | nl
Fri Feb 21 02:27:53 PM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.611ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.601ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.582ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.579ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.583ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.564ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.567ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.571ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.61 | nl
Fri Feb 21 02:28:15 PM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.582ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.587ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.569ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.586ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.598ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.582ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.595ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.585ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct1:~#
 
= CT0 =
 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.60 | nl
Fri Feb 21 02:28:46 PM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.585ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.560ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.584ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.585ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.582ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.596ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.570ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.568ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.61 | nl
Fri Feb 21 02:29:04 PM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.575ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.568ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.580ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.576ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.585ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.580ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.565ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.596ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct0:~#


EDIT:

So far haven't had any issues with VMs or networking with that setting enabled. Still maxing out the dual 25GbE connection too.
1740418899180.png
 
Last edited:
Okay, more results. I'm not sure I'm doing the VRF correctly, so I may need help getting this tested.

But here's my notes:

Created vrf

Bash:
ip link add vrf-blue type vrf table 10
ip link set dev vrf-blue up
ip route add table 10 unreachable default metric 4278198272
ip link set dev scsi0 master vrf-blue
ip link set dev scsi1 master vrf-blue
sysctl -w net.ipv4.tcp_l3mdev_accept=1
sysctl -w net.ipv4.udp_l3mdev_accept=1

Listed the vrf
Bash:
# list devices bound to the vrf
ip link show vrf vrf-blue

1741113483154.png

The interfaces turned off for some reason after enabling the vrf
Bash:
ip -br link
1741113554465.png

Re-enabled them using ip
Bash:
ip link set ens2f0np0 up
ip link set ens2f1np1 up
ip link set scsi0 up
ip link set scsi1 up
ip link set vrf-blue up

Able to ping from the vrf
1741113584294.png

Pure responded with:
Code:
= CT1 = 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.60 | nl 
Mon Mar 3 12:50:22 PM CST 2025 
1 ARPING 10.10.254.60 from 10.10.254.52 eth18 
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.750ms 
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.744ms 
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.770ms 
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.596ms 
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.576ms 
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.612ms 
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.602ms 
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.585ms 
10 Sent 8 probes (1 broadcast(s)) 
11 Received 8 response(s) 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.61 | nl 
Mon Mar 3 12:50:33 PM CST 2025 
1 ARPING 10.10.254.61 from 10.10.254.52 eth18 
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.591ms 
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.737ms 
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.590ms 
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.571ms 
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.588ms 
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.744ms 
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.582ms 
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.726ms 
10 Sent 8 probes (1 broadcast(s)) 
11 Received 8 response(s) 
root@der-pure-ct1:~# 
 
= CT0 = 
 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.60 | nl 
Mon Mar 3 12:51:37 PM CST 2025 
1 ARPING 10.10.254.60 from 10.10.254.50 eth18 
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.592ms 
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.747ms 
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.737ms 
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.728ms 
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.569ms 
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.739ms 
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.565ms 
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.731ms 
10 Sent 8 probes (1 broadcast(s)) 
11 Received 8 response(s) 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.61 | nl 
Mon Mar 3 12:51:47 PM CST 2025 
1 ARPING 10.10.254.61 from 10.10.254.50 eth18 
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.752ms 
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.733ms 
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.739ms 
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.729ms 
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.728ms 
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.732ms 
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.579ms 
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.730ms 
10 Sent 8 probes (1 broadcast(s)) 
11 Received 8 response(s) 
root@der-pure-ct0:~#

Now disabling arp_ignore and leaving the current vrf settings
Looks like the interfaces are back to responding with whatever. Probably due to rebooting the server and it losing the vrf.
Re-added the vrf as above

Looking at the route list, it looks like it should be good to go
1741113636322.png

Code:
= CT1 = 
 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.60 | nl 
Tue Mar 4 10:09:18 AM CST 2025 
1 ARPING 10.10.254.60 from 10.10.254.52 eth18 
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:69] 0.668ms 
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.676ms 
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.583ms 
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.745ms 
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.580ms 
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.580ms 
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.604ms 
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.585ms 
10 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.582ms 
11 Sent 8 probes (1 broadcast(s)) 
12 Received 9 response(s) 
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.61 | nl 
Tue Mar 4 10:09:31 AM CST 2025 
1 ARPING 10.10.254.61 from 10.10.254.52 eth18 
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.583ms 
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.618ms 
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.725ms 
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.585ms 
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.749ms 
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.758ms 
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.752ms 
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.750ms 
10 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.590ms 
11 Sent 8 probes (1 broadcast(s)) 
12 Received 9 response(s) 
root@der-pure-ct1:~# 
 
= CT0 = 
 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.60 | nl 
Tue Mar 4 10:10:15 AM CST 2025 
1 ARPING 10.10.254.60 from 10.10.254.50 eth18 
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:69] 0.743ms 
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.752ms 
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.737ms 
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.737ms 
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.741ms 
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.731ms 
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.572ms 
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.733ms 
10 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:62] 0.704ms 
11 Sent 8 probes (1 broadcast(s)) 
12 Received 9 response(s) 
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.61 | nl 
Tue Mar 4 10:10:28 AM CST 2025 
1 ARPING 10.10.254.61 from 10.10.254.50 eth18 
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:69] 0.731ms 
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.752ms 
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.578ms 
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.580ms 
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.728ms 
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.732ms 
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.578ms 
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.573ms 
10 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:62] 0.719ms 
11 Sent 8 probes (1 broadcast(s)) 
12 Received 9 response(s) 
root@der-pure-ct0:~#

I don't think I have the VRF setup correct
 
Okay, more results. I'm not sure I'm doing the VRF correctly, so I may need help getting this tested.
Thanks for trying this and the arp_ignore=1! Manually setting up the VRF like you did may be a bit tricky, it's probably easier to let ifupdown2 handle this. My previous answer [1] on this was a bit brief, so here are some more details:

If I understand correctly, your /etc/network/interfaces, after following the guide by the original poster, defines two IP addresses in the same subnet on two different physical interfaces. Similarly, I have these two stanzas for ens19/ens20 on my test system:
Code:
auto ens19
iface ens19 inet static
    address 172.16.0.200/24

auto ens20
iface ens20 inet static
    address 172.16.0.201/24

It should be enough to add two VRF stanzas (the names don't really matter too much) ...
Code:
auto path1
iface path1
    vrf-table auto

auto path2
iface path2
    vrf-table auto

... and modify the ens19/ens20 stanzas to attach to these VRF by adding two vrf options:
Code:
auto ens19
iface ens19 inet static
    address 172.16.0.200/24
    vrf path1

auto ens20
iface ens20 inet static
    address 172.16.0.201/24
    vrf path2

Then reload the network config with ifreload -a. Then, ens19/ens20 should be attached to their respective VRF, and the VRFs should have one route defined (to the subnet where the iSCSI portals are located), e.g.:
Code:
# ip link | egrep 'ens(19|20)'
3: ens19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master path1 state UP mode DEFAULT group default qlen 1000
4: ens20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master path2 state UP mode DEFAULT group default qlen 1000
# ip route show vrf path1
172.16.0.0/24 dev ens19 proto kernel scope link src 172.16.0.200
# ip route show vrf path2
172.16.0.0/24 dev ens20 proto kernel scope link src 172.16.0.201
I assume you have also set up Open-iSCSI to bind directly to the ens19/ens20 interfaces using iscsiadm -m iface, as described in the guide by the original poster.

On my test setup, this setup makes the host respond to ARP requests for 172.16.0.200/172.16.0.201 only on one interface, and with the correct MAC address -- no further tweaks are necessary (in particular, no need to set the tcp_l3mdev_accept/udp_l3mdev_accept sysctls).

[1] https://forum.proxmox.com/threads/quick-howto-on-setting-up-iscsi-multipath.157532/post-750326
 
Alrighty, got that setup. Followed what you posted above, here's the results:
⚠️ Note: The MAC address are different than the last few posts I made because I was using a bridge for multiple different things on those ports. This time they reflect the actual MAC addresses of the ports

Code:
= CT1 =

root@der-pure-ct1:~# date ; arping -c 8 10.10.254.60 | nl
Fri Mar 7 10:42:04 AM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.774ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.583ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.581ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.724ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.742ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.592ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.740ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.584ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct1:~# date ; arping -c 8 10.10.254.61 | nl
Fri Mar 7 10:42:16 AM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.52 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 9.547ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.732ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.730ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.580ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.744ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.743ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.753ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.735ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct1:~#

= CT0 =

root@der-pure-ct0:~# date ; arping -c 8 10.10.254.60 | nl
Fri Mar 7 10:43:03 AM CST 2025
1 ARPING 10.10.254.60 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.584ms
3 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.726ms
4 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.739ms
5 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.693ms
6 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.583ms
7 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.736ms
8 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.731ms
9 Unicast reply from 10.10.254.60 [BC:97:E1:78:47:60] 0.726ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct0:~# date ; arping -c 8 10.10.254.61 | nl
Fri Mar 7 10:43:19 AM CST 2025
1 ARPING 10.10.254.61 from 10.10.254.50 eth18
2 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.580ms
3 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.577ms
4 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.741ms
5 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.727ms
6 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.741ms
7 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.569ms
8 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.733ms
9 Unicast reply from 10.10.254.61 [BC:97:E1:78:47:61] 0.730ms
10 Sent 8 probes (1 broadcast(s))
11 Received 8 response(s)
root@der-pure-ct0:~#

So it looks like that does work as intended as well.
 
Alrighty, got that setup. Followed what you posted above, here's the results:
⚠️ Note: The MAC address are different than the last few posts I made because I was using a bridge for multiple different things on those ports. This time they reflect the actual MAC addresses of the ports

[...]

So it looks like that does work as intended as well.
Thanks for testing and reporting back! The results look good.

Is there any reason to do the VRF vs arp_ignore?
In my opinion, the VRF solution is much cleaner than the arp_ignore change: For one, the VRF configuration more visible (if configured in /etc/network/interfaces) whereas it's easier to forget about a changed sysctl config file change in /etc/sysctl.d. Also, regarding sysctls and especially network-related sysctls, my general recommendation would be to stay with the defaults if possible.

So I'd say, the general recommendation for iSCSI multipath would still be to have multiple disjoint subnets for the different paths. If this is not desired and assigning multiple IPs in the same subnet to the Proxmox VE nodes is necessary, the VRF solution looks cleaner than the arp_ignore change.
 
Last edited: