Are not both correct?
There does exist data transfer between ME5024 controllers as used for cache coherency and other controller<->controller communications, but it is also true that one controller can owns the disk (group) I/O at a time.
@walter, if I am not mistaken, the below statement is only correct when using path_grouping_policy multibus, and does not apply when using group_by_prio. I did not test multibus with redundant cabling/switches/controllers though.
RoundRobin access of ME4/5 + MSA20xy volumes is a really bad idea as the volumes are owned by the controller as assigned to. So accessing them by the redundant path of the other controller just forward the request from ctrl A to B over internal link and the answer again back over internal link which provides you max latency as possible.
I'm led to believe that one has to set path_grouping_policy multibus, in order for paths for both controller A and B to be in the same priority group, upon which path_selector round-robin 0 would then attempt to use the non-preferred paths, resulting in undesirable performance/configuration.
If using the recommended path_grouping_policy group_by_prio, with preferred paths being set from alua, the use of path_selector round-robin 0 is valid, and leaves the less preferred paths in a second group, mainly for failover, thus a desirable configuration with performance.
In my own testing, using group_by_prio with either service-time 0 or round-robin 0, did not result in the non-preferred paths to controller B being used at all, not unless I downed all paths in the preferred group by downing nic interfaces, doing cable pulls, downing a controller or otherwise.
FWIW to anybody else, this morning after further testing of path_selector round-robin 0 (not recommended by Dell) as opposed to service-time 0 (recommended by Dell), it was seen that max latency spikes were somewhat reduced. round-robin 0 helped for a specific workload that had erratic performance and did not introduce new performance issues (as far as my setup/testing went).
The reduced latency was seen when performing in-guest data transfer to/from same local disk (LVM, between 2 to 4 iSCSI 10GBASE-T paths, NTFS 4K allocation unit, vm disk cache off as per default). It did not show when using a single path. In the end, for prod settings have decided to go with vm disk cache write-back, 4 x 10GBASE-T paths and make use of NTFS 16K allocation unit for any windows VMs that have NTFS fs disks requiring SMB sharing to many users. Impact weights large to small mostly attributable to vm disk cache method followed by NTFS allocation unit size and lastly path_selector round-robin 0.