2 node cluster using iscsi multipath to Dell ME5012 4 port 10Gig for iscsi slow performance?

robk

New Member
Jan 30, 2025
7
1
3
Hi All,

I am new to setting up iscsi multipath but did manage to get everything working as far as I can tell. I am getting about 2400Mbps read using fio on the VMs. the ME5 is configured for RAID10 with 6 disks. Jumbo frames are enabled, and confirmed at least from node to node getting 10Gig speeds.

I expected more performance, but things seem sluggish.

Each node shows 8 sessions available as they have 2 x 10Gig nics. They are configured for their own subnets as well.

If anyone has any info to help, I would appreciate it.

Cheers,
Rob

---

multipath.conf:

defaults {
find_multipaths strict
polling_interval 2
path_selector "round-robin 0"
path_grouping_policy multibus
uid_attribute ID_SERIAL
rr_min_io 100
failback immediate
no_path_retry queue
user_friendly_names yes
}

devices {

device {

vendor "DellEMC"

product "ME5"

path_grouping_policy "group_by_prio"

path_checker "tur"

hardware_handler "1 alua"

prio "alua"

failback immediate

rr_weight "uniform"

path_selector "service-time 0"

}
}

multipaths {
multipath {
wwid "3600c0ff000fc850722869a6701000000"
alias mpath0
}
}
 
Sluggish compared to? was it faster before multipathing and doing jumbo frames? Hard to know where to look for bottlenecks, but usually using jumbo frames brings jumbo problems, (just my experience) so I would check that stuff just off my head. Maybe set mtu's back to 1500 and get a baseline? Then step through your lan devices to troubleshoot? Is there a benchmark you are shooting for on disk I/O?
 
Last edited:
  • Like
Reactions: waltar
Thanks for replying dj423. I did do a baseline at 1500. It is always the same. 2400Gbps. Only to the storage. Running iperf from node to node I get ~10Gig. I will do a baseline again just to be sure. I was expecting at least 10Gbps speeds if not more since I am using multipath.
 
Yeah it's pretty challenging to fully saturate 10Gb links, bonded or not just due to the limitations of storage. Keeping everything on the same layer 2 network helps the most in my experience, since any type of routing is expensive for iscsi targets.
 
Multipath isn't active to other controller as long as primary controller which has the raidset defined is online.
What kind of fio did you do, random 4k read ? 2400Gbps is 300MB/s from 6 hdd isn't bad.
 
The ME5 only has one controller, but 4 10 gig ports. The disks are HGST 12TB 3.5" 7200RPM SAS3 12Gb/s and 256MB Cache. Results of FIO below:

Commands used:

fio --filename=/dev/mapper/mpath0 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

fio --filename=/dev/mapper/mpath0 --direct=1 --rw=write --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

MTU: 1500

READ: bw=2241MiB/s (2350MB/s), 2241MiB/s-2241MiB/s (2350MB/s-2350MB/s), io=132GiB (141GB), run=60090-60090msec

WRITE: bw=575MiB/s (603MB/s), 575MiB/s-575MiB/s (603MB/s-603MB/s), io=34.0GiB (36.5GB), run=60542-60542msec

MTU: 9000

READ: bw=1827MiB/s (1916MB/s), 1827MiB/s-1827MiB/s (1916MB/s-1916MB/s), io=107GiB (115GB), run=60104-60104msec

WRITE: bw=523MiB/s (549MB/s), 523MiB/s-523MiB/s (549MB/s-549MB/s), io=30.9GiB (33.1GB), run=60414-60414msec
 
So you are measured cache effects on read as 2350 MB/s / 6 hdd = 391 MB/s / 12 TB 7200rpm hdd which isn't possible.
Writes are ideal for a 6 disk raid10 as thats 603 MB/s / 3 mirror = 201 MB/s / 12 TB hdd.
And yeah ... with todays network chips fiddling on MTU is totally bullshit as seen here again - that's fine too !!
 
Last edited:
  • Like
Reactions: dj423
Thanks waltar. When Full cloning a template. It consistently takes 30 minutes for a 128GB VM. Is this expected?
 
I started with a clean slate. Everything looks to be setup correctly. 128GB VM still takes 30 minutes to clone. I believe because it is not thin provisioned. The LVM created on top of iSCSI, I believe can not be thin provisioned, which means a lot of zeroes must be getting written for the full 128G. Best explanation I have so far.

Migrating between nodes is fast, I assume because only pointers are getting re-written, and no data is being copied or moved since its on the same storage. fio tests are the same as before as well.

writes are around 500MB/s - I assume because technically only one disk is being written to, then getting distributed?
reads are around 2.4GB/s - I assume because reads can be done on multiple drives simultaneously.

Please let me know if I got that wrong.

I appreciate any info you all can share. Will be putting this system into production next week, just want to make sure its the best it can be.

Cheers,
Rob
 
Did you configed inside me5 your disks in "virtual" oder "linear" mode and in 1 or 3 disksets each mode of ?