Geringe Lese-Performance mit iSCSI und Dell ME5

Xoxxoxuatl · Jan 8, 2026

Hallo,

wir sind gerade dabei PVE als mögliche Lösung für unsere neue Umgebung zu betrachten und mir sind Probleme bei der Lese-Performance aufgefallen die ich nicht lösen kann. Vielleicht kann mir jemand dabei helfen...

HARDWARE
Server - R7525 (mit 4x25 GBit für iSCSI und 2x25GB VM Traffic)
Switche - 2 x S5248-ON
Storage - Dell ME5024 mit SSDs (4 Ports je Controller)

NETZWERK
Server - 2 Ports sind mit VLAN 4011 (iSCSI) verbunden und 2 Ports mit VLAN 4012 (iSCSI)
Storage - je Controller (A/B) sind jeweils 2 Ports im VLAN 4011 und in 4012

CONFIG
Die MTU wurde auf 9000 gestellt und eine LUN mit 2TB auf RAID 6 and en Host gemappt.

/etc/multipath.conf

Code:

defaults {
    user_friendly_names no
    find_multipaths yes
    polling_interval 10
    no_path_retry queue
}

blacklist {
    devnode "^sda$"
}

devices {
    device {
        vendor "DellEMC"
        product "ME5"
        path_grouping_policy group_by_prio
        path_checker "tur"
        hardware_handler "1 alua"
        prio "alua"
        failback immediate
        rr_weight "uniform"
        path_selector "service-time 0"
    }
}

multipath -ll

Code:

3600c0ff000f949ac64435e6901000000 dm-15 DellEMC,ME5
size=1.8T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 14:0:0:6 sdav 66:240 active ready running
| |- 13:0:0:6 sdat 66:208 active ready running
| |- 18:0:0:6 sdbe 67:128 active ready running
| `- 15:0:0:6 sdbc 67:96  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 12:0:0:6 sdao 66:128 active ready running
  |- 16:0:0:6 sdaw 67:0   active ready running
  |- 19:0:0:6 sdbd 67:112 active ready running
  `- 17:0:0:6 sdbf 67:144 active ready running

TEST
Die Schreibtests waren ganz OK. Was eventuell auf ein Problem hindeutet sind die Lese-Tests, welche wiederholt eine zu geringe Performance zeigen.

Bash:

fio --filename=/dev/mapper/3600c0ff000f949ac64435e6901000000 --direct=1 --rw=write --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --time_based --group_reporting --name=seq_write_test

seq_write_test: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=3347MiB/s][w=3347 IOPS][eta 00m:00s]
seq_write_test: (groupid=0, jobs=1): err= 0: pid=512685: Thu Jan  8 08:41:47 2026
  write: IOPS=3329, BW=3330MiB/s (3492MB/s)(195GiB/60009msec); 0 zone resets
    slat (usec): min=35, max=2799, avg=71.69, stdev=43.40
    clat (usec): min=1292, max=378743, avg=9537.44, stdev=13525.75
     lat (usec): min=1370, max=378806, avg=9609.13, stdev=13525.45
    clat percentiles (msec):
     |  1.00th=[    4],  5.00th=[    5], 10.00th=[    6], 20.00th=[    6],
     | 30.00th=[    7], 40.00th=[    7], 50.00th=[    8], 60.00th=[    9],
     | 70.00th=[   10], 80.00th=[   12], 90.00th=[   15], 95.00th=[   17],
     | 99.00th=[   22], 99.50th=[   27], 99.90th=[  215], 99.95th=[  218],
     | 99.99th=[  222]
   bw (  MiB/s): min= 2862, max= 4176, per=100.00%, avg=3330.38, stdev=162.67, samples=120
   iops        : min= 2862, max= 4176, avg=3330.38, stdev=162.67, samples=120
  lat (msec)   : 2=0.03%, 4=1.70%, 10=70.53%, 20=25.93%, 50=1.40%
  lat (msec)   : 100=0.01%, 250=0.41%, 500=0.01%
  cpu          : usr=7.40%, sys=17.53%, ctx=116662, majf=0, minf=2366
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,199823,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=3330MiB/s (3492MB/s), 3330MiB/s-3330MiB/s (3492MB/s-3492MB/s), io=195GiB (210GB), run=60009-60009msec

Bash:

fio --filename=/dev/mapper/3600c0ff000f949ac64435e6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --time_based --group_reporting --name=seq_read_test
seq_read_test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [R(1)][0.3%][eta 06h:44m:30s]
seq_read_test: (groupid=0, jobs=1): err= 0: pid=512077: Thu Jan  8 08:40:04 2026
  read: IOPS=78, BW=78.3MiB/s (82.1MB/s)(4892MiB/62452msec)
    slat (usec): min=38, max=1062, avg=52.38, stdev=36.28
    clat (usec): min=1597, max=10719k, avg=408367.90, stdev=1218691.64
     lat (usec): min=1953, max=10719k, avg=408420.28, stdev=1218694.35
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[    5], 10.00th=[    5], 20.00th=[    5],
     | 30.00th=[    6], 40.00th=[    6], 50.00th=[    6], 60.00th=[   17],
     | 70.00th=[   65], 80.00th=[  161], 90.00th=[  953], 95.00th=[ 4329],
     | 99.00th=[ 5269], 99.50th=[ 5403], 99.90th=[ 9866], 99.95th=[10671],
     | 99.99th=[10671]
   bw (  KiB/s): min= 6144, max=1245184, per=100.00%, avg=216375.65, stdev=254036.48, samples=46
   iops        : min=    6, max= 1216, avg=211.30, stdev=248.08, samples=46
  lat (msec)   : 2=0.04%, 4=0.51%, 10=57.07%, 20=3.31%, 50=6.23%
  lat (msec)   : 100=8.20%, 250=8.91%, 500=2.02%, 750=0.92%, 1000=3.76%
  lat (msec)   : 2000=3.21%, >=2000=5.81%
  cpu          : usr=0.02%, sys=0.42%, ctx=4525, majf=0, minf=8203
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.3%, 32=99.4%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=4892,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=78.3MiB/s (82.1MB/s), 78.3MiB/s-78.3MiB/s (82.1MB/s-82.1MB/s), io=4892MiB (5130MB), run=62452-62452msec

Danke schonmal für die Hilfe.

Thomas

Falk R. · Jan 8, 2026

Hi, 8 Pfade ist schon ungewöhnlich, aber sollte der ME nichts ausmachen.
Da der Writetest vernünftig läuft gehe ich nicht von einem Netzwerkfehler aus.
Wie sieht denn die Ausstattung der ME aus? Zufällig HDDs?
Beim Write landet alles im Batteriegestützen Cache, aber ein Direct Read kommt direkt von den Disks.
78MB bei einer Blocksize von 1MB ist aber auch für HDDs schon ungewöhnlich langsam.
Da die Konfiguration sauer aussieht, würde ich mir die ME etwas genauer anschauen.

Eventuell auch mal mit Standard 4 Pfaden testen, so wie in den Offiziellen Anleitungen.

Xoxxoxuatl · Jan 9, 2026

Moin,

in der ME sind SSDs verbaut.

Hab die LUN an einen weiteren Server gehängt und dort läuft es besser, daher würde ich das Storage ausschließen?
Bin dann Gestern noch die Configs und Ports durch konnte aber nichts finden.

Böser Server

Code:

prmx03:~# fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --time_based --size=50G --name=seq_read

seq_read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [R(1)][1.3%][eta 01h:23m:44s]
seq_read: (groupid=0, jobs=1): err= 0: pid=3946418: Fri Jan  9 08:03:37 2026
  read: IOPS=10, BW=10.6MiB/s (11.1MB/s)(675MiB/63908msec)
    slat (usec): min=34, max=604, avg=71.43, stdev=81.42
    clat (usec): min=1104, max=26019k, avg=3025339.96, stdev=4286541.61
     lat (usec): min=1145, max=26019k, avg=3025411.38, stdev=4286534.90
    clat percentiles (usec):
     |  1.00th=[    1696],  5.00th=[    5014], 10.00th=[    5014],
     | 20.00th=[    5211], 30.00th=[    5997], 40.00th=[   62129],
     | 50.00th=[  221250], 60.00th=[ 4043310], 70.00th=[ 5133829],
     | 80.00th=[ 5469373], 90.00th=[ 6408897], 95.00th=[ 9462350],
     | 99.00th=[17112761], 99.50th=[17112761], 99.90th=[17112761],
     | 99.95th=[17112761], 99.99th=[17112761]
   bw (  KiB/s): min= 2048, max=143360, per=100.00%, avg=41216.00, stdev=42705.71, samples=32
   iops        : min=    2, max=  140, avg=40.25, stdev=41.70, samples=32
  lat (msec)   : 2=1.33%, 4=0.89%, 10=29.78%, 20=1.04%, 50=2.37%
  lat (msec)   : 100=12.00%, 250=2.67%, 500=1.63%, 1000=0.89%, 2000=3.70%
  lat (msec)   : >=2000=43.70%
  cpu          : usr=0.00%, sys=0.08%, ctx=564, majf=0, minf=8201
  IO depths    : 1=0.1%, 2=0.3%, 4=0.6%, 8=1.2%, 16=2.4%, 32=95.4%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.2%, 64=0.0%, >=64=0.0%
     issued rwts: total=675,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=10.6MiB/s (11.1MB/s), 10.6MiB/s-10.6MiB/s (11.1MB/s-11.1MB/s), io=675MiB (708MB), run=63908-63908msec

Guter Server

Code:

prmx02:~# fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --time_based --size=50G --name=seq_read

seq_read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=8533MiB/s][r=8533 IOPS][eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=2534292: Fri Jan  9 07:50:23 2026
  read: IOPS=8587, BW=8588MiB/s (9005MB/s)(503GiB/60001msec)
    slat (usec): min=108, max=1291, avg=115.27, stdev= 9.54
    clat (usec): min=2, max=32080, avg=3609.62, stdev=167.35
     lat (usec): min=116, max=33372, avg=3724.89, stdev=174.51
    clat percentiles (usec):
     |  1.00th=[ 3523],  5.00th=[ 3523], 10.00th=[ 3556], 20.00th=[ 3556],
     | 30.00th=[ 3589], 40.00th=[ 3589], 50.00th=[ 3621], 60.00th=[ 3621],
     | 70.00th=[ 3621], 80.00th=[ 3621], 90.00th=[ 3654], 95.00th=[ 3654],
     | 99.00th=[ 3720], 99.50th=[ 3982], 99.90th=[ 4883], 99.95th=[ 5538],
     | 99.99th=[ 6063]
   bw (  MiB/s): min= 7840, max= 8758, per=100.00%, avg=8587.70, stdev=108.53, samples=120
   iops        : min= 7840, max= 8758, avg=8587.70, stdev=108.53, samples=120
  lat (usec)   : 4=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=99.55%, 10=0.44%, 20=0.01%, 50=0.01%
  cpu          : usr=0.82%, sys=99.11%, ctx=1095, majf=0, minf=8212
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=515262,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=8588MiB/s (9005MB/s), 8588MiB/s-8588MiB/s (9005MB/s-9005MB/s), io=503GiB (540GB), run=60001-60001msec

Falk R. · Jan 9, 2026

Pack bitte mal die Ergebnisse in Code Blöcke, das hilft beim lesen.
Wenn die so verschieden sind, solltest du das Netzwerk vom „Bösen“ noch einmal checken. Jumbo Frames überall korrekt konfiguriert? Fehler auf Ports?

Xoxxoxuatl · Jan 9, 2026

Bin die Switche nochmal durch...

Jumbo Frames sind an allen Ports an
keine CRC Fehler|Discard
Flowcontrol tx/rx ist auch aktiv.

Ich steck die beiden mal um, dazu komm ich aber erst nächste Woche.

Falk R. · Jan 9, 2026

Xoxxoxuatl said:
Bin die Switche nochmal durch...

Jumbo Frames sind an allen Ports an
keine CRC Fehler|Discard
Flowcontrol tx/rx ist auch aktiv.

Ich steck die beiden mal um, dazu komm ich aber erst nächste Woche.

Gerade wenn nur das Lesen eingeschränkt ist, hätte ich auch mal die Ports auf dem PVE gecheckt, nicht das da einer falsch ist.

Xoxxoxuatl · Jan 12, 2026

Neue Woche, neues Glück...

Das Umstecken aller 4 iSCSI Ports hat nichts gebracht, der Fehler ist nicht mit gewandert.
Also doch ein Server-Problem?

Falk R. · Jan 12, 2026

Hast du überall gecheckt ob auf allen Ports die Jumbo Frames richtig gesetzt sind? Das könnte ich mir auf Hostseite vorstellen.

Xoxxoxuatl · Jan 12, 2026

Jeder Server-Port kann Jumbo Frames an jeden SAN-Port im selben vLAN senden.

Xoxxoxuatl · Jan 12, 2026

Was mir noch aufgefallen ist...ab einer bestimmten queue size (--iodepth) bricht die Performance ein.

fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=4 --runtime=10 --time_based --size=200G --name=seq_read

READ: bw=3876MiB/s (4064MB/s), 3876MiB/s-3876MiB/s (4064MB/s-4064MB/s), io=37.9GiB (40.6GB), run=10001-10001msec

fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=6 --runtime=10 --time_based --size=200G --name=seq_read

READ: bw=1396MiB/s (1464MB/s), 1396MiB/s-1396MiB/s (1464MB/s-1464MB/s), io=14.2GiB (15.3GB), run=10450-10450msec

fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1M --ioengine=libaio --iodepth=16 --runtime=10 --time_based --size=200G --name=seq_read

READ: bw=12.2MiB/s (12.8MB/s), 12.2MiB/s-12.2MiB/s (12.8MB/s-12.8MB/s), io=161MiB (169MB), run=13238-13238msec

Bei einem Vergleich zu diesem System (HDD und 2x10G) zeigt das etwas nicht optimal läuft.

fio --filename=/dev/mapper/mpath0 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

READ: bw=2241MiB/s (2350MB/s), 2241MiB/s-2241MiB/s (2350MB/s-2350MB/s), io=132GiB (141GB), run=60090-60090msec

Meine System (SSDs, 4x25G)

fio --filename=/dev/mapper/3600c0ff000f949ac18915f6901000000 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

READ: bw=6670KiB/s (6830kB/s), 6670KiB/s-6670KiB/s (6830kB/s-6830kB/s), io=567MiB (595MB), run=87043-87043msec

Falk R. · Jan 13, 2026

Das der mit der hohen I/O Tiefe so einbricht, ist nicht normal, klingt aber nach Storage als Ursache.
Was für einen Hosttyp hast du im Storage angegeben?

Xoxxoxuatl · Jan 13, 2026

Das Profil steht auf Standard und Host Type ist iSCSI.

Falk R. · Jan 18, 2026

Ich habe da spontan keine Idee, Firmware auf dem Storage ist aktuell?

Xoxxoxuatl · Jan 19, 2026

Moin, die FW ist aktuell.
Auch Dell hat sich das Storage angesehen -> keine Auffälligkeiten gefunden und auf die Server verwiesen...

...das ist dann mein nächstes Ticket.

Xoxxoxuatl · Jan 19, 2026

Müsste Flowcontrol nicht an sein? Was könnte daran schuld sein, Kabel?

Code:

ethtool eno12409np1 | grep "advertised pause"
        Link partner advertised pause frame use: No

Search

Search

Geringe Lese-Performance mit iSCSI und Dell ME5

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Falk R.

Distinguished Member

Xoxxoxuatl

New Member

Xoxxoxuatl

New Member

We value your privacy