Was kann ich an Ceph Performance erwarten

fwinkler

Member
Dec 12, 2021
49
3
13
51
wir haben einen neuen Proxmox Cluster mit 5 Nodes und einem Backupserver. Ich kann nur überhaupt nicht wirklich einschätzen was ich an Leistung erwarten kann.

Server:
4 x Dell DC NVMe ISE 7450 RI U.2 7.68TB
4 x 10gb Netzwerkkarte
2 x 1gb Netzwerkkarte
2 x 100gb Netzwerk

Auf den 100gb Karten läuft Ceph.



Bash:
root@pve5:~# fio --ioengine=libaio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based --name=fio
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=296MiB/s][w=75.8k IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=16447: Thu Jul 25 14:09:27 2024
  write: IOPS=75.6k, BW=295MiB/s (310MB/s)(17.3GiB/60001msec); 0 zone resets
    slat (nsec): min=2220, max=46981, avg=2428.28, stdev=309.09
    clat (nsec): min=620, max=620751, avg=10537.05, stdev=1044.35
     lat (usec): min=9, max=630, avg=12.97, stdev= 1.10
    clat percentiles (nsec):
     |  1.00th=[10176],  5.00th=[10176], 10.00th=[10304], 20.00th=[10304],
     | 30.00th=[10304], 40.00th=[10432], 50.00th=[10432], 60.00th=[10432],
     | 70.00th=[10560], 80.00th=[10560], 90.00th=[10688], 95.00th=[10944],
     | 99.00th=[13248], 99.50th=[15936], 99.90th=[24192], 99.95th=[24960],
     | 99.99th=[31872]
   bw (  KiB/s): min=298736, max=305360, per=100.00%, avg=302530.22, stdev=1471.07, samples=119
   iops        : min=74684, max=76340, avg=75632.61, stdev=367.76, samples=119
  lat (nsec)   : 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.08%, 20=99.51%, 50=0.41%
  lat (usec)   : 100=0.01%, 250=0.01%, 750=0.01%
  cpu          : usr=6.92%, sys=39.76%, ctx=4534644, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4535227,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=295MiB/s (310MB/s), 295MiB/s-295MiB/s (310MB/s-310MB/s), io=17.3GiB (18.6GB), run=60001-60001msec

Disk stats (read/write):
  nvme0n1: ios=350/4541653, merge=8/2916, ticks=56/36713, in_queue=36769, util=99.91%



auf einer VM mit Windows 2022 hatte ich im read 24000 iops.

Linux mit kubernetes:

- mit OpenEBS Local PV(das ist dann die Lokal Platte vom Host):

Bash:
./kubestr fio -s openebs-hostpath
PVC created kubestr-fio-pvc-tkhlg
Pod created kubestr-fio-pod-hjbwb
Running FIO test (default-fio) on StorageClass (openebs-hostpath) with a PVC of Size (100Gi)
Elapsed time- 24.549538772s
FIO test results:
 
FIO version - fio-3.36
Global options - ioengine=libaio verify=0 direct=1 gtod_reduce=1

JobName: read_iops
  blocksize=4K filesize=2G iodepth=64 rw=randread
read:
  IOPS=42727.835938 BW(KiB/s)=170928
  iops: min=33366 max=45955 avg=42794.929688
  bw(KiB/s): min=133464 max=183823 avg=171180.000000

JobName: write_iops
  blocksize=4K filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=34595.921875 BW(KiB/s)=138400
  iops: min=26228 max=38732 avg=34435.414062
  bw(KiB/s): min=104912 max=154928 avg=137741.859375

JobName: read_bw
  blocksize=128K filesize=2G iodepth=64 rw=randread
read:
  IOPS=32374.083984 BW(KiB/s)=4144420
  iops: min=24112 max=34513 avg=32431.792969
  bw(KiB/s): min=3086336 max=4417667 avg=4151284.750000

JobName: write_bw
  blocksize=128k filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=28725.218750 BW(KiB/s)=3677365
  iops: min=14702 max=30946 avg=28716.689453
  bw(KiB/s): min=1881856 max=3961088 avg=3675741.000000

Disk stats (read/write):
  sdb: ios=1273852/1068990 merge=0/16 ticks=1965217/1675217 in_queue=3647005, util=42.120041%
  -  OK

2. mit ceph-csi block Treiber:

Bash:
./kubestr fio -s ceph-block
PVC created kubestr-fio-pvc-m9qzw
Pod created kubestr-fio-pod-4cbd2
Running FIO test (default-fio) on StorageClass (ceph-block) with a PVC of Size (100Gi)
Elapsed time- 26.418651031s
FIO test results:
 
FIO version - fio-3.36
Global options - ioengine=libaio verify=0 direct=1 gtod_reduce=1

JobName: read_iops
  blocksize=4K filesize=2G iodepth=64 rw=randread
read:
  IOPS=2961.541016 BW(KiB/s)=11862
  iops: min=2899 max=3046 avg=2966.466553
  bw(KiB/s): min=11599 max=12184 avg=11866.266602

JobName: write_iops
  blocksize=4K filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=1739.318481 BW(KiB/s)=6974
  iops: min=1510 max=1814 avg=1742.433350
  bw(KiB/s): min=6040 max=7256 avg=6969.799805

JobName: read_bw
  blocksize=128K filesize=2G iodepth=64 rw=randread
read:
  IOPS=2925.999756 BW(KiB/s)=375064
  iops: min=2854 max=3018 avg=2931.933350
  bw(KiB/s): min=365312 max=386304 avg=375301.875000

JobName: write_bw
  blocksize=128k filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=1756.755005 BW(KiB/s)=225401
  iops: min=1724 max=1808 avg=1757.933350
  bw(KiB/s): min=220672 max=231424 avg=225020.406250

Disk stats (read/write):
  rbd0: ios=100394/59699 merge=0/830 ticks=2173550/1308741 in_queue=3482292, util=99.479439%
  -  OK


Wie kann ich das einstufen, ceph-csi erscheint mir zu langsam?
 
Bash:
root@pve5:~# fio --ioengine=libaio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based --name=fio
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=296MiB/s][w=75.8k IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=16447: Thu Jul 25 14:09:27 2024
  write: IOPS=75.6k, BW=295MiB/s (310MB/s)(17.3GiB/60001msec); 0 zone resets
    slat (nsec): min=2220, max=46981, avg=2428.28, stdev=309.09
    clat (nsec): min=620, max=620751, avg=10537.05, stdev=1044.35
     lat (usec): min=9, max=630, avg=12.97, stdev= 1.10
    clat percentiles (nsec):
     |  1.00th=[10176],  5.00th=[10176], 10.00th=[10304], 20.00th=[10304],
     | 30.00th=[10304], 40.00th=[10432], 50.00th=[10432], 60.00th=[10432],
     | 70.00th=[10560], 80.00th=[10560], 90.00th=[10688], 95.00th=[10944],
     | 99.00th=[13248], 99.50th=[15936], 99.90th=[24192], 99.95th=[24960],
     | 99.99th=[31872]
   bw (  KiB/s): min=298736, max=305360, per=100.00%, avg=302530.22, stdev=1471.07, samples=119
   iops        : min=74684, max=76340, avg=75632.61, stdev=367.76, samples=119
  lat (nsec)   : 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.08%, 20=99.51%, 50=0.41%
  lat (usec)   : 100=0.01%, 250=0.01%, 750=0.01%
  cpu          : usr=6.92%, sys=39.76%, ctx=4534644, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4535227,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=295MiB/s (310MB/s), 295MiB/s-295MiB/s (310MB/s-310MB/s), io=17.3GiB (18.6GB), run=60001-60001msec

Disk stats (read/write):
  nvme0n1: ios=350/4541653, merge=8/2916, ticks=56/36713, in_queue=36769, util=99.91%
Bei diesem Test hast du direkt eine NVMe getestet, ohne Ceph. Diese Werte sehen aber nicht nach normaler NVMe Performance aus.
Wie sind die NVMe angebunden? Dirket onboard oder über einen Perc Controller. Ja soetwas sehe ich ab und zu, je nachdem wer bei DELL das Sizing gemacht hat.
Wenn die NVMe schon nicht performt, wird das mit den zusätzlichen Latenzen des Netzwerks nicht besser.

P.S. eventuell lässt du den Test noch einmal ohne iodepth=1 laufen. Das NVMe Protokoll ist für Parallelisierung gebaut und profitiert sehr stark davon.
 
Last edited:
Bash:
root@pve5:~# fio --ioengine=libaio --filename=/dev/nvme4n1 --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 -runtime=60 --time_based --name=fio -iodepth=16
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=794MiB/s][w=203k IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=375347: Tue Oct  1 10:27:15 2024
  write: IOPS=216k, BW=845MiB/s (886MB/s)(49.5GiB/60001msec); 0 zone resets
    slat (usec): min=2, max=668, avg= 3.79, stdev= 1.14
    clat (usec): min=12, max=738, avg=69.97, stdev= 4.95
     lat (usec): min=16, max=741, avg=73.76, stdev= 5.07
    clat percentiles (usec):
     |  1.00th=[   67],  5.00th=[   68], 10.00th=[   69], 20.00th=[   69],
     | 30.00th=[   69], 40.00th=[   69], 50.00th=[   69], 60.00th=[   69],
     | 70.00th=[   70], 80.00th=[   71], 90.00th=[   76], 95.00th=[   81],
     | 99.00th=[   89], 99.50th=[   93], 99.90th=[  105], 99.95th=[  113],
     | 99.99th=[  133]
   bw (  KiB/s): min=696776, max=877664, per=100.00%, avg=866072.40, stdev=24259.75, samples=119
   iops        : min=174194, max=219416, avg=216518.18, stdev=6064.95, samples=119
  lat (usec)   : 20=0.01%, 50=0.01%, 100=99.81%, 250=0.19%, 500=0.01%
  lat (usec)   : 750=0.01%
  cpu          : usr=12.48%, sys=87.50%, ctx=990, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12980872,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=845MiB/s (886MB/s), 845MiB/s-845MiB/s (886MB/s-886MB/s), io=49.5GiB (53.2GB), run=60001-60001msec

Disk stats (read/write):
  nvme4n1: ios=0/12948135, merge=0/0, ticks=0/97522, in_queue=97522, util=99.52%
 
Die Platten sind Micron:

Bash:
DC NVMe ISE 7450 RI U.2 7.68TB    Micron Technology Inc

Und die Platten sind direkt angeschlossen.
 
Ich frage mich halt warum ceph-csi so viel langsamer ist.

ceph-csi: IOPS=2961.541016 BW(KiB/s)=11862
virtuelle Platte auf dem geiche System: IOPS=42727.835938 BW(KiB/s)=170928 auf Ceph
 
Ich frage mich halt warum ceph-csi so viel langsamer ist.
Na, weil die Daten durch den Netzwerk-Stack, dann durch das Kabel, den Switch und das andere Kabel, nochmal durch die Netzwerktreiber und dann auf die Replikat-Destination müssen?

Lesen mag (manchmal) schnell gehen, aber Schreiben erzwingt dies gleich mehrfach, je nach size/min_size. Erst wenn ein Datum überall geschrieben wurde, ist der Vorgang beendet.

Lokal ist immer schneller als irgendein Netzwerkdateisystem...

Disclaimer: ich habe von Ceph keine Ahnung... ;-)
 
  • Like
Reactions: gurubert
Ich habe von Kubernetes keine Ahnung, aber schon mehrfach gehört, dass CSI nicht ganz so schnell wie RBD ist.
Aber auch mit iodepth=16 kommt weniger raus als ich erwartet hätte. Was für ein Server ist das denn? Ich habe aber auch schon NVMe Performanceverbesserungen messen können, wenn man die C-States (Stromsparen) deaktiviert.
 


Modul
Beschreibung
SKU
Steuertyp
Menge
Komponenten
Basis
PowerEdge R6615
210-BFUO​
SR​
1​
Smart Selection
Smart Selection PowerEdge R6615
486-82082​
SR​
1​
FRONT STORAGE
2,5""-Gehäuse
379-BDTF​
SR​
1​
Rückwand
NVMe-Rückwandplatine
379-BDSX​
SR​
1​
REAR STORAGE
Kein Storage (Rückseite)
379-BDTE​
SR​
1​
Trusted Platform Module
Trusted Platform Module 2.0 V3
461-AAIG​
SR​
1​
Gehäuse-Konfiguration
2,5""-Gehäuse mit bis zu 10 NVMe-Direct-Laufwerken
321-BIFY​
SR​
1​
Prozessor
AMD EPYC 9354P, 3,25 GHz, 32C/64T, 256 MB Cache (280 W) DDR5-4800
338-CGWZ​
SR​
1​
Thermische Konfiguration des Prozessors
Kühlkörper mit hoher Performance
412-BBGB​
SR​
1​
Typ von Speicherkonfiguration
Leistungsoptimierung
370-AHLL​
SR​
1​
Speicher DIMM Typ und Geschwindigkeit
4.800 MT/s RDIMMs
370-AHCL​
SR​
1​
Speicherkapazität
64 GB RDIMM, 4.800 MT/s, Dual Rank
370-AGZR​
SR​
12​
RAID-Konfigurationen
C30, Kein RAID für NVME-Gehäuse
780-BCDO​
SR​
1​
RAID/Interne Speichercontroller
Kein Controller
405-AACD​
SR​
1​
Festplatte
Ohne Festplatte
400-ABHL​
SR​
1​
Festplatten (PCIe SSD/Flex Bay)
7,68 TB Rechenzentrum, NVMe, leseoptimiert, AG-Laufwerk U2 der 4. Generation mit Träger
345-BJNW​
SR​
4​
BIOS und erweiterte Systemkonfigurationseinstellungen
Energiesparende BIOS-Einstellungen
384-BBBH​
SR​
1​
Erweiterte Systemkonfigurationen
Ohne Energy Star
387-BBEY​
SR​
1​
Kühlung
4 Lüfter mit sehr hoher Performance für 1 CPU
384-BDHS​
SR​
1​
Stromversorgung
Dual, redundant (1 + 1), Hot-Plug-Netzteil, 1.100 W MM (100–240 V AC) Titanium
450-AKLF​
SR​
1​
Netzkabel
C13 zu C14, PDU Stil, 10 A, 2m (6,5 Fuß), Netzkabel
450-AADY​
SR​
2​
PCIe-Riser
Riser-Konfiguration 3, 2 x16 FH (Gen 5)
330-BCBX​
SR​
1​
Hauptplatine
PowerEdge R6615-Hauptplatine V2
329-BJSB​
SR​
1​
Ethernet-Mezzanine-Adapter
Broadcom 57508, 2 Anschlüsse, 100 GbE, QSFP56, OCP NIC 3.0
540-BDXQ​
SR​
1​
Ethernet-Mezzanine-Adapter
R6615/R7615, OCP 3.0-Kabel
540-BFDX​
SR​
1​
Zusätzliche Netzwerkadapter
Broadcom 5720, 2 Anschlüsse, 1 GbE, LOM
540-BDKD​
SR​
1​
Zusätzliche Netzwerkadapter
Broadcom 57454-BASE-T-Adapter, 4 Anschlüsse, 10 GbE, PCIe gesamte Höhe
540-BDLK​
SR​
1​
Blende
Standardblende für x8/x10-Gehäuse, R6615
325-BETN​
SR​
1​
Bootoptimierte Speicherkarten
BOSS-N1-Controller-Karte + mit 2 M.2 480 GB (RAID 1)
403-BCRU​
SR​
1​
Bootoptimierte Speicherkarten
BOSS-Kabel, Klammer für R6615
470-AFNB​
SR​
1​
Eingebetteter System-Management
iDRAC9, Enterprise 16G
528-CTIC​
SR​
1​
 
c-stats habe ich schon aus

System ProfilePerformance Per Watt (OS)PerformanceCustom
CPU Power ManagementMaximum Performance
Memory FrequencyMaximum Performance
Turbo BoostEnabled
C-StatesDisabled
Memory Patrol ScrubStandard
Memory Refresh Rate1x
Workload ProfileNot ConfiguredHPC Profile
PCI ASPM L1 Link Power ManagementDisabled
Determinism SliderPower Determinism
Power Profile SelectHigh Performance Mode
PCIE Speed PMM ControlAuto
EQ Bypass To Highest RateDisabled
DF PState Frequency OptimizerEnabled
DF PState Latency OptimizerEnabled
DF CStateEnabled
Host System Management Port (HSMP) SupportEnabled
Boost FMax0 - Auto
Algorithm Performance Boost Disable (ApbDis)Disabled
 
  • Like
Reactions: Falk R.
Bash:
root@pve5:~# fio -ioengine=rbd  -pool=test -ioengine=rbd  -pool=test -direct=1 -sync=1 -rw=write -bs=4K -numjobs=1 -runtime=60 -time_based -name=fio -iodepth=4
 -rbdname=vm-113-disk-0
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=4
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=33.4MiB/s][w=8555 IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=298365: Tue Oct  1 15:01:51 2024
  write: IOPS=8451, BW=33.0MiB/s (34.6MB/s)(1981MiB/60001msec); 0 zone resets
    slat (nsec): min=510, max=44300, avg=4646.06, stdev=2108.97
    clat (usec): min=307, max=13419, avg=468.33, stdev=156.84
     lat (usec): min=318, max=13423, avg=472.98, stdev=156.83
    clat percentiles (usec):
     |  1.00th=[  392],  5.00th=[  408], 10.00th=[  416], 20.00th=[  424],
     | 30.00th=[  433], 40.00th=[  441], 50.00th=[  449], 60.00th=[  457],
     | 70.00th=[  465], 80.00th=[  478], 90.00th=[  498], 95.00th=[  537],
     | 99.00th=[ 1037], 99.50th=[ 1172], 99.90th=[ 1844], 99.95th=[ 2311],
     | 99.99th=[ 7504]
   bw (  KiB/s): min=30368, max=35768, per=100.00%, avg=33829.65, stdev=918.50, samples=119
   iops        : min= 7592, max= 8942, avg=8457.41, stdev=229.62, samples=119
  lat (usec)   : 500=90.11%, 750=7.36%, 1000=1.40%
  lat (msec)   : 2=1.05%, 4=0.05%, 10=0.03%, 20=0.01%
  cpu          : usr=4.00%, sys=2.63%, ctx=277256, majf=0, minf=19
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,507081,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
  WRITE: bw=33.0MiB/s (34.6MB/s), 33.0MiB/s-33.0MiB/s (34.6MB/s-34.6MB/s), io=1981MiB (2077MB), run=60001-60001msec

Disk stats (read/write):
    dm-1: ios=7/3407, merge=0/0, ticks=1/109, in_queue=110, util=1.19%, aggrios=47/1545, aggrmerge=0/0, aggrticks=15/26, aggrin_queue=41, aggrutil=0.06%
  nvme2n1: ios=47/1545, merge=0/0, ticks=15/26, in_queue=41, util=0.06%

so?
 
nein :)

Bash:
root@pve5:~# fio -ioengine=rbd  -pool=test -ioengine=rbd  -pool=test -direct=1 -sync=1 -rw=write -bs=4K -numjobs=1 -runtime=60 -time_based -name=fio -iodepth=128 -rbdname=vm-113-disk-0
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=128
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=262MiB/s][w=67.0k IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=318694: Tue Oct  1 15:22:08 2024
  write: IOPS=66.6k, BW=260MiB/s (273MB/s)(15.3GiB/60002msec); 0 zone resets
    slat (nsec): min=770, max=374811, avg=6234.89, stdev=2866.94
    clat (usec): min=431, max=41339, avg=1914.04, stdev=446.25
     lat (usec): min=437, max=41344, avg=1920.28, stdev=446.07
    clat percentiles (usec):
     |  1.00th=[ 1074],  5.00th=[ 1467], 10.00th=[ 1614], 20.00th=[ 1696],
     | 30.00th=[ 1762], 40.00th=[ 1811], 50.00th=[ 1844], 60.00th=[ 1893],
     | 70.00th=[ 1958], 80.00th=[ 2040], 90.00th=[ 2376], 95.00th=[ 2671],
     | 99.00th=[ 3195], 99.50th=[ 3425], 99.90th=[ 4621], 99.95th=[ 5604],
     | 99.99th=[ 8029]
   bw (  KiB/s): min=230752, max=283400, per=100.00%, avg=266698.89, stdev=8539.38, samples=119
   iops        : min=57688, max=70850, avg=66674.76, stdev=2134.81, samples=119
  lat (usec)   : 500=0.01%, 750=0.01%, 1000=0.74%
  lat (msec)   : 2=75.87%, 4=23.21%, 10=0.16%, 20=0.01%, 50=0.01%
  cpu          : usr=42.82%, sys=14.02%, ctx=344658, majf=0, minf=289
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=0,3998730,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
  WRITE: bw=260MiB/s (273MB/s), 260MiB/s-260MiB/s (273MB/s-273MB/s), io=15.3GiB (16.4GB), run=60002-60002msec

Disk stats (read/write):
    dm-1: ios=0/3411, merge=0/0, ticks=0/213, in_queue=213, util=0.07%, aggrios=36/1700, aggrmerge=0/0, aggrticks=10/59, aggrin_queue=69, aggrutil=0.07%
  nvme2n1: ios=36/1700, merge=0/0, ticks=10/59, in_queue=69, util=0.07%

Bash:
root@pve5:~# fio -ioengine=rbd  -pool=test -ioengine=rbd  -pool=test -direct=1 -sync=1 -rw=write -bs=4M -numjobs=1 -runtime=60 -time_based -name=fio -iodepth=8 -rbdname=vm-113-disk-0
fio: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=8
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=3219MiB/s][w=804 IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=320506: Tue Oct  1 15:24:02 2024
  write: IOPS=700, BW=2804MiB/s (2940MB/s)(164GiB/60011msec); 0 zone resets
    slat (usec): min=189, max=3059, avg=1158.16, stdev=501.84
    clat (usec): min=4342, max=65173, avg=10252.40, stdev=2559.82
     lat (usec): min=4913, max=66848, avg=11410.57, stdev=2664.94
    clat percentiles (usec):
     |  1.00th=[ 6063],  5.00th=[ 6915], 10.00th=[ 7439], 20.00th=[ 8225],
     | 30.00th=[ 8848], 40.00th=[ 9372], 50.00th=[10028], 60.00th=[10683],
     | 70.00th=[11207], 80.00th=[11863], 90.00th=[13173], 95.00th=[14615],
     | 99.00th=[17957], 99.50th=[19530], 99.90th=[25297], 99.95th=[30802],
     | 99.99th=[57934]
   bw (  MiB/s): min= 2400, max= 3392, per=100.00%, avg=2804.64, stdev=280.48, samples=119
   iops        : min=  600, max=  848, avg=701.16, stdev=70.12, samples=119
  lat (msec)   : 10=49.96%, 20=49.61%, 50=0.42%, 100=0.01%
  cpu          : usr=80.82%, sys=0.66%, ctx=10984, majf=0, minf=9221
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,42066,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=8

Run status group 0 (all jobs):
  WRITE: bw=2804MiB/s (2940MB/s), 2804MiB/s-2804MiB/s (2940MB/s-2940MB/s), io=164GiB (176GB), run=60011-60011msec

Disk stats (read/write):
    dm-1: ios=0/3184, merge=0/0, ticks=0/926, in_queue=926, util=0.31%, aggrios=36/1921, aggrmerge=0/0, aggrticks=7/30999, aggrin_queue=31006, aggrutil=0.30%
  nvme2n1: ios=36/1921, merge=0/0, ticks=7/30999, in_queue=31006, util=0.30%
 
ja, da kommt auch noch ein zweiter 100gb Switch dazu.

Was mich halt irritiert ist das ceph-csi im Kubernetes. Das os ist talos und die iops auf der lokalen Platte, was ja auch schon ceph ist, ist ja auch ok.
Aber warum der ceph-csi Treiber so schlecht ist verstehe ich nicht.

Oder ist teste ich falsch?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!