Hi,
i know there are many of ceph problems struggling with performance here, and i thought i got it but things are telling me a better story...
Here is our setup:
9 PVE-Nodes with each minimum 128GB RAM, 12Core Processor
2x HDDs or SSDs (newer Server have got SSDs) used for ZFS-R1 for the System itself. Nothing is stored there, just the system
8 Servers are used for ceph, each one of them has 8 - 4TB SSDs combined to one big pool.
ceph network has a 2x10gbs background for fast recovery and fast data delivery
i thought we should not face any performance problems with this setup, but since the recent upgrade we are struggling with performance inside the vms
i started a fio with direct=1 and a filesize of 20mb:
Awfull 18IOPS, for a 59 SSD backed ceph, isn't it? i cannot find the issue or the config that i have to update.
If you need any further information, please let me know i'll provide it as fast as i can.
thanks
i know there are many of ceph problems struggling with performance here, and i thought i got it but things are telling me a better story...
Here is our setup:
9 PVE-Nodes with each minimum 128GB RAM, 12Core Processor
2x HDDs or SSDs (newer Server have got SSDs) used for ZFS-R1 for the System itself. Nothing is stored there, just the system
8 Servers are used for ceph, each one of them has 8 - 4TB SSDs combined to one big pool.
ceph network has a 2x10gbs background for fast recovery and fast data delivery
i thought we should not face any performance problems with this setup, but since the recent upgrade we are struggling with performance inside the vms
i started a fio with direct=1 and a filesize of 20mb:
Code:
fio --rw=write --name=test --size=20M --direct=1
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12
Starting 1 process
test: Laying out IO file (1 file / 20MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=80KiB/s][w=20 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=18002: Wed May 5 12:07:20 2021
write: IOPS=18, BW=75.8KiB/s (77.7kB/s)(20.0MiB/270024msec); 0 zone resets
clat (msec): min=6, max=1130, avg=52.73, stdev=128.65
lat (msec): min=6, max=1130, avg=52.73, stdev=128.65
clat percentiles (msec):
| 1.00th=[ 7], 5.00th=[ 7], 10.00th=[ 8], 20.00th=[ 9],
| 30.00th=[ 11], 40.00th=[ 12], 50.00th=[ 14], 60.00th=[ 16],
| 70.00th=[ 18], 80.00th=[ 23], 90.00th=[ 55], 95.00th=[ 405],
| 99.00th=[ 584], 99.50th=[ 718], 99.90th=[ 986], 99.95th=[ 1083],
| 99.99th=[ 1133]
bw ( KiB/s): min= 7, max= 552, per=100.00%, avg=80.54, stdev=120.46, samples=508
iops : min= 1, max= 138, avg=20.09, stdev=30.12, samples=508
lat (msec) : 10=29.49%, 20=44.63%, 50=15.76%, 100=0.72%, 250=1.19%
lat (msec) : 500=6.37%, 750=1.41%, 1000=0.33%
cpu : usr=0.01%, sys=0.10%, ctx=5487, majf=0, minf=12
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,5120,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=75.8KiB/s (77.7kB/s), 75.8KiB/s-75.8KiB/s (77.7kB/s-77.7kB/s), io=20.0MiB (20.0MB), run=270024-270024msec
Disk stats (read/write):
vda: ios=0/7090, merge=0/1204, ticks=0/731652, in_queue=450144, util=55.69%
Awfull 18IOPS, for a 59 SSD backed ceph, isn't it? i cannot find the issue or the config that i have to update.
If you need any further information, please let me know i'll provide it as fast as i can.
thanks