Speed test with above NFS RDMA = Full Speed!
fio --name=testfile --directory=/clientfoldername --size=2G --numjobs=10 --rw=write --bs=1000M --ioengine=libaio --fdatasync=1 --runtime=60 --time_based --group_reporting --eta-newline=1s
testfile: (g=0): rw=write, bs=(R) 1000MiB-1000MiB, (W) 1000MiB-1000MiB, (T) 1000MiB-1000MiB, ioengine=libaio, iodepth=1
...
fio-3.33
Starting 10 processes
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
Jobs: 10 (f=10): [W(10)][4.9%][eta 00m:58s]
Jobs: 10 (f=10): [W(10)][6.6%][eta 00m:57s]
Jobs: 10 (f=10): [W(10)][8.3%][w=8008MiB/s][w=8 IOPS][eta 00m:55s]
Jobs: 10 (f=10): [W(10)][11.7%][eta 00m:53s]
Jobs: 10 (f=10): [W(10)][15.0%][w=1001MiB/s][w=1 IOPS][eta 00m:51s]
Jobs: 10 (f=10): [W(10)][18.3%][eta 00m:49s]
Jobs: 10 (f=10): [W(10)][21.7%][eta 00m:47s]
Jobs: 10 (f=10): [W(10)][25.0%][w=3000MiB/s][w=3 IOPS][eta 00m:45s]
Jobs: 10 (f=10): [W(10)][28.3%][eta 00m:43s]
Jobs: 10 (f=10): [W(10)][31.7%][w=1000MiB/s][w=1 IOPS][eta 00m:41s]
Jobs: 10 (f=10): [W(10)][35.0%][w=1000MiB/s][w=1 IOPS][eta 00m:39s]
Jobs: 10 (f=10): [W(10)][39.0%][w=1001MiB/s][w=1 IOPS][eta 00m:36s]
Jobs: 10 (f=10): [W(10)][41.7%][w=9009MiB/s][w=9 IOPS][eta 00m:35s]
Jobs: 10 (f=10): [W(10)][45.0%][w=1000MiB/s][w=1 IOPS][eta 00m:33s]
Jobs: 10 (f=10): [W(10)][48.3%][w=7000MiB/s][w=7 IOPS][eta 00m:31s]
Jobs: 10 (f=10): [W(10)][52.5%][eta 00m:28s]
Jobs: 10 (f=10): [W(10)][55.9%][w=4000MiB/s][w=4 IOPS][eta 00m:26s]
Jobs: 10 (f=10): [W(10)][59.3%][eta 00m:24s]
Jobs: 10 (f=10): [W(10)][61.7%][w=2002MiB/s][w=2 IOPS][eta 00m:23s]
Jobs: 10 (f=10): [W(10)][65.0%][w=6000MiB/s][w=6 IOPS][eta 00m:21s]
Jobs: 10 (f=10): [W(10)][68.3%][eta 00m:19s]
Jobs: 10 (f=10): [W(10)][71.7%][w=3003MiB/s][w=3 IOPS][eta 00m:17s]
Jobs: 10 (f=10): [W(10)][75.0%][w=5000MiB/s][w=5 IOPS][eta 00m:15s]
Jobs: 10 (f=10): [W(10)][78.3%][w=2000MiB/s][w=2 IOPS][eta 00m:13s]
Jobs: 10 (f=10): [W(10)][81.7%][w=1000MiB/s][w=1 IOPS][eta 00m:11s]
Jobs: 10 (f=10): [W(10)][85.0%][w=3003MiB/s][w=3 IOPS][eta 00m:09s]
Jobs: 10 (f=10): [W(10)][88.3%][eta 00m:07s]
Jobs: 10 (f=10): [W(10)][91.7%][w=7000MiB/s][w=7 IOPS][eta 00m:05s]
Jobs: 10 (f=10): [W(10)][95.0%][w=1001MiB/s][w=1 IOPS][eta 00m:03s]
Jobs: 10 (f=10): [W(10)][98.3%][w=5000MiB/s][w=5 IOPS][eta 00m:01s]
Jobs: 10 (f=10): [W(10)][100.0%][w=2000MiB/s][w=2 IOPS][eta 00m:00s]
Jobs: 2 (f=2): [f(2),_(8)][100.0%][w=9.77GiB/s][w=10 IOPS][eta 00m:00s]
testfile: (groupid=0, jobs=10): err= 0: pid=69013: Thu Jul 4 18:45:46 2024
write: IOPS=2, BW=2700MiB/s (2831MB/s)(161GiB/61119msec); 0 zone resets
slat (msec): min=954, max=7524, avg=3553.49, stdev=875.68
clat (usec): min=2, max=51049, avg=453.58, stdev=4013.11
lat (msec): min=954, max=7525, avg=3553.95, stdev=875.72
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 5], 20.00th=[ 5],
| 30.00th=[ 6], 40.00th=[ 7], 50.00th=[ 7], 60.00th=[ 8],
| 70.00th=[ 12], 80.00th=[ 61], 90.00th=[ 297], 95.00th=[ 693],
| 99.00th=[ 6390], 99.50th=[51119], 99.90th=[51119], 99.95th=[51119],
| 99.99th=[51119]
bw ( MiB/s): min=20000, max=20004, per=100.00%, avg=20000.86, stdev= 0.50, samples=155
iops : min= 20, max= 20, avg=20.00, stdev= 0.00, samples=155
lat (usec) : 4=9.09%, 10=58.79%, 20=9.09%, 50=0.61%, 100=6.67%
lat (usec) : 250=4.24%, 500=5.45%, 750=1.21%, 1000=1.21%
lat (msec) : 2=1.21%, 4=0.61%, 10=1.21%, 100=0.61%
fsync/fdatasync/sync_file_range:
sync (msec): min=7, max=301, avg=116.38, stdev=63.11
sync percentiles (msec):
| 1.00th=[ 12], 5.00th=[ 31], 10.00th=[ 42], 20.00th=[ 59],
| 30.00th=[ 71], 40.00th=[ 87], 50.00th=[ 113], 60.00th=[ 133],
| 70.00th=[ 148], 80.00th=[ 169], 90.00th=[ 203], 95.00th=[ 234],
| 99.00th=[ 279], 99.50th=[ 292], 99.90th=[ 300], 99.95th=[ 300],
| 99.99th=[ 300]
cpu : usr=1.22%, sys=28.97%, ctx=1180399, majf=755122, minf=5863483
IO depths : 1=240.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,165,0,0 short=231,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=2700MiB/s (2831MB/s), 2700MiB/s-2700MiB/s (2831MB/s-2831MB/s), io=161GiB (173GB), run=61119-61119msec
Second test only 10 seconds duration gave even better results:
Run status group 0 (all jobs):
WRITE: bw=2889MiB/s (3029MB/s), 2889MiB/s-2889MiB/s (3029MB/s-3029MB/s), io=30.3GiB (32.5GB), run=10732-10732msec