Speed test with above NFS RDMA = Full Speed!
fio --name=testfile --directory=/clientfoldername --size=2G --numjobs=10 --rw=write --bs=1000M --ioengine=libaio --fdatasync=1 --runtime=60 --time_based --group_reporting --eta-newline=1s
testfile: (g=0): rw=write, bs=(R) 1000MiB-1000MiB, (W) 1000MiB-1000MiB, (T) 1000MiB-1000MiB, ioengine=libaio, iodepth=1
...
fio-3.33
Starting 10 processes
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
testfile: Laying out IO file (1 file / 2048MiB)
Jobs: 10 (f=10): [W(10)][4.9%][eta 00m:58s]
Jobs: 10 (f=10): [W(10)][6.6%][eta 00m:57s]
Jobs: 10 (f=10): [W(10)][8.3%][w=8008MiB/s][w=8 IOPS][eta 00m:55s]
Jobs: 10 (f=10): [W(10)][11.7%][eta 00m:53s]
Jobs: 10 (f=10): [W(10)][15.0%][w=1001MiB/s][w=1 IOPS][eta 00m:51s]
Jobs: 10 (f=10): [W(10)][18.3%][eta 00m:49s]
Jobs: 10 (f=10): [W(10)][21.7%][eta 00m:47s]                       
Jobs: 10 (f=10): [W(10)][25.0%][w=3000MiB/s][w=3 IOPS][eta 00m:45s]
Jobs: 10 (f=10): [W(10)][28.3%][eta 00m:43s]                      
Jobs: 10 (f=10): [W(10)][31.7%][w=1000MiB/s][w=1 IOPS][eta 00m:41s]
Jobs: 10 (f=10): [W(10)][35.0%][w=1000MiB/s][w=1 IOPS][eta 00m:39s]
Jobs: 10 (f=10): [W(10)][39.0%][w=1001MiB/s][w=1 IOPS][eta 00m:36s]
Jobs: 10 (f=10): [W(10)][41.7%][w=9009MiB/s][w=9 IOPS][eta 00m:35s]
Jobs: 10 (f=10): [W(10)][45.0%][w=1000MiB/s][w=1 IOPS][eta 00m:33s]
Jobs: 10 (f=10): [W(10)][48.3%][w=7000MiB/s][w=7 IOPS][eta 00m:31s]
Jobs: 10 (f=10): [W(10)][52.5%][eta 00m:28s]                      
Jobs: 10 (f=10): [W(10)][55.9%][w=4000MiB/s][w=4 IOPS][eta 00m:26s]
Jobs: 10 (f=10): [W(10)][59.3%][eta 00m:24s]                      
Jobs: 10 (f=10): [W(10)][61.7%][w=2002MiB/s][w=2 IOPS][eta 00m:23s]
Jobs: 10 (f=10): [W(10)][65.0%][w=6000MiB/s][w=6 IOPS][eta 00m:21s]
Jobs: 10 (f=10): [W(10)][68.3%][eta 00m:19s]                      
Jobs: 10 (f=10): [W(10)][71.7%][w=3003MiB/s][w=3 IOPS][eta 00m:17s]
Jobs: 10 (f=10): [W(10)][75.0%][w=5000MiB/s][w=5 IOPS][eta 00m:15s]
Jobs: 10 (f=10): [W(10)][78.3%][w=2000MiB/s][w=2 IOPS][eta 00m:13s]
Jobs: 10 (f=10): [W(10)][81.7%][w=1000MiB/s][w=1 IOPS][eta 00m:11s]
Jobs: 10 (f=10): [W(10)][85.0%][w=3003MiB/s][w=3 IOPS][eta 00m:09s]
Jobs: 10 (f=10): [W(10)][88.3%][eta 00m:07s]                      
Jobs: 10 (f=10): [W(10)][91.7%][w=7000MiB/s][w=7 IOPS][eta 00m:05s]
Jobs: 10 (f=10): [W(10)][95.0%][w=1001MiB/s][w=1 IOPS][eta 00m:03s]
Jobs: 10 (f=10): [W(10)][98.3%][w=5000MiB/s][w=5 IOPS][eta 00m:01s]
Jobs: 10 (f=10): [W(10)][100.0%][w=2000MiB/s][w=2 IOPS][eta 00m:00s]
Jobs: 2 (f=2): [f(2),_(8)][100.0%][w=9.77GiB/s][w=10 IOPS][eta 00m:00s]
testfile: (groupid=0, jobs=10): err= 0: pid=69013: Thu Jul  4 18:45:46 2024
  write: IOPS=2, BW=2700MiB/s (2831MB/s)(161GiB/61119msec); 0 zone resets
    slat (msec): min=954, max=7524, avg=3553.49, stdev=875.68
    clat (usec): min=2, max=51049, avg=453.58, stdev=4013.11
     lat (msec): min=954, max=7525, avg=3553.95, stdev=875.72
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    4], 10.00th=[    5], 20.00th=[    5],
     | 30.00th=[    6], 40.00th=[    7], 50.00th=[    7], 60.00th=[    8],
     | 70.00th=[   12], 80.00th=[   61], 90.00th=[  297], 95.00th=[  693],
     | 99.00th=[ 6390], 99.50th=[51119], 99.90th=[51119], 99.95th=[51119],
     | 99.99th=[51119]
   bw (  MiB/s): min=20000, max=20004, per=100.00%, avg=20000.86, stdev= 0.50, samples=155
   iops        : min=   20, max=   20, avg=20.00, stdev= 0.00, samples=155
  lat (usec)   : 4=9.09%, 10=58.79%, 20=9.09%, 50=0.61%, 100=6.67%
  lat (usec)   : 250=4.24%, 500=5.45%, 750=1.21%, 1000=1.21%
  lat (msec)   : 2=1.21%, 4=0.61%, 10=1.21%, 100=0.61%
  fsync/fdatasync/sync_file_range:
    sync (msec): min=7, max=301, avg=116.38, stdev=63.11
    sync percentiles (msec):
     |  1.00th=[   12],  5.00th=[   31], 10.00th=[   42], 20.00th=[   59],
     | 30.00th=[   71], 40.00th=[   87], 50.00th=[  113], 60.00th=[  133],
     | 70.00th=[  148], 80.00th=[  169], 90.00th=[  203], 95.00th=[  234],
     | 99.00th=[  279], 99.50th=[  292], 99.90th=[  300], 99.95th=[  300],
     | 99.99th=[  300]
  cpu          : usr=1.22%, sys=28.97%, ctx=1180399, majf=755122, minf=5863483
  IO depths    : 1=240.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,165,0,0 short=231,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
  WRITE: bw=2700MiB/s (2831MB/s), 2700MiB/s-2700MiB/s (2831MB/s-2831MB/s), io=161GiB (173GB), run=61119-61119msec
Second test only 10 seconds duration gave even better results:
Run status group 0 (all jobs):
WRITE: bw=2889MiB/s (3029MB/s), 2889MiB/s-2889MiB/s (3029MB/s-3029MB/s), io=30.3GiB (32.5GB), run=10732-10732msec