Slow VM disks over NFS shares

GLaDOS

New Member
Sep 30, 2024
1
0
1
Hey guys, first time posting here so forgive me if I skip important details. Also English is not my first lenguage so apologies.

In the past I had 2 pve nodes and 1 NAS running Ubuntu 22.04 hosting 2 nfs shares (1x RAIDz5 and 1x 500GB NVME). Both pve nodes were interconected via 10gb cards and NAS still had a 1GB nic. Most of my VMs were stored in that NFS (NVME) and were running fine over 1GB.

Yesterday I upgraded the NAS to a new platform and i switched from ubuntu to pve for the nas to take advantage of the new hardware (old nas was ancient hardware) and i've added a 10gb card to the NAS, trying to improve even further the speeds.

Since then, I've getting horrible vm speed and general unresponivess. I've launched a W10 vm and it took like 7 minutes to start, when normally it should take 1 minute at most.

I've checked connectivity between nodes with iperf and it all looks fine to me:

Code:
root@pveprod01:~# iperf -e -t 30 -i 3 -c 192.168.5.3
------------------------------------------------------------
Client connecting to 192.168.5.3, TCP port 5001 with pid 131785 (1 flows)
Write buffer size: 131072 Byte
TOS set to 0x0 (Nagle on)
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.5.1%cluster1 port 38158 connected with 192.168.5.3 port 5001 (sock=3) (icwnd/mss/irtt=87/8948/189) (ct=0.22 ms) on 2024-11-10 00:17:17 (CET)
[ ID] Interval            Transfer    Bandwidth       Write/Err  Rtry     Cwnd/RTT(var)        NetPwr
[  1] 0.0000-3.0000 sec  2.78 GBytes  7.95 Gbits/sec  22746/0         91      795K/806(256) us  1232987
[  1] 3.0000-6.0000 sec  2.85 GBytes  8.16 Gbits/sec  23337/0         45     1371K/1257(313) us  811145
[  1] 6.0000-9.0000 sec  1.93 GBytes  5.51 Gbits/sec  15771/0         16     1494K/2190(373) us  314633
[  1] 9.0000-12.0000 sec  1.60 GBytes  4.58 Gbits/sec  13112/0          0     1607K/1614(243) us  354939
[  1] 12.0000-15.0000 sec  1.65 GBytes  4.72 Gbits/sec  13498/0          0     1669K/2252(894) us  261872
^C[  1] 15.0000-16.3698 sec  1.13 GBytes  7.08 Gbits/sec  9248/0          5     1328K/1047(108) us  845196
[  1] 0.0000-16.3698 sec  11.9 GBytes  6.26 Gbits/sec  97712/0        157     1328K/1047(108) us  747254

Code:
root@pveprod02:~# iperf -e -t 30 -i 1 -c 192.168.5.3
------------------------------------------------------------
Client connecting to 192.168.5.3, TCP port 5001 with pid 75395 (1 flows)
Write buffer size: 131072 Byte
TOS set to 0x0 (Nagle on)
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.5.2%cluster1 port 43322 connected with 192.168.5.3 port 5001 (sock=3) (icwnd/mss/irtt=87/8948/1446) (ct=1.47 ms) on 2024-11-10 00:17:24 (CET)
[ ID] Interval            Transfer    Bandwidth       Write/Err  Rtry     Cwnd/RTT(var)        NetPwr
[  1] 0.0000-1.0000 sec   510 MBytes  4.28 Gbits/sec  4078/0          0     3469K/6645(282) us  80438
[  1] 1.0000-2.0000 sec   511 MBytes  4.28 Gbits/sec  4085/0          0     3469K/6417(236) us  83439
[  1] 2.0000-3.0000 sec   500 MBytes  4.20 Gbits/sec  4001/0          0     3469K/6373(272) us  82288
[  1] 3.0000-4.0000 sec   494 MBytes  4.15 Gbits/sec  3956/0          0     3469K/5661(104) us  91595
[  1] 4.0000-5.0000 sec   486 MBytes  4.08 Gbits/sec  3889/0          2     2656K/5189(150) us  98235
[  1] 5.0000-6.0000 sec   477 MBytes  4.00 Gbits/sec  3817/0          0     3303K/5386(169) us  92889
[  1] 6.0000-7.0000 sec   478 MBytes  4.01 Gbits/sec  3821/0          0     3320K/5295(226) us  94585
[  1] 7.0000-8.0000 sec   477 MBytes  4.00 Gbits/sec  3816/0          0     3320K/5563(120) us  89910
^C[  1] 8.0000-8.6933 sec   324 MBytes  3.92 Gbits/sec  2589/0          0     3320K/6914(322) us  70798
[  1] 0.0000-8.6933 sec  4.16 GBytes  4.11 Gbits/sec  34052/0          2     3320K/6914(322) us  74258
root@pveprod02:~#

Note: This measurements are with both nodes hitting the nas at the same time

What am I missing? I'm at a loss here.

Thanks,
Iyán

Edit: I also tried to hit the network and the nvme at the same time to check that there were not any weird shenanigans with the PCI-E lanes:
 

Attachments

  • 2024-11-10 01_17_18-pveprod02.glados.es - PuTTY.png
    2024-11-10 01_17_18-pveprod02.glados.es - PuTTY.png
    130 KB · Views: 3
Last edited: