Hello everyone. I am bit lost at this stage but maybe I am being too concerned about it as well. Your opinion is appreciated.
Server setup:
• DL380 Gen9 (141Gb Ram - 2x Xeon E5 E5-2620 v3) running Truenas Scale with ZFS Pool. (4x 10Tb disks in raidz1 setup, I have slog NVME 1tb mirrored pool)
• DL360 Gen9 (128Gb Ram - 2x Xeon E5-2620 v3) running PVE. PVE has 1x Samsung MZPLJ6T4HALA-00007 at 6.4Tb as VM Storage in single disk ZFS setup.
• Dell Server running truenas scale which is replicated from the DL380 to assure further data redundancy.
PVE (7.4.3) hosts:
Proxmox Backup Server (2.4.1) with System drive located on that 6.4Tb Samsung disk and my datastore disks are sitting on Truenas Scale (Proxmox is managing NFS Share in Datastore section and disks are then created inside Proxmox and passed to the VM so the VM is not aware that its over NFS) totally 10 Datastore VM Disks.
See attached 110.txt for further details.
See attached storage.txt for further details.
Connection between Proxmox VE and Truenas Scale is direct (No switch) using via 2x40Gb QSFP ports in balanced bond of 80Gb total speed connected using Direct Attach Copper cable. All network interfaces has set 9000 MTU including inside PBS VM.
Issue:
After I've replaced old DL380 with the current DL360 server (Proxmox fresh installation and VMs were restored from backups). Proxmox Backup Server runs verify jobs I get around 30% on average IO delay on PVE, when backups are happening from WAN traffic to PBS VM (Download speed 1 Gigabit - Fibre Connection) I get around 6% IO delay on PVE. I dont remember seeing these delays before they would be in 0.0% numbers. Also verify jobs are taking hours where before they were finished in minutes.
I have suspicious feeling that NFS is here to blame but I can't find anything wrong see studies sections.
Concern:
After I've replaced old DL380 with the current DL360 I am concerned if this is the issue. I thought that difference between DL380 and DL360 is really just between chassis type and layouts.
Studies:
Iperf3 speed results between Proxmox and Truenas Scale are one average at 20 gigabit/s on single connection. When i run 4 connection tests in one go i get around 17.5 gigabit per second per connection meaning I am saturating that 80Gb connection well enough (I think HAHA).
Proxmox Backup Client Benchmark on local LAN results:
Time per request: 9201 microseconds.
TLS speed: 455.83 MB/s
SHA256 speed: 344.63 MB/s
Compression speed: 341.65 MB/s
Decompress speed: 515.24 MB/s
AES256/GCM speed: 1037.65 MB/s
Verify speed: 183.61 MB/s
Proxmox Backup Client Benchmark over WAN results:
Time per request: 230241 microseconds.
TLS speed: 18.22 MB/s
SHA256 speed: 369.91 MB/s
Compression speed: 403.25 MB/s
Decompress speed: 604.02 MB/s
AES256/GCM speed: 1183.09 MB/s
Verify speed: 243.91 MB/s
I've read here on forum that PBS writes data in small chunks to I thought that if run ISCSI disk from Truenas to PVE it would improve IO since it's know that ISCSI is better for small read/write operations but no JOY its pretty much the same.
I've tried amending NFS share inside PVE (see options in storage.txt file) to see if it would improve NFS performance didn't help either.
Notes:
Previous DL380 replaced by DL360 has same configuration all components were moved to the new chassis the only diffrence is that proxmox was before installed on single NVME where on now it sits on raid card 2 Sas drive pool in mirror. (From internet research i was assured by myself that proxmox dont care what it is installed on and it doesnt affect VM performance at all if the VMs disks are not stored on the raid card which are not.
Conclusion:
Either I do indeed have some issue somewhere or I am wrong and these IO delays are fine for my setup.
I would appreciate any guidance to any topic on the internet or any user input or even better actual solution to my problem hehe.
Server setup:
• DL380 Gen9 (141Gb Ram - 2x Xeon E5 E5-2620 v3) running Truenas Scale with ZFS Pool. (4x 10Tb disks in raidz1 setup, I have slog NVME 1tb mirrored pool)
• DL360 Gen9 (128Gb Ram - 2x Xeon E5-2620 v3) running PVE. PVE has 1x Samsung MZPLJ6T4HALA-00007 at 6.4Tb as VM Storage in single disk ZFS setup.
• Dell Server running truenas scale which is replicated from the DL380 to assure further data redundancy.
PVE (7.4.3) hosts:
Proxmox Backup Server (2.4.1) with System drive located on that 6.4Tb Samsung disk and my datastore disks are sitting on Truenas Scale (Proxmox is managing NFS Share in Datastore section and disks are then created inside Proxmox and passed to the VM so the VM is not aware that its over NFS) totally 10 Datastore VM Disks.
See attached 110.txt for further details.
See attached storage.txt for further details.
Connection between Proxmox VE and Truenas Scale is direct (No switch) using via 2x40Gb QSFP ports in balanced bond of 80Gb total speed connected using Direct Attach Copper cable. All network interfaces has set 9000 MTU including inside PBS VM.
Issue:
After I've replaced old DL380 with the current DL360 server (Proxmox fresh installation and VMs were restored from backups). Proxmox Backup Server runs verify jobs I get around 30% on average IO delay on PVE, when backups are happening from WAN traffic to PBS VM (Download speed 1 Gigabit - Fibre Connection) I get around 6% IO delay on PVE. I dont remember seeing these delays before they would be in 0.0% numbers. Also verify jobs are taking hours where before they were finished in minutes.
I have suspicious feeling that NFS is here to blame but I can't find anything wrong see studies sections.
Concern:
After I've replaced old DL380 with the current DL360 I am concerned if this is the issue. I thought that difference between DL380 and DL360 is really just between chassis type and layouts.
Studies:
Iperf3 speed results between Proxmox and Truenas Scale are one average at 20 gigabit/s on single connection. When i run 4 connection tests in one go i get around 17.5 gigabit per second per connection meaning I am saturating that 80Gb connection well enough (I think HAHA).
Proxmox Backup Client Benchmark on local LAN results:
Time per request: 9201 microseconds.
TLS speed: 455.83 MB/s
SHA256 speed: 344.63 MB/s
Compression speed: 341.65 MB/s
Decompress speed: 515.24 MB/s
AES256/GCM speed: 1037.65 MB/s
Verify speed: 183.61 MB/s
Proxmox Backup Client Benchmark over WAN results:
Time per request: 230241 microseconds.
TLS speed: 18.22 MB/s
SHA256 speed: 369.91 MB/s
Compression speed: 403.25 MB/s
Decompress speed: 604.02 MB/s
AES256/GCM speed: 1183.09 MB/s
Verify speed: 243.91 MB/s
I've read here on forum that PBS writes data in small chunks to I thought that if run ISCSI disk from Truenas to PVE it would improve IO since it's know that ISCSI is better for small read/write operations but no JOY its pretty much the same.
I've tried amending NFS share inside PVE (see options in storage.txt file) to see if it would improve NFS performance didn't help either.
Notes:
Previous DL380 replaced by DL360 has same configuration all components were moved to the new chassis the only diffrence is that proxmox was before installed on single NVME where on now it sits on raid card 2 Sas drive pool in mirror. (From internet research i was assured by myself that proxmox dont care what it is installed on and it doesnt affect VM performance at all if the VMs disks are not stored on the raid card which are not.
Conclusion:
Either I do indeed have some issue somewhere or I am wrong and these IO delays are fine for my setup.
I would appreciate any guidance to any topic on the internet or any user input or even better actual solution to my problem hehe.