I suspect the answer here will be consumer SSD / read the FAQ - but I'm trying to understand if there is anything else I am missing. I'm curious as all four of these SSDs were running on Proxmox 6 in a cluster on different hosts without issue. When I built these two new-to-me servers the drives were wiped and re-initialized.
I built two similar servers. HP DL380P G8's with the following specs:
Server 1:
When Server 2 is "idle" (10% CPU usage) it's fine. When it starts doing work it seems that the IO delay ramps up in parallel with the CPU usage and often exceeds it. IO delay will get into the 20's and I've seen it over 50%.
When it is "idle" it is running an NVR VM which records a 4K camera continuously to the SUV400S3 drive which is configured as VirtIO SCSI with Write Back to a Windows 10 VM. So I would assume this is "high" or constant IO - you'd think it would cause IO delay constantly.
I've tried tuning ZFS based on recommendations on this forum and doing my own research - but to be honest I'm in over my head at this point. Server 1 seems "fine" ... but I've not paid that close attention to it as I've been troubleshooting on Server 2 before I do anything to take Server 1 offline.
These are currently running as independent hosts. I previously had these configured in a cluster and when I tried to migrate a VM from Server 1 to Server 2 the IO delay on Server 2 caused the cluster sync to die and the Server 2 web gui became borderline unresponsive while the migration was underway.
Thoughts as to what might be causing the IO delay issue on Server 2? Anything I can do to test, troubleshoot, etc.?
I built two similar servers. HP DL380P G8's with the following specs:
Server 1:
- Dual Xeon E5-2643 v2
- 384 GB RAM
- Integrated P420i in HBA mode
- Dual Kingston SA400S3 SSDs (ZFS01 and ZFS02)
- Dual Xeon E5-2690
- 384 GB RAM
- H220 HBA in PCI-e x8
- One Kingston SA400S3 SSD (ZFS01)
- One Kingston SUV400S3 SSD (ZFS02)
When Server 2 is "idle" (10% CPU usage) it's fine. When it starts doing work it seems that the IO delay ramps up in parallel with the CPU usage and often exceeds it. IO delay will get into the 20's and I've seen it over 50%.
When it is "idle" it is running an NVR VM which records a 4K camera continuously to the SUV400S3 drive which is configured as VirtIO SCSI with Write Back to a Windows 10 VM. So I would assume this is "high" or constant IO - you'd think it would cause IO delay constantly.
I've tried tuning ZFS based on recommendations on this forum and doing my own research - but to be honest I'm in over my head at this point. Server 1 seems "fine" ... but I've not paid that close attention to it as I've been troubleshooting on Server 2 before I do anything to take Server 1 offline.
These are currently running as independent hosts. I previously had these configured in a cluster and when I tried to migrate a VM from Server 1 to Server 2 the IO delay on Server 2 caused the cluster sync to die and the Server 2 web gui became borderline unresponsive while the migration was underway.
Thoughts as to what might be causing the IO delay issue on Server 2? Anything I can do to test, troubleshoot, etc.?