Hello,
I'm having trouble with an install of proxmox where the majority of the VMs running on it are a combination of Windows servers and virtual workstations running Windows 10. This was initially up and running on Hyper-V and according to the customer it is operating up to 20x slower (their words) . The difference between how it is now and how it was is that with Hyper-V all the VMs data existed locally on the host that the VM was running on. They were running on Dell R510 servers with Perc 700 raid controllers with 12x 3TB spinning 7200 RPM SAS drives running in RAID 5 configuration.
The current setup is that the "compute" hardware are two HP DL380 Gen 9 servers with 128GB of ram with a couple of mirrored 1TB hard disks for booting the OS (the servers came with the drives so we just used what we had - overkill but whatever). The storage they are connected to is a TrueNAS machine running on one of the Dell R510s with a Perc H200 in IT mode with a ZFS Raid Z1 with 10x 3TB drives and a Cache VDEV on a 120GB SSD. They are connected Via HP 10GE ethernet controllers to a 10GE unmanaged switch using Twinax cables. Same adapter is in the Truenas that is in the HP machine. There should be plenty of bandwidth over the storage network for whatever we want to throw at it.
The second HP "compute" server is identical to the first and is in a cluster with the first one. There is a separate 1G card for the cluster network set up on a VLAN on their main cisco switches and no issues with communication there.
The shared storage is an NFS share provided by TrueNAS over the 10GE network and all the VMs disks are in RAW format.
The way this customer works is everyone connects to a virtual Windows 10 machine with 4 monitors and they operate that way. Then they can remote in and have the same desktop they do at the office.
My gut feeling is that the disk channel is the bottleneck but I'm not sure how to confirm this theory. Just the fact that they are 3TB 7200 RPM spinning drives and I don't even think they are dual channel makes me point the finger heavily in that direction. I know that when I kick off a backup (using proxmox backup server running on another Dell R510) over a 1GE link (which I think is the same as the management network) the system grinds to a halt.
I have a similar install elsewhere that is running much like their Hyper-V was with everything local using HPs 10,000 RPM 6G dual port SAS drives and that system screams in comparison.
Any help would be greatly appreciated. I'm going there tomorrow to see what I can do to help speed things up but not sure where to go from here. Thought about moving the owner's vm storage to one of the local proxmox servers and have him run that way and test... figured that would tell us if we eliminate the shared storage and run it on the higher speed HP drives for testing.
Rich
I'm having trouble with an install of proxmox where the majority of the VMs running on it are a combination of Windows servers and virtual workstations running Windows 10. This was initially up and running on Hyper-V and according to the customer it is operating up to 20x slower (their words) . The difference between how it is now and how it was is that with Hyper-V all the VMs data existed locally on the host that the VM was running on. They were running on Dell R510 servers with Perc 700 raid controllers with 12x 3TB spinning 7200 RPM SAS drives running in RAID 5 configuration.
The current setup is that the "compute" hardware are two HP DL380 Gen 9 servers with 128GB of ram with a couple of mirrored 1TB hard disks for booting the OS (the servers came with the drives so we just used what we had - overkill but whatever). The storage they are connected to is a TrueNAS machine running on one of the Dell R510s with a Perc H200 in IT mode with a ZFS Raid Z1 with 10x 3TB drives and a Cache VDEV on a 120GB SSD. They are connected Via HP 10GE ethernet controllers to a 10GE unmanaged switch using Twinax cables. Same adapter is in the Truenas that is in the HP machine. There should be plenty of bandwidth over the storage network for whatever we want to throw at it.
The second HP "compute" server is identical to the first and is in a cluster with the first one. There is a separate 1G card for the cluster network set up on a VLAN on their main cisco switches and no issues with communication there.
The shared storage is an NFS share provided by TrueNAS over the 10GE network and all the VMs disks are in RAW format.
The way this customer works is everyone connects to a virtual Windows 10 machine with 4 monitors and they operate that way. Then they can remote in and have the same desktop they do at the office.
My gut feeling is that the disk channel is the bottleneck but I'm not sure how to confirm this theory. Just the fact that they are 3TB 7200 RPM spinning drives and I don't even think they are dual channel makes me point the finger heavily in that direction. I know that when I kick off a backup (using proxmox backup server running on another Dell R510) over a 1GE link (which I think is the same as the management network) the system grinds to a halt.
I have a similar install elsewhere that is running much like their Hyper-V was with everything local using HPs 10,000 RPM 6G dual port SAS drives and that system screams in comparison.
Any help would be greatly appreciated. I'm going there tomorrow to see what I can do to help speed things up but not sure where to go from here. Thought about moving the owner's vm storage to one of the local proxmox servers and have him run that way and test... figured that would tell us if we eliminate the shared storage and run it on the higher speed HP drives for testing.
Rich