Disk performance doubts

virusbcn

New Member
Mar 5, 2024
4
1
3
Hi all, I am looking to move my virtual machines from vmware to proxmox, I have finally decided and bought a second hand offer of a HP DL360 G10 with HBA E208 controller, two mirrored SSD disks for Proxmox and 4 2Tb SSD samsung 870 disks for the VMS storage, on the old vmware server I have the same 4 2Tb samsung 870 disks on a HP DL360 G9 with a HP RAID controller and the disk performance is much better on the old vmware configuration, I thought that 4 disks with ZFS in proxmox would give me much better performance but I see that in benchmark tests it only gives me x2 the performance of a 2Tb 870 SSD, that is about 1100 MB/s, while in the old vmware it gives me about 6000 MB/s, it is true that the old one has a controller with 2Gb of RAM that distorts a little the benchmark performance, but it is true that the 2Gb of the RAID helps a lot to move the VMs well, I do not see any increase in latency in the datastore of the VMs in my old vmware, I'm afraid of having performance problems once I migrate to proxmox my VMs.
In proxmox I have tested the benchmarks with the 4 SSD 2Tb disks with Raidz1, with LVM, with all the options and with small variations in performance I have always given the same in sequential write/read, about 1000/1200 MB/s.

If I can not increase performance I consider two options, the safest option would be to do the same as my old option, buy a HP RAID with RAM, create a Raid5 in the controller and pass this storage as a unit to proxmox or test with a NVME card with 4 NVME disks and look to pass them to proxmox.

The bad thing I see of the HP RAID controller option is that proxmox would lose connection with the disks, I could not see the status of the ssd for example, or do you know any way to do it?

What do you recommend?
 
Hello,





It’s not unusual that your old RAID controller shows much higher throughput, since its cache memory heavily influences benchmark results. Many hardware RAID controllers “enhance” performance figures by writing to their built-in cache first, which can significantly boost sequential read/write rates. However, that doesn’t necessarily reflect real-world performance under typical workloads.





A few key points to consider:


1. ZFS with an HBA vs. Hardware RAID


• Using an HBA (in IT or JBOD mode) with ZFS gives you full visibility of each individual SSD. This allows you to read SMART data, monitor SSD health, and detect issues early.


• With a hardware RAID controller, Proxmox only sees the virtual drive presented by the RAID controller (e.g., RAID5). Direct monitoring of SSDs is then only possible through the controller’s management tools, assuming they’re supported on your system.


2. Performance with ZFS


• RAIDZ1 (similar to RAID5) is space-efficient, but for VM workloads on SSDs, you often see better performance from a mirrored setup (e.g., a RAID10-like configuration in ZFS). This is especially true for random I/O.


• Pure sequential write/read benchmarks don’t always represent actual VM workloads, where IOPS and latency are more critical. You may find latency is more important than raw throughput in day-to-day operations.


• If sequential throughput is still your priority, you might try RAID10 in ZFS (two mirrored pairs in a stripe), which sacrifices storage capacity but often yields higher I/O rates.


3. NVMe as an Option


• NVMe SSDs generally provide much higher I/O performance than SATA SSDs, particularly for random reads/writes. If you need maximum performance for multiple VMs and don’t have massive capacity requirements, NVMe could be a good choice.


• On the other hand, four NVMe drives might be more expensive than four SATA SSDs. Weigh whether you actually need that extra performance or if your bottleneck might lie elsewhere (CPU, RAM, network, etc.).


4. ZFS Cache and Logs


• If you stick with ZFS and your workloads perform a lot of synchronous writes (e.g., databases), adding a dedicated SLOG device (like a small, very fast NVMe SSD) can significantly reduce latency.


• L2ARC (an additional read cache on an SSD or NVMe) can help when you have large datasets that are frequently accessed.


5. Real-World Testing vs. Benchmark Numbers


• Seeing 6000 MB/s vs. 1100 MB/s in benchmarks looks dramatic, but it really depends on your workload (random vs. sequential I/O, block sizes, how many VMs are accessing data simultaneously, etc.).


• I recommend setting up a test workload on Proxmox to see how latency and IOPS behave in practice. If everything runs smoothly, the lower “paper” benchmark numbers may not be as big of a concern.





Overall recommendation: if monitoring and transparency for your SSDs are important, you might prefer sticking with an HBA and letting ZFS manage your drives directly. If your current RAIDZ1 setup doesn’t deliver the speed you want, consider switching to a mirrored setup (ZFS RAID10) or adding a fast SLOG device for synchronous writes.





If you absolutely need the high sequential performance figures (and are comfortable losing direct SSD monitoring), then returning to a hardware RAID controller is an option. However, many in the Proxmox community have found that ZFS with an HBA is very reliable in real-world usage, and you get the added benefit of simpler monitoring.





In short:


HBA + ZFS = better monitoring, generally solid real-world performance, fewer “inflated” benchmark figures.


Hardware RAID = higher sequential benchmark scores (thanks to the cache), but limited monitoring and no true ZFS control over individual SSDs.


NVMe = significantly faster (especially for random I/O), but also more expensive and typically offering less total capacity.





Good luck with your decision!
 
  • Like
Reactions: virusbcn and UdoB
if your ssd's dont have power loss protection then the performance with zfs will be worse because writes cant be cached for data safety
Samsung 870 SSDs are consumer grade and dont have PLP so zfs with these ssd's is not really recommended, the tbw rating is low and with zfs write amplification your ssd's could die early ...
 
Thanks to both of you, I forgot to mention that I also tested RAID10 with the HBA controller in proxmox with the same performance in sequential, +- 1100MB/s, it will be connected in a datacenter with guaranteed power and dual source, so I don't think it will ever go out of power, at least it has been for the last xx years.

Although I'm aware that the HP RAID controller falsifies data because of its RAM, it's also true that you really get a lot more performance, I'm aware that the Samsung 870's are not the professional range, but after 5 years I've taken them out of the servers with vmware and RAID controller and they still have a lot of life left in them, 60/70/70/80% available life, I don't know to what extent ZFS amplifies the use of the disks, but maybe it is another point to opt for the RAID controller, do you know any HP driver or software for proxmox that allows to see life, etc.. ... like ssacli in vmware ?

I have a lot of ram free but i don't view that proxmox use, i try to adjust some tips but I haven't got a big improvement in performance.
 
Last edited:
Surprise! I have mounted an NVME drive, create LVM disk, and achieved the same speed with CrystalDiskMark on Windows as i did using 870 SSDs (2TB) in RAID10 and RAIDZ setups with ZFS encryption. The sequential speeds were approximately 1100/1000 MB/s for both configurations.

In my home test server, which runs an older i7-7700 processor paired with a 980 2TB NVME drive (using ZFS), the system achieved impressive sustained read/write speeds of around 6000MB/s due to the cache benefits provided by ZFS encryption. However, on the DL360 G10 server equipped with an Xeon Gold 6138 processor and a single NVMe SSD, they only managed to reach speeds of approximately ±1000 MB/s.

Any ideas for further investigation?
 
Last edited:
Well, I've found the reason for the low performance and I'm banging my head against the wall ...:mad:

The reason was because I was creating the disk in the VM as IDE, and not as SCSI or Virtio, now I get 10Gb/s write/read, yes thanks to the proxmox cache, but the performance difference is brutal.

Now I just have to think if these Samsung SSD 870 2Tb disks will not die very soon because of ZFS, in the old servers that are in vmware + HP RAID controller when I have taken some, they were at 50/60/70% after 5/6 years ... so much ZFS spends on writing to the disks ??? I do not understand why if they have RAM these disks, what affects the PLP battery in writing more or less ?????
 
  • Like
Reactions: _gabriel
Well, I've found the reason for the low performance and I'm banging my head against the wall ...:mad:

The reason was because I was creating the disk in the VM as IDE, and not as SCSI or Virtio, now I get 10Gb/s write/read, yes thanks to the proxmox cache, but the performance difference is brutal.

Now I just have to think if these Samsung SSD 870 2Tb disks will not die very soon because of ZFS, in the old servers that are in vmware + HP RAID controller when I have taken some, they were at 50/60/70% after 5/6 years ... so much ZFS spends on writing to the disks ??? I do not understand why if they have RAM these disks, what affects the PLP battery in writing more or less ?????

PLP does not help with writing more or less but the ssd can tell the file system that data ist written sucessfully earlier when in reality the data is in the ssd cache and not written to flash yet.

Without PLP a ssd has to write data to flash and then tell the file system that the data is written (even when having a cache because no plp means on power outtake the data in the cache is lost because its not written to flash yet).

its more about latency for writes, the filesystem gets a successful write faster aknowledged by the ssd, thats only possible when the faster cache on the ssd has a battery backup.

Im not an expert on zfs but the amount what zfs writes more in comparison to other filesystems like ext4 depends on multiple factors (raid level, ,compression, params used when creating zfs ...).

Im just guessing here, zfs writes about 1.3 to 1.6 times more data than ext4? maybe someone with more insight could give better numbers.
but this means your ssd's will die faster because they reach their tbw value faster.
 
Last edited:
  • Like
Reactions: virusbcn