ZFS Pool Seemingly Slow on Striped Mirror With Sata SSDs

dizzydre21

Member
Apr 10, 2023
34
0
6
Hello,

I have been running Proxmox for a month or so in a homelab type environment. I have it running on the W680 chipset with an I5-13500 and 64GB of RAM. There are two ZFS striped mirror pools, Tank and Tank2, where I store the VMs I'm running (4x500GB 870 Evo, 4x250GB MX500 respectively). I don't know if this is an okay method for storing and running VMs, but I am aware they are not enterprise level SSDs. Also they are all on motherboard SATA ports. All VMs currently are Ubuntu Server, besides Truenas, which gets HDDs passed through via an LSI-9211-8i. I have TrueNAS set up with a striped mirror and appreciated getting the read and write speeds, so I did the same for my VMs. When creating the VM Pools in Proxmox, I left ashift at 12 with compression set to lz4. Also, to be clear, the VM pools were created in Proxmox and the Truenas pool was created within TrueNAS.

The issue I am seeing is with a VM on the Crucial MX500 drives. I first noticed it with an Ubuntu VM that was running SABnzbd within a docker container. During downloads, the speeds fluctuate all over the place. I have 1gig up/down from my ISP and this VM is running through a VPN. Speeds will go from about 95MBps down to basically nothing and then back up, repeating over and over. I've tried a number of things and do not believe this is an issue with my network connection or anything like that.

If I move the VM in question from Tank2 to Tank, my downloads max my network connection (with VPN overhead considered) and it doesn't fluctuate basically at all. This is with identical VM setup parameters and the same network setup.

I have used dd to actually test disk speeds within the VM and they seem a little slow too if I do a larger transfer test. Some were as low as 100MBps. If I run consecutive dd commands, some speeds will look closer to what I would expect. To be honest, I don't know if I am totally running dd correctly. My understanding is that the RAM cache makes transfers really fast until it needs to dump.

Does anyone have suggestions for troubleshooting?
 
I have used dd to actually test disk speeds within the VM and they seem a little slow too if I do a larger transfer test.
dd doesn't really work as a benchmark tool, especially when not using /dev/urandom. Try fio.

where I store the VMs I'm running (4x500GB 870 Evo, 4x250GB MX500 respectively).
At least they should all be TLC SSDs. With QLC SSDs it would be even worse. But yes, don't expect good performance when continuously writing to them. They are only fast when doing async writes and only as long as the DRAM and SLC cache hasn't filled up.

Also they are all on motherboard SATA ports.
This can also be a bottleneck. All Onbaord SATA ports, NICs, Soundcard, all USB ports and so on share the same few PCIe lanes that connect the CPU with the chipset. But I guess this isn't a big problem here as the W680 is using DMI 4.0 wich should provide a lot of bandwidth.
 
dd doesn't really work as a benchmark tool, especially when not using /dev/urandom. Try fio.


At least they should all be TLC SSDs. With QLC SSDs it would be even worse. But yes, don't expect good performance when continuously writing to them. They are only fast when doing async writes and only as long as the DRAM and SLC cache hasn't filled up.


This can also be a bottleneck. All Onbaord SATA ports, NICs, Soundcard, all USB ports and so on share the same few PCIe lanes that connect the CPU with the chipset. But I guess this isn't a big problem here as the W680 is using DMI 4.0 wich should provide a lot of bandwidth.
I will test with fio later this morning. Thanks for the suggestion.

How come I don't see the speed dropouts with the Samsung SSDs though? I guess that is my real question. I think both models are supposed to be similar in performance. Even so, we're not talking huge speeds. Shouldn't under 100MBps be sustainable indefinitely, at least for transfers that aren't just a bunch of tiny files?

Lastly, what exactly is the issue with using the TLC drives. I have not looked much into this, but will do so today as well.
 
How come I don't see the speed dropouts with the Samsung SSDs though? I guess that is my real question. I think both models are supposed to be similar in performance. Even so, we're not talking huge speeds. Shouldn't under 100MBps be sustainable indefinitely, at least for transfers that aren't just a bunch of tiny files?

Lastly, what exactly is the issue with using the TLC drives. I have not looked much into this, but will do so today as well.
There are multiple technologies/grades of NAND flash. From fastest and most durable ut smallest to slowest and bad durability but biggest:
SLC > eMLC > MLC > TLC > QLC
The only NAND flash you get these days are TLC or QLC (with exception of the Intel optane SSDs which still use SLC).

A SSD using QLC flash often can't even handle 100 MB/s of continuous writes and would be beaten by an old HDD.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!