What is the Best storage option for High Read/IO Wait ?

shubhank008 · Feb 18, 2021

I wanted some suggestions on what storage option to use for my Proxmox Containers for high read throughput considering my current LVM-Thin way of doing it is having huge IO Wait and the container lags at times and file downloads take quite some time to start downloading.

I need to serve 100-200MB files to like 1000~ users at once, my server spec is decent enough and CPU/Ram usage on my host and even container never peaks and mostly its the IO Wait/Delay that rises.

I was wondering if using something like ZFS or NFS directly on host and mounting it on my CT be a better option compared to LVM/Directory based approach where files are stored in the Container itself.

I have a 4 x 4TB HDD setup, would using a Raid compared to individual drives make any difference ? I remember reading somewhere accessing drives separately provides more overall throughput (as data and R/W is spread across 4 disks)

aaron · Feb 19, 2021

Overall, the best improvement will come by using SSDs and not HDDs. I would stay away from any QLC SSDs as they have terrible write and not such gread read performance.

If you use ZFS you can also make use of its cache (ARC). Give the machine plenty of RAM and monitor how many read operations can be satisfied from RAM without going down to the disk. The tool arcstat can do that, or even better, have some performance monitoring in place that keeps an eye on the ARC hit ratio.

If you like your data, use some kind of redundancy! If you create a ZFS pool to only store data on and no VMs, you could also consider a raidz. If you plan to store VM disks on it as well, use a RAID10 like setup made up of mirrored VDEVs.

shubhank008 · Feb 19, 2021

aaron said:
Overall, the best improvement will come by using SSDs and not HDDs. I would stay away from any QLC SSDs as they have terrible write and not such gread read performance.

If you use ZFS you can also make use of its cache (ARC). Give the machine plenty of RAM and monitor how many read operations can be satisfied from RAM without going down to the disk. The tool arcstat can do that, or even better, have some performance monitoring in place that keeps an eye on the ARC hit ratio.

If you like your data, use some kind of redundancy! If you create a ZFS pool to only store data on and no VMs, you could also consider a raidz. If you plan to store VM disks on it as well, use a RAID10 like setup made up of mirrored VDEVs.

Its a online server not homelab so SSD are not a option for 10TB storage need, I can only make use of HDD.
I do have option to go for 32GB or 64GB Ram if it helps in cache.

What is the difference between storing data directly in ZFS pool vs VMs ? Currently I use LVM and all my data is stored in VMs (thus some VMs are 6TB). Do you mean I can run VM with just the OS and store all my data directly in ZFS pool (which exists on host server) and mount it locally in my VM thus keeping VM a few GBs ?

I am open to RAIDZ for increased performance as I can get a higher storage server to waste space in Raid, currently redundancy is not a priority but something I might opt for in Future via CEPH (cannot right now as it will only let me use 1/3 storage space so for 15TB I need 45-50TB HDD)

aaron · Feb 19, 2021

You mentioned containers. Containers work differently than VMs and can get a local directory mounted directly as they are running in the context of the hosts kernel.

If you serve the data from VMs then you cannot do this easily. For the ZFS pool layout if VMs are running on it, go with a set of mirrored disks. We have a chapter in the documentation talking about the different aspects when it comes to ZFS pool design.

shubhank008 said:
but something I might opt for in Future via CEPH (cannot right now as it will only let me use 1/3 storage space so for 15TB I need 45-50TB HDD)

Be aware that for Ceph you need at least 3 nodes with a good and fast and low latency network between them.

shubhank008 · Feb 19, 2021

aaron said:
You mentioned containers. Containers work differently than VMs and can get a local directory mounted directly as they are running in the context of the hosts kernel.

If you serve the data from VMs then you cannot do this easily. For the ZFS pool layout if VMs are running on it, go with a set of mirrored disks. We have a chapter in the documentation talking about the different aspects when it comes to ZFS pool design.

Be aware that for Ceph you need at least 3 nodes with a good and fast and low latency network between them.

My bad, I actually do not use VMs but containers only. As such, I am thinking to use ZFS with RaidZ (Raid5) as I have 4 HDD and I will get to use 3 of them to get as much IOPs as I can with some redundancy without losing too much storage space.

CEPH is not possible due to non-availability of 10Gbit or even a secondary NIC for internal private network between servers, not to mention available 1/3rd storage.
I was looking into glusterFS but quite a few users recommended not using it with proxmox due to some compatibility issues and non-native integration.

I am going to give ZFS with cache a try, do you recommend a 200GB SSD (one only) or higher ram (32-64GB) for it ?

aaron · Feb 22, 2021

shubhank008 said:
I am going to give ZFS with cache a try, do you recommend a 200GB SSD (one only) or higher ram (32-64GB) for it ?

What kind of cache? Write cache for sync writes? (ZIL/SLOG) or read cache (L2ARC)? If you mean read cache, then go for more RAM. It is way faster and if RAM is the limiting factor, then adding a L2ARC does not help too much because it too, needs some RAM for ZFS to hold its index.

Search

Search

What is the Best storage option for High Read/IO Wait ?

shubhank008

Member

aaron

Proxmox Staff Member

shubhank008

Member

aaron

Proxmox Staff Member

shubhank008

Member

aaron

Proxmox Staff Member

We value your privacy