Not another Best Practices Noob question please!!!

A_Wraithe · Jul 24, 2022

Greetings All!

Not trying to be -that- guy, so I've spent the better part of a week combing through this (and other) forums, and I would much appreciate any thoughts/ideas/posts/personal experiences with the summary of what I've got and where I think I need to go:

1. Old Server: Repurposed Gaming Machine (Proxmox 6.4 / AMD Ryzen 2700x 8Core/16Thread 3.7GHz / 64GB DDR4 / 2x Samsung NVMe 2TB / 6x WD Yellow/Ent 6TB 7.2k Spinners / 4x 1Gb Intel Pro → 2x Bonded to switch, 2x Bonded to 48TB NAS)... Up to this point, primarily used to play around / learn / media server / etc... *Currently using around 40GB/50GB RAM with single (permanent) Ubuntu VM running PLEX / Docker (6 containers), 4 occasional VMs (Arch/Kali/etc) on the NVMes, and NFS mounted/presented 36TB Spinners (76% full).

2. New/Additional Server: Dell R720 (waiting on a couple of parts) which is 2x Intel Xeon 2670 8 Core/32Thread / 192GB DDR3, 8x 2TB SAS 7.2k Spinners / 4x 1Gb Onboard / 2x Mellanox SFP+ 10Gb...

So far I had not used / needed to use L2ARC / SLOG, and it was mostly just a fun messing around project for spinning up new VMs to play with, and host home automation / media / etc...

Now I am at a point to where I can start being a bit more serious and look at tinkering with LXCs, additional VMs, etc., and actually use the old/new/combination of them to do more than look over at it once a month and sigh to myself and say 'One day my friend, I will have time... One day...' lol...

Assumptions/Questions that I have specifically are:

1. To my knowledge, installing PMVE 7.2 on the new server and forming a cluster with the old one is most likely going to:
A. Give me a bit of experience playing with clustering via ProxMox (not sure if once clustered if they can take advantage of all of each other's resources, e.g. CPU/RAM/Graphics, or only Storage?)
B. Also have additional overhead as I literally have to physical hosts running, as opposed to possibly ripping the components out of the old gaming tower, getting a cheap 2U chassis, transplanting the parts into there and then presenting it as possibly ZFS over iSCSI / some form of possible other DAS type back end connection via the 10Gb Fiber/DAC connection?

2. If using ZFS over iSCSI, is clustering a (separate instance) ProxMox server going to be my only/best option (vice some sort of dumbed down PMVE 'Storage' server) or is there a better solution (idk, installing <Insert Linux Flavor Here / FreeNAS / OpenZFS> that allows minimal overhead that would just present the storage.
*** Sub note on that, would the current AMD Ryzen / 64GB RAM be overkill for that, in your opinion? I have no problem repurposing what I've got lol... * **

3. From my reading, it appears that ZFS over iSCSI really likes to see each disk that it is managing (individually) / separate iSCSI connector, so I am assuming that would be a consideration for #2 above.

4. There are 4 additional (3 low profile & 1 of 2 normal) expansion slots for PCIe in the rear that I can/could add in / transfer the 2 TB NVMes and/or look at an Intel Optane/etc... So far I've not ran a L2ARC nor SLOG, as I've not seen any need to, but if I'm going to be running a lot more off of it, I thought I might plan ahead there as well.

5. Last but not least, the Dell R720 has an H710p w/ 1028MB PERC on it. From everything I'm reading online, it should have no issues flashing into IT Mode, but if anyone has positive/negative experiences with that, it would be most appreciated (as if it won't work, I need to be looking around for an HBA)

Just trying to get ahead of this and if I need to order / procure anything else, attempt building my own NAS, etc at this point.

Thanks in advance, and even a link dropped or something that points me in the right direction would be most appreciated!

Cheers!
~AW

Dunuin · Jul 24, 2022

Cluster nodes won't share any ressources. Only thing that can be shared is the storage and that won't come out of the box. You will need to setup a shared filesystem yourself. Like a NFS share (Ceph won't work with just 2 nodes and ZFS isn't a shared storge, it will just keep a copy of the guests on both nodes, so the storge isn't share, more like a mirroring across nodes).

L2ARC usually is only recommended if you:
1.) already maxed out your RAM so you can't upgrade your ARC
2.) you got a specific workload with a knows size that is too big to fit in your ARC...Lets say you got a 200GB DB and only a 100GB ARC. Here it could make sense to add a 100GB L2ARC so the DB would be read from SSD instead of the slow HDD. But it won't make much sense if you just got 64TB of movies or something similar,

And SLOG will only help you with sync writes and won't help anything when using async writes. So you should check first the async/sync write ratio of your workload. If you are not running alot of DBs a SLOG or something similar a SLOG won't help that much.

Better might be to add some SSDs as special metadata devices. These boost the performance of both async and sync writes and also the read performance.

A_Wraithe · Jul 24, 2022

Dunuin said:
L2ARC usually is only recommended if you:
1.) already maxed out your RAM so you can't upgrade your ARC
2.) you got a specific workload with a knows size that is too big to fit in your ARC...Lets say you got a 200GB DB and only a 100GB ARC. Here it could make sense to add a 100GB L2ARC so the DB would be read from SSD instead of the slow HDD. But it won't make much sense if you just got 64TB of movies or something similar,

1. I do know that all 24 slots are used w/ 8GB modules (and from reading the manual, it appears that the only viable config past that is to start using 16GB modules (or higher) so that would check that box, however... ↓

2. As of now I do not have a workload that would max out the ARC, although that does make me wonder about the memory allocation itself? For example, if I am running 2 VMs @ 32GB, plus 4 VMs running 8GB, and say a handful of LXCs for another ? 10GB? which would add up to... 96GB for running VMs/Containers on the NVMe plus whatever overhead I have for PMVE (2GB?) itself (and am I correct to assume that I would add into my calculations the 1GB of RAM per 1TB of storage per pool, so in my above listed specs, that would be 4TB NVMes + 36TB Spinners, which would add up to 138GB (of the listed 192GB) RAM, would this mean (not counting dynamic allocation/etc) that in theory I would only have roughly 54GB left for ARC?
(or am I doubling up there, and the 2GB for PMVE OS plus 1GB per 1TB Storage is actually including the reservation for the ARC, so I should just 'calculate' it out as 2GB for PMVE, 1GB per TB Storage, and then subtract that number, and keep my VMs/LXCs below -that- number?)

Dunuin said:
And SLOG will only help you with sync writes and won't help anything when using async writes. So you should check first the async/sync write ratio of your workload. If you are not running alot of DBs a SLOG or something similar a SLOG won't help that much.

Perfect! That is what I thought on that. From everything I read, I can dynamically add that later if I look at the ratios and see that it is necessary.

Dunuin said:
Better might be to add some SSDs as special metadata devices. These boost the performance of both async and sync writes and also the read performance.

I'll have to look that one up. Is this just basically adding in the additional SSDs and then presenting them as an additional/new pool (for specific use)?

And thanks again, I'm sure glad I verified on the Cluster/Node bit prior to getting that up and running and being sorely disappointed.
My second option (B) was to try to put it on a specific (compatible) ZFS formatted (I assume) DIY NAS/DAS and then ZFS over iSCSI that to the new server.

Dunuin · Jul 24, 2022

A_Wraithe said:
2. As of now I do not have a workload that would max out the ARC, although that does make me wonder about the memory allocation itself? For example, if I am running 2 VMs @ 32GB, plus 4 VMs running 8GB, and say a handful of LXCs for another ? 10GB? which would add up to... 96GB for running VMs/Containers on the NVMe plus whatever overhead I have for PMVE (2GB?) itself (and am I correct to assume that I would add into my calculations the 1GB of RAM per 1TB of storage per pool, so in my above listed specs, that would be 4TB NVMes + 36TB Spinners, which would add up to 138GB (of the listed 192GB) RAM, would this mean (not counting dynamic allocation/etc) that in theory I would only have roughly 54GB left for ARC?
(or am I doubling up there, and the 2GB for PMVE OS plus 1GB per 1TB Storage is actually including the reservation for the ARC, so I should just 'calculate' it out as 2GB for PMVE, 1GB per TB Storage, and then subtract that number, and keep my VMs/LXCs below -that- number?)

Thats 4GB + 1GB per 1TB raw storage rule of thumb is for roughly dimensioning the ARC. By default with PVE, ZFS will always use up to 50% of your RAM. So if you got 192GB RAM, ZFS will try to use 96GB for its ARC. So don't wonder when your server will always be at 90+ % RAM usage.

A_Wraithe said:
I'll have to look that one up. Is this just basically adding in the additional SSDs and then presenting them as an additional/new pool (for specific use)?

Not a new pool.
A pool can consist of several vdevs striped together or even vdevs of different types. The "special" vdev is a special type of vdev that will only store metadata (but you also could tell it to optionally also store small data blocks). Without that your normal vdevs will have to store both data and metadata. So when adding SSDs as "special" vdevs your HDDs only need to store data. So the HDDs should be faster because they are hit by way less IO because thats then done by the special vdev SSDs.

A_Wraithe · Jul 24, 2022

Dunuin said:
Thats 4GB + 1GB per 1TB raw storage rule of thumb is for roughly dimensioning the ARC. By default with PVE, ZFS will always use up to 50% of your RAM. So if you got 192GB RAM, ZFS will try to use 96GB for its ARC. So don't wonder when your server will always be at 90+ % RAM usage.

Not a new pool.
A pool can consist of several vdevs striped together or even vdevs of different types. The "special" vdev is a special type of vdev that will only store metadata (but you also could tell it to optionally also store small data blocks). Without that your normal vdevs will have to store both data and metadata. So when adding SSDs as "special" vdevs your HDDs only need to store data. So the HDDs should be faster because they are hit by way less IO because thats then done by the special vdev SSDs.

Excellent! That answers whether it is 50% of total RAM vs 50% of non provisioned vs 50% of real time available!

Thank you again for the clarification!

Not another Best Practices Noob question please!!!

A_Wraithe

New Member

Dunuin

Distinguished Member

A_Wraithe

New Member

Dunuin

Distinguished Member

A_Wraithe

New Member

We value your privacy